重庆大学学报:社会科学版
重慶大學學報:社會科學版
중경대학학보:사회과학판
Journal of Chongqing University(Social Sciences Edition)
2008年
6期
130~135
,共null页
标准误 不确定度 大学英语六级考试 多元概化理论 信度 语言测试
標準誤 不確定度 大學英語六級攷試 多元概化理論 信度 語言測試
표준오 불학정도 대학영어륙급고시 다원개화이론 신도 어언측시
SEM; uncertainty; CET-6; MGT; reliability; language testing
通过对170名大学生在一套旧式大学英语六级考试客观题上的反应数据的分析,演示了如何用多元概化理论这一工具计算复杂结构语言试卷测试结果的信度系数。结果显示,对于这组学生,总分信度系数达0.921与达0.907,但各部分的差异很大,词汇和语法部分的信度系数最高,为0.802与0.782,听力部分的信度系数次之,为0.769与0.744,阅读理解部分的最低,为0.551与0.782,听力部分的信度系数次之,为0.769与0.744,阅读理解部分的最低,为0.551与0.503。进一步的分析揭示,在这套试卷的70道客观题中,与各自部分不融洽的题目有23个,其中听力部分6道,阅读部分10道,词汇语法部分7道。如果这些不融洽题目上的成绩不记入总成绩,总分和各部分成绩的信度系数都大幅度提高,其中总分信度系数提高到0.937,听力部分提高到0.831,阅读部分提高到0.773,词汇语法部分提高到0.859。在分析的基础上,对语言测试工作者提出了5条积极的建议。
通過對170名大學生在一套舊式大學英語六級攷試客觀題上的反應數據的分析,縯示瞭如何用多元概化理論這一工具計算複雜結構語言試捲測試結果的信度繫數。結果顯示,對于這組學生,總分信度繫數達0.921與達0.907,但各部分的差異很大,詞彙和語法部分的信度繫數最高,為0.802與0.782,聽力部分的信度繫數次之,為0.769與0.744,閱讀理解部分的最低,為0.551與0.782,聽力部分的信度繫數次之,為0.769與0.744,閱讀理解部分的最低,為0.551與0.503。進一步的分析揭示,在這套試捲的70道客觀題中,與各自部分不融洽的題目有23箇,其中聽力部分6道,閱讀部分10道,詞彙語法部分7道。如果這些不融洽題目上的成績不記入總成績,總分和各部分成績的信度繫數都大幅度提高,其中總分信度繫數提高到0.937,聽力部分提高到0.831,閱讀部分提高到0.773,詞彙語法部分提高到0.859。在分析的基礎上,對語言測試工作者提齣瞭5條積極的建議。
통과대170명대학생재일투구식대학영어륙급고시객관제상적반응수거적분석,연시료여하용다원개화이론저일공구계산복잡결구어언시권측시결과적신도계수。결과현시,대우저조학생,총분신도계수체0.921여체0.907,단각부분적차이흔대,사회화어법부분적신도계수최고,위0.802여0.782,은력부분적신도계수차지,위0.769여0.744,열독리해부분적최저,위0.551여0.782,은력부분적신도계수차지,위0.769여0.744,열독리해부분적최저,위0.551여0.503。진일보적분석게시,재저투시권적70도객관제중,여각자부분불융흡적제목유23개,기중은력부분6도,열독부분10도,사회어법부분7도。여과저사불융흡제목상적성적불기입총성적,총분화각부분성적적신도계수도대폭도제고,기중총분신도계수제고도0.937,은력부분제고도0.831,열독부분제고도0.773,사회어법부분제고도0.859。재분석적기출상,대어언측시공작자제출료5조적겁적건의。
By analyzing the response data set of 170 tertiary level students on the old version CET-6 test form, we demonstrated how multivariate generalizability theory can be employed in the evaluation of reliability coefficient for a language test form with a complex structure. The result shows that for the given group of students and the population it represents, the objective section of the test as whole has a generalizability coefficient of 0.921 and an independence index of 0. 907, but the three sections vary greatly in terms of reliability coefficient, with Listening Comprehension having an of 0.769 and a of 0. 744, Reading Comprehension an of 0. 551 and a of 0. 503, and Vocabulary and Straucture an of 0.802 and a of 0.782. Further probe reveals that of the 70 items, 23 are inconsistent with their relevant sections, 6 being inconsistent with Listening Comprehension, 10 with Reading Comprehension, and 7 with Vocabulary and Structure. If the response to the inconsistent items does not count, both the overall reliability coefficient and the reliability coefficient for each section significantly improve. The coefficient for the total score rises to 0.937, and that for Listening to 0.831, for Reading 0. 773 and for Vocabulary and Structure 0.859.On the basis of the analysis, 5 suggestions have been proposed for language testers in China.