心理科学
心理科學
심이과학
Psychological Science
2015年
2期
452~456
,共null页
计算机化多阶段自适应测验 纸笔测验 计算机化自适应测验 阶段 模块
計算機化多階段自適應測驗 紙筆測驗 計算機化自適應測驗 階段 模塊
계산궤화다계단자괄응측험 지필측험 계산궤화자괄응측험 계단 모괴
computerized multistage adaptive testing (MST), paper-and-pencil test (P&P), computerized adaptive test (CAT), stage, module
计算机化多阶段自适应测验是基于计算机技术的测验形式,它将题目集合作为测试单元,通过多阶段自适应的形式对被试进行测试和评分。近年来通过研究各种测验形式,发现其比计算机化自适应测验和纸笔测验突显出更大优势。与纸笔测验相比,其具有参数不变性、能力估计更精确等优势。与计算机化自适应测验相比,其具有可控制题目特性、被试可检查题目等优势。如何减小测量误差,使其应用更加便捷、有效,是未来研究的发展方向。
計算機化多階段自適應測驗是基于計算機技術的測驗形式,它將題目集閤作為測試單元,通過多階段自適應的形式對被試進行測試和評分。近年來通過研究各種測驗形式,髮現其比計算機化自適應測驗和紙筆測驗突顯齣更大優勢。與紙筆測驗相比,其具有參數不變性、能力估計更精確等優勢。與計算機化自適應測驗相比,其具有可控製題目特性、被試可檢查題目等優勢。如何減小測量誤差,使其應用更加便捷、有效,是未來研究的髮展方嚮。
계산궤화다계단자괄응측험시기우계산궤기술적측험형식,타장제목집합작위측시단원,통과다계단자괄응적형식대피시진행측시화평분。근년래통과연구각충측험형식,발현기비계산궤화자괄응측험화지필측험돌현출경대우세。여지필측험상비,기구유삼수불변성、능력고계경정학등우세。여계산궤화자괄응측험상비,기구유가공제제목특성、피시가검사제목등우세。여하감소측량오차,사기응용경가편첩、유효,시미래연구적발전방향。
Computerized multistage adaptive testing (MST) is a kind of test format based on computerized technology, consisting of sets of items scored and administered as a unit. These sets of items are called modules or testlets. They are a number of short linear tests, which provide a certain percentage of test information to reduce the measurement errors. Items in a module may centre on one or several common stems, such as a paragraph and a diagram, or they may have no relevance with each other. In MST, adaptations occur at the items sets level, based on the cumulative performance of previous items, then the next module is selected. MST has fewer adaptations than item level computerized adaptive testing (CAT), but more adaptations than conventional paper-and-pencil (P&P) testing. It combines the components of conventional P&P with the adaptive characteristic of CAT. And the advantage of these two test forms combined can overcome their individual disadvantages. Thus, there is no doubt that it is a compromise of the two tests forms How to build a MST is the first thing that test developers should consider. The number of stages, the modules in every stage, and the items in every module, all these must have been decided before the test has been built. Target statistics, and qualitative specification also should be considered before the test has been built. The ways of scoring, adapting and assembling the test are the components as vital as the ones listed earlier. After the test has been set up but before it is executed, the test developers can check the items for non-statistical properties, including content balance, ordering and the potential for context effects, cognitive level, item format, answer key position, word count, and any other characteristics of interest or concern in developing the modules. MST may assure the item response theory (IRT) assumptions of local independence and unidimensionality among modules. Items in one stem which violates local independence assumptions are treated as polytomous ones. Therefore, all modules should be allocated optimally. When subjects take the test, they can preview and review items in a module, and modify the false ones. Then, the subjects may operate the modules optimally. Both the test developers and subjects could operate the module optimally in order to obtain a better result in the exam. MST appeared to provide the opportunity to improve the quality of examinations. It has already been used in many large evaluation tests, such as the Uniform CPA Examination and the Graduate Record Examination (GRE). Along with the study of various tests, we can find that compared with the conventional P&P and CAT, MST is obviously superior. Compared with the conventional P&P, its advantages are the parameter invariance, time saving, timely feedback, accurate estimation, and so on. Compared with the CAT, its advantages include the control of non-statistical properties and item exposure, the opportunity to check the items, etc. The direction of future research is how to minimize measurement errors in order to make the application of MST more convenient and effective.