Files in this item



application/pdf1_Jang_So Young.pdf (1MB)
(no description provided)PDF


Title:The development and evaluation of a systematic training program for increasing both rater reliability and rating accuracy
Author(s):Jang, So Young
Director of Research:Davidson, Frederick G.
Doctoral Committee Chair(s):Davidson, Frederick G.
Doctoral Committee Member(s):Chang, Hua-Hua; Zhang, Jinming; Sadler, Randall W.
Department / Program:Educational Psychology
Discipline:Educational Psychology
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Rater training
Development of rater training program
Abstract:The primary purposes of this study are to identify the characteristics of modeling a rater training program and to develop an efficient training model at the University of Illinois at Urbana-Champaign. This study focuses substantially on a basic conception of rater reliability including true score measurements of examinees’ language proficiency. This study was conducted based on a definition of rater reliability achieved by the reinterpretation of the various meanings of reliability. For these purposes, a basic framework of standardization was achieved using training theories, and this study proposes that a rater training program can be standardized by accomplishing innovative systematic changes that consider (a) the relevant literature, (b) the test instrument itself, (c) the test procedure, and (d) contextual effects such as the characteristics of the stakeholders, their concerns, the structure of the test, or washback effects of the test use. This study utilized a modified version of Lynch’s program evaluation model (1996; 2003) to collect evidence from different sources, including data drawn from the entire evaluation process ranging from needs analysis to a feedback system based on the final product of the evaluation. The effectiveness of both the training program and the individual performances were identified by incorporating all sources of data collected using measurement theory. Mixed methods were proposed for the data analysis. The data analysis involved an investigation of training effectiveness by measuring raters’ scoring reliability, and providing a new training program for raters’ professional improvement. Quantitative data analysis was proposed for analyzing the surveys, the rating corpus, and training effectiveness. Qualitative and document analysis were also essential for analyzing relevant training materials and workshop observation as well as exploring the degree of change in the perceptions of the raters. The results of this study provide educational implications for language testing. At the program level, standardized training contributed to shared responsibilities among test users. The results support the idea that the professionalism of the raters could be improved by providing access to similar quality input which can reinforce their learning and skills via training. The salient value of this dissertation is the collaboration with stakeholders in a test administration situation. Stakeholders’ concerns and challenges were clearly identified, shared, and resolved with the practitioners (the EPT trainer and raters). In addition, I recognize the importance of a balance between understanding fundamental theoretical underpinnings and applying theory through practical experience. It could be concluded that this study contributes to the enhancement of rating validity and the cumulative growth in scoring reliability, as well as a positive washback effect for the future rater training program.
Issue Date:2010-05-14
Rights Information:Copyright 2010 So Young Jang
Date Available in IDEALS:2010-05-14
Date Deposited:May 2010

This item appears in the following Collection(s)

Item Statistics