Pragmatic Rater Training: Does It Affect Non-native L2 Teachers' Rating Accuracy and Bias?

Document Type: Original Article


1 Allameh Tabataba’i University, Iran.

2 Sharif University of Technology, Iran.


Pragmatics assessment literature provides little evidence of research on rater consistency and bias. To address this underexplored topic, this study aimed to investigate whether a training program focused on pragmatic rating would have a beneficial effect on the accuracy of non-native English speaker (NNES) ratings of refusal production as measured against native English speaker (NES) ratings and whether NNES rating bias diminishes after training. To this end, 50 NNES teachers rated EFL learners’ responses to a 6-item written discourse completion task (WDCT) for the speech act of refusal before and after attending a rating workshop. The same WDCT was rated by 50 NES teachers who functioned as a benchmark. Pre-workshop non-native ratings as measured against the native benchmark in terms of mean, SD, mean difference, and native/non-native correlation revealed that non-native raters tended to be more lenient and greatly divergent in rating total DCT and across items. Subsequent to training, however, non-native rating produced more accurate and consistent scores, indicating its approximation toward the native benchmark. To measure rater bias, a FACETS analysis was run. FACETS results showed that both before and after training, many of the raters were outliers. Besides, after training, a few raters became biased in rating certain items. From these findings, it can be concluded that pragmatic rater training can positively influence non-native ratings by getting them closer to those of natives and making them more consistent, but not necessarily less biased.