Document Type : Original Research Article
Hakim Sabzevari University, Sabzevar, Iran.
University of Tehran, Tehran, Iran.
Universidad de Oviedo, Plaza de Feijoo, s/n, Oviedo, 33003, Spain.
The aim of the present study is twofold. First, the paper investigated whether University of Tehran English Proficiency Test (UTEPT) manifested substantial gender Differential Item Functioning (DIF). Second, the flagged DIF items were subjected to a content analysis to determine underlying sources of DIF. Mantel-Haenszel (MH) and Logistic Regression (LR) as two popular methods of DIF detection were employed to analyze the data obtained from 1550 test takers in 2010. The findings indicated that even though 28% of items were initially detected by MH and LR as displaying gender DIF, the effect size of DIF was mostly negligible. Moreover, the content analysis phase of the study showed that sometimes it is difficult to hypothesize the linguistic element causing DIF in items. However, humanities-oriented subjects were rated as favoring females and science-oriented subjects were rated as favoring males. Finally, a correlation index of .90 manifested that MH and LR produce highly consistent DIF results. These findings are discussed and implications for test developers and DIF researchers are provided.