ORIGINAL_ARTICLE
EFL Students’ Evaluation Apprehension and Their Academic Achievement, Gender, and Educational level: Towards Designing and Validating a Comprehensive Scale
Student evaluation apprehension as one of the detrimental factors in an English as a foreign language (EFL) context, reduces and gradually diminishes student participation in classroom activities, since learners are mostly concerned with how others (teacher and classmates) evaluate/judge their performance. Due to the fact that the studies considering the important role of student evaluation apprehension are scarce in number, this study was conducted to validate the newly-designed questionnaire via exploratory and confirmatory factor analyses and find the relationship between student evaluation apprehension and academic achievement, gender, and educational level of 258 EFL students. The results from EFA, CFA, and reliability analyses revealed that the new questionnaire is a valid and reliable instrument measuring EFL students’ evaluation apprehension. Moreover, a significant negative correlation was observed between student evaluation apprehension and academic achievement. Besides, it was found that females experience evaluation apprehension more than males, and BA students were also found to have more evaluation apprehension than their MA counterparts.
https://www.ijlt.ir/article_128355_1d92060d3f3d6944be68bfaf1acd4fc6.pdf
2021-03-01
1
21
evaluation apprehension
EFL students
academic achievement
EFA
CFA
Safoura
Jaheidzadeh
jahedi.s1310@gmail.com
1
English department, Imam Reza University
AUTHOR
Afsaneh
Ghanizadeh
afsanehghanizadeh@gmail.com
2
English department, Imam Reza International University, Mashhad, Iran
LEAD_AUTHOR
ORIGINAL_ARTICLE
Diagnostic Test Construction: Insights from Cognitive Diagnostic Modeling
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive Diagnostic Assessment (CDA) from scratch. Despite the prevalence of retrofitting CDA studies, true applications of CDA are believed to be rare since, firstly, a coherent framework to conduct such studies had not been available and, secondly, researchers were not able to analyze various DCMs according to the same model fit indices and criteria. This paper presents a summary of different types of DCMs and reviews true and retrofitting CDA studies. Having examined the limitations of previous CDA studies, the present study argues for the implication and application of Ravand and Baghaei’s (2019) framework to conduct true CDA studies. This framework is of importance since not only does it fit into prominent frameworks in education assessment such as Cognitive Design System and Assessment Triangle, but also it can provide test-developers with practical steps in conducting valid cognitive diagnostic tests.
https://www.ijlt.ir/article_128357_68e74c779145d7be54c24f610a549879.pdf
2021-03-01
22
35
cognitive diagnostic assessment
Diagnostic Classification Models
model fit indices
Q-Matrix construction
Somaye
Ketabi
s86k@hotmail.com
1
English Department, Faculty of Foreign Languages and Literatures, University of Tehran, Tehran, Iran
LEAD_AUTHOR
Seyyed Mohammad
Alavi
smalavi@ut.ac.ir
2
English Department, Faculty of Foreign Languages and Literatures, University of Tehran, Iran
AUTHOR
Hamdollah
Ravand
ravand@vru.ac.ir
3
English Department, Faculty of Literature & Humanities, Vali-e-Asr University of Rafsanjan, Rafsanjan, Iran
AUTHOR
ORIGINAL_ARTICLE
Assessment Alternatives in Developing L2 Listening Ability: Assessment FOR, OF, AS Learning or Integration? (Assessment x̄ Approach)
Uni-furcation of assessment and instruction has recently been realized in the form of purposeful assessment scenarios; Assessment x̄ Scenarios (analogous to Noam Chomsky's x̄ Theory!). x̄ here refers to any of the triple assessment scenarios including Assessment for Learning (AFL), Assessment as Learning (AAL), and Assessment of Learning (AOL), plus pairing each with another or integrating all three (i.e., Integrated Assessment Scenario). Comparative investigation of the effect of each scenario as to developing language skills particularly listening skill seems to be an intact area. In a bid to fill this gap, 100 conveniently sampled Iranian female EFL learners of 13-19 years old were randomly divided into three experimental and one control group. Prior to the treatment, their listening ability was measured through a pre-test. Then, each experimental group; AFL, AAL, and Integrated assessment, experienced the listening instruction based on the principles of each specific scenario, while the control group was treated based on AOL principles. Their listening ability was then measured in the light of an identical listening post-test to the pre-test. ANOVA, used to check the comparative performances of all groups, showed that AFL and AAL groups significantly outperformed over the AOL group, but the integrated assessment group significantly outperformed the other experimental groups. While the findings yield support to the bifurcation approach, they generate more prospective areas for further research.
https://www.ijlt.ir/article_128359_27958c504fc9d88a0a5b315691c5df52.pdf
2021-03-01
36
57
Language assessment
Integrated assessment
Listening Performance
Elham
Ghorbanpour
elhamghorbanpour4@gmail.com
1
Department of English Teaching, Kish International Branch, Islamic Azad University, Kish Island, Iran
AUTHOR
Gholam
Abbasian
gabbasian@gmail.com
2
Department of English, Kish International Branch, Islamic Azad University, Kish Island, Iran
LEAD_AUTHOR
Ahmad
Mohseny
amohseny1328@gmail.com
3
English Language Department, Faculty of Persian Literature Foreign Languages, Islamic Azad University- South Tehran Branch,
AUTHOR
ORIGINAL_ARTICLE
The Construction and Validation of a Q-matrix for a High-stakes Reading Comprehension Test: A G-DINA Study
Investigating the processes underlying test performance is a major source of data for supporting the explanation inference in the validity argument (Chappelle, 2021). One way of modeling the cognitive processes underlying test performance is through the construction of a Q-matrix, which is essentially about summarizing the attributes explaining test takers’ response behavior. The current study documents the construction and validation of a Q-matrix for a high stakes test of reading within a generalized-deterministic inputs, noisy “and” gate (G-DINA) model framework. To this end, the attributes underlying the 20 items of the reading comprehension test were specified through retrospective verbal reports and domain experts’ Delphi techniques. In the ensuing stage, the Q-matrix thus developed along with item response data of 2625 test-takers were subjected to empirical analysis using the procedure suggested by de la Torre and Chiu (2016). Item-level results showed that, except for one item, the processes underlying the other items were captured by compensatory and additive models. This finding has significant implications for model selection for DCM practitioners.
https://www.ijlt.ir/article_128361_4488710c7301b77e88a7aca333162624.pdf
2021-03-01
58
87
Keywords: Cognitive Diagnostic Assessment
Test Reading Comprehension
Q-Matrix construction
Q-matrix Validation
Fateme
Roohani Tonekaboni
f_roohani_t@yahoo.com
1
English department, Shahid Chamran University, Ahvaz, Iran
LEAD_AUTHOR
Hamdollah
Ravand
ravand@vru.ac.ir
2
Vali-e-Asr University of Rafsanjan, 7315679823, Rafsanjan, Iran.
AUTHOR
Reza
Rezvani
rrezvani@yu.ac.ir
3
Yasouj University, Yasouj, Iran
AUTHOR
ORIGINAL_ARTICLE
Contrasting Groups Analysis of TOEFL® iBT Test Cut Scores and the Common European Framework of Reference (CEFR) Proficiency Levels: Kernel Density Estimation of an English Learners’ Corpus
Placing non-native speakers of English into appropriate classes involves mapping placement test scores onto proficiency levels based on predetermined cut scores. However, studies on how to set boundaries for different levels of proficiency have been lacking in the language testing literature. A top-down approach to standard setting in which a panel of experts set cut scores has dominated the typical standard setting procedure. A less utilized approach is to proceed bottom-up by clustering learners based on test scores. The purpose of this study was to fill this gap by examining Education Testing Services (ETS)’s mapping of TOEFL® iBT Test scores to the Common European Framework of Reference (CEFR) levels. The study examined TOEFL® iBT score data from ICNALE (International Corpus Network of Asian Learners of English) and conducted optimal Kernel Density Estimation to find peaks in the distribution of test scores. In addition to the number of peaks, the local minima of the resulting distribution were chosen as boundaries of cut scores for delineating different ability groups. This method of separating scores, also known as contrasting groups, finds clusters of test takers based on maximum differences in scores. The results showed that ETS’ guide for cut scores linking to CEFR levels was comparable to Kernel Density Estimation with some exceptions, namely two out of three cut scores were found to be similar. Implications are discussed in terms of test-centered versus examinee-centered method of standard setting and the need to consider the demographics of the examinee population in determining cut scores.
https://www.ijlt.ir/article_128362_90d3b127f9959eeeb9833c490619ed00.pdf
2021-03-01
88
102
CEFR
cut scores
proficiency levels
standard-setting
TOEFL
Peter
Kim
pk2505@tc.columbia.edu
1
Teachers College, Columbia University
LEAD_AUTHOR
ORIGINAL_ARTICLE
Assessment Perceptions and Practices in Academic Domain: The Design and Validation of an Assessment Identity Questionnaire (TAIQ) for EFL Teachers
The present research aimed to conceptualize the construct of Teacher Assessment Identity (TAI) by designing and validating a questionnaire in the Iranian EFL context. In so doing, a tentative scale with 96 items was piloted on 340 novice and experienced Iranian EFL teachers using Exploratory and Confirmatory Factor Analysis (EFA, CFA). The results of the analyses led to the removal of 33 items, leaving the questionnaire with 61 items on a five-point Likert scale. Moreover, the results revealed that the construct of TAI has 12 factors including assessment “knowledge”, “beliefs”, “attitudes”, “skills and confidence”, “practices”, “use assurance”, “feedback”, “rubric/criteria”, “consistency and consequence”, “grading/scoring”, “question-types”, and “roles”. Likewise, the convergent validity and reliability of the instrument to measure the construct of concern was statistically confirmed (p>.05). The findings have various implications for EFL teachers, teacher trainers, course designers, and language researchers by raising their awareness of assessment identity and its underlying components.
https://www.ijlt.ir/article_128363_c23c977855725ac5b6008e2c7625a354.pdf
2021-03-01
103
131
Assessment identity
EFL teachers
questionnaire design
teacher identity
Validation
Masoomeh
Estaji
mestaji74@gmail.com
1
Allameh Tabataba'i University, Iran.
LEAD_AUTHOR
Farhad
Ghiasvand
f.ghiasvand70@yahoo.com
2
Allameh Tabataba'i University
AUTHOR
ORIGINAL_ARTICLE
A Mokken Scale Analysis of an English Reading Comprehension Test
Reading comprehension in English, as one of the most central skills, has a vital role in the process of learning English as a Foreign/Second Language. The current study used the Mokken Scale Analysis (MSA), a probabilistic-nonparametric approach to item response theory (IRT), to determine the unidimensionality and scalability of a 20-item reading comprehension test administered on 300 EFL university students in the Iranian context. The results showed no major concerns in terms of item scalability. Monoton Homogeneity Model (MHM) fitted all the items of the test very well as measured by the scalability coefficients and restscore groups method. Considering the IIO, it was concluded that the ordering of items according to their mean is invariant across examinees although HT was small. Dimensionality analysis results using the AISP showed that the test is unidimensional providing evidence of the validity of the test in measuring a single ability dimension.
https://www.ijlt.ir/article_130373_11e89cb4a60e31d46cab77d2a59755cd.pdf
2021-03-01
132
143
Double monotonicity Model
Nonparametric Item-response theory
Monoton Homogeneity Model
Mokken-scale analysis
reading comprehension
Mona
Tabatabaee-Yazdi
tabatabaee.mona@gmail.com
1
English Department, Tabaran Institute of Higher Education
LEAD_AUTHOR
Khalil
Motallebzadeh
kmotallebz@gmail.com
2
English Department, Tabaran Institute of Higher Education
AUTHOR
Purya
Baghaei
puryabaghaei@gmail.com
3
English Department, Islamic Azad University of Mashhad, Mashhad, Iran.
AUTHOR