Detecting Academic Field-based Differential Item Functioning and Differential Distractor Functioning in the Islamic Azad University EPT Employing the Rasch Model
Subject Areas : Research in English Language PedagogySarallah Jafaripour 1 , Omid Tabatabaei 2 , Hadi Salehi 3 , Hossein Vahid Dastjerdi 4
1 - English Department, Najafabad Branch, Islamic Azad University,Najafabad, Iran
2 - English Department, Najafabad Branch, Islamic Azad University, Najafabad, Iran
3 - English Department, Najafabad Branch, Islamic Azad University, Najafabad, Iran
4 - English Department, Najafabad Branch, Islamic Azad University, Najafabad, Iran
Keywords: Differential Distractor Functioning (DDF), Differential Item Functioning (DIF), English Proficiency Test (EPT), Rasch Model, Test Bias,
Abstract :
This study aimed to explore Differential Item Functioning (DIF) and Differential Distractor Functioning (DDF) based upon academic fields in the Islamic Azad University English Proficiency Test (IAUEPT). Applying the Rasch model, DIF and DDF were subjected to analysis among PhD candidates belonging to different fields. The 1069 participants were broken down into Social and Human Disciplines (SHD) and Non-Social and Human Disciplines (N-SHD) groups. Findings suggested minimal academic field-related DIF, with only two out of 100 items displaying such effects. Likewise, academic field-specific DDF analysis highlighted disparities in the number of items: one in Choice A, three in Choice B, four in Choice C, and three in Choice D. The results have important implications for individuals involved in the development of high-stakes proficiency tests. Through identifying potential biases in such tests, it becomes possible to promote fairness and equality for all testees. Additionally, the identification of academic field-based DIF can provide insights into potential gaps in the curriculum, revealing areas where learners from different fields of study may face disadvantages.
Abedi, J., Leon, S., & Kao, J. C. (2008). Examining differential item functioning in reading assessments for students with disabilities. CRESST Report, 744. https://doi.org/10.1037/E642932011-001
Adibatmaz, F. B. K., & Yildiz, H. (2020). The effects of distractors to differential item functioning in Peabody picture vocabulary test. Journal of Theoretical Educational Science, 13(3), 530-547. https://doi.org/10.30831/akukeg.622180
Ahmadi, A., & Jalili, T. (2014). A confirmatory study of differential item functioning on EFL reading comprehension. Applied Research on English Language, 3(2), 55-68. https://doi: 10.22108/are.2014.15489
Ajideh, P., Yaghoubi-Notash, M., & Babaee Bormanaki, H. (2022). Native language-based DIF across the subtests: A study of the Iranian national university entrance exam. Journal of English Language Teaching and Learning University of Tabriz, 14(30), 39-56. https://doi: 10.22034/elt.2022.51852.2491
Alavi, S. M., & Bordbar, S. (2018). Differential item functioning analysis of high-stakes test in terms of gender: A Rasch model approach. MOJES: Malaysian Online Journal of Educational Sciences, 5(1), 10-24. https://mojes.um.edu.my/article/view/12631
Alavi, S., Ali Rezaee, A., Amirian, S. (2012). Academic discipline DIF in an English language proficiency test. Journal of English Language Teaching and Learning University of Tabriz, 3(7), 39-65.
Ayoobiyan, H., & Ahmadi, A. (2023). Detecting halo effects across rubric criteria in L2 writing assessment: A many-facet Rasch analysis. Applied Research on English Language, 12(1), 159-176. https://doi: 10.22108/are.2022.132503.1848
Baghaei, P. (2021). Mokken scale analysis in language assessment. Münster, Germany: Waxmann Verlag.
Balık, D. (2016). Ornament: The Politics of architecture and subjectivity. The Journal of Architecture, 21(8), 1336–1339. https://doi.org/10.1080/13602365.2016.1257277
Barati, H., Ketabi, S., & Ahmadi, A. (2006). Differential item functioning in high-stakes tests: The effect of field of study. Iranian Journal of Applied Linguistics, 19(2), 27-42.
Baron, J. (2000). Thinking and deciding (3rd ed.). Cambridge University Press
Bowles, M. A. (2022). Using instructor judgment, learner corpora, and DIF to develop a placement test for Spanish L2 and heritage learners. Language Testing, 39(3), 355-376. https://doi.org/10.1177/02655322221076033
Croskerry, P. (2003). The importance of cognitive errors in diagnosis and strategies to minimize them. Academic Medicine, 78(8), 775-780. https://doi.org/10.1097/00001888-200308000-00003
Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A meta-analysis. Psychonomic Bulletin & Review, 3(4), 422-433. https://doi.org/10.3758/BF03214546
Gierl, M. J., Bulut, O., Guo, Q., & Zhang, X. (2017). Developing, analyzing, and using distractors for multiple-choice tests in education: A comprehensive review. Review of Educational Research, 87(6), 1082–1116. http://www.jstor.org/stable/44667687
Green, B. F., Crone, C. R., & Folk, V. G. (1989). A method for studying differential distractor functioning. Journal of Educational Measurement, 26(2), 147–160. https://doi.org/10.1111/j.1745-3984.1989.tb00325.x
Groat, L. N., & Ahrentzen, S. (1996). Reconceptualizing architectural education for a more diverse future: Perceptions and visions of architectural students. Journal of Architectural Education, 49(3), 166-183. https://doi.org/10.1080/10464883.1996.10734679
Hoshino, Y. (2013). Relationship between types of distractor and difficulty of multiple-choice vocabulary tests in sentential context. Language Testing in Asia, 3(1), 1-14. https://doi.org/10.1186/2229-0443-3-16
Jafaripour, S., Tabatabaei, O., Salehi, H., & Vahid Dastjerdi, H. (2024). Applying IRT Model to Determine Gender and Discipline-based DIF and DDF: A Study of the IAU English Proficiency Test. International Journal of Language Testing, 14(1), 56-74. https://doi.org/10.22034/ijlt.2023.407117.1268
Karami, H. (2012). An introduction to differential item functioning. The International Journal of Educational and Psychological Assessment, 11(2), 59-76.
Kennedy, M. L., & Kennedy, W. J. (2012). Writing in the disciplines: A reader and rhetoric for academic writers. Pearson.
Kenny, C., & Priyadarshini, A. (2021). Review of current healthcare waste management methods and their effect on global health. Healthcare, 9(3), 284. https://doi.org/10.3390/healthcare9030284
Khalaf, M. A., & Omara, E. M. N. (2022). Rasch analysis and differential item functioning of English language anxiety scale (ELAS) across sex in Egyptian context. BMC psychology, 10(1), 242. https://doi.org/10.1186/s40359-022-00955-w
Koch, D. (2018). On architectural space and modes of subjectivity: Producing the material conditions for creative-productive activity. Urban Planning, 3(3), 70-82. https://doi.org/10.17645/up.v3i3.1379
Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480–498. https://doi.org/10.1037/0033-2909.108.3.480
Linacre, J. M. (2009). WINSTEPS Rasch Measurement (Version 3.73) [Computer software]. Chicago, IL: Winsteps.com.
Linacre, J. M. (2023b). Winsteps® Rasch measurement computer program User's Guide. Version 5.6.0. Portland, Oregon: Winsteps.com.
Luckett, K. (2016). Making the implicit explicit: The grammar of inferential reasoning in the humanities and social sciences. Universal Journal of Educational Research, 4(5), 1003-1015. https://doi.org/10.13189/ujer.2016.040510
Mapuranga, R., Dorans, N. J., & Middleton, K. (2008). A review of recent developments in differential item functioning. ETS Research Report Series, 2008(2), i–32. https://doi.org/10.1002/j.2333-8504.2008.tb02129.x
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5-11. https://doi.org/10.3102/0013189X018002005
Moradi, E., Ghabanchi, Z., & Pishghadam, R. (2022). Reading comprehension test fairness across gender and mode of learning: Insights from IRT-based differential item functioning analysis. Language Testing in Asia, 12(1), 1-18. https://doi.org/10.1186/s40468-022-00192-3
Obi, F., Ugwuishiwu, B., & Nwakaire, J. (2016). Agricultural waste concept, generation, utilization and management. Nigerian Journal of Technology, 35(4), 957. https://doi.org/10.4314/njt.v35i4.34
Penfield, R. D. (2008). An odds ratio approach for assessing differential distractor functioning effects under the nominal response model. Journal of Educational Measurement, 45(3), 247–269. http://www.jstor.org/stable/20461895
Penfield, R. D. (2010). Modeling DIF effects using distractor-level invariance effects: Implications for understanding the causes of DIF. Applied Psychological Measurement, 34(3), 151-165. https://doi.org/10.1177/0146621609359284
Penfield, R. D. (2011). How are the form and magnitude of DIF effects in multiple-choice items determined by distractor-level invariance effects? Educational and Psychological Measurement, 71(1), 54–67. https://doi.org/10.1177/0013164410387340
Penfield, R. D., & Lam, T. C. (2000). Assessing differential item functioning in performance assessment: Review and recommendations. Educational Measurement: Issues and Practice, 19(3), 5-15. https://doi.org/10.1111/j.1745-3992.2000.tb00033.x
Penton, H., Dayson, C., Hulme, C., & Young, T. (2022). An investigation of age-related differential item functioning in the EQ-5D-5L using item response theory and logistic regression. Value in Health, 25(9), 1566-1574. https://doi.org/10.1016/j.jval.2022.03.009
Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research, 1960. (Expanded edition, Chicago: The University of Chicago Press, 1980). https://doi.org/10.7208/chicago/9780226702764.001.0001
Sadeghi, K., & Abolfazli Khonbi, Z. (2017). An overview of differential item functioning in multistage computer adaptive testing using three-parameter logistic item response theory. Language Testing in Asia, 7(1), 1-16. https://doi.org/10.1186/s40468-017-0038-z
Sazegar, Z., Ashraf, H., & Motallebzadeh, K. (2021). Constructing and validating an EFL hidden curriculum scale using the Rasch model. Applied Research on English Language, 10(1), 1-32. https://doi: 10.22108/are.2020.121574.1540
Stemler, S. E., & Naples, A. (2021). Rasch measurement v. item response theory: Knowing when to cross the line. Practical Assessment, Research and Evaluation, 26(1), 11. https://doi.org/10.7275/v2gd-4441
Suh, Y., & Talley, A. E. (2015). An empirical comparison of DDF detection methods for understanding the causes of DIF in multiple-choice items. Applied Measurement in Education, 28(1), 48-67. https://doi.org/10.1080/08957347.2014.973560
Swales, J. M., Barks, D., Ostermann, A. C., & Simpson, R. C. (2001). Between critique and accommodation: Reflections on an EAP course for masters of architecture students. English for Specific Purposes, 20(1), 439–458. https://doi.org/10.1016/S0889-4906(01)00020-5
Takala, S., & Kaftandjieva, F. (2000). Test fairness: A DIF analysis of an L2 vocabulary test. Language Testing, 17(3), 323-340. https://doi.org/10.1177/026553220001700303
Terao, T., & Ishii, H. (2020). A comparison of distractor selection among proficiency levels in reading tests: A focus on summarization processes in Japanese EFL learners. SAGE Open, 10(1), 1–14. https://doi.org/10.1177/2158244020902087
Törmäkangas, K. (2011). Advantages of the Rasch measurement model in analysing educational tests: An applicator's reflection. Educational Research and Evaluation, 17(5), 307-320. https://doi.org/10.1080/13803611.2011.630562
Tsaousis, I., Sideridis, G., & Al-Saawi, F. (2018). Differential distractor functioning as a method for explaining DIF: The case of a national admissions test in Saudi Arabia. International Journal of Testing, 18(1), 1–26. https://doi.org/10.1080/15305058.2017.1345914
Zand-Moghadam, A., Meihami, H., & Ghiasvand, F. (2018). Exploring the English language needs of EAP students of humanities and social sciences in Iran: A triangulated approach. Issues in Language Teaching, 7(1), 135-164. https://doi.org/10.22054/ilt.2019.47351.434
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Ottawa, Canada: Directorate of Human Resources Research and Evaluation, Department of National Defense.
Zumbo, B. D. (2007). Three generations of DIF analysis: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4(2), 223-233. https://doi.org/10.1080/15434300701375832