Selecting the Best Fit Model in Cognitive Diagnostic Assessment: Differential Item Functioning Detection in the Reading Comprehension of the PhD Nationwide Admission Test
Subject Areas : All areas of language and translationNiloufar Shahmirzadi 1 , Masood Siyyari 2 , Hamid Marashi 3 , Masoud Geramipour 4
1 - Department of Foreign Languages, Tehran Central Branch, Islamic Azad University, Tehran, Iran
2 - Assistant Professor of Applied Linguistics, Department of Foreign Languages, Tehran Science and Research Branch, Islamic Azad University, Tehran, Iran
3 - Associate Professor of Applied Linguistics, Department of Foreign Languages, Central Tehran Branch, Islamic Azad University, Tehran, Iran
4 - Assistant Professor of Assessment, Kharazmi University, Tehran, Iran
Keywords: reading comprehension, Cognitive diagnostic assessment, Differential item functioning,
Abstract :
This study was an attemptto provide detailed information of the strengths and weaknesses of test takers‟ real ability through cognitive diagnostic assessment, and to detect differential item functioning in each test item. The rationale for using CDA was that it estimates an item‟s discrimination power, whereas clas- sical test theory or item response theory depicts between rather within item multi-dimensionality. To ful- fill the purpose of this study, latent attributes are shown in a Q-matrix and 4200 participants who sought to pursue their studies at the PhD level at state universities were randomly selected. The test used for the present research consisted of two different reading passages with 10 multiple-choice items consisting of four options. The data were analyzed with the application of R studio package, GDINA, and DINA mod- els. Item and model fit indices were estimated and the Wald test was run. The result of the study revealed that some items flagged DIF. The study further concluded that CDA can provide pedagogically useful diagnostic information for test designers, teachers, syllabus and materials developers, and policymakers as a proficiency test needs to be valid, reliable, and fair in the context of high-stakes tests so that it im- proves the knowledge of test takers.
References
Alderson, J. C. (1990a). Testing reading compre- hension skills (Part one). Reading in a Foreign Language, 6(2), 425–438.
Alderson, J. C. (1990b). Testing reading compre- hension skills (Part two): Getting stu- dents to talk about taking a reading test (A pilot study). Reading in a Foreign Language, 7(1), 465–503.
Alderson, J. C. (2000). Assessing reading. Cam- bridge: Cambridge University Press.
Alderson, J. C. (2005). Diagnosing foreign lan- guage proficiency: The interface be- tween learning and assessment. London: Continuum.
American Educational Research Association (AERA), A. P. A. A., & National Coun- cil on Measurement in Education (NCME). (1999). Standards for educa- tional and psychological testing. Wash- ington, DC: American Psychological As- sociation.
Anderson, N. J. (2003). Scrolling, clicking, and reading English: Online reading strate- gies in a second/foreign language. The Reading Matrix, 3(3), 1-33.
Anderson, R. C., & Pearson, P. D. (1988). A schema-theoretic view of basic processes in reading comprehension. In J. D. D. E. E. E. In P. L. Carrell (Ed.), Interactive approaches to second language reading (pp. 37-55): Cambridge: Cambridge Uni- versity Press.
Aryadoust, V. (2011). Application of the fusion model to while-listening performance tests. SHIKEN: JALT Testing and Eval- uation SIG Newsletter, 15(2), 2-9.
Baghaei, P., & Ravand, H. (2015). A cognitive processing model of reading comprehen- sion in English as a foreign language us- ing the linear logistic test model. Learn- ing and Individual Differences, 43, 100- 105
Chen, J., de la Torre, J., & Zhang, Z. (2013). Rel- ative and absolute fit evaluation in cog- niitve diagnostic modeling. Journal of Educational Measurement, 50, 123-140. Chen, H., & Chen, J. (2016). Retrofitting non- cognitive-diagnostic reading assessment under the generalized DINA model framework. Language Assessment Quar-
terly, 13(3), 218-230.
Chen, W. H., & Thissen, D. (1997). Local depen-
dence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265– 289.
Chomsky, N. (1957). Syntactic structures. The Hague: Mourton & Company.
Christie, F., Devlin, B., Freebody, P., Luke, A., Martin, J. R., & Threadgold, T. (1991). Teaching English literacy: A project of
Journal of language and translation, Volume 10, Number 3, 2020
national significance on the preservice preparation of teachers for teaching Eng- lish literacy (Vol. 1): Canberra: Depart- ment of Employment, Education and Training.
Clapham, C. M. (1996). The development of IELTS: A study of the effect of back- ground knowledge on reading compre- hension. Cambridge: Cambridge Univer- sity Press.
Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differen- tial item functioning test items. Educa- tional Measurement: Issues and Practice, 17(1), 31-44.
Cooper, M. (1984). Linguistic competence of practiced and unpracticed non-native readers of English. In I. J. C. A. A. H. Urquhart (Ed.), Reading in a foreign lan- guage (pp. 122-138). London: Longman.
Council, N. R. (2001). Knowing what students know: The science and design of educa- tional assessment. Washington: National Academy Press.
Cronbach, L. M., P. (1955). Construct validity in psychological tests. Psychological Bulle- tin, 52(4), 281–302.
de la Torre, J., & Douglas, J. A. (2004). Higher- order latent trait models for cognitive di- agnosis. Psychometrika, 69(3), 333-353.
de la Torre, J., & Lee, Y. S. (2013). Evaluating the Wald test for item-level comparison of saturated and reduced models in cog- nitive diagnosis. Journal of Educational Measurement, 50(4), 355-373.
Ellis, B. B., & Raju, N. S. (2003). Test and Item Bias: What they are, What they aren't, and How to detect them. Educational Re- sources Information Center (ERIC).
Farashaiyan, A. T., K. H. (2012). On the relation- ship between pragmatic knowledge and langugae proficiency among Iranian male and female undrgraduate EFL learners. 3L: Language, Linguistics, Literature. The Southeast Asian Journal of English Langauge Studies, 18, 33-46.
11
Farr, R. (1992). Putting it all together: Solving the reading assessment puzzle. The Reading Researcher, 46(1), 26–37.
Freedle, R., & Kostin, I. (1993). The prediction of TOEFL reading item difficulty: impli- cations for construct validity. Language Testing, 10(2), 133-170.
Furr, M. R., & Bacharach, V. R. (2007). An in- troduction: Psychometrics: Thousand Oaks, CA: SAGE.
Gao, L., & Rogers, W. T. (2010). Use of tree- based regression in the analyses of L2 reading test items. Language Testing, 28(2), 1-28.
Gaylord, R. H. (1955). Conceptual consistency and criterion equivalence: a dual ap- proach to criterion analysis. Unpublished Manuscript as cited in Cronbach & Me- heel (PRB Research Note No. 17). Cop- ies obtainable from ASTIA-DSC, AD-21 440.
Glaser, R. (1994). Instructional technology and the measurement of learning outcomes: Some questions. Educational Measure- ment: Issues & Practice, 13(4), 6-8.
Goodman, K. S. (1967). Reading: A psycholin- guistic guessing game. Journal of the Reading Specialist, 6(1), 126-135.
Gough, P. B. (1972). One second of reading. In I. J. F. K. I. G. Mattingly (Ed.), Language by ear and by eye. Cambridge, Mass.: MIT Press.
Grabe, W. (1988). Reassessing the term „interac- tive'. In J. D. In P. L. Carrell, & D.E. Eskey (Ed.), Interactive approaches to second language reading (pp. 57-69). Cambridge: Cambridge University Press.
Grabe, W. (1991). Current developments in second language reading research. TESOL Quarterly, 25(3), 375–406.
Grabe, W. (2000). Developments in reading re- search and their implications for comput- er-adaptive tests of reading. Cambridge: Cambridge University Press.
Grabe, W. (2009). Reading in a second langauge: Moving from theory to practice. New
12 Selecting the Best Fit Model in Cognitive Diagnostic Assessment:...
York, NY: Cambridge University Press. Harlen, W. (2005). Teachers‟ summative practic- es and assessment for learning-tensions and synergies. The Curriculum Journal of Educational and Behavioral Statistics,
16(2), 207-223.
Hemmati, S. J., Baghaei, P., & Bemani, M.
(2016). Cognitive Diagnostic Modeling of L2 Reading Comprehension Ability: Providing Feedback on the Reading Per- formance of Iranian Candidates for the University Entrance Examination. Inter- national Journal Langauge Testing, 6(2).
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191-210.
Hughes, A. (1989). Testing for language teach- ers. Cambridge: Cambridge University Press.
Jang, E. E. (2005). A validity narrative: Effects of reading skills diagnosis on teaching and learning in the context of NG TOEFL. Unpublished doctoral disserta- tion, University of Illinois, Urbana- Champaign. [Available from ProQuest Dissertations and Theses database. (AAT 3182288)].
Jang, E. E. (2009). Cognitive diagnostic assess- ment of L2 reading comprehension abili- ty: Validity arguments for Fusion Model application to LanguEdge assessment. Language Testing, 26(1), 31-73.
Kamata, A., & Vaughn, B. K. (2004). An intro- duction to differential item functioning analysis. Learning Disabilities: A Con- temporary Journal, 2(2), 49-69.
Kellis, M., & Silvernail, D. (2002). Considering the place of teacher judgment in Maine‟s local assessment systems: Maine: Center for Educational Policy, Applied Re- search, and Evaluation, University of Southern Maine.
Khodaii, E. (2009). Effective Factors in Passing MA Entrance Examination. Higher Edu-
cation Studies and Planning Quarterly,
54, 15-34.
Kim, A.-Y. (2015). Exploring ways to provide
diagnostic feedback with an ESL place- ment test: Cognitive diagnostic assess- ment of L2 reading ability. Langauge Testing, 322(2), 227-258.
Kintsch, W. v. D., T. A. (1978). Toward a model of text comprehension and production. Psychological Review, 85(34), 363–394.
LaBerge, D. S., S. J. (1974). Toward a theory of automatic information processing in reading. Cognitive Psychology, 6(2), 293-323.
Langer, J. A. A., R. L. (1992). Curriculum re- search in writing and reading. In I. P. Jackson (Ed.), The handbook of curricu- lum research (pp. 687–725). New York: Macmillan.
Lee, Y. W., & Sawaki. (2009a). Cognitive diag- nosis and Q-matrices in language as- sessment. Language Assessment Quarter- ly, 6(1), 169-171.
Lei, P. W., & Li, H. (2016). Fit indices' perfor- mance in choosing cognitive diagnostic models and Q-matrices.Applied Psycho- logical Measurement. 40(6), 405-417.
Li, H. (2011). Evaluating langauge group differ- ences in the subskills of reading using a cognitive diagnostic modeling and diffe- rential skill functioning approach. Un- published doctoral disserttaion, Penn State University, State College, PA.
Louden, W., Rohl, M., Gore, J., McIntosh, A., Greaves, D., Wright, R. (2005). Prepared to teach: An investigation into the prepa- ration of teachers to teach literacy and numeracy. Canberra: Department of Education, Training and Youth Affairs.
Lumley, T. (1993). The notion of subskills in reading comprehension tests: An EAP example. Language Testing, 10(3), 211- 234.
Matters, G. (2006). Assessment approaches in Queensland senior science syllabuses. A
Journal of language and translation, Volume 10, Number 3, 2020
report to the Queensland Studies Au-
thority. Brisbane: ACER.
Messick, S. (1995). Standards of validity and the
validity of standards in performance as- sessment. Educational Measurement. Is- sues and Practice, 14(4), 5-8.
Nevo, N. (1989). Test-taking strategies on a mul- tiple-choice test of reading comprehen- sion. Language Learning, 6(2), 199-215.
Perfetti, C. A., & Stafura, J. (2014). Word know- ledge in a theory of reading comprehen- sion. Scientific Studies of Reading, 18(1), 22-37.
Perfetti, C. A., Yang, C-L., & Schmalhofer, F. (2008). Comprehension skill and word- to-text processes. Applied Cognitive Psychology, 22(3), 303-318.
Ranjbaran, F., & Alavi, S. M. (2017). Developing a reading comprehension test for cogni- tive diagnostic assessment: A RUM analysis. Studies in Educational Evalua- tion, 55, 167-179.
Ravand, H. (2015). Application of a cognitive diagnostic model to a high-stakes reading comprehension test. Journal of Psychoe- ducational Assessment, 1-8.
Ravand, H. (2016). Application of a cogniitve diagnostic model to a high stakes reading comprehension test. Journal of Psychoe- ducational Assessment, 34(8), 782-799.
Ravand, H., & Robitzsch, A. . (2015). Cognitive diagnostic modeling using R. Practical Assessment, Research and Evaluation, 20(11), 1-12.
Ravand, H., Barati, H., & Widhiarso, W. (2012). Exploirng diagnostic capacity of a high- stakes reading comprehension test: A pe- dagogical demonstration. Iranian Journal of Language Testing, 3(1), 12-37.
Rost, D. H. (1993). Assessing different compo- nents of reading comprehension: Fact or fiction. Language Testing, 10(1), 79-92.
Roussos, L., & Stout, W. (2004). Differential item functioning analysis: Detecting DIF items and testing DIF hypotheses. In I. D. Kaplan (Ed.), The value Sage hand-
13
book for social sciences (pp. 107-115):
Newbury Park, CA: Sage.
Rumelhart, D. (1977). Toward an interactive
model of reading. In I. S. Domic (Ed.): Attention and performance (VI). Hillsdale, N.J.: Erlbaum.
Rumelhart, D. (1980). Schemata: the building blocks of language. In B. C. B. In R. J. Spiro, & W. F. Brewer (Ed.), Theoretical issues on reading comprehension (pp. 33- 58): Hillsdale, New Jersey: Erlbaum.
Salager-Meyer, F. (1991). Reading expository prose at the post-secondary level: The in- fluence of textual variables on L2 read- ing comprehension (a genre-based ap- proach). Reading in a Foreign Language, 8(1), 645–662.
Shohamy, E. (2001). The power of tests: A criti- cal perspective on the use of language tests. Harlow, England: Longman.
Smith, F. (1971). Understanding reading. New York: Holt, Rinehart and Winston.
Snow, R. E., & Lohman, D. F. (1989). Implica- tions of cognitive psychology for educa- tional measurement. In I. R. L. Linn (Ed.), Educational measurement (pp. 263-331). New York: American Council on Education/Macmillan.
Stanovich, K. E. (1980). Toward an interactive- compensatory model of individual differ- ences in the development of reading flu- ency. Reading Research Quarterly, 16(1), 32-71.
Stiggins, R. J., & Conklin, N. F. (1992). In teach- ers‟ hands: Investigating the practices of classroom assessment. Albany, practices NY: State University of New York Press.
Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educa- tional Measurement: Issues & Practice, 20(4), 345-354.