Test Method Facet and the Construct Validity of Listening Comprehension Tests
محورهای موضوعی : language teachingرویا خویی 1 , سارا پایدار نیا 2
1 - North Tehran Branch, Islamic Azad University, Tehran, Iran
2 - North Tehran Branch, Islamic Azad University, Tehran, Iran
کلید واژه: factor analysis, Construct validity, Gap Filling On Summary (Listening Summary Cloze), Multiple-Choice Items, Fill-In-The-Blank Task,
چکیده مقاله :
The assessment of listening abilities is one of the least understood, least developed and, yet, one of the most important areas of language testing and assessment. It is particularly important because of its potential wash-back effects on classroom practices. Given the fact that listening tests play a great role in assessing the language proficiency of students, they are expected to enjoy a high level of construct validity. The present study was dedicated to investigating the construct validity of three different test formats, namely, multiple-choice, gap filling on summary (also called listening summary cloze), and fill-in-the-blank, used to evaluate the listening comprehension of EFL learners. In order to achieve the purpose of the study, three passages with relatively similar readability levels were used for the construction of 9 listening tests, that is, each appeared in three formats. Following a counter-balanced design, the tests were administered to 91homogeneous EFL learners divided into three groups. The statistical analysis of the results revealed that the multiple-choice test enjoyed the highest level of construct validity. Moreover, a repeated measure one-way ANOVA demonstrated that the fill-in-the-blank task was the most difficult with the MC test as the easiest for the participants.
در حیطه آزمون سازی و اندازه گیری، ارزیابی مهارت های شنیداری، علیرغم اهمیت بالای آن ، کمتر مورد توجه و درک متخصصین قرار گرفته و از پیشرفت کمتری نسبت به آنها برخوردار بوده است. اهمیت این حیطه بیشتر ریشه در تأثیرات بازگشتی آن روی تمرین های کلاسی دارد. با توجه به نقش شایان توجه آزمون های شنیداری در اندازه گیری معلومات زبانی فراگیرندگان، از آن ها انتظار میرود که از روایی سازه ای بالایی برخوردار باشند. پژوهش حاضر به منظور بررسی روایی سازه ای سه نوع متفاوت از آزمون های شنیداری شامل آزمو ن های چهار جوابی، پر کردن جای خالی در خلاصه متن، و پر کردن در متن دست نخورده با هدف اندازه گیری مهارت شنیدن فراگیران زبان انگلیسی به عنوان یک زبان خارجی انجام شد. برای دستیابی به هدف تحقیق، سه متن با سطح دشواری نسبتا یکسان برای ساخت نه آزمون شنیدن برای درک مفهوم مورد استفاده قرار گرفتند. سپس این آزمون ها بر اساس طرح موازنه ای به 91 زبان آموز همگون داده شدند. تحلیل آماری داده ها نشان داد که آزمون چهارجوابی از بالاترین میزان روایی سازه ای برخوردار بود. بعلاوه، بعد از مقایسه ی نتایج آزمون ها با استفاد هاز آزمون ANOVA یکطرفه، نتیجه گرفته شد که آزمون پر کردن جای خالی در متن دست نخورده از همه ی آزمون ها دشوارتر و آزمون چهارجوابی از همه ی آن ها برای شرکت کنندگان در پژوهش آسان تر بود.
Alderson, J. C. (2000). Assessing reading. Cambridge: Cambridge University Press.
Angoff, W. H. (1988).Validity: An evolving concept. In Wainer, H. and Braun, H. I (eds.), Test validity. (pp.19-32). Hillsdale, New Jersey.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: OxfordUniversity Press.
Bachman, L. & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press.
Bachman, L. F. (2002). Some reflections on task-based language performance assessment. Language Testing, 19 (4), 453-476.
Brantmeier, C. (2005). Effects of reader’s knowledge, text type, and test type on L1 and L2 reading comprehension. The Modern Language Journal, 89 (1), 37-53.
Brindley, G. (1998). Assessing listening abilities. Annual Review of Applied Linguistics, 18, 171-191.
Brown, J. D. (2000). What is construct validity? JALT Testing & Evaluation SIG Newsletter 4(2), 8-12.
Brown, H. D. (2004). Language assessment: Principles and classroom practices. New York: Longman.
Brown, J. D. (2005). Testing in language programs. New York: Mc Graw-Hill.
Buck, G. (2001). Assessing listening. Cambridge: Cambridge University Press.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302.
Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.). Educational measurement (2nd Ed.). Washington, D. C: American Council on Education.
Eykyn, L. E. (1992). The effects of listening guides on the comprehension of authentic texts by novice learners of French as a second language. Diss., University of South Carolina.
Farhady, H. (1983). On the plausibility of the unitary language proficiency factor. In J.W. Oller (Ed.), Issues in language testing research (pp.11-29). Rowley, Mass: Newbury House.
Foster, P. & Skehan, P. (1996). The influence of planning on performance in task-based learning. Studies in Second Language Acquisition 18, 299-324.
Garrett, H. E. (1947). Statistics in psychology and education. New York: Longman, Green.
Garson, D. (2010). Factor Analysis. From Statnotes: Topics in multivariate analysis. Retrieved June, 12, 2010 from http://www2.chass.ncsu.edu/garson/PA765/factor.htm
Hansen, C., & Jensen, C. (1994). Evaluating lecture comprehension. In J. Flowerdew (Ed.), Academic listening (pp. 241-268). New York: Cambridge University Press.
Henning, G., Gary, N., and Gary, J. (1983).Listening recall: A listening comprehension test for low proficiency learners. System, 11, 287-293.
Hughes, A. (2003). Testing for language teachers. Cambridge: Cambridge University Press.
In'nami, Y. & Koizumi, R. (2009). A meta-analysis of test format effects on reading and listening test performance: Focus on multiple-choice and open-ended formats. Language Testing, 26 (2), 219-244.
Lewkowicz, J. (1991). Testing listening comprehension: A new approach. Hong Kong. Papers in Linguistics and Language Teaching 14 (1015-2059).
Lin, R. L. (1993). Educational measurement. Phoenix: American Council on Education and the Oryx Press.
Lu, C. H. (1999). Application of computer technology: Exploratory/confirmatory factor analysis to promote quantitative research. Paper presented at the National Conference of American Association of Physics Teachers (AAPT), San Antonio, Texas.
Messick, S. (1988). The once and future issues of validity: Assessing the meaning and consequences of measurement. In H. Wainer and H. I. Braun (Eds.), Test Validity. (pp.33-48). Hillsdale, New Jersey.
Nissan, S., DeVincenzi, F., & Tang, K. L. (1996). An analysis of factors affecting the difficulty of dialogue items in TOEFL listening comprehension. (TOEFL Research Rep. No. 51). Princeton, NJ: ETS.
Oller J. W., Jr. & Hinofotis, F. (1980). Two mutually exclusive hypotheses about second language ability: indivisible or partially divisible competence. In J.W. Oller & Perkins, K. (eds.), Research in language testing. Rowley, Mass.: Newbury House Publishers, Inc.
Rubin, J. (1994). A review of second language listening comprehension research. Modern Language Journal, 78 (2), 199–221.
Shin, S. (2008). Examining the construct validity of a web-based academic listening test: An investigationof the effects of constructed response formats in a listening test. The Spaan Fellowship Working Papers in Second or Foreign Language. : 95-129. English Language Institute. University of Michigan.
Stevens, J. (1996). Applied multivariate statistics for the social sciences (3rd Ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Swaim, V. S. (2009). Determining the number of factors in data containing a single outlier: A study of factor analysis of simulated data. Unpublished dissertation. Louisiana State University and Agricultural and Mechanical College.
Teng, H. C. (1998). The effect of text and question type on English listening comprehension. English Teaching, 23 (19), 5-18.
Wainer, H. Braun, H. I. (1988). Test Validity. New Jersey: Lawrence Erlbaum Associates.
Wu, Y. 1998. What do tests of listening comprehension test? A retrospection study of EFL test-takers performing a multiple-choice task. Language Testing, 15(1), 21-44.
Ying-hui, H. (2006). An investigation into the task features affecting EFL listening comprehension test performance. Asian EFL Journal, 8(2).