A GENDER-BASED AND ATTITUDE-BASED STUDY ON WASHBACK ‎EFFECT OF TRADITIONAL ASSESSMENT AND TASK-BASED ‎ASSESSMENT ON DEVELOPING READING SKILL
Mojtaba Aghajani
1
(
Khatam ol-Anbia University, Tehran, Iran.
)
حمیدرضا خلجی
2
(
گروه زبان انگلیسی، دانشگاه آزاد اسلامی واحد ملایر، ملایر، ایران
)
Abbas Bayat
3
(
English Language Department, Malayer Branch, Islamic Azad University, Malayer, Iran
)
Keywords: &lrm, Attitude, Gender, Reading skill, Task-Based Language Assessment, Washback effect,
Abstract :
The objectives of this mixed-method study are to study washback effect of task-based language assessment on developing reading skill and also students` attitude towards TBLA. 108 EFL learners participated in the study were randomly chosen and were allocated into two classes of 54. The two groups took a reading class which carried out 90 minutes per week and continued for 3 months. By administering the PET, the homogeneity of two groups was approved. Both groups received similar instruction over 24 sessions, but only participants in experimental class took the researcher-designed task-based reading quizzes each four sessions. Next, both groups were given a reading post-test after the treatment, Results showed that TBLA has a positive impact of washback on reading improvement among EFL students. Furthermore, the attitude of learners about TBLA was calculated by employing an “Attitude Questionnaires on Reading”. To analyze the data received from “Attitude Questionnaire” A paired-sample t-test was run. By analyzing the results of quantitative questionnaire it is revealed that learners commonly have a tendency to the application of TBLA in learning and teaching reading skill. Results of the research show that TBLA need to be replaced by traditional assessment procedures since all educational planning always attempts to take full advantage of educational achievements and also improvements.
A GENDER-BASED AND ATTITUDE-BASED STUDY ON WASHBACK EFFECT OF TRADITIONAL ASSESSMENT AND TASK-BASED ASSESSMENT ON DEVELOPING READING SKILL
Abstract
The objectives of this mixed-method study are to study washback effect of task-based language assessment on developing reading skill and also students` attitude towards TBLA. 108 EFL learners participated in the study were randomly chosen and were allocated into two classes of 54. The two groups took a reading class which carried out 90 minutes per week and continued for 3 months. By administering the PET, the homogeneity of two groups was approved. Both groups received similar instruction over 24 sessions, but only participants in experimental class took the researcher-designed task-based reading quizzes each four sessions. Next, both groups were given a reading post-test after the treatment, Results showed that TBLA has a positive impact of washback on reading improvement among EFL students. Furthermore, the attitude of learners about TBLA was calculated by employing an “Attitude Questionnaires on Reading”. To analyze the data received from “Attitude Questionnaire” A paired-sample t-test was run. By analyzing the results of quantitative questionnaire it is revealed that learners commonly have a tendency to the application of TBLA in learning and teaching reading skill. Results of the research show that TBLA need to be replaced by traditional assessment procedures since all educational planning always attempts to take full advantage of educational achievements and also improvements.
Keywords: Attitude, Gender, Reading skill, Task-Based Language Assessment, Washback effect
INTRODUCTION
Koné (2015) emphasizes in the crucial role a proper assessment may play in language learning and teaching process in classroom environment. This process would be inhibited if assessment does not meet the necessitates and requirements or if it is not administered in a proper way. From the first day that tests become a tool for measuring language knowledge, researchers and test designers have constantly been seeking to find ways to improve the quality of tests so that the knowledge as well as competence of language learners be measured in optimal way and make assessment more helpful for the language learners.
This role of tests in language assessment often forces teachers to teach students as if there are being taught how to give a test. This is a common fact that is accepted by most teachers, instructors and students that teachers mostly teach for exams and attract students to study so that they just pass the exams.
Buck (1988) describes washback as the effect of a test on a class, teaching and students. He adds another dimension to his definition when he states that the aforementioned effect on tests can be either helpful or harmful. In general, we make decisions about examinees', their weaknesses and strengths and finally about their possible future accomplishment. These implications made of test results stress on the tests` standards and values and have a probable influence on the learning and teaching environment and also an influence on the whole educational structure in where learners are studying (Bachman & Palmer, 1996). This influence, in applied linguistics, is called "washback".
The subsequent definitions of "washback" signify the meanings mostly employed in this study. Messick (1996) defines washback as the influence of testing on both teachers and learners. He mentions that teachers and learners might do thing in a class that they mostly do not and Bailey (1996) explains the effect as the common effect of an assessment tool on either on teaching as well as learning.
Tests are part of educational courses where decisions are made with regard to the feedback acquired from a test. It has been supposed that tests have direct impact on educational courses in many ways. Some hypotheses say that teachers may be tempted by the awareness the learners might bring when taking a specified test. In this case, educators and teachers often adjust the teaching as well as the material to, in some way, teach the test. Test takers and teachers will attain more information about learners, about learning and also about their instruction through washback. Washback is characterized in different ways by several practitioners and researchers, with regard to the fields of study they have. Some scholars and language practitioners emphasize solely on the mere learners and learning process whilst a few worked on its output and the consequences that it may has on teacher and teaching, university, programs of study, educational organization , teaching material, and even the educational environments and the society in where the teaching process is being done.
Brown (2000) states that the impact of a test on learning and teaching is called washback. Buck (1988) in his study highlights the influence a test might bring in teaching environments where both instructors and learners are involved in. This impact is one of the most prevalent descriptions being discussed by means of the probable effect of a test on learning and teaching (Alderson & Wall, 1993; Cheng & Curtis, 2004). In the same way, Shohamy (1992) emphasizes on washback when she considers learners as test takers and continues her clarification by language learning being influenced by considering language tests in learning environment. Cheng (1997) after a study conducted in Hong Kong describes that washback is a dynamic way and function of proposed curriculum variation through modifying public exams, as studied in Hon Kong.
In their survey, Shohamy et al. (1996) believe that the possible relation concerning learning and testing is called washback. similarly, Bailey (1996) has a wide vision when he defines washback as to how a test might have an impact on educational practices. latterly, Bachman and Palmer (1996) describe that an exam`s influence on students, instructional process and the general public is a subsection of washback. They believe that the micro and macro sets are subsects of the effect of a testing system, the first is the effect of an exam on students and also teachers and the latter is the effect of a test on general public and the instructional process involved in it.
Many studies have been conducted that have investigated the effect of TBLT on developing the reading ability of the learners (Foster & Skehan, 1996; Skehan, 1998), but few have examined the effect of TBLA on the reading skills of EFL learners. While some researchers propose that the traditional methods include prearranged phases that offer teachers a clear plan of what they should do, some other researchers put emphasis on the importance of task-based approaches to communicative instruction which help teachers and learners to find their own procedures freer in order to maximize communicative effectiveness (Long & Crookes, 1992; Nunan, 1989; Prabhu, 1987; Skehan, 1996).
Cheng et al. (2004) believe that ''every model of the teaching-learning process requires that teachers base their decisions– instructional, grading, and reporting–on some knowledge of the students' attainment of and progress towards desired learning outcomes'' (p. 361). Ellis (2003) clarifies that "Language classrooms strive to involve and support learners in the learning process. Instructional tasks are essential components of the language learning environment, and ''hold a central place'' in the learning process" (p.1).
Testing systems that are task-based involve language learners in kinds of goal-oriented performance and require learners to use language in real environment to gain their purpose. This way, real-world is considered by teachers as a standard to assess the behaviors produced by students on the proposed tasks. Areas of the EFL curriculum are defied by TBLA. As Norris (2009) claimed, task-based language assessment is narrowly connected to establishment of task-based language teaching by giving learners` needed material so that it supports learning a language and also raises students` skills to use the learnt language.
Still, few studies are available on employing TBA in testing a skill separately, like reading comprehension ability. In current research, researchers try to study the impact of task-based assessment on learners’ skill improvement in reading comprehension ability and also to show their attitudes towards it.
Alderson (1988) believed that in language assessment and testing area, washback is an evolving part and a separate area. Along with Alderson, many other linguists and practitioners have marked this terminology in their studies. Popham (1987) firmly states that considering the role of washback, tests may motivate and help teaching and learning process which he names it as a measurement-driven tutoring. The positive or negative impact of washback had continuously been an unlimited matter for researchers and test designers. These concerns were the reason for many investigations in the area (Alderson & Wall, 1993; Hughes, 2003). Hughes (2003) described washback as the influence of a test on either learning and teaching. Shohamy (1992) took the view that washback is using foreign language exams/tests in order to affect and promote learning a foreign language in an educational environment. She emphasized that “the influence and promotion of learning was due to the robust power of a test itself and its great influence on the life of a learner taking that test” (p.513).
Furthermore, Messick (1996) has a well-defined definition for washback as to that degree testing encourages language instructors and students in doing tasks that they will not normally do in a testing setting and then he improves its meaning as he mentions that the sign of learning and teaching effects has to be taken as washback effect only when the aforementioned sign could be related to test administration and use (p.241).
The thought that task-based evaluation framework can be utilized to create forecasts approximately execution on learners` future language utilization exterior the test itself emphasizes the statements of TBLPA supporters.
According to Mislevy et al. (2002), in task-based language assessment, language use is observed in settings that are realistic and complex. Therefore, the use of authentic material and real-life language is an important feature of task-based assessment.
RQ1. Is there a significant variation among TBLA & traditional assessment with regard to their washback effects on learners` reading skill?
RQ2. What is EFL learners’ attitude toward the Task-Based Language Assessment (TBLA) and traditional assessment with regard to their reading progress?
RQ 3. Is washback effect of TBA gender sensitive?
METHOD
Participants
In this study, 108 EFL learners participated who were studying at Islamic Azad University, Tehran East Branch. Major age range of participants was between 19 and 27 years and a randomization method was applied to them to be part of either control or experimental set. After administration a proficiency test (Cambridge Preliminary English Test), the scores obtained proved that all the participants were at intermediate level regarding their reading skill. Table 1 indicates the comes about of the proficiency test(PET).
Table 1
Statistics of 108 learners` scores on reading
Cambridge Preliminary English Test (PET) | |
N | 108 |
M | 21.7965 |
SD | 2.68749 |
Minimum | 19.0 |
Maximum | 27.0 |
All population selected for this experimental study was students at Azad University Tehran, East Branch, Iran. This university was chosen because the researcher himself teaches in this university and had access to samples during an academic term.
The group which was chosen to be an experimental was the same number students (54) as the control one. Both groups comprised of an add up to number of 108 students: 54 experimental and 54 control. They were all aged between 19 and 27 as indicated in table below.
Table 2
Gender Distribution of Participants
Group | Gender | N |
Exp.group | Male | 24 |
Female | 30 | |
Cont. Group | Male | 26 |
Female | 28 | |
Total |
| 108 |
Materials
A Cambridge Preliminary English Test (PET) was administered to participants with regard to their reading ability as a homogeneity tool. Cambridge Preliminary English Test (PET) stands an exam for qualifying English Foreign Language learners which is designed by Cambridge TESOL.
Preliminary English Test calculates candidates` general language proficiency up to level B1 of CEFR, which is consistent with the common intermediate level of proficiency in learners’ language knowledge. Considering variables in current research and by considering reading skill as one of the independent variables, the part of the PET (reading) was extracted as a comprehensive test. PET surely would be compatible with the purpose of the research since all the participants were at intermediate level.
One of The research instruments was a questionnaire named “Attitudes to TBLA in Reading Questionnaire”. Its final version was developed on the basis of two diverse sources and a pilot study. the questionnaire consisted of 38 items and is divided into two related sections, each with a 4-point Likert scale. 12 items out of 38 were extracted and developed from the “Motivations for Reading Questionnaire”, designed by Wigfield and Guthrie (1997)and the rest 26 items were extracted from “The Attitude Test Battery”, modified by Gardner (2004).
A 4-point Likert scale was utilized in the questionnaire in the form of "Completely Disagree" (CD) to " Completely Agree" (CA) with values 1 to 4 for per option. This modified questionnaire then had been handed out to participants of experimental class once before, then once at the end of the educational program that was implemented. The educational program along with reading quizzes and its measurement plan have been validated by the equal board.
This questionnaire initially comprised a set of 59 items, to guarantee its reliability, its wording and finally internal consistency of items, the piloted administration was done with similar students in another university (Islamic Azad university, Tonekabon branch). items not representing well were removed. respectively, 38 items were selected in order to have an ultimate form of the questionnaire. Furthermore, to make sure the reliability of Attitude Questionnaire, Cronbach’s Alpha coefficient was calculated as .911. Correspondingly, in order to obtain items` internal consistency, the Cronbach’s Alphas for each single category was calculated around .6 signifying acceptable.
The pre-test (reading) contained 30 multiple-choice questions. The third instrument for this research consists of 6 task-based reading comprehension quizzes used for experimental group along with another six traditional reading comprehension multiple-choice quizzes designed for control group. After 4 sessions, these quizzes were given to either groups. The quizzes for control group consist of 30 multiple-choice questions and for the experimental group, six reading task to be done.
At the end of the treatment, as a post-test, the identical 30 multiple-choice reading pre-test was administered to either classes. The reliability of both pre and post-tests are 0.73 , 0.69 for Experimental quiz number1 , .73 for Experimental quiz number 2, .78 for Experimental quiz 3, .75 for Experimental quiz 4, .71 for Experimental quiz number 4, .76 for Experimental quiz 5, 0.77 for Experimental quiz number 6, and the reliability of tests for control group was (Control Quiz number 1), 0.70 (Control Quiz number 2), 0.72 (Control Quiz number 3) 0.78 , (Control Quiz number 4) 0.68, (Control Quiz number 5) 0.59, (Control Quiz number 6), 0.79, respectively. Furthermore, a reading booklet was assembled for the presentation of the blocking words to the two groups.
a test-retest method had been employed by researchers to confirm reliability of all used instruments. Between administration the test and re-test, a two-week period was considered. The Cronbach reliability co-efficient of the Attitude Questionnaire in its entirety was calculated as alpha = 0.8831. and for reading skill exam as 0.8329 that is considered statistically standard.
Procedure
The current research experiment was conducted at the beginning of an academic year to measure the effect of TBLA on learners’ reading skill improvement in an EFL environment. Firstly, the PET was used to make homogeneity of participated learners. This way, there were two groups of experimental group (consisting of 24 male and 30 female learners) and the other class as control group (26 male and 28 female students). Both classes were randomly chosen. The same method of teaching was employed to teach the reading material to either classes and the aforementioned pre-test post-test method was administered to both classes. During this treatment that only experimental class received, 6 different reading quizzes were given to them. These 6 quizzes were considered as independent variables in this research, on the other hand, reading skill ability of the participants and students` attitude toward TBLA (quizzes) were the dependent variables.
In order to answer the second question in this study, researchers seek to figure out how much if any, the administered quizzes had any effect on learners` attitude in experimental group, another instrument was employed by researchers. This instrument – Attitude Questionnaire- consists of 38 items which was modified by researcher gave important and somehow amazing data about students` attitude toward taking quizzes throughout the course. The Attitude Questionnaire was handed out after giving/taking post-test.
For simpler analysis of the results, all thirty-eight options in learners’ Attitude Questionnaire were closed questions with the four-point Likert scale. To inspire participants to produce truthful and authentic answers, the researcher decided to give the Attitude Questionnaire anonymously but the learners learnt this after they have filled it. The responses produced by students had no impact or any consequences on their final scores and this fact was unveiled at last. All the participants expressed their consent competently for answering the questionnaire. It is necessary to mention that researchers have converted a well-known 5-point Likert scale (strongly agree to strongly disagree) into a 4-point scale (Strongly Agree, Agree, Disagree, Strongly Disagree).
inferential as well as Descriptive statistics were calculated by researchers to analyze the collected scores. This aforementioned questionnaire was administered twice, once prior to treatment and once later at the end of the treatment (for experimental group). To figure out participants’ attitudes concerning TBLA in reading comprehension ability, a paired-sample t-test has been conducted.
Both classes were homogenized after taking the proficiency test according to the results of the test, afterward, in a random way, one group was assigned experimental group and the other group as control one. To teach selected material, truly identical method of teaching was employed and a pretest and posttest was administered in either groups. As mentioned in instrument part, pre-tests and post-tests are based on traditional method for assessing learners' reading comprehension ability. The class sessions are held twice a week. After four sessions, consecutively, the researcher gave a task-based quiz to assess the experimental groups' reading comprehension progress. Thus, in the experimental group, every single participant took 6 quizzes over course of 24 sessions with the purpose of inspecting possible differential influence of task-based assessment in learners` reading skill progress, plus the pre-test and the post-test and also the Cambridge Preliminary English Test (PET) as a test of homogeneity. It is noteworthy that all of the six teacher-made quizzes were officially agreed or accepted as satisfactory. by two professors specialized in field (thesis advisor and reader), and SPSS software was used to calculate the reliability of these quizzes. additionally, the material to be taught and also the approach the course was presented in was alike in control group, however this control group simply took three tests in the study, the PET by way of a test of proficiency and a pre-test and finally a post-test. After the last phase of the treatment, considering second question in the study, the Attitude Questionnaire was distributed to students belonging the experimental class, either male and female students. The post-test was administered to equally experimental as well as control class after treatment.
Till now, both data analysis and data collection are revolving around the first hypothesis that is the variance among Task-Based Language Assessment and the commonly traditional assessment in relation to washback effects on students` reading improvement. In order to make a comparison between experimental and control groups' scores in both pre-test and post-test, an independent-sample t-test had been conducted and in order to make a comparison between the mean scores obtained from quizzes in experimental class, a paired-sample t-test has been conducted. Then, in order to answer second question in present study, another paired-sample t-test was run to display the participants` attitudes concerning TBLA in reading comprehension improvement (experimental group). After recording the learners' viewpoints towards task-based assessment via interview, content analysis is done. Finally, frequency analysis is done to show the differences of the learners' viewpoints towards tasks in the experimental class for the learners' questionnaire.
Design
In this study, a pre-test-post-test non-equivalent-group design was used in both control and experimental groups, this is one of the designs of quasi-experimental. all learners who participated in study were selected in a random way and then allocated to two intact classes, one experimental group and another class as control group, they were chosen from Azad University Tehran, East Branch, Iran. The experimental group was under treatment for 6 weeks in terms of task-based language assessment (TBLA). this assessment was designed by researchers. On the other hand, the usual teaching method for reading skill was employed by the researcher to control group participants. Both experimental and the control groups took the Pre-test and post-test in their reading skill and only the experimental group were administered the questionnaire twice for their attitudes towards TBLA effectiveness.
RESULTS
Descriptive and inferential statistics were applied to the gathered data and were analyzed by means of these two main forms of analysis. The students’ performance on the tests before and after the treatment was analyzed through both pre-test and post-test as the first part in current research. For the second part which deals with participants` attitude to probable effects of TBLA in their reading improvement was measured by the English Reading Attitudes Questionnaire. The first research question is about the possible variance among the traditional assessment and TBLA with regard to students` washback effect on their reading skill. To start investigating such difference, different statistical tests were exploited on the data. Firstly, the independent-samples t-test has been conducted on results obtained from pre-test to make a comparison in both control and experimental groups. The descriptive statistics are shown on table 3 and the inferential one on table 4:
Table 3
Group statistics of Control and Experimental groups` pre-test scores
Group | N | Mean | Std. Deviation | Std. Error Mean |
Sum pretest Exp. group
Cont. group | 54
54 | 21.8609
22.3586 | 5.61
5.56 | .55388
.44465 |
Considering the acquired scores obtained from the table above, the mean score of pre-test in experimental class is as (mean= 23.8609, SD = 5.61) which is not toughly greater than control group` mean score which is (mean=22.35, with SD=5.56). Though, with the purpose of the comparison of outcomes on a statistical scale, the researcher exploited an independent-samples t-test on scores as presented in table number 4.
Table 4
Reading pre-test Scores
| Levene's Test for Equality of Variances | t-test for Equality of Means |
| |||||||||
Reading pretest | Equal variances assumed | F | Sig. | T | Df | Sig. (2-tailed) | Mean differences | Std. Error Difference
| 95% Confidence Interval of the Difference | |||
Lower | upper | |||||||||||
.78 | .41 | 1.28 | 108 | 1.894 | 1.83 | 1.5491 | -5.45826 | 5.49159 |
As presented in Table 4, since the 2-tailed sig. is “1.894” which is higher that assumed p value “0.05”, consequently, there is not a statistically significant difference concerning both groups before treatment.
Table 5
Group Statistics of Control and Experimental Groups` Post-Test Scores
Group | N | Mean | Std. Deviation | Std. Error Mean |
Sum post-test Exp. group
Cont. group | 54
54 | 27.6541
23.5360 | 3.163
3.654 | .8537
.6452 |
Seemingly in table number 5, it is inferred that the two mean scores in experimental group (27.65) also in control group (23.53) were statistically different. In other words, participants in experimental group who were under treatment obtained better scores than control group with regard to their reading improvement.
Table 6 Post-Test Results of Reading
| Levene's Test for Equality of Variances | t-test for Equality of Means |
| |||||||||
reading posttest | Equal variances assumed | F | Sig. | T | Df | Sig. (2-tailed) | Mean differences | Std. Error Difference
| 95% Confidence Interval of the Difference | |||
Lower | upper | |||||||||||
.43 | .49 | 3.03 | 108 | 0.046 | 3.23 | 0.19 | 1.6235 | 5.9456 |
By looking at above table, there is a statically significant difference between both groups considering sig.2- tailed “0.046” which is less than the pre-determined p value 0.05.
With regard to the above tables, it can be concluded that at the beginning of the project, no significant difference could be found among the control and experimental groups' members. On the other hand, regarding post-test descriptive analysis and independent-samples` group mean score t-test, it is obvious that reading skill of participants in the experimental group has developed meaningfully when a task-based assessment was run in order to measure them during the course of treatment.
In order to find out whether TBLA had any influence on language learners’ reading improvement, five independent sample t-test were exploited on groups' performances in 5 quizzes from quiz number 2 to quiz number 6 to find out the potential differences. It is essential mentioning that scores extracted from TB quizzes of experimental quizzes were compared to the traditional quizzes of control group. The difficulty level of quizzes was equal to the traditional ones. The results of the first quiz were not compared since its washback effect would reasonably affect the reading progress in a posteriori manner. In the following tables, the descriptive analysis as well as the inferential ones are done to discover a possible influence of TBLA on learners` reading progress.
Table 7
Descriptive Statistics of Scores of Control and Experimental Group in Quiz 2
Group | N | Mean | Std. Deviation | Std. Error Mean |
Quiz2Exp.Cont Exp. group
Cont. group | 54
54 | 17.145
14.635 | 3.316
2.504 | .548
.671 |
The mean score in experimental group` second quiz (mean= 17.14, SD=3.31) appears to be higher than mean score in control group (mean=14.63, SD=2.50), as shown in Table 7. In order to find out whether it is statistically significant or not, an independent sample t-test was run. With regard to the results of t-test, an apparent difference was found. The results of t-test are shown on Table number 8:
Table 8
Independent Samples T-Test for Quiz 2
| Levene's Test for Equality of Variances | t-test for Equality of Means |
| |||||||||||||
Quiz2Exp.Cont | Equal variances assumed | F | Sig. | T | Df | Sig. (2-tailed) | Mean differences | Std. Error Difference
| 95% Confidence Interval of the Difference | |||||||
Lower | upper | |||||||||||||||
9.87 | .01 | 5.14 | 108 | 0.036 | 5.23 | 0.91 | 3.6235 | 6.9456 |
With regard to table 8, the p value equals .01 which has a smaller amount than critical level of significance which is=.05; thus, the results of t-test shows a statistically significant dissimilarity among Control and Experimental groups in their second quiz performance. We can conclude that TBA has a positive washback impact on students' reading improvement. The third, fourth, fifth and sixth quizzes` results has been compared in a related manner for further validity of this finding.
Table 9
Descriptive Statistics of Scores of Control and Experimental Groups` Quiz 3
Group | N | Mean | Std. Deviation | Std. Error Mean |
Quiz3Exp.ContExp. group
Cont. group | 54
54 | 18.541
16.274 | 2.316
2.504 | .721
.489 |
Table 9 shows that experimental group` mean score (mean = 18.541, SD=2.316) seems to be higher but not as considerably as the quiz number 2 than control group (M=16.274, SD=2.504).
Table 10
Independent Samples T-Test for Quiz 3
| Levene's Test for Equality of Variances | t-test for Equality of Means |
| |||||||||||||
Quiz3Exp.Cont | Equal variances assumed | F | Sig. | T | Df | Sig. (2-tailed) | Mean differences | Std. Error Difference
| 95% Confidence Interval of the Difference | |||||||
Lower | upper | |||||||||||||||
4.53 | .04 | 10.24 | 108 | 0.001 | 7.08 | 0.65 | 7.4698 | 9.0289 |
Table 10 indicates a clearly fewer significance level (p=.04) than critical p value (p=0.05) (*p.05, two-tailed.); thus, the results show that this difference among experimental and control groups in quiz 3 is statistically significant.
For the next three quizzes, just two groups` mean scores and p value of independent samples t-test will be stated to avoid verbosity. In quiz number 4, (p=.01, p<.05), and for experimental group(mean=18.25) while for control group, (mean=15.84). in quiz number 5, (p=.00, p<.05), and for experimental group (mean=19.01) while for control group, (mean=16.39). Finally, in quiz number 6, (p=.03, p<.05), and for experimental group(mean=18.78) while for control group (mean=16.11).
By comparing the groups' performances in next three quizzes affirm a higher washback influence of tasked-based language assessment on students’ reading comprehension improvement and therefore the answer to the first question is found and subsequently the null hypothesis is rejected. Finally, we can indicate a positive washback influence of TBA on the participants` reading comprehension improvement.
Another research question searched for the attitudes of learners with regard to Task-Based Language Assessment in helping reach a progress in their reading comprehension. To find out their attitudes towards TBA in reading comprehension improvement, the Frequency Analysis had been exploited. In order to obtain students` attitude towards TBLA, before and after the treatment, a 38-item survey questionnaire was administered to only experimental group (N=54) once before and secondly after the treatment. All the students were supposed to respond the questions by indicating their agreement: 4 for Strongly Agree; 3 for Agree; 2 for Disagree; 1 for Strongly Disagree. Table 11 displays the descriptive statistics in language learners` attitude to TBLA. In addition, students` agreement with TBLA was examined through conducting a paired-sample t-test.
Table 11
Descriptive Statistics of Participants` Attitude to TBLA (After The Treatment)
| N | Min | Max | Mean | Std.Deviation |
Strongly Agree | 38 | .00 | 40.00 | .350 | 7.568 |
Agree | 38 | .00 | 52.00 | .456 | 5.895 |
Disagree | 38 | .00 | 13.00 | .114 | 4.586 |
Strongly Disagree | 38 | .00 | 9.00 | .078 | 8.657 |
Valid N (Listwise) | 38 |
| 114.00 |
|
|
As can be seen in the table 11, agree (0.52) and strongly agree (0.40) are higher in mean scores comparing to the other two options. In other words, comparing to the questionnaire administered, students participated in experimental group generally agreed that TBLA had positive impact on their reading comprehension improvement which can be considered as a positive attitude towards TBLA. However, with the purpose of the comparison of outcomes on a statistical scale, the paired-sample t-test has been exploited on data as shown in table 12.
Table 12
Paired-Samples Test for Attitude Questionnaire
| Paired Differences | t | df | Sig. (2-tailed) | ||||
Mean | Std. Deviation | Std. Error Mean | 95% Confidence Interval of the Difference | |||||
Lower | Upper | |||||||
Atti.pre-post | -2.22727 | 3.98726 | .85009 | -3.99513 | -.45942 | -2.620 | 54 | .041 |
As table 12 indicates, results of 38-items of English Reading Attitude Questionnaire before and after the treatment are (Sig.2-tailed=041, p<.05). as a results, it is inferred that students seemed to have a positive attitude to TBLA in their reading improvement. In short, the participants generally agreed that the task-based quizzes had helped them improve their reading skill, that could be seen as a positive contribution to the learners` attitude.
The third research question looked for the probable difference of washback effect of traditional assessment and TBA on reading improvement among female and male language learners. The means of females and males’ post-test and also pre-test in experimental as well as control group were compared to approve or discard this hypothesis. The inferential and descriptive statistic of this comparison is showed in just a single table for succinctness and saving space.
Table 13
Inferential and Descriptive Statistics of Females and Males` scores in Pre-test
Gender | Number
| Mean | SD
| T
| Sig.
|
Female | 58 | 29.45
| 8.01
| -0.41
| 0.61
|
Male
| 58 | 30.97
| 5.75
|
|
|
As table above indicates, p value is 0.61 which is greater than significance level which is by default (*p>.05), therefore, no significant difference exists among female as well as male`s scores in pre-test by task-based language assessment participants. yet, by comparing the similar participants' scores in post-test showed that female participants performed better than male students.
Table 14
Descriptive and Inferential Statistics of Females and Males` scores in Post-Test
Gender | Number | Mean | SD
| T
| Sig. |
Female | 58 | 46.32 | 4.11 | 3.57 | 0.04 |
Male
| 58 | 43.17 | 4.01 |
|
|
Regarding table 14, it is shown that p value (p=0.04), which is fewer than significance level which is 0.05 (*p<.05), therefore, results show a significant difference among female and male's test results, then female's scores are more than male's scores. The result proves that TBLA had more positive washback impact on reading improvement in female participants.
DISCUSSIONS
Wong (2001) stated that learning and teaching which has long been under discussion extensively are affected extremely by teachers` and instructor` quizzes and tests in the history of general as well as language education.
From years ago, in language teaching and learning, a negative effect of language exams has been always in teachers and scholar’s mind, and there is quite much concern for such negative views. Chapman and Snyder (2000) explained this issue as the language instructors who have tendency and willingness to instruct to tests frequently leads to emergence of an obstacle to presenting new teaching methods.
Lately, nevertheless, test designers` consideration has focused on designing tests that represent learners’ actual potency to utilize the learnt language in reality. The basic belief behind this viewpoint is summed up in Biggs (1995) idea “modifying assessment system can be a fast and simple method to change learners` learning. In spite of the supposition that tests have somehow a detrimental effect in language learning process, utilizing assessment as a means to enhance curriculum modification in general has turned out continuously common in education, specifically, and in language education in general (Alderson & Wall, 1993; Cheng, 1997).
In this study, researchers mainly tried to examine possible washback effect of traditional assessment and TBLA concerning reading skill of English language learners in Iran. According to the findings of the study, TBLA approach takes a resilient positive influence on language learners reading improvement. In this study, the researcher tried to make a comparison between traditional assessment and tasked-based assessment techniques with regard to reading skill. Findings of the study strongly support McNamara`s statement just as the assessment methods used in study has a positive influence on reading improvement of EFL students (McNamara, 2001). McNamara (2001) believes that TBA with consolidated components and language skills has higher positive washback effect on teaching and learning than distinct testing items which usually restrain communicative teaching appeals.
CONCLUSION
Reasons behind this conclusion can be said as the dominance of washback influence of TBLA comparing to traditional assessment approach which is goal-oriented in nature and that tasks are more authentic leading to a more communicative approach. Acknowledging that TBLA had a positive washback influence on reading improvement, Norris (2009) suggest the reasons for using TBLA are endorsed to elements such as a positive washback impact of assessment on either teaching and learning, alignment of TBA with task-based techniques, and currents restrictions of separate-skill assessment.
Concerning the second research question of the research, considering learners’ reading skill, a positive attitude concerning tasked-based assessment had been found in their skill improvement. To respond the above question, a 38-item survey questionnaire was given initially to experimental group participants and also end of the teaching course, for all items, frequencies of responses given to each item were calculated. Results of the study showed a positive attitude towards task-based assessment. To discover the likely washback effect on learners` attitude concerning TBLA, the results of the questionnaire scores was examined and it showed a significant difference between learners` attitudes in two diverse occasions. Based on these findings, a number of various assumptions will be made, by analyzing the attitude questionnaire, it is clarified that participants favored using task-based assessments in their reading classes comparing with traditional assessment.
Expectantly, tasked-based language assessment utilization in other skills and subskills of a language like grammar, vocabulary teaching/learning, listening, writing and speaking might be investigated (Bacha, 2001; Boroughani et al., 2023; Schoonen et al., 2011; Zakian et al., 2022). It will increase the value to the dominant approach of tasked-based language assessment by a different sample from other sections throughout Iran. It is very likely that replication of the current research in another pedagogical setting other than university context will rise the reliability of the findings.
Viewing results of this research, the researchers reached to a conclusion, there is a significant dissimilarity among traditional assessment and TBLA with regard to washback impact on language learners` reading improvement and TBLA has a positive washback influence whereas traditional assessment approaches don’t have the similar positive washback influence, comparatively. This positive washback influence emphasizes that task-based assessment approach could be employed as the alternative replacement of traditional assessment approaches in an educational environment. Furthermore, results of the survey questionnaire confirm students’ positive attitude towards TBLA with comparison to traditional assessment. By considering results of the present research, most language learners preferred innovative teaching or at least getting involved in authentic classroom activities to learn new skills and materials in preference to just joining classes with tedious procedures of teaching. To finish, this research provided valuable visions in deployment tasked-based assessment in order to assess learners’ reading enhancement.
With regard to pedagogical implications, it can be suggested as to the findings of the study, firstly, teachers and practitioners need to rethink about their assessment tools so that an improvement is seen in learners’ learning. Secondly, it sounds advisable to replace traditional testing methods with TBLA or performance-based testing equivalents. Even though practicality considerations prevent this replacement, testing process itself produce a more accurate representation of learners’ awareness on condition that traditional testing approaches are combined with substitute task-based methodologies.
Moreover, the positive washback impact of TBLA on language learners' reading improvement was revealed higher in female than male learners. This positive washback influences emphasizes the usefulness of these testing approaches as an another equivalence of the commonly traditional approach in an educational testing system (Alderson & Wall, 1993; Bachman & Palmer, 1996).
References
Alderson, J. C. (1988). Innovation in language testing: can the microcomputer help? University of Lancaster.
Alderson, J. C., & Wall, D. (1993). Does Washback Exist? Applied Linguistics, 14(2), 115–129. https://doi.org/10.1093/applin/14.2.115
Bacha, N. (2001). Writing evaluation: What can analytic versus holistic essay scoring tell us? System, 29(3), 371–383. https://doi.org/10.1016/S0346-251X(01)00025-2
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and Developing Useful Language Tests. Oxford University Press.
Bailey, K. M. (1996). Working for washback: a review of the washback concept in language testing. Language Testing, 13(3), 257–279. https://doi.org/10.1177/026553229601300303
Biggs, J. (1995). Assessing for learning: Some dimensions underlying new approaches to educational assessment. Alberta Journal of Educational Research, 41(1), 1–17.
Boroughani, T., Xodabande, I., & Karimpour, S. (2023). Self-regulated learning with mobile devices for university students: exploring the impacts on academic vocabulary development. Discover Education, 2(1), 5. https://doi.org/10.1007/s44217-023-00028-z
Brown, J. D. (2000). University entrance examinations: Strategies for creating positive washback on English language teaching in Japan. Shiken:JALT Testing & Evaluation SIG Newsletter, 3(2), 2–7. https://hosted.jalt.org/test/bro_5.htm
Buck, G. (1988). Testing listening comprehension in Japanese university entrance examinations. JALT Journal, 10(1), 15–42. https://jalt-publications.org/sites/default/files/pdf-article/jj-10.1-art1.pdf
Chapman, D. W., & Snyder, C. W. (2000). Can high stakes national testing improve instruction: reexamining conventional wisdom. International Journal of Educational Development, 20(6), 457–474. https://doi.org/10.1016/S0738-0593(00)00020-1
Cheng, L. (1997). How Does Washback Influence Teaching? Implications for Hong Kong. Language and Education, 11(1), 38–54. https://doi.org/10.1080/09500789708666717
Cheng, L., & Curtis, A. (2004). Washback or Backwash: A Review of the Impact of Testing on Teaching and Learning. In Washback in language testing: Research contexts and methods. (pp. 3–17). Lawrence Erlbaum Associates Publishers. https://doi.org/10.4324/9781410609731-9
Cheng, L., Rogers, T., & Hu, H. (2004). ESL/EFL instructors’ classroom assessment practices: purposes, methods, and procedures. Language Testing, 21(3), 360–389. https://doi.org/10.1191/0265532204lt288oa
Ellis, R. (2003). Task-based language learning and teaching. Oxford University Press.
Foster, P., & Skehan, P. (1996). The Influence of Planning and Task Type on Second Language Performance. Studies in Second Language Acquisition, 18(3), 299–323. https://doi.org/10.1017/S0272263100015047
Gardner, R. C. (2004). Attitude/Motivation Test Battery: International AMTB Research Project. https://publish.uwo.ca/~gardner/docs/englishamtb.pdf
Hughes, A. (2003). Testing for Language Teachers. Cambridge University Press.
Koné, K. (2015). The Impact of Performance-Based Assessment on University ESL Learners’ Motivation [Minnesota State University - Mankato]. https://cornerstone.lib.mnsu.edu/etds/402/
Long, M. H., & Crookes, G. (1992). Three Approaches to Task-Based Syllabus Design. TESOL Quarterly, 26(1), 27–56. https://doi.org/10.2307/3587368
McNamara, T. (2001). Language assessment as social practice: challenges for research. Language Testing, 18(4), 333–349. https://doi.org/10.1177/026553220101800402
Messick, S. (1996). Validity and washback in language testing. Language Testing, 13(3), 241–256. https://doi.org/10.1177/026553229601300302
Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2002). Design and analysis in task-based language assessment. Language Testing, 19(4), 477–496. https://doi.org/10.1191/0265532202lt241oa
Norris, J. M. (2009). Task-Based Teaching and Testing. In M. H. Long & C. J. Doughty (Eds.), The Handbook of Language Teaching (pp. 578–594). https://doi.org/10.1002/9781444315783.ch30
Nunan, D. (1989). Designing tasks for the communicative classroom. Cambridge University Press.
Popham, W. J. (1987). The Merits of Measurement-Driven Instruction. Phi Delta Kappa International, 68(9), 679–682. https://www.jstor.org/stable/20403467
Prabhu, N. S. (1987). Second language pedagogy. Oxford University Press.
Schoonen, R., van Gelderen, A., Stoel, R. D., Hulstijn, J., & de Glopper, K. (2011). Modeling the Development of L1 and EFL Writing Proficiency of Secondary School Students. Language Learning, 61(1), 31–79. https://doi.org/10.1111/j.1467-9922.2010.00590.x
Shohamy, E. (1992). Beyond Proficiency Testing: A Diagnostic Feedback Testing Model for Assessing Foreign Language Learning. The Modern Language Journal, 76(4), 513–521. https://doi.org/10.1111/j.1540-4781.1992.tb05402.x
Shohamy, E., Donitsa-Schmidt, S., & Ferman, I. (1996). Test impact revisited: washback effect over time. Language Testing, 13(3), 298–317. https://doi.org/10.1177/026553229601300305
Skehan, P. (1996). Second language acquisition research and task-based instruction. In J. Willis & D. Willis (Eds.), Challenge and Change in Language Teaching. Heinemann.
Skehan, P. (1998). A cognitive approach to Language Learning. Oxford University Press.
Wigfield, A., & Guthrie, J. T. (1997). Relations of children’s motivation for reading to the amount and breadth or their reading. Journal of Educational Psychology, 89(1), 420–432. https://doi.org/10.1037/0022-0663.89.3.420
Wong, Y. (2001). The impact of a new oral exam on students’ spoken language performance. University of Hong Kong.
Zakian, M., Xodabande, I., Valizadeh, M., & Yousefvand, M. (2022). Out-of-the-classroom learning of English vocabulary by EFL learners: investigating the effectiveness of mobile assisted learning with digital flashcards. Asian-Pacific Journal of Second and Foreign Language Education, 7(1), 1–16. https://doi.org/10.1186/s40862-022-00143-8