A Review of the Use of Artificial Intelligence and Automated Writing Evaluation Systems as Sources of Provision of Feedback in Assessing Writing
الموضوعات : Journal of Applied Linguistics Studies
Samira Gharekhani
1
,
Seyyed Hassan Seyyedrezaei
2
1 - Department of English Language Teaching, Aliabad Katoul Branch, Islamic Azad University, Aliabad Katoul, Iran
2 - Department of English Language Teaching, Aliabad Katoul Branch, Islamic Azad University, Aliabad Katoul, Iran
الکلمات المفتاحية: Artificial Intelligence (AI), Automated Writing Evaluation Systems (AWES), Provision of Feedback, Technology, Writing Assessment,
ملخص المقالة :
Due to the novelty of the field of the artificial intelligence (AI) and automated writing evaluation systems (AWES) as sources of provision of feedback in language teaching and the need to collect different positive and negative findings in the field into one major study, the current systematic review examines literature on current notions, terminologies, methodologies, and designs as well as the main findings in the use of AI and AWES for providing feedback in writing assessment. Utilizing search strategies for keywording and mapping, the study explored common themes in the articles to address its research questions. Analyzing the general themes revealed that all the articles employed quantitative methods with a quasi-experimental design. Moreover, various notions and terminologies were used to capture the primary findings in the field. Based on the findings, the study emphasized the need for researchers to adopt more interactive designs to further investigate the potential of AI and automated writing evaluation systems in providing feedback in writing assessment. It also called for additional studies to review related domain findings to offer more productive and practical outcomes for educational settings.
Algburi, E. (2024). Combination of awe (criterion) feedback with the process approach and its impact on EFL writing content/idea development and organization. International Journal of Academic Research in Progressive Education and Development, 13(1). https://doi.org/10.6007/ijarped/v13-i1/20082
Alikaniotis, D., Yannakoudakis, H., & Rei, M. (2016). Automatic text scoring using neural networks. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Volume 1 Long Papers (pp. 715-725). Stroudsburg: Association for Computational Linguistics.
Ashwell, T. (2000). Patterns of teacher response to student writing in a multiple-draft composition classroom: Is content feedback followed by form feedback the best method?. Journal of second language writing, 9(3), 227-257.
Ayers, M. (2023). Human versus machine. Journal of Clinical Engineering, 48(3), 130-138. https://doi.org/10.1097/jce.0000000000000603
Bitchener, J. (2008). Evidence in support of written corrective feedback. Journal of second language writing, 17(2), 102-118.
Bitchener, J., & Knoch, U. (2010). The contribution of written corrective feedback to language development: A ten-month investigation. Applied linguistics, 31(2), 193-214.
Carter, A., & Absalom, M. (2023). Giving Students the Tools: Looking at Teaching and Learning using Corpora. The EuroCALL Review, 30(1), 52-62.
Chan, K., Bond, T., & Yan, Z. (2022). Application of an automated essay scoring engine to English writing assessment using many-facet Rasch measurement. Language Testing, 40(1), 61-85. https://doi.org/10.1177/02655322221076025
Chen, H., & Pan, J. (2022). Computer or human: a comparative study of automated evaluation scoring and instructors’ feedback on Chinese college students’ English writing. Asian-Pacific Journal of Second and Foreign Language Education, 7(1), 34.
Dasgupta, T., Naskar, A., Saha, R., & Dey, L. (2018). Augmenting textual qualitative features in deep convolution recurrent neural network for automatic essay scoring. In Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications (pp. 93-102). Stroudsburg: Association for Computational Linguistics.
Dikli S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning, and Assessment, 5(1), 1-36.
Dong, F., & Zhang, Y. (2016). Automatic features for essay scoring—an empirical study. In Proceedings of the 2016 conference on empirical methods in natural language processing (pp. 1072-1077). Stroudsburg: Association for Computational Linguistics.
Dugan, L., Ippolito, D., Kirubarajan, A., Shi, S., & Callison-Burch, C. (2023). Real or fake text?: investigating human ability to detect boundaries between human-written and machine-generated text. Proceedings of the Aaai Conference on Artificial Intelligence, 37(11), 12763-12771. https://doi.org/10.1609/aaai.v37i11.26501
Dai, W., Lin, J., Jin, H., Li, T., Tsai, Y. S., Gašević, D., & Chen, G. (2023, July). Can large language models provide feedback to students? A case study on ChatGPT. In 2023 IEEE International Conference on Advanced Learning Technologies (ICALT) (pp. 323-325). IEEE.
Ferris, D. R. (2002). Treatment of error in second language student writing. University of Michigan press.
Ferris, D. R. (2003). Response to student writing: Implications for second language students. Routledge.
Ferris, D. R., & Hedgcock, J. S. (2023). Teaching L2 composition: Purpose, process, and practice. Routledge.
Geckin, V. (2023). Assessing second-language academic writing: ai vs. human raters. Journal of Educational Technology and Online Learning, 6(4), 1096-1108. https://doi.org/10.31681/jetol.1336599
Goldstein, L. M. (2004). Questions and answers about teacher written commentary and student revision: Teachers and students working together. Journal of second language writing, 13(1), 63-80.
Humphry, S., & Heldsinger, S. (2020). A two-stage method for obtaining reliable teacher assessments of writing. Frontiers in Education, 5. https://doi.org/10.3389/feduc.2020.00006
Hyland, K. (2009). Academic discourse: English in a global context Continuum.
Hyon, S. (1996). Genre in three traditions: Implications for ESL. TESOL quarterly, 30(4), 693-722.
Jackaria, P. M., Hajan, B. H., & Mastul, A. R. H. (2024). A comparative analysis of the rating of college students’ essays by ChatGPT versus human raters. International Journal of Learning, Teaching and Educational Research, 23(2), 478-492.
Kim, Y., Kim, H., Park, G., Kim, S., Choi, S., & Lee, S. (2021). Reliability of machine and human examiners for detection of laryngeal penetration or aspiration in videofluoroscopic swallowing studies. Journal of Clinical Medicine, 10(12), 2681. https://doi.org/10.3390/jcm10122681
Kural, M., Jin, J., Fürbass, F., Perko, H., Qerama, E., Johnsen, B., … & Beniczky, S. (2022). Accurate identification of eeg recordings with interictal epileptiform discharges using a hybrid approach: artificial intelligence supervised by human experts. Epilepsia, 63(5), 1064-1073. https://doi.org/10.1111/epi.17206
Liu, S. and Kunnan, A. (2016). Investigating the application of automated writing evaluation to Chinese undergraduate English majors: a case study of write to learn. Calico Journal, 33(1), 71-91. https://doi.org/10.1558/cj.v33i1.26380
Lloyd, S., Beckman, M., Pearl, D., Passonneau, R., Li, Z., & Wang, Z. (2022). Foundations for ai-assisted formative assessment feedback for short-answer tasks in large-enrollment classes.. https://doi.org/10.52041/iase.icots11.t3c3
Lui, S., & Kunnan, A. J. (2016). Investigating the application of automated writing assessment to Chinese undergraduate English majors: A case study of WriteToLearn. Computer Assisted Language Instruction Consortium, 33, 71-91.
Maftoon, P., & Seyyedrezaei, S. H. (2012). Good Language Learner: A Case Study of Writing Strategies. Theory & Practice in Language Studies (TPLS), 2(8).
Nguyen, T. (2023). Exploring the efficacy of ChatGPT in language teaching. Asiacall Online Journal, 14(2), 156-167. https://doi.org/10.54855/acoj.2314210
Refaat, M. M., Ewees A. A., & Eisa, M. M. (2012). Automated assessment of students' Arabic free text answers. International Journal of Intelligent Computing and Information Science, 12(1), 213-222.
Rezaeian, M., Seyyedrezaei, S. H., Barani, G., & Seyyedrezaei, Z. S. (2020). An investigation into Iranian non-English PhD students’ perceptions regarding learning as an educational consequence of EPT. Iranian Journal of Learning and Memory, 3(10), 71-80.
Seyyedrezaeia, S. H., Kazemib, Y., & Shahhoseinic, F. (2016). Mobile Assisted Language Learning (MALL): An accelerator to Iranian language learners’ vocabulary learning improvement. International Journal of research in Linguistics, language Teaching and Testing, 1, 7-13.
Seyyedrezaei, S. H. (2015). Improving E-Assessment and E-Learning in Language Learning and Teaching Using Information Technology. In Proceedings of International Conference on Application of Information and Communication Technology and Statistics in Economy and Education (ICAICTSEE) (p. 87). Sofia, Bulgaria.
Song, C., & Song, Y. (2023). Enhancing academic writing skills and motivation: assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology, 14, 1260843.
Taghipour K., & Ng, H. T. (2016). A neural approach to automated essay scoring. In Proceedings of the 2016 conference on empirical methods in natural language processing (pp. 1882-1891). Stroudsburg: Association for Computational Linguistics.
Wei, P. (2023). The impact of automated writing evaluation on second language writing skills of Chinese EFL learners: a randomized controlled trial. Frontiers in Psychology, 14. https://doi.org/10.3389/fpsyg.2023.1249991
Wong, W. S, & Bong, CH (2021). Assessing Malaysian University English Test (MUET) Essay on Language and Semantic Features Using Intelligent Essay Grader (IEG). Pertanika Journal of Science & Technology, 29(2), 919-941.
Zhang, L., Chen, J., Hou, L., Xu, Y., Liu, Z., Huang, S., … & Liang, L. (2022). Clinical application of artificial intelligence in longitudinal image analysis of bone age among GHD patients. Frontiers in Pediatrics, 10. https://doi.org/10.3389/fped.2022.986500
Zhang, Y., Zhang, J., & Zhang, X. (2021). Chinese IELTS test takers' perceptions of computer- mode IELTS: A mixed-methods study. International Journal of Applied Linguistics and English Literature Studies, 7(2), 46-58. Https://doi.org/10.3923/ijalelsv7n2p46
Zribi, R., & Smaoui, C. (2021). Automated versus Human Essay Scoring: A Comparative Study. International Journal of Information Technology and Language Studies (IJITLS), 5(1), 62-71. International Scientific Indexing (ISI). (2024). Author guidelines. https://isindexing.com/isi/detailsss.php?id=13820&page=guide
Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050.
Pienemann, M. (1989). Is language teachable? Psycholinguistic experiments and hypotheses. Applied linguistics, 10(1), 52-79.
Storch, N. (2010). Critical feedback on written corrective feedback research. International Journal of English Studies, 10(2), 29-46.
Wu, J., Li, Y., Zhou, J., & Chen, S. (2024). The impact of intelligent personal assistants on Mandarin second language learners: interaction process, acquisition of listening and speaking ability. Computer Assisted Language Learning, 1-26.