A Corpus-based Investigation of Lexical Bundles in Iranian Advanced Learners’ Discussion of English and Natives
Subject Areas : Applied LinguisticsAshraf Vaziri 1 , Hamed Barjesteh 2 * , Atefeh Nasrollahi Mouziraji 3
1 - PhD Student of TEFL, AshrafV294@Yahoo.com; Department of English Language and Literature, Ayatollah Amoli Branch, Islamic Azad University, Amol, Iran.
2 - Associate Professor, ha_bar77@yahoo.com; Department of English Language and Literature, Ayatollah Amoli Branch, Islamic Azad University, Amol, Iran.
3 - Assistant Professor in TEFL, the Department of English Language and Literature, Ayatollah Amoli Branch, Islamic Azad University, Amol, Iran
Keywords: Corpus, Learner Corpora, Lexical Bundles, Speaking Skill,
Abstract :
This study examines the use of four-word lexical bundles in spoken group discussions by Iranian EFL learners (advanced) and native speakers. A corpus of 21 discussions (encompassing academic context) is analyzed to explore the frequency, structure, and function of lexical bundles used by these groups. Biber et al.'s (2004) taxonomies are employed to categorize the extracted bundles based on their structural and functional characteristics. The Michigan Corpus of Academic Spoken English (MICASE) serve as reference points for comparisons with native speaker usage. Quantitative analysis (frequency counts and chi-square tests) alongside qualitative content analysis reveal that native speakers utilize lexical bundles more frequently overall. Additionally, they exhibit a preference for discourse organizer bundles (functional category) and noun phrases (structural category). In the academic context, non-native speakers rely more heavily on stance expressions (functional) and verb phrase fragments (structural). These findings hold significant pedagogical implications for EFL instructors, material developers, and learners themselves, paving the way for improved instructional strategies and learning outcomes.
Ädel, A., & Erman, B. (2012). Recurrent word combinations in academic writing by native and non-native speakers of English: A lexical bundles approach. English for specific purposes, 31(2), 81-92. https://doi.org/10.1016/j.esp.2011.08.004
Ahmadi, M., Esfandiari, R., & Zarei, A. A. (2020). A corpus-based study of noun phrase complexity in applied linguistics research article abstracts in two contexts of publication. Iranian Journal of English for Academic Purposes, 9(1), 76-94. https://doi.org/20.1001.1.24763187.2020.9.1.6.3
Anthony, L. (2019). AntConc (Version 3.5.8) [Computer Software]. Tokyo, Japan: Waseda University. https://www.laurenceanthony.net/software
Appel, R. F. (2011). Lexical bundles in university EAP exam writing samples: CAEL test essays [Doctoral dissertation]. Carleton University.
Baroni, M., & Evert, S. (2008). Corpus Frequency Test Wizard [Computer program]. http://sigil.collocations.de/wizard.html
Biber, D., & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for specific purposes, 26(3), 263-286. https://doi.org/10.1016/j.esp.2006.08.003
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and textbooks. Applied linguistics, 25(3), 371-405. https://doi.org/10.1093/applin/25.3.371
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (2000). Longman grammar of spoken and written English.
Bychkovska, T., & Lee, J. J. (2017). At the same time: Lexical bundles in L1 and L2 university student argumentative writing. Journal of English for Academic Purposes, 20, 38-52. https://doi.org/10.1016/j.jeap.2017.10.008
Chen, Y. H., & Baker, P. (2016). Investigating criterial discourse features across second language development: Lexical bundles in rated learner essays, CEFR B1, B2 and C1. Applied linguistics, 37(6), 849-880.
Chen, Y., & Baker, P. (2010). Lexical bundles in L1 and L2 academic writing. Language Learning and Technology, 14(2), 30-49.
Cheng, S. W. (2010). A Corpus-Based Approach to the Study of Speech Act of Thanking. Concentric: Studies in Linguistics, 36(2).
Chun-Guang, T. (2014). An empirical research on the corpus-driven lexical chunks instruction. International Journal of English Language Teaching, 2(2), 1-36.
Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for specific purposes, 23(4), 397-423. https://doi.org/10.1016/j.esp.2003.12.001
Cortes, V. (2006). Teaching lexical bundles in the disciplines: An example from a writing intensive history class. Linguistics and education, 17(4), 391-406. https://doi.org/10.1016/j.linged.2007.02.001
Coxhead, A. (2008). Phraseology and English for academic purposes. Phraseology in language learning and teaching, 149-161.
Cui, X., & Kim, Y. (2023). Structural and functional differences between bundles of different lengths: A corpus-driven study. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.1061097
Esfandiari, R., & Barbary, F. (2017). A contrastive corpus-driven study of lexical bundles between English writers and Persian writers in psychology research articles. Journal of English for Academic Purposes, 29, 21-42. https://doi.org/10.1016/j.jeap.2017.09.002.
Esfandiari, R., Ahmadi, M., & Schaefer, E. (2021). A Corpus-based Study on the Use and Syntactic Functions of Lexical Bundles in Applied Linguistics Research Articles in Two Contexts of Publications. Applied Research on English Language, 10(4), 139-166. https://doi.org/10.22108/ARE.2021.130833.1787
Estaji, M., & Montazeri, M. R. (2022). Native English and non-native authors’ utilisation of lexical bundles: A corpus-based study of scholarly public health papers. Southern African Linguistics and Applied Language Studies, 40(2), 177-199. https://doi.org/10.2989/16073614.2022.2043169
Ganji, M., & Nasrabady, P. (2021). An analysis of the multiword units presented in IELTS speaking preparation books published by Iranian authors. Applied Research on English Language, 10(3), 137-160. https://doi.org/10.22108/ARE.2021.126390.1666
Gómez Burgos, E. (2015). First year university students' use of formulaic sequences in oral and written descriptions. Profile Issues in Teachers Professional Development, 17(1), 25-33. https://doi. org/10.15446/profile. v17n1. 43438
Grabowski, Ł. (2015). Keywords and lexical bundles within English pharmaceutical discourse: A corpus-driven description. English for Specific Purposes, 38, 23-33. https://doi.org/10.1016/j.esp.2014.10.004
Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and lexical phrases. Phraseology: Theory, analysis and applications, 145-160.
Granger, S., & Meunier, F. (2008). Phraseology in language learning and teaching. Where to from where. Phraseology: An interdisciplinary perspective. New York/Amsterdam: John Benjamins, 247-252. http://digital.casalini.it/9789027291318
Gries, S. T., & Ellis, N. C. (2015). Statistical measures for usage‐based linguistics. Language Learning, 65(S1), 228-255. https://doi.org/10.1111/lang.12119
Hejazi, H. (2021). A Corpus-Based Investigation of Lexical Bundles and Keyness in B1, B2 and C1 ESL Learners’ Academic Writing [Doctoral dissertation, University of Liverpool]. University of Liverpool. The University of Liverpool Repository. https://livrepository.liverpool.ac.uk/id/eprint/3154170
Henriksen, B. (2013). Research on L2 learners’ collocational competence and development–a progress report. C. Bardel, C. Lindqvist, & B. Laufer (Eds.) L, 2, 29-56.
Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for specific purposes, 27(1), 4-21. https://doi.org/10.1016/j.esp.2007.06.001
Hyland, K. (2012). Bundles in academic discourse. Annual Review of Applied Linguistics, 32, 150-169. https://doi.org/10.1017/S0267190512000037
Hyland, K., & Jiang, F. (2018). Academic lexical bundles: How are they changing? International Journal of Corpus Linguistics, 23(4), 383-407.
Jablonkai, R. (2009). In the light of: A corpus-based analysis of lexical bundles in two EU-related registers. Corvinus University of Budapest: WopaLP, 3, 1-26.
Jalali, H. (2013). Lexical bundles in applied linguistics: Variations across postgraduate genres. Journal of Foreign Language Teaching and Translation Studies, 2(2), 1-29. https://doi.org/10.22034/EFL.2013.79199
Jiangang, S. (2017). A corpus-based study of contrastive/concessive linking adverbials in spoken English of Chinese EFL learners. Studies in Literature and Language, 14(2), 17-25. https://doi.org/10.3968/9273
Juknevičienė, R. (2009). Lexical bundles in learner language: Lithuanian learners vs. native speakers. Kalbotyra, 61, 61-72.
Karabacak, E., & Qin, J. (2013). Comparison of lexical bundles used by Turkish, Chinese, and American university students. Procedia-Social and Behavioral Sciences, 70, 622-628. https://doi.org/10.1016/j.sbspro.2013.01.101
Kashiha, H., & Chan, S. H. (2015). A little bit about: Differences in native and non-native speakers’ use of formulaic language. Australian Journal of Linguistics, 35(4), 297-310. https://doi.org/10.1080/07268602.2015.1067132
Khazaee, H., Maftoon, P., Birjandi, P., & Rezaie Golandoz, G. (2020). Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners. International Journal of Foreign Language Teaching and Research, 8(33), 109-122. https://sanad.iau.ir/journal/jfl/Article/677137?jid=677137
Kuswoyo, H., Sujatna, E. T. S., Indrayani, L. M., & Rido, A. (2020). Cohesive conjunctions and so as discourse strategies in English native and non-native engineering lecturers: A corpus-based study. International Journal of Advanced Science and Technology, 29(7), 2322-2335. http://sersc.org/journals/index.php/IJAST/article/view/17970
Lehecka, T. (2015). Collocation and colligation. In Handbook of pragmatics online. Benjamins.
Li, J., & Schmitt, N. (2009). The acquisition of lexical phrases in academic writing: A longitudinal case study. journal of second Language Writing, 18(2), 85-102. https://doi.org/10.1016/j.jslw.2009.02.001
Liu, C.-Y., & Chen, H. J. H. (2020). Analyzing the functions of lexical bundles in undergraduate academic lectures for pedagogical use. English for Specific Purposes, 58, 122-137. https://doi.org/10.1016/j.esp.2019.12.003
Lu, X., & Deng, J. (2019). With the rapid development: A contrastive analysis of lexical bundles in dissertation abstracts by Chinese and L1 English doctoral students. Journal of English for Academic Purposes, 39, 21-36. https://doi.org/10.1016/j.jeap.2019.03.008
Ma, G. H. (2009). Lexical bundles in L2 timed writing of English majors. Foreign Language Teaching and Research, 1, 54-59.
Mohammadi, M., & Enayati, B. (2018). The Effects of Lexical Chunks Teaching on EFL Intermediate Learners' Speaking Fluency. International Journal of Instruction, 11(3), 179-192. https://doi.org/10.12973/iji.2018.11313a
Nasrabady, P., Elahi Shirvan, M., & Ehsan Golparvar, S. (2020). Exploring lexical bundles in recent published papers in the field of applied linguistics. Journal of World Languages, 6(3), 175-197. https://doi.org/10.1080/21698252.2020.1797992
Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press.
Oakey, D. (2020). Phrases in EAP academic writing pedagogy: Illuminating Halliday’s influence on research and practice. Journal of English for Academic Purposes 44: 100829. https://doi.org/10.1016/j.jeap.2019.100829
O'keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language teaching. Cambridge University Press.
Öztürk, Y., & Köse, G. D. (2016). Turkish and native English academic writers’ use of lexical bundles. Journal of language and linguistic studies, 12(1), 149-165.
Pan, F., & Liu, C. (2019). Comparing L1-L2 differences in lexical bundles in student and expert writing. Southern African Linguistics and Applied Language Studies 37(2), 142–157. https://doi.org/10.2989/16073614.2019.1625276
Pan, F., Reppen, R., & Biber, D. (2016). Comparing patterns of L1 versus L2 English academic professionals: Lexical bundles in Telecommunications research journals. Journal of English for Academic Purposes, 21, 60-71. https://doi.org/10.1016/j.jeap.2015.11.003
Ping, P. (2009). A study on the use of four-word lexical bundles in argumentative essays by Chinese English-majors: A comparative study based on WECCL and LOCNESS. CELEA journal, 32(3), 25-45. . http://www.celea.org.cn/teic/85/85-25.pdf
Qin, J. (2014). Use of formulaic bundles by non-native English graduate writers and published authors in applied linguistics. System 42(1), 220–231. https://doi.org/10.1016/j.system.2013.12.003
Reppen, R., & Olson, S. B. (2020). Lexical bundles across disciplines. U. Römer, V. Cortes & E. Friginal. Advances in Corpus-based Research on Academic Writing: Effects of discipline, register, and writer expertise, 169-182.
Ruan, Z. (2017). Lexical bundles in Chinese undergraduate academic writing at an English medium university. RELC Journal, 48(3), 327-340. https://doi.org/10.1177/00336882166312
Saito, K., & Liu, Y. (2022). Roles of collocation in L2 oral proficiency revisited: Different tasks, L1 vs. L2 raters, and cross-sectional vs. longitudinal analyses. Second Language Research, 38(3), 531-554. https://doi.org/10.1177/0267658320988055
Schmitt, N. (2010). Researching vocabulary: A vocabulary research manual, Springer.
Schmitt, N., & Carter, R. (2004). Formulaic sequences in action. Formulaic sequences: Acquisition, processing and use, 1-22. https://digital.casalini.it/9789027295750
Shahriari Ahmadi, H., Ghonsooly, B., & Hosseini Fatemi, A. (2013). Analyzing research article introductions by Iranian and native English-speaking authors of Applied Linguistics. International Journal of Research, 2(3), 3-18. https://doi.org/10.5861/ijrsll.2012.158
Shin, Y. K. (2018). The Construction of English Lexical Bundles in Context by Native and Nonnative Freshman University Students. English teaching, 73(3), 115-139. https://doi.org/10.15858/engtea.73.3.201809.115
Shirazizadeh, M., & Amirfazlian, R. (2021). Lexical bundles in theses, articles and textbooks of applied linguistics: Investigating intra disciplinary uniformity and variation. Journal of English for Academic Purposes, 49, 100946. https://doi.org/10.1016/j.jeap.2020.100946
Simpson-Vlach, R., & Leicher, S. (2006). The MICASE Handbook. Ann Arbor: The University of Michigan Press.
Staples, S., Egbert, J., Biber, D., & McClair, A. (2013). Formulaic sequences and EAP writing development: Lexical bundles in the TOEFL iBT writing section. Journal of English for academic purposes, 12(3), 214-225. https://doi.org/10.1016/j.jeap.2013.05.002
Stengers, H., Boers, F., Housen, A., & Eyckmans, J. (2011). Formulaic sequences and L2 oral proficiency: Does the type of target language influence the association?
Svensén, B. (2009). A handbook of lexicography: The theory and practice of dictionary-making. Cambridge University Press.
Vaziri, A., Barjesteh, H., & Nasrollahi Mouziraji, A. (2023). Formulaic Sequences in Learners’ Spoken English: A Comparative Corpus- based Study between Native and Non-Native Speakers of English. Iranian Journal of English for Academic Purposes, 12(3), 56-72.
Wang, M., & Zhang, Y. (2021). ‘According to…’: The impact of language background and writing expertise on textual priming patterns of multi-word sequences in academic writing. Journal of English for Specific Purposes, 61, 47-59. https://doi.org/10.1177/0267658320988055
Wei, Y., & Lei, L. (2011). Lexical bundles in the academic writing of advanced Chinese EFL learners. RELC journal, 42(2), 155-166. https://doi.org/10.1177/0033688211407295
Wu, H. (2019). A Corpus-based Analysis of TESOL EFL Students' Use of Logical Connectors in Spoken English. Theory and Practice in Language Studies, 9(6), 625-636. https://dx.doi.org/10.17507/tpls.0906.04
Xu, R., & Wijitsopon, R. (2023). Corpus linguistics and cinematic discourse: Lexical bundles in mainstream film scripts. LEARN Journal: Language Education and Acquisition Research Network, 16(1), 545-574. https://so04.tcithaijo.org/index.php/LEARN/article/view/263456
Yakut, I., & Yuvayapan, F. (2022). Forms and functions of self-repetitions in spoken discourse: A corpus linguistics analysis of L1 and L2 English. Topics in Linguistics, 23(1), 83-96. https://doi.org/10.2478/topling-2022-0007
Yang, B. (2020). A Corpus-based study of synonymous epistemic adverbs perhaps, probably, maybe and possibly. Research Journal of Education, 6(8), 158-168. : https://doi.org/10.32861/rje.68.158.168
Young, R. F., & Miller, E. R. (2004). Learning as changing participation: Discourse roles in ESL writing conferences. The Modern Language Journal, 88(4), 519-535. https://doi.org/10.1111/j.0026-7902.2004.t01-16-.x
Zago, R. (2020). Film discourse. In E. Friginal & J. A. Hardy The routledge handbook of corpus approaches to discourse analysis, 168–182. Routledge.
INTRODUCTION
Corpus linguistics contains statistical measures, and deals with large amounts of linguistic and empirical data (Granger, 1998; Gries & Ellis, 2015). Among analytical procedures that ensue from corpus linguistic studies, one can refer to word-lists and frequencies, lexical variation (type/token ratio), concordance lines, collocations, and lexical bundles (Biber et al. 2004; Coxhead, 2008; Granger & Meunier, 2008). Frequency is a basic feature of this kind of investigation; however, a corpus-based study does not basically count linguistic features, but also comprises qualitative commentaries of numerical data too. In other words, language corpora provide lexicographers with huge amounts of authentic linguistic data which subsequently enable them to determine the frequency of items and bundles and decide their occurrence (Esfandiari et al., 2021; Svensen, 2009). The goal of corpus-based research, according to Nasrabady et al. (2020), is not only to report quantitative linguistic data but also to bring to light samples of language apply through the analysis of language data.
The requirement of comprehending how language is created and the way various people acquire or learn language is an essential field in linguistic research, since language has been employed to converse ideas, feelings, and convey information and knowledge to raise generations during history (Gómez Burgos, 2015). Of all four language skills, speaking is taken into account to be the significant aspect in learning a second or foreign language. Comprehending the vast significance of speaking proficiency in EFL curriculums, it is essential to discover and employ the top instructional methods, materials, activities, media, and other prerequisites that will assist the learners become proficient in speaking expertise (Vaziri et al., 2023; Young & Miller, 2004). In this regard, Yang (2020) mentioned that, if English learners desire to achieve native-like “fluency and accuracy”, it is superior for them to gain communication strings and “lexical bundles”. It is increasingly acknowledged that certain sequences of words have functions that play an imperative role in the mastery of the language (Schmitt & Carter, 2004; Qin, 2014). These expressions also allow learners to organize their ideas in context, and facilitate fluent linguistic production and communication (Hejazi, 2021; Hyland, 2008; Li & Schmitt, 2009).
Lexical bundles are of special importance in speaking as they fulfill significant discourse functions and are a hallmark of advanced academic speaking (Pan & Liu, 2019; Ruan, 2017). Some scholars (Cortes, 2004; Hyland, 2012; Xu & Wijitsopon, 2023) described, “lexical bundles” as the combinations of more than two words that are repeatedly employed inside conversation related to the situation. According to Nasrabady et al. (2020), lexical bundles are generated automatically and unconsciously and have been considered for their pragmatic functions in different contexts.
Guided by the requisite to help such learners improve competence in speaking, there are many investigations of ‘lexical bundles’ on the different context (Pan et al., 2016; Pan & Liu, 2019; Staples et al., 2013). In general, these studies have deferred precious insights into how English learners at various proficiency levels employ language. Though, the results of these investigations remain mixed, and the differences between L2 learners from special proficiency levels in terms of their utilization of different forms, structural and functional patterns into which lexical bundles are categorized is not yet clear. This may be because of various corpus sizes, methodologies, and contexts in specific academic registers, or heterogeneity in corpus design, which may affect the employment of lexical bundles.
As a probe into lexical bundles frequencies and range might help teachers of English realize the influence of learners’ L1 on the L2 acquisition, boost more practical teaching strategies, and fill the research gap of the lack of investigation of the frequencies and range of Iranian learners’ employment of lexical bundles. Therefore, the current study investigates EFL learners’ application of lexical bundles in their spoken discourse, which was examined and analyzed with the help of corpora. The researcher first compiled an EFL learner corpus, the participants whose level of proficiency was advanced. Then, the researcher compared it with the well-built spoken English corpus of Michigan Corpus of Academic Spoken English (MICASE) to sort out some key features of the two corpora.
Specifically, this research examines the following research questions:
1. What are the overall frequencies of the lexical bundles in the Iranian advanced learners’ discussion corpora and MICASE?
2. What are the similarities and differences in utilizing lexical bundles between the Iranian advanced learners and native speakers?
LITERATURE REVIEW
The term ‘Lexical bundle’ is narrowly described as continuous three or more word sequences which take place frequently in a corpus, to satisfy particular frequency and dispersion thresholds, for instance, occurring at least 20–40 times per million words in three to six texts (Biber and Barbieri, 2007; Chen and Baker, 2016). This description has been implemented in many investigations of bundles, which has enhanced understanding of lexical bundles employment and provided standard identification criteria for them (Biber et al., 2004; Cortes, 2006; Grabowski, 2015; Liu and Chen, 2020).
Language learning and language fluency, researchers have studied the employment of lexical bundles in both oral and written language. It has been detected that lexical bundles are different across registers, “L1 writers, competence or expertise writers, disciplines, and levels” (e.g., Biber et al., 2004; Chen and Baker, 2016; Hyland, 2008; Öztürk & Köse, 2016; Reppen & Olson, 2020). Therefore, the significance of studying lexical bundles is apparent in different broader areas, such as “applied linguistics, second-language acquisition, language instruction”.
All preceding research on lexical bundles has employed corpora to recognize the most common recurrent sequences of words and verify how those sequences can be inferred as building blocks of discourse. According to Chun-Guang (2014) corpus linguistic has an effect on lexis teaching (Nation, 2001; Schmitt & Carter, 2004). It is pretty possible that data-driven learning methods prepare us with novel conceptions of ‘meta-linguistic knowledge’. A corpus prepares a large amount of quantitative data and an opportunity to test ideas about definite language. It can also show instances of chiefly rare or exceptional cases that could not be identified from looking at single texts. According to Svensen (2009), language corpora provide lexicographers with huge amounts of authentic linguistic data which subsequently enable them to determine the frequency of items and bundles and decide their occurrence.
The research of lexical bundles is essential since they are widespread in various registers and they “are prominent due to their rigidity” that allow them to be a good standard for teaching and learning a foreign language, as they are easily recognized (Lehecka, 2015). A number of studies have also claimed that learning these expressions not only assists learners to become fluent, but also to accomplish a greater range and accuracy of these expressions (Esfandiari & Barbary, 2017; Henriksen, 2013; Saito and Liu, 2022; Schmitt, 2010; Stengers et al., 2011;Wang & Zhang, 2021). These researches were based on various corpora and proposed “structural and functional” classifications for lexical bundles.
Biber et al. (2004) classified lexical bundles into “structural and functional” type. Regarding their structural classification, lexical bundles can include “verb phrase fragments” (e.g. can be used to), “dependent clause fragments” (e.g. that there is a), and “noun phrase and prepositional fragments” (e.g. one of the things). Functionally, they are sub-categorized into “stance expressions” (e.g., I don’t know that, the fact that the, if you want to, it is important to, going to be a, it is possible to). Second functional subcategory of lexical bundles is called “discourse organizers” which “reflect relationships between prior and coming discourse” (e.g. in this chapter we, as well as the). Last functional sub-set of lexical bundles is called “referential expressions” which “make direct reference to physical or abstract entities, or to the textual context itself (e.g. is one of the, something like that, the rest of the, the size of the, in terms of the, in the United States) (p. 381, 384–388).
A lot of studies have been done considering lexical bundles and corpus linguistics. For example, Adel and Erman (2012) compared the use of LBs by L1 speakers of Swedish advanced learners and their English native-speaker counterparts who were all undergraduate students in the discipline of applied linguistics. Four-word lexical bundles were extracted from the corpora, and they were analyzed both quantitatively and qualitatively in terms of the functions they served. The results of their study showed that native speakers used more varied and a larger number of lexical bundles in comparison to L2 writers.
To compare the utilization of lexical bundles in both written and spoken discourse, Oakey (2020) scrutinized the uses of lexical bundles in oral and written university registers based on a sub-component of the TOEFL 2000 Spoken and Written Academic Language Corpus. The findings of their research made it clear that lexical bundles are both dependent on the mode (written and spoken) and the communicative purpose they serve. They also found that lexical bundles are commonly used in university language and even in non-academic registers such as management course syllabi.
Zago (2020), studied lexical bundles in “English and American films” and detected that lexical bundles are frequent in cinematic discourse, with proximal bundles being more common in film dialogues than in other spoken registers of English.
In the study of Saito and Liu (2022), lexical bundles as one the particular kind of formulaic sequence has been investigated in speaking L2 proficiency. The result revealed a strong positive relationship between the length of communication and language proficiency, paying attention to the length of speech than the specifications of speech, may help speakers’ production of language and seem more native-like while focusing. It is also proved that using appropriate formulaic language enhances learners’ self-confidence and motivates them by the feeling of accomplishment that might facilitate the learning process (Schmitt, 2010).
METHODOLOGY
Corpora
Native Corpus: MICASE
During 1997 to 2001, the “University of Michigan’s English language Institute” accumulated the “Michigan Corpus of Academic Spoken English” (MICASE), contained ‘200 full hours’ of academic communication which were filed and copied (Simpson-Vlach & Leicher, 2006). By the number of ‘152 speech occasions’, they demonstrated that MICASE involves ‘four academic partitions’, like: “Physical Sciences and Engineering”, “Biological and Health Sciences”, “Humanities and Art”, and “Social Sciences and Education”. Corpus of “Humanities and Art”, and “Social Sciences and Education” were employed in present research. Although there are enormous spoken corpora with literally transcriptions of a considerable number of spoken samples from EFL learners from all over the world, some of them are not free of charge, and others are not simply accessible. Besides, among the accessible corpora, the percentage of the participants who speak advanced English in the corpora is pretty low. The data needed for this study should be advanced English learners whose English proficiency ranges among C1 or above. So the researcher employed MICASE for the current study.
In addition to mentioned reasons for choosing MICASE; different studies investigating spoken language have assigned this corpus in literature (Ganji & Nasrabady, 2021; Kashiha & Chan, 2015; Kuswoyo et al., 2020; Wu, 2019). In the current study the researcher chose five topics from MICASE corpus; since the corpus has discussion sessions and is related to the context of this research. The topics, types and word counts of each transcription are proposed in the following table.
Table 1
Description of transcripts of MICASE Corpus
Title | Type | Word Count | Academic Division |
1.Philosophy 2. Economics 3. Intro Anthropology 4. History 5. Intro to American politics | Discussion section Discussion section Discussion section Discussion section Discussion section | 8355 8526 7893 15679 7220 | Humanities Social science and education Social science and education Social science and education Social science and education |
Non-native Corpus: Iranian Non-native Academic Spoken Corpus
Non-native EFL learners’ spoken corpus of present study has been drawn out from 21 ‘group discussion’ transcriptions including different topics such as: ‘Anthropology, Economics, Philosophy, History and American politics’ . The subjects of discussions were the same as the topics of discussions extracted from MICASE corpus. The members of group discussions were ‘18 female and 10 male undergraduate university students’ (distributed in 7 group discussions) between the age of 20 to 26 from. All participants of the research were engaged in English studying for 5-6 years as a minimum and their language proficiency level were C1. Quick Oxford Placement Test (QOP) was carried out and the participants whose level was C1 or above were took part in group discussions (Mohammadi & Enayati, 2018). The participants were assigned on the basis of “convenience sampling”, therefore they varied in their ‘age, gender, and years of learning experience’. Four participants took part in each ‘group discussions’ and they consult with each other about 15-20 minutes.
Table 2
Non-native Learner Corpus features
Features of Learners | Features of Tasks | ||
Age: 20-26 years old
Gender: 18 female and 10 male | English foreign language
Advanced university students | Speaking
Academic genre | Speaking group discussion
timed: 15-20 min. |
Table 3
Information of the Corpora
Corpus | Number of texts | Number of words |
Native corpus: MICASE Non-native corpus: Iranian advanced university students | 5 21 | 47673 35538 |
Lexical Bundles Identification
For this stage, two key points had to be considered: “the size of cluster” (i.e., lexical bundle) to seek for, and “the frequency cut-off”; they were difficult decisions. As mentioned before, this research focused on units consisting of 4-word sequences. Referring to Hyland (2008), 4-word lexical bundles are more general than 5-word bundles and usually signify clearer structures and functions than 3 -word bundles. Two word lexical bundles were left out too, since they are too frequent and often represent regular collocations. As proposed by Appel (2011), 4-word bundles “seem to have become a standard unit of length in this type of research, problems still persist” (p. 69).
Another norm was to determine an objective “frequency cut-off point” for the detection of “lexical bundles”. Hyland (2008) also admitted that the “cut-off point” for bundles is pretty arbitrary. Biber et al. (2000) explained “lexical bundles” as the groupings of words that occur at least ‘10 times in a million words’, and ‘in at least five different texts in the corpora’. Subsequent to some consideration, it was determined that the ‘frequency cut-off’ for the present study would be “four occurrences in the corpus”, corresponding to roughly 12 incidences per million words. This is included the range established by preceding researches; as previously confirmed, ‘numbers between five and 40 occurrences per million words’ have been employed (Biber et al., 2004; Jablonkai, 2009).
“Range” involves that a bundle should emerge in a definite number of various texts; this assists to “guard against idiosyncratic uses by individual speakers or authors” (Biber et al., 2004). Prior investigations of this setting have applied disparate ‘range cut-offs’; Biber et al. (2004), Cortes (2006), and Cheng (2010) all formerly employed a “cut-off of five texts”. Hyland (2008) offered a “percentage-based approach”, (i.e. a series appearing in fewer than 10 percent of the transcripts would be deleted). For the recent research, it was settled on that a bundle should manifest in “at least four of the twenty- one texts”. This would decrease significantly the amount of total elements in the list by omitting the bundles that were particular to some speakers.
Since the corpora employed in this research were not parallel in size, a “normalization procedure in every 1000 words” had to be accomplished to make it possible to compare two corpora in terms of overall frequency of bundles utilized. Despite the limitations, this procedure had already been employed in some previous corpus-based researches of lexical bundles (Esfandiari et al., 2021; Jalali, 2013; Khazaee et al., 2020)
DATA ANALYSIS
The data have been compiled through speaking group discussions which were performed by the participants about definite topics assigned to them by the researcher. For the aim of compiling the data inside spoken discourse of non-native speakers of English, the student group discussions of Iranian undergraduate students were primary audio-recorded and then have been kept as sound files on a computer. Since the research corpus was comprised by taking MICASE as the basis for the study, the transcription process was conducted in a similar method and manner with MICASE. The transcription of the study corpus was done according to MICASE orthographic transcription principles and mark-up system which are classified to allow for ease of readability, as containing sufficient details to make sure enough comprehension from the text of the transcript alone. For the purpose of operating the AntConc 3.3.0 (Anthony, 2019) program for data scrutiny, plain text is required, thus all conversations have been typed and accumulated in text set-up utilizing computer software. The researcher employed AntConc 3.3.0, a freeware concordance computer software to recognize and create a list of common 4-word lexical bundles in each sub-corpora. Having analyzed the frequency of each bundle, chi-square analysis was run to determine whether or not the differences in each category of various bundles were significant. In this research, it was computed with the employment of an online calculator (http://sigil.collocations.de/wizard.html).
Throughout transcription, speech errors were made by the participants, have not been corrected and were transcribed as how they have really happened. After comparing the occurrence and patterns of employment, the recognized “lexical bundles” were then classified both “structurally” based on their grammatical forms, and “functionally” along with their contextual meanings (Biber et al., 2004). In order to decrease subjectivity, the transcripts were verified by another rater, and modifies were made if needed.
RESULTS
Division of the Extracted Bundles
A primary evaluation of two lists ‘native speakers (NSs) and non-native speakers (NNSs)’ of drawn out bundles revealed, the discussion sessions of the NSs contained a larger number of “lexical bundles”. Overall, total raw frequencies of “505 target bundles” were found in the native corpus and “315 bundles” in non-native corpus. There were 67 various sequences in the native corpus and merely 45 for the NNSs. As raw frequencies in corpora with various sizes present incomparable results, raw frequencies were normalized per 1000 words. The process of normalization can be done manually with the formula (raw frequency x 1000) ÷ number of words in the corpus, or electronically through websites.
Table 4
Division of Lexical Bundles in N and NN Corpora
Groups | Entire No. of N-grams types (Raw Freq.) | Entire No. of N-grams Token (Raw Freq.) | NF per 1000 words |
Native speakers | 67 | 505 | 10.59 |
Non- native Speakers | 45 | 315 | 8.86 |
Note: In this study ‘N-grams’ refer to 4-word lexical bundles
Raw Freq. = Raw Frequency, NF= Normalized frequency
As table 4 revealed, based on normalized frequencies native speakers employed more bundles comparing their counterparts. NF for NS is 10.59 and for NNS 8.86. To compare the frequency of extracted bundles of both corpora, ‘Chi-square test’ for statistical significance was done. The test was run by an online calculator named “Corpus Frequency Wizard Tool” which compared “the frequency of two samples across two different data sets” (Baroni & Evert, 2008). Different researchers employed this procedure to administer Chi-square test and compare the frequencies (Jiangang, 2017; Khazaee et al., 2020; Yakut & Yuvayapan, 2022). Figure 3 demonstrates the frequency comparison of lexical bundles in both native and non-native corpora and Figure 1 shows the results of comparison.
Figure 1
Frequency Comparison of Bundles in Both NS and NNS Corpora via Corpus Frequency Wizard Tool
Figure 2
Corpus Frequency Test Result
As figure 2 showed, a statistically significant difference was distinguished between native and non-native corpora in terms of the use of lexical bundles (X2= 6.063; p< 0.05). It is interpreted that native corpus utilized significantly more bundles comparing to Iranian corpus. The fact that English L2 speakers employed fewer “lexical bundles” than native speakers has been confirmed by (Adel & Erman, 2012; Chen & Baker, 2010; Karabacak & Qin, 2013; Kashiha & Chan, 2015; Ma, 2009) in the previous researches. As the goal of the research is to investigate the “structural and functional” characteristics of “lexical bundles” in group discussions, the complete analysis of these two sets, is given through the study of all bundles which were found in both native and non-native corpora.
Below are a number of instances of the employment of “lexical bundles” in non-native corpus.
……honestly I have no idea about all the intricacies of anthropology, but it sounds like an extraordinary marvelous subject….
……when I talk about history I want you to know what I mean by studying events, people and societies from the past to gain insights into our current world…
….it doesn’t have to be daunting task to delve into the realm of philosophy; it can be a rewarding journey of self-discovery and intellectual exploration….
…..let’s talk about the various branches of philosophy and how they shape our understanding of the worlds….
…..a little bit more empathy and understanding can go a long way in bridging the political divides that exist in our country…..
…..I was just wondering, have you ever thought about how anthropology can shed lights on our cultural identities….
…….I understood that you have spent a lot of money recently, so I want to help find the way to manage your finances better…….
……thank you very much because of your extraordinary interesting favour for introducing me some historical places in your hometown….
…….that would be interesting to discover how various societies perceive and interpret the world around them….
…..can be used as a symbol or clue, especially in historical investigations…..
…...I don’t understand why people don’t pay attention to undeniable evidences…..
…..you don’t have to limit your soul between these issues…
…..I guess you cannot tolerate philosophy and it looks dreamy for you…..
……this is hard to control your consumption and think about pros and cons…
…..I am going to change your worldview and make you liberal……..
Structural Distribution of LBs
In addition to the occurrence table, decomposition of the corpus displayed that “university undergraduate learners” utilized combination of structures to form “lexical bundles” in their discussions. Outcomes of the analysis clarified, most of the objective bundles were ‘phrasal’ rather than ‘clausal’. These sequences mostly included either ‘noun, prepositional or verb’ phrases. Table 5 presents ‘the number’ of the most significant ‘syntactic structures’ of the bundles in ‘group discussions’. In addition to raw frequencies, the percentage and normalized frequencies of bundles are reported in table.
Table 5
Structural Division of Lexical Bundles in N and NN corpora
structural types | NS No. (%) | NF per 1000 words | (+/-) | NNS No. (%) | NF per 1000 words | ||
verb phrase | 23 (34.33) | 0.48 | +0.11 | 21 (46.67) | 0.59 | ||
noun phrase and prepositional phrase | 34 (50.74) | 0.71 | -0.20 | 18 (40) | 0.50 | ||
dependent clause | 10 (14.92) | 0.20 | -0.04 | 6 (13.33) | 0.16 | ||
Total | 67 |
|
| 45 |
|
Note: NS refers to Native Speakers, NNS refers to Non-native Speakers
NF= normalized frequency; (+/-: A positive value denotes overuse, a negative value denotes underuse)
As it is specified in the table, NSs were employed ‘noun phrase and prepositional phrase’ fragments more than the two other sets. NSs used ‘noun phrase and prepositional phrase’ fragments 0.71 (50.74%), ‘verb phrase’ fragments 0.48 (34.33%) and ‘dependent clause’ fragments 0.20 (14.92%). While, NNSs indicated more propensity to ‘verb phrase’ fragments 0.59 (46.67%), actually non-native learners overused VP (+0.11) and underused the other two sub-groups. It’s fascinating to declare that both ‘native and non-native’ corpora had roughly resembling employment of ‘dependent clause fragments’: NSs (14.92%) and NNSs (13.33%).
Functional Distribution of LBs
As mentioned before, the “functional taxonomy” in current research can be dispensed into three sets: “stance bundle, discourse organizer, referential expression”. In order to show precise results, analysis of the sequences was sub-classified for each part. According to Biber et al. (2004), “Stance bundles express attitudes or assessments of certainty that frame some other proposition. ‘Discourse organizers’ reflect relationships between prior and coming discourse. ‘Referential bundles’ make direct reference to physical or abstract entities, or to the textual context itself, either to identify the entity or to single out some particular attribute of the entity as especially important” (p. 384). The “functional classification” of all the ‘4–word lexical bundles’ recognized in both native and non-native corpus of present study is presented in table 6.
Table 6
Functional Division of Lexical Bundles in N and NN Corpora
functions | NS No. (%) | NF per 1000 words | (+/-) | NNS No. (%) | NF per 1000 words |
stance expressions | 21 (31.34) | 0.44 | +0.09 | 19 (42.22) | 0.53 |
discourse organizers | 27 (40.29) | 0.56 | -0.09 | 17 (37.77) | 0.47 |
referential expressions | 19 (28.35) | 0.39 | -0.14 | 9 (20)
| 0.25 |
Total | 67 | 1.40 |
| 45 | 1.26 |
Note: (+/-: A positive value denotes overuse, a negative value denotes underuse)
The results demonstrate that ‘spoken language in group discussions’ in native corpus contained ‘27 discourse organizers (40.29%)’, followed by ‘21stance expressions (31.34%)’, ‘19 referential expressions (28.35%)’ respectively. The Ratio of Frequency per 1000 words is: 0.56, 0.44, and 0.39. Adopting the same calculation method in non-native corpus showed that they employed a higher variety of ‘stance expressions’ (0.53), in comparison to the N data (0.44). In contrast, NSs displayed more willingness to apply ‘discourse organizers (0.56)’ that was the second used function in the non-native corpus (0.47). Then we can get the difference between 0.53 and 0.44 was 0.09, which was a positive value indicating the overuse of ‘stance expressions’ in the Iranian undergraduate university learners. This result is not in line with Hyland (2008) and Esfandiari and Barbary (2017); since they came to conclusion that Iranian writers underuse “stance bundles”. It can be due to cultural issues that are the root of the disinterest in making serious claims in academic writing (Estaji & Montazeri, 2022). Comparing the other two sub-sets revealed the negative value and underuse of both ‘discourse organizers and referential expressions’ in non-native corpus.
Jukneviciene (2009) proved, the features of speaking can be distinguished with ‘stance expressions’ and ‘discourse organizing’, whereas the features of written context could be discerned by ‘referential expression’ (Biber et al., 2000, 2004; O’Keeffe et al., 2007). “Referential expressions” describe approximately ‘one third of the bundles in the native’ and ‘one fifth in non-native corpus (28.35% and 20% respectively)’. The distinction of all ‘functional classifications’ and its sub-sets are explained in the subsequent parts.
DISCUSSION
The purpose of the current research was to compare lexical bundles used by Iranian advanced English learners and native English speakers. The results of the study showed that Iranian advanced English learners made less use of LBs at a lower frequency than English academic speakers. Structural analysis of LBs disclosed that VP-based bundles made up the greatest proportion of all bundle types in NNC, followed by NP-based bundles, and Dependent clause bundles. Though, NC revealed various patterns of employment where NP-based bundles comprised the largest proportion, followed by VP-based bundles, and Dependent clause bundles. Functional analysis of LBs revealed that “Stance bundles” made up the greatest proportion of all bundle types in NNC, followed by “Discourse bundles”, and “Referential bundles”. Though, NC revealed various patterns of employment where “Discourse bundles” comprised the largest proportion, followed by “Stance bundles”, and “Referential bundles”.
Following examples illustrate how the ‘lexical bundles’ are utilized in the main functions of the Iranian learners.
...... I don’t know if you have ever pondered the big questions of existence but philosophy offers fantastic insights….. (Stance expressions)
..... I’m going to explore the field of anthropology to learn more about the rich tapestry of human existence…… (Stance expressions)
….it seems to me as a complex subject which is the study of existence and moral…..(Stance expressions)
…..I’d like to say that history is fascinating subject that allows us to explore the past and understand how it has shaped the present…… (Discourse organizers)
……if you want to understand the complexities of American politics, it’s important to stay informed and engage in discussions……(Discourse organizers)
……as you can guess the first thing we learned was the principle of supply and demand……(Discourse organizers)
…….on the other hand it affects on your business and occupations…..(Discourse organizers)
……what do you think about financial affairs?..... (Discourse organizers)
…..that would be good if some governors disclose some secrets which are hidden…..(Discourse organizers)
…..at the end of the day, philosophy encourages us to critically examine our beliefs and seek deeper meaning…..(Referential expressions)
……for a long time anthropologists have been fascinated by the diversity of human cultures and their similarities ……(Referential expressions)
…..this is hard to grasp at first but delving into history can provide us with valuable lessons and perspectives that can help us navigate the complexities of our modern world ……(Referential expressions)
…..some people stuck to their prejudices for a long time…..(Referential expressions)
…….in the same way there are many citizens who are not satisfied with their government……..(Referential expressions)
Reviewing literature, the researcher has found that some scholars such as: “Cortes, 2004; Hyland, 2008; Ping, 2009; Shahriari Ahmadi et al., 2013; Wei & Lei, 2011” are in line with her. Since they found that non-native English learners utilized less lexical bundles compared to natives. But there are also some researches which are not in the same vein with this study. Lu and Deng (2019) discovered that Chinese doctoral students used LBs more frequently than their native-speaker counterparts, although they “exhibited incomplete knowledge of some aspects of the English lexico-grammatical system” (p. 1). According to Ahmadi et al. (2020) Persian writers employed significantly more lexical bundles of all types as noun modifiers compared to native writers.
The results of Bychkovska and Lee (2017) are similar to the findings of this study. They also found that native English speakers employed more “Phrasal bundles” than the non-native English learners, and conversely, used less “Clausal bundles”. Pan et. al. (2016) found that the majority of the bundles utilized in the native corpus were “phrasal”, whereas in the non-native corpus these bundles were not as common. This reveals that the non-native learners tend to employ more “clausal bundles” even at the more advanced level. This means that the nativeness and expertness both might be at play with respect to utilize of “clausal bundles” in the non-native corpus (Chen & Baker, 2010). Similar to the native corpus of this study, the conclusion of some studies claimed that “noun phrase-based” and “preposition phrase-based” bundles were the eminent ones (Cui & Kim, 2023; Hyland & Jiang, 2018; Shirazizadeh & Amirfazlian, 2021).
The findings of this research correspond to preceding studies, which propose that ‘verb-based bundles’ are in the prominent structural classification employed by Iranian English learners (Chen & Baker, 2010; Chen and Baker, 2016; Ruan, 2017). Pan et.al. (2019) also concluded that the non-native learners employed more Verb-based bundles, particularly the ‘Passive verbs + prepositional phrase fragment’, than the native English writers.
Similar to the previous studies (Biber et al., 2004; Chen & Baker, 2010), LBs embedded with first person pronouns (e.g., I think that, I think it is, I want to) were overused across Iranian learners. The enhanced employment of the first-person pronoun helps readers follow up the discussion, and helps speakers to build their identity by highlighting their voice when conveying the idea. On the other hand, the use of lexical bundles, ‘personal pronouns and stance bundles’ support and builds on the results of previous studies (Chen and Baker, 2016; Staples et al., 2013).
On the other hand, some previous researches have shown that the use of bundles is discipline specific (Cortes, 2004; Hyland, 2008). For example, Cortes (2004) found that the students of History made a frequent use of “Clausal bundles”, whereas the students of Biology made frequent use of “Phrasal bundles”. So, the more frequent use of “Clausal bundles” in native and the non-native student writing might be due to data based on the disciplines that make more frequent use of clausal bundles. Shin (2018) also detected various results from the current study. She compared the employment of bundles in ‘argumentative essays written by native and non-native English students’. She found that in both the native and the non-native corpora, the majority of the bundles were “clausal”.
CONCLUSION
This study examined employment of four-word bundles in speaking group discussions of native and Iranian advanced English speakers. All through a corpus-based study, two corpora were contrasted based on their “occurrence, function and also structure”. The findings of the analysis revealed that the NC included more lexical bundles than the Iranian corpus. Functionally, the results showed that Iranian speakers used ‘stance expressions’ more than the two other categories, whereas in NC speakers employed ‘discourse organizer’ more. Structurally, in the current study the evaluation exposed that learners’ discussions included ‘verb phrase’ and then ‘noun and prepositional phrase’, but this is in contrast with NC. This research expectantly is practical for instructors, English learners and researchers. The result of this investigation can be used as a reference for those who desire to study in English teaching to improve students’ speaking skills and form energetic classes by teachers through employing practical techniques for students and authentic teaching. Shirazizadeh and Amirfazlian (2020) suggested that highlighting lexical bundles in teaching materials can improve overall language fluency for non- native English speakers. It seems that instruction in LB usage may benefit from corpus-based learning approaches for exploring, comparing, and analyzing the positional distribution of bundles to resolve any discrepancies in the rhetorical conventions of LBs in advanced academic speaking (Cui & Kim, 2023). Mostly in Iran context, the results of current study will be helpful for textbook compiler too. Further studies can provide a list of different lexical bundles from various related studies and design creative exercises for boosting speaking skill. Still, researches can investigate larger corpora in different contexts such as: university lectures, seminars and dialogues.
References
Ädel, A., & Erman, B. (2012). Recurrent word combinations in academic writing by native and non-native speakers of English: A lexical bundles approach. English for specific purposes, 31(2), 81-92. https://doi.org/10.1016/j.esp.2011.08.004
Ahmadi, M., Esfandiari, R., & Zarei, A. A. (2020). A corpus-based study of noun phrase complexity in applied linguistics research article abstracts in two contexts of publication. Iranian Journal of English for Academic Purposes, 9(1), 76-94. https://doi.org/20.1001.1.24763187.2020.9.1.6.3
Anthony, L. (2019). AntConc (Version 3.5.8) [Computer Software]. Tokyo, Japan: Waseda University. https://www.laurenceanthony.net/software
Appel, R. F. (2011). Lexical bundles in university EAP exam writing samples: CAEL test essays [Doctoral dissertation]. Carleton University.
Baroni, M., & Evert, S. (2008). Corpus Frequency Test Wizard [Computer program]. http://sigil.collocations.de/wizard.html
Biber, D., & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for specific purposes, 26(3), 263-286. https://doi.org/10.1016/j.esp.2006.08.003
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and textbooks. Applied linguistics, 25(3), 371-405. https://doi.org/10.1093/applin/25.3.371
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (2000). Longman grammar of spoken and written English.
Bychkovska, T., & Lee, J. J. (2017). At the same time: Lexical bundles in L1 and L2 university student argumentative writing. Journal of English for Academic Purposes, 20, 38-52. https://doi.org/10.1016/j.jeap.2017.10.008
Chen, Y. H., & Baker, P. (2016). Investigating criterial discourse features across second language development: Lexical bundles in rated learner essays, CEFR B1, B2 and C1. Applied linguistics, 37(6), 849-880.
Chen, Y., & Baker, P. (2010). Lexical bundles in L1 and L2 academic writing. Language Learning and Technology, 14(2), 30-49.
Cheng, S. W. (2010). A Corpus-Based Approach to the Study of Speech Act of Thanking. Concentric: Studies in Linguistics, 36(2).
Chun-Guang, T. (2014). An empirical research on the corpus-driven lexical chunks instruction. International Journal of English Language Teaching, 2(2), 1-36.
Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for specific purposes, 23(4), 397-423. https://doi.org/10.1016/j.esp. 2003.12.001
Cortes, V. (2006). Teaching lexical bundles in the disciplines: An example from a writing intensive history class. Linguistics and education, 17(4), 391-406. https://doi.org/10.1016/j.linged. 2007.02.001
Coxhead, A. (2008). Phraseology and English for academic purposes. Phraseology in language learning and teaching, 149-161.
Cui, X., & Kim, Y. (2023). Structural and functional differences between bundles of different lengths: A corpus-driven study. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.1061097
Esfandiari, R., & Barbary, F. (2017). A contrastive corpus-driven study of lexical bundles between English writers and Persian writers in psychology research articles. Journal of English for Academic Purposes, 29, 21-42. https://doi.org/10.1016/j.jeap.2017.09.002.
Esfandiari, R., Ahmadi, M., & Schaefer, E. (2021). A Corpus-based Study on the Use and Syntactic Functions of Lexical Bundles in Applied Linguistics Research Articles in Two Contexts of Publications. Applied Research on English Language, 10(4), 139-166. https://doi.org/10.22108/ARE.2021.130833.1787
Estaji, M., & Montazeri, M. R. (2022). Native English and non-native authors’ utilisation of lexical bundles: A corpus-based study of scholarly public health papers. Southern African Linguistics and Applied Language Studies, 40(2), 177-199. https://doi.org/10.2989/16073614.2022.2043169
Ganji, M., & Nasrabady, P. (2021). An analysis of the multiword units presented in IELTS speaking preparation books published by Iranian authors. Applied Research on English Language, 10(3), 137-160. https://doi.org/10.22108/ARE.2021.126390.1666
Gómez Burgos, E. (2015). First year university students' use of formulaic sequences in oral and written descriptions. Profile Issues in Teachers Professional Development, 17(1), 25-33. https://doi. org/10.15446/profile. v17n1. 43438
Grabowski, Ł. (2015). Keywords and lexical bundles within English pharmaceutical discourse: A corpus-driven description. English for Specific Purposes, 38, 23-33. https:// doi. org/ 10. 10 16/j. esp.2 014.10.004
Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and lexical phrases. Phraseology: Theory, analysis and applications, 145-160.
Granger, S., & Meunier, F. (2008). Phraseology in language learning and teaching. Where to from where. Phraseology: An interdisciplinary perspective. New York/Amsterdam: John Benjamins, 247-252. http://digital.casalini.it/9789027291318
Gries, S. T., & Ellis, N. C. (2015). Statistical measures for usage‐based linguistics. Language Learning, 65(S1), 228-255. https://doi.org/10.1111/lang.12119
Hejazi, H. (2021). A Corpus-Based Investigation of Lexical Bundles and Keyness in B1, B2 and C1 ESL Learners’ Academic Writing [Doctoral dissertation, University of Liverpool]. University of Liverpool. The University of Liverpool Repository. https:// livre pository. liverpool. ac.uk/id/eprint/3154170
Henriksen, B. (2013). Research on L2 learners’ collocational competence and development–a progress report. C. Bardel, C. Lindqvist, & B. Laufer (Eds.) L, 2, 29-56.
Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for specific purposes, 27(1), 4-21. https://doi.org/10.1016/j.esp.2007.06.001
Hyland, K. (2012). Bundles in academic discourse. Annual Review of Applied Linguistics, 32, 150-169. https://doi.org/10.1017/S0267190512000037
Hyland, K., & Jiang, F. (2018). Academic lexical bundles: How are they changing? International Journal of Corpus Linguistics, 23(4), 383-407.
Jablonkai, R. (2009). In the light of: A corpus-based analysis of lexical bundles in two EU-related registers. Corvinus University of Budapest: WopaLP, 3, 1-26.
Jalali, H. (2013). Lexical bundles in applied linguistics: Variations across postgraduate genres. Journal of Foreign Language Teaching and Translation Studies, 2(2), 1-29. https://doi.org/10.22034/EFL.2013.79199
Jiangang, S. (2017). A corpus-based study of contrastive/concessive linking adverbials in spoken English of Chinese EFL learners. Studies in Literature and Language, 14(2), 17-25. https://doi.org/10.3968/9273
Juknevičienė, R. (2009). Lexical bundles in learner language: Lithuanian learners vs. native speakers. Kalbotyra, 61, 61-72.
Karabacak, E., & Qin, J. (2013). Comparison of lexical bundles used by Turkish, Chinese, and American university students. Procedia-Social and Behavioral Sciences, 70, 622-628. https://doi.org/10.1016/j.sbspro.2013.01.101
Kashiha, H., & Chan, S. H. (2015). A little bit about: Differences in native and non-native speakers’ use of formulaic language. Australian Journal of Linguistics, 35(4), 297-310. https://doi.org/10.1080/07268602.2015.1067132
Khazaee, H., Maftoon, P., Birjandi, P., & Rezaie Golandoz, G. (2020). Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners. International Journal of Foreign Language Teaching and Research, 8(33), 109-122. https://sanad.iau.ir/journal/jfl/Article /677137 ?jid=677137
Kuswoyo, H., Sujatna, E. T. S., Indrayani, L. M., & Rido, A. (2020). Cohesive conjunctions and so as discourse strategies in English native and non-native engineering lecturers: A corpus-based study. International Journal of Advanced Science and Technology, 29(7), 2322-2335. http://sersc.org/journals/index.php/IJAST/article/view/17970
Lehecka, T. (2015). Collocation and colligation. In Handbook of pragmatics online. Benjamins.
Li, J., & Schmitt, N. (2009). The acquisition of lexical phrases in academic writing: A longitudinal case study. journal of second Language Writing, 18(2), 85-102. https:// doi. org/ 10. 10 16/ j.js lw.2009.02.001
Liu, C.-Y., & Chen, H. J. H. (2020). Analyzing the functions of lexical bundles in undergraduate academic lectures for pedagogical use. English for Specific Purposes, 58, 122-137. https://doi.org/10.1016/j.esp.2019.12.003
Lu, X., & Deng, J. (2019). With the rapid development: A contrastive analysis of lexical bundles in dissertation abstracts by Chinese and L1 English doctoral students. Journal of English for Academic Purposes, 39, 21-36. https://doi.org/10.1016/j.jeap.2019.03.008
Ma, G. H. (2009). Lexical bundles in L2 timed writing of English majors. Foreign Language Teaching and Research, 1, 54-59.
Mohammadi, M., & Enayati, B. (2018). The Effects of Lexical Chunks Teaching on EFL Intermediate Learners' Speaking Fluency. International Journal of Instruction, 11(3), 179-192. https://doi.org/10.12973/iji.2018.11313a
Nasrabady, P., Elahi Shirvan, M., & Ehsan Golparvar, S. (2020). Exploring lexical bundles in recent published papers in the field of applied linguistics. Journal of World Languages, 6(3), 175-197. https://doi.org/10.1080/21698252.2020.1797992
Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press.
Oakey, D. (2020). Phrases in EAP academic writing pedagogy: Illuminating Halliday’s influence on research and practice. Journal of English for Academic Purposes 44: 100829. https://doi.org/10.1016/j.jeap.2019.100829
O'keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language teaching. Cambridge University Press.
Öztürk, Y., & Köse, G. D. (2016). Turkish and native English academic writers’ use of lexical bundles. Journal of language and linguistic studies, 12(1), 149-165.
Pan, F., & Liu, C. (2019). Comparing L1-L2 differences in lexical bundles in student and expert writing. Southern African Linguistics and Applied Language Studies 37(2), 142–157. https://doi.org/10.2989/16073614.2019.1625276
Pan, F., Reppen, R., & Biber, D. (2016). Comparing patterns of L1 versus L2 English academic professionals: Lexical bundles in Telecommunications research journals. Journal of English for Academic Purposes, 21, 60-71. https://doi.org/10.1016/j.jeap.2015.11.003
Ping, P. (2009). A study on the use of four-word lexical bundles in argumentative essays by Chinese English-majors: A comparative study based on WECCL and LOCNESS. CELEA journal, 32(3), 25-45. . http://www.celea.org.cn/teic/85/85-25.pdf
Qin, J. (2014). Use of formulaic bundles by non-native English graduate writers and published authors in applied linguistics. System 42(1), 220–231. https://doi.org/10.1016/j.system.2013.12.003
Reppen, R., & Olson, S. B. (2020). Lexical bundles across disciplines. U. Römer, V. Cortes & E. Friginal. Advances in Corpus-based Research on Academic Writing: Effects of discipline, register, and writer expertise, 169-182.
Ruan, Z. (2017). Lexical bundles in Chinese undergraduate academic writing at an English medium university. RELC Journal, 48(3), 327-340. https://doi.org/10.1177/00336882166312
Saito, K., & Liu, Y. (2022). Roles of collocation in L2 oral proficiency revisited: Different tasks, L1 vs. L2 raters, and cross-sectional vs. longitudinal analyses. Second Language Research, 38(3), 531-554. https://doi.org/10.1177/0267658320988055
Schmitt, N. (2010). Researching vocabulary: A vocabulary research manual, Springer.
Schmitt, N., & Carter, R. (2004). Formulaic sequences in action. Formulaic sequences: Acquisition, processing and use, 1-22. https://digital.casalini.it/9789027295750
Shahriari Ahmadi, H., Ghonsooly, B., & Hosseini Fatemi, A. (2013). Analyzing research article introductions by Iranian and native English-speaking authors of Applied Linguistics. International Journal of Research, 2(3), 3-18. https://doi.org/10.5861/ijrsll.2012.158
Shin, Y. K. (2018). The Construction of English Lexical Bundles in Context by Native and Nonnative Freshman University Students. English teaching, 73(3), 115-139. https://doi.org/10.15858/e ngtea.73.3.201809.115
Shirazizadeh, M., & Amirfazlian, R. (2021). Lexical bundles in theses, articles and textbooks of applied linguistics: Investigating intra disciplinary uniformity and variation. Journal of English for Academic Purposes, 49, 100946. https://doi.org/10.1016/j.jeap.2020.100946
Simpson-Vlach, R., & Leicher, S. (2006). The MICASE Handbook. Ann Arbor: The University of Michigan Press.
Staples, S., Egbert, J., Biber, D., & McClair, A. (2013). Formulaic sequences and EAP writing development: Lexical bundles in the TOEFL iBT writing section. Journal of English for academic purposes, 12(3), 214-225. https://doi.org/10.1016/j.jeap.2013.05.002
Stengers, H., Boers, F., Housen, A., & Eyckmans, J. (2011). Formulaic sequences and L2 oral proficiency: Does the type of target language influence the association?
Svensén, B. (2009). A handbook of lexicography: The theory and practice of dictionary-making. Cambridge University Press.
Vaziri, A., Barjesteh, H., & Nasrollahi Mouziraji, A. (2023). Formulaic Sequences in Learners’ Spoken English: A Comparative Corpus- based Study between Native and Non-Native Speakers of English. Iranian Journal of English for Academic Purposes, 12(3), 56-72.
Wang, M., & Zhang, Y. (2021). ‘According to…’: The impact of language background and writing expertise on textual priming patterns of multi-word sequences in academic writing. Journal of English for Specific Purposes, 61, 47-59. https://doi.org/10.1177/0267658320988055
Wei, Y., & Lei, L. (2011). Lexical bundles in the academic writing of advanced Chinese EFL learners. RELC journal, 42(2), 155-166. https://doi.org/10.1177/0033688211407295
Wu, H. (2019). A Corpus-based Analysis of TESOL EFL Students' Use of Logical Connectors in Spoken English. Theory and Practice in Language Studies, 9(6), 625-636. https:// dx. doi. Org /10. 175 07/ tpls.0906.04
Xu, R., & Wijitsopon, R. (2023). Corpus linguistics and cinematic discourse: Lexical bundles in mainstream film scripts. LEARN Journal: Language Education and Acquisition Research Network, 16(1), 545-574. https://so04.tcithaijo.org/index.php/LEARN/article/view/263456
Yakut, I., & Yuvayapan, F. (2022). Forms and functions of self-repetitions in spoken discourse: A corpus linguistics analysis of L1 and L2 English. Topics in Linguistics, 23(1), 83-96. https:// doi. org/ 10. 2478/topling-2022-0007
Yang, B. (2020). A Corpus-based study of synonymous epistemic adverbs perhaps, probably, maybe and possibly. Research Journal of Education, 6(8), 158-168. : https://doi.org/10.32861/rje.68.158.168
Young, R. F., & Miller, E. R. (2004). Learning as changing participation: Discourse roles in ESL writing conferences. The Modern Language Journal, 88(4), 519-535. https://doi.org/10.1111/j.0026-7902.2004.t01-16-.x
Zago, R. (2020). Film discourse. In E. Friginal & J. A. Hardy The routledge handbook of corpus approaches to discourse analysis, 168–182. Routledge.