The Development of an Error-tagged Learner Corpus: TELC (Turkish-English Learner Corpus) and its Web-interface
Year 2024,
, 279 - 307, 24.12.2024
Hakan Cangır
Kutay Uzun
Taner Can
Enis Oğuz
Ömer Faruk Kaya
Though rather rare and not favoured by corpus linguists due to computationally hard-to-handle problems, learner corpora consisting of spoken and written texts by students from different L1 backgrounds can benefit both researchers in the field of second language acquisition and language teachers. Growing from this need and considering corpora’s potential importance for the language teachers and learners in the Turkish context, our L2 English learner corpus is yet another humble attempt to build an error-tagged learner corpus particularly scrutinizing lexical errors, which play a key role in the language production of second language learners. Building on Hemchua and Schmitt’s lexical error taxonomy and developed following the strict methodological considerations in the literature (e.g., error naming and fixing through several rounds of tagging), the corpus consists of 369 written texts by 231 university students (with 104,864 words, 3000+ tagged and fixed errors). The corpus database is provided with a user-friendly web-interface, which consists of statistical output, modules highlighting lexical errors and correct versions, different search options including error types, and an error-tagging add-in for further development. In addition to being a resourceful website trying to guide language practitioners and second language learners, it can be considered a platform with a capacity to be developed further by applied linguists conducting studies in this line of research. Finally, thanks to its easy-to-use interface and versatile features, it has potential to become a reference learner corpus for English as a foreign/second language with the contribution of other universities in Türkiye.
- Anthony, L. (2023). AntConc (Version 4.2.4) [Computer Software]. Waseda University. Available from
- Berberich, K., & Kleiber, I. (2023). Tools for corpus linguistics.
- Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? Tesol Quarterly, 45(1), 5-35.
- Bley-Vroman, R. (1989). What is the logical problem of foreign language learning? In S. M. Gass & J. Schachter (Eds.), Linguistic perspectives on second language acquisition (pp. 41–68). Cambridge University Press.
- Cangır, H., Uzun, K., Can, T., Küllü, K., Oğuz, E., Kaya Ö. M. (2025). Linguistic features and L2 English writing quality: A multidimensional analysis. [Manuscript submitted for publication]. AILA Review.
- Cortes, V. (2018). Corpus tools for Writing Teachers. In The TESOL Encyclopedia of English Language Teaching (pp. 1–6). John Wiley & Sons, Ltd.
- Crosthwaite, P. (Ed.). (2024). Corpora for language learning: Bridging the research-practice divide (1st ed.). Routledge.
- Ellis, N. C., & Laporte, N. (2014). Contexts of acquisition: Effects of formal instruction and naturalistic exposure on second language acquisition. In Tutorials in bilingualism (pp. 53-83). Psychology Press.
- Francis, W., & Kučera, H. (1964). Manual of information to accompany a standard corpus of present-day edited American English, for use with digital computers. Brown University.
- Friginal, E. (2013). Developing research report writing skills using corpora. English for Specific Purposes, 32(4), 208–220.
- Gablasova, D., Brezina, V., & McEnery, T. (2017). Exploring learner language through corpora: Comparing and interpreting corpus frequency information. Language Learning 67(S1), 130–154.
- Gilquin, G. (2015). From design to collection of learner corpora. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 9–34). Cambridge University Press.
- Gilquin, G., & Granger, S. (2015). Learner language. In D. Biber & R. Reppen (Eds.), The Cambridge handbook of English corpus linguistics (pp. 418–436). Cambridge University Press.
- Gilquin, G. (2023). Written learner corpora to inform teaching. In R.R. Jablonkai & E. Csomay (eds) The Routledge Handbook of Corpora and English Language Teaching and Learning (pp. 281-295). Routledge.
- Granger, S. (1993). International Corpus of learner English. In Aarts, J., de Haan, P., & Oostdijk, N. (eds.) English language corpora: Design, analysis and exploitation, (pp. 57 – 71). Rodopi.
- Granger, S. (2002). A Bird’s-eye review of learner corpus research. In Granger, S., Hung, J., Petch-Tyson, S. (eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3-33). John Benjamins.
- Granger, S. (2003). The International Corpus of Learner English: A new resource for foreign language learning and teaching and second language acquisition research. In Tesol Quarterly 37(3), pp. 538–546.
- Granger, S. (2015). The contribution of learner corpora to reference and instructional materials design. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 485-510). Cambridge University Press.
- Granger, S. (2021). Commentary: Have Learner Corpus Research and Second Language Acquisition Finally Met? In B. Le Bruyn & M. Paquot (Eds.), Learner corpus research meets second language acquisition (pp. 243–257). Cambridge University Press.
- Granger, S., Dupont, M., Meunier, F., Naets, H., & Paquot, M. (2020). The International Corpus of Learner English. Version 3. Presses universitaires de Louvain.
- Granger, S., Gilquin, G., & Meunier, F. (2015). Introduction: learner corpus research – past, present and future. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 1–6). Cambridge University Press.
- Hemchua, S., & Schmitt, N. (2006). An analysis of lexical errors in the English compositions of Thai learners. Prospect, 21(3). 3-25.
- Hunston, S. (2002). Corpora in applied linguistics. Cambridge University Press.
- Kaya, F. Ö., Uzun, K., & Cangır, H. (2022). Using corpora for language teaching and assessment in L2: A narrative review. Focus on ELT Journal, 4(3), 46-62.
- Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), 7–36.
- Kučera, H., & Francis, W. (1967). Computational analysis of present day American English. Brown University Press.
- Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of 97 syntactic complexity and usage-based indices of syntactic sophistication [Doctoral dissertation, Georgia State University]. ScholarWorks @Georgia State University.
- Kyle, K., Crossley, S. A., & Berger, C. (2018). The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behavior Research Methods, 50, 1030–1046.
- Lee, J. J., Bychkovska, T., & Maxwell, J. D. (2019). Breaking the rules? A corpus-based comparison of informal features in L1 and L2 undergraduate student writing. System, 80, 143-153.
- Lee, S. (2011). Challenges of using corpora in language teaching and learning: Implications for secondary education. Linguistic Research, 28(1), 159–178.
- Leech, G. (1981). Semantics: the study of meaning. 2nd Ed. Penguin.
- Liao, Y., & Fukuya, Y. J. (2004). Avoidance of phrasal verbs: The case of Chinese learners of English. Language Learning, 54(2), 193–226.
- Meunier, F. (2020). Introduction to learner Corpus research. In The Routledge handbook of second language acquisition and corpora (pp. 23-36). Routledge.
- Moore, J. (2005). Common mistakes at Proficiency ... and how to avoid them. Cambridge University Press.
- Murakami, A., & Alexopoulou, T. (2016). L1 influence on the acquisition order of English grammatical morphemes: A learner corpus study. Studies in Second Language Acquisition, 38(3), 365-401.
- Myles, F. (2005). Interlanguage corpora and second language acquisition research. Second Language Research, 21(4), 373-391.
- Myles, F. (2021). Commentary: An SLA perspective on learner corpus research. In B. Le Bruyn & M. Paquot (Eds.), Learner Corpus Research Meets Second Language Acquisition (pp. 258–273). Cambridge University Press.
- Nesselhauf, N. (2003). The Use of Collocations by Advanced Learners of English and Some Implications for Teaching. Applied Linguistics, 24(2), 223–242.
- O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From Corpus to Classroom: Language Use and Language Teaching. Cambridge University Press.
- Paquot, M., & Granger, S. (2012). Formulaic language in learner corpora. In Ann Rev Appl Linguist, 32, 130–149.
- Paquot, M., Larsson, T., Hasselgård, H., Ebeling, S. O., De Meyere, D., Valentin, L., Laso, N. J., Verdaguer, I., & van Vuuren, S. (2022). The varieties of English for specific purposes database (VESPA): Towards a multi-L1 and multi-register learner corpus of disciplinary writing. Research in Corpus Linguistics, 10(2), 1–15. ricl.10.02.02
- Pérez-Paredes, P. (2022). A systematic review of the uses and spread of corpora and data-driven learning in CALL research during 2011–2015. In Computer Assisted Language Learning, 35(1-2), 36–61.
- Schneider, G. (2023). Detecting and analysing learner difficulties using a learner corpus without error tagging. In K. Harrington & P. Ronan (Eds.), Demystifying corpus linguistics for English language teaching (pp. 229–257). Springer International Publishing.
- Selivan, L. (2023). Corpus linguistics and vocabulary teaching. In K. Harrington & P. Ronan (Eds.), Demystifying corpus linguistics for English language teaching (pp. 139–161). Springer International Publishing.
- Sinclair, J. M. (1990). Collins COBUILD English grammar. Collins.
- Thewissen, J. (2013). Capturing L2 accuracy developmental patterns: Insights from an error‐tagged EFL learner corpus. The Modern Language Journal, 97(S1), 77-101.
- Thewissen, J. (2015). Accuracy across proficiency levels: A learner corpus approach. Presses universitaires de Louvain.
- Xiao, R. (2009). How can corpora help in language pedagogy. In Postgraduate Conference in Applied Linguistics, Ningbo, China.
- Xu, Q. (2016). Application of learner corpora to second language learning and teaching: An overview. In English Language Teaching, 9(8), pp. 46–52. Available online at
Hata Etiketli Öğrenen Derlemi Geliştirilmesi: TELC (Türkçe-İngilizce Öğrenen Derlemi) ve Web-Arayüzü
Year 2024,
, 279 - 307, 24.12.2024
Hakan Cangır
Kutay Uzun
Taner Can
Enis Oğuz
Ömer Faruk Kaya
Oldukça nadir olmasına ve derlem dilbilimciler tarafından geliştirmedeki zorlukları nedeniyle tercih edilmemesine rağmen, farklı D1 geçmişlerine sahip öğrencilerin sözlü ve yazılı metinlerinden oluşan öğrenen derlemleri, hem ikinci dil edinimi alanındaki araştırmacılara hem de dil öğretmenlerine fayda sağlayabilir. Bu ihtiyaçtan yola çıkarak ve derlemlerin Türkiye bağlamında dil öğretmenleri ve öğrenenler için potansiyel önemini göz önünde bulundurarak, D2 İngilizce öğrenen derlemimiz, özellikle ikinci dil öğrenenlerin dil üretiminde kilit rol oynayan sözcük hatalarını inceleyen, hata etiketli bir öğrenen derlemi oluşturmaya yönelik bir girişimdir. Hemchua ve Schmitt'in sözcüksel hata taksonomisine dayanan ve alanyazındaki katı metodolojik hususlar (örneğin, hata adlandırma ve birkaç tur etiketleme yoluyla düzeltme) izlenerek geliştirilen derlem, 231 üniversite öğrencisinin 369 yazılı metninden (104.864 sözcük, 3000'den fazla etiketlenmiş ve düzeltilmiş hatadan) oluşmaktadır. Kullanıcı dostu arayüze sahip derlem veri tabanı, kullanıcıların istatistiksel çıktılara ulaşmasına ve sözcüksel hataları ve doğru versiyonlarını görüntüleyebilmesine ve derlem içinde farklı hata türlerini aramasına imkân sağlar. Ayrıca, arayüzde veri tabanının gelişimine olanak sağlayan hata etiketleme eklentisi mevcuttur. TELC, dil öğretenlere ve ikinci dil öğrenenlere rehber kaynak niteliğinde bir internet sitesi olmasının yanı sıra, bu alanda çalışmalar yürüten uygulamalı dilbilimciler tarafından geliştirilebilecek bir dijital platform olarak da değerlendirilebilir. Son olarak, kullanımı kolay arayüzü ve çok yönlü özellikleri sayesinde, Türkiye'deki diğer üniversitelerin de katkısıyla yabancı/ikinci dil olarak İngilizce öğretimi / öğrenimi için referans bir öğrenen derlemi olma potansiyeline sahiptir.
Ethical Statement
Ankara Üniversitesi Etik Kurulunun 29/03/2021 tarihli toplantısında alınan 3/9 sayılı kararıyla çalışmanın etik açıdan uygun olduğuna karar verilmiştir.
Supporting Institution
- Anthony, L. (2023). AntConc (Version 4.2.4) [Computer Software]. Waseda University. Available from
- Berberich, K., & Kleiber, I. (2023). Tools for corpus linguistics.
- Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? Tesol Quarterly, 45(1), 5-35.
- Bley-Vroman, R. (1989). What is the logical problem of foreign language learning? In S. M. Gass & J. Schachter (Eds.), Linguistic perspectives on second language acquisition (pp. 41–68). Cambridge University Press.
- Cangır, H., Uzun, K., Can, T., Küllü, K., Oğuz, E., Kaya Ö. M. (2025). Linguistic features and L2 English writing quality: A multidimensional analysis. [Manuscript submitted for publication]. AILA Review.
- Cortes, V. (2018). Corpus tools for Writing Teachers. In The TESOL Encyclopedia of English Language Teaching (pp. 1–6). John Wiley & Sons, Ltd.
- Crosthwaite, P. (Ed.). (2024). Corpora for language learning: Bridging the research-practice divide (1st ed.). Routledge.
- Ellis, N. C., & Laporte, N. (2014). Contexts of acquisition: Effects of formal instruction and naturalistic exposure on second language acquisition. In Tutorials in bilingualism (pp. 53-83). Psychology Press.
- Francis, W., & Kučera, H. (1964). Manual of information to accompany a standard corpus of present-day edited American English, for use with digital computers. Brown University.
- Friginal, E. (2013). Developing research report writing skills using corpora. English for Specific Purposes, 32(4), 208–220.
- Gablasova, D., Brezina, V., & McEnery, T. (2017). Exploring learner language through corpora: Comparing and interpreting corpus frequency information. Language Learning 67(S1), 130–154.
- Gilquin, G. (2015). From design to collection of learner corpora. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 9–34). Cambridge University Press.
- Gilquin, G., & Granger, S. (2015). Learner language. In D. Biber & R. Reppen (Eds.), The Cambridge handbook of English corpus linguistics (pp. 418–436). Cambridge University Press.
- Gilquin, G. (2023). Written learner corpora to inform teaching. In R.R. Jablonkai & E. Csomay (eds) The Routledge Handbook of Corpora and English Language Teaching and Learning (pp. 281-295). Routledge.
- Granger, S. (1993). International Corpus of learner English. In Aarts, J., de Haan, P., & Oostdijk, N. (eds.) English language corpora: Design, analysis and exploitation, (pp. 57 – 71). Rodopi.
- Granger, S. (2002). A Bird’s-eye review of learner corpus research. In Granger, S., Hung, J., Petch-Tyson, S. (eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3-33). John Benjamins.
- Granger, S. (2003). The International Corpus of Learner English: A new resource for foreign language learning and teaching and second language acquisition research. In Tesol Quarterly 37(3), pp. 538–546.
- Granger, S. (2015). The contribution of learner corpora to reference and instructional materials design. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 485-510). Cambridge University Press.
- Granger, S. (2021). Commentary: Have Learner Corpus Research and Second Language Acquisition Finally Met? In B. Le Bruyn & M. Paquot (Eds.), Learner corpus research meets second language acquisition (pp. 243–257). Cambridge University Press.
- Granger, S., Dupont, M., Meunier, F., Naets, H., & Paquot, M. (2020). The International Corpus of Learner English. Version 3. Presses universitaires de Louvain.
- Granger, S., Gilquin, G., & Meunier, F. (2015). Introduction: learner corpus research – past, present and future. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 1–6). Cambridge University Press.
- Hemchua, S., & Schmitt, N. (2006). An analysis of lexical errors in the English compositions of Thai learners. Prospect, 21(3). 3-25.
- Hunston, S. (2002). Corpora in applied linguistics. Cambridge University Press.
- Kaya, F. Ö., Uzun, K., & Cangır, H. (2022). Using corpora for language teaching and assessment in L2: A narrative review. Focus on ELT Journal, 4(3), 46-62.
- Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), 7–36.
- Kučera, H., & Francis, W. (1967). Computational analysis of present day American English. Brown University Press.
- Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of 97 syntactic complexity and usage-based indices of syntactic sophistication [Doctoral dissertation, Georgia State University]. ScholarWorks @Georgia State University.
- Kyle, K., Crossley, S. A., & Berger, C. (2018). The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behavior Research Methods, 50, 1030–1046.
- Lee, J. J., Bychkovska, T., & Maxwell, J. D. (2019). Breaking the rules? A corpus-based comparison of informal features in L1 and L2 undergraduate student writing. System, 80, 143-153.
- Lee, S. (2011). Challenges of using corpora in language teaching and learning: Implications for secondary education. Linguistic Research, 28(1), 159–178.
- Leech, G. (1981). Semantics: the study of meaning. 2nd Ed. Penguin.
- Liao, Y., & Fukuya, Y. J. (2004). Avoidance of phrasal verbs: The case of Chinese learners of English. Language Learning, 54(2), 193–226.
- Meunier, F. (2020). Introduction to learner Corpus research. In The Routledge handbook of second language acquisition and corpora (pp. 23-36). Routledge.
- Moore, J. (2005). Common mistakes at Proficiency ... and how to avoid them. Cambridge University Press.
- Murakami, A., & Alexopoulou, T. (2016). L1 influence on the acquisition order of English grammatical morphemes: A learner corpus study. Studies in Second Language Acquisition, 38(3), 365-401.
- Myles, F. (2005). Interlanguage corpora and second language acquisition research. Second Language Research, 21(4), 373-391.
- Myles, F. (2021). Commentary: An SLA perspective on learner corpus research. In B. Le Bruyn & M. Paquot (Eds.), Learner Corpus Research Meets Second Language Acquisition (pp. 258–273). Cambridge University Press.
- Nesselhauf, N. (2003). The Use of Collocations by Advanced Learners of English and Some Implications for Teaching. Applied Linguistics, 24(2), 223–242.
- O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From Corpus to Classroom: Language Use and Language Teaching. Cambridge University Press.
- Paquot, M., & Granger, S. (2012). Formulaic language in learner corpora. In Ann Rev Appl Linguist, 32, 130–149.
- Paquot, M., Larsson, T., Hasselgård, H., Ebeling, S. O., De Meyere, D., Valentin, L., Laso, N. J., Verdaguer, I., & van Vuuren, S. (2022). The varieties of English for specific purposes database (VESPA): Towards a multi-L1 and multi-register learner corpus of disciplinary writing. Research in Corpus Linguistics, 10(2), 1–15. ricl.10.02.02
- Pérez-Paredes, P. (2022). A systematic review of the uses and spread of corpora and data-driven learning in CALL research during 2011–2015. In Computer Assisted Language Learning, 35(1-2), 36–61.
- Schneider, G. (2023). Detecting and analysing learner difficulties using a learner corpus without error tagging. In K. Harrington & P. Ronan (Eds.), Demystifying corpus linguistics for English language teaching (pp. 229–257). Springer International Publishing.
- Selivan, L. (2023). Corpus linguistics and vocabulary teaching. In K. Harrington & P. Ronan (Eds.), Demystifying corpus linguistics for English language teaching (pp. 139–161). Springer International Publishing.
- Sinclair, J. M. (1990). Collins COBUILD English grammar. Collins.
- Thewissen, J. (2013). Capturing L2 accuracy developmental patterns: Insights from an error‐tagged EFL learner corpus. The Modern Language Journal, 97(S1), 77-101.
- Thewissen, J. (2015). Accuracy across proficiency levels: A learner corpus approach. Presses universitaires de Louvain.
- Xiao, R. (2009). How can corpora help in language pedagogy. In Postgraduate Conference in Applied Linguistics, Ningbo, China.
- Xu, Q. (2016). Application of learner corpora to second language learning and teaching: An overview. In English Language Teaching, 9(8), pp. 46–52. Available online at