Research Article

The Development of an Error-tagged Learner Corpus: TELC (Turkish-English Learner Corpus) and its Web-interface

Volume: 35 Number: 2 December 24, 2024
TR EN

The Development of an Error-tagged Learner Corpus: TELC (Turkish-English Learner Corpus) and its Web-interface

Abstract

Though rather rare and not favoured by corpus linguists due to computationally hard-to-handle problems, learner corpora consisting of spoken and written texts by students from different L1 backgrounds can benefit both researchers in the field of second language acquisition and language teachers. Growing from this need and considering corpora’s potential importance for the language teachers and learners in the Turkish context, our L2 English learner corpus is yet another humble attempt to build an error-tagged learner corpus particularly scrutinizing lexical errors, which play a key role in the language production of second language learners. Building on Hemchua and Schmitt’s lexical error taxonomy and developed following the strict methodological considerations in the literature (e.g., error naming and fixing through several rounds of tagging), the corpus consists of 369 written texts by 231 university students (with 104,864 words, 3000+ tagged and fixed errors). The corpus database is provided with a user-friendly web-interface, which consists of statistical output, modules highlighting lexical errors and correct versions, different search options including error types, and an error-tagging add-in for further development. In addition to being a resourceful website trying to guide language practitioners and second language learners, it can be considered a platform with a capacity to be developed further by applied linguists conducting studies in this line of research. Finally, thanks to its easy-to-use interface and versatile features, it has potential to become a reference learner corpus for English as a foreign/second language with the contribution of other universities in Türkiye.

Keywords

Supporting Institution

TÜBİTAK ARDEB

Project Number

220K289

Ethical Statement

Ankara Üniversitesi Etik Kurulunun 29/03/2021 tarihli toplantısında alınan 3/9 sayılı kararıyla çalışmanın etik açıdan uygun olduğuna karar verilmiştir.

References

  1. Anthony, L. (2023). AntConc (Version 4.2.4) [Computer Software]. Waseda University. Available from https://www.laurenceanthony.net/software
  2. Berberich, K., & Kleiber, I. (2023). Tools for corpus linguistics. https://corpus-analysis.com/
  3. Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? Tesol Quarterly, 45(1), 5-35. https://doi.org/10.5054/tq.2011.244483
  4. Bley-Vroman, R. (1989). What is the logical problem of foreign language learning? In S. M. Gass & J. Schachter (Eds.), Linguistic perspectives on second language acquisition (pp. 41–68). Cambridge University Press. https://doi.org/10.1017/CBO9781139524544.005
  5. Cangır, H., Uzun, K., Can, T., Küllü, K., Oğuz, E., Kaya Ö. M. (2025). Linguistic features and L2 English writing quality: A multidimensional analysis. [Manuscript submitted for publication]. AILA Review.
  6. Cortes, V. (2018). Corpus tools for Writing Teachers. In The TESOL Encyclopedia of English Language Teaching (pp. 1–6). John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118784235.eelt0553
  7. Crosthwaite, P. (Ed.). (2024). Corpora for language learning: Bridging the research-practice divide (1st ed.). Routledge. https://doi.org/10.4324/9781003413301
  8. Ellis, N. C., & Laporte, N. (2014). Contexts of acquisition: Effects of formal instruction and naturalistic exposure on second language acquisition. In Tutorials in bilingualism (pp. 53-83). Psychology Press.

Details

Primary Language

English

Subjects

Corpus Linguistics , Applied Linguistics and Educational Linguistics , Linguistics (Other)

Journal Section

Research Article

Publication Date

December 24, 2024

Submission Date

May 24, 2024

Acceptance Date

October 3, 2024

Published in Issue

Year 2024 Volume: 35 Number: 2

APA
Cangır, H., Uzun, K., Can, T., Oğuz, E., & Kaya, Ö. F. (2024). The Development of an Error-tagged Learner Corpus: TELC (Turkish-English Learner Corpus) and its Web-interface. Dilbilim Araştırmaları Dergisi, 35(2), 279-307. https://doi.org/10.18492/dad.1489654