Evaluating NLP models with written and spoken L2 samples

Kristopher Kyle*, Masaki Eguchi

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

The use of natural language processing tools such as part-of-speech taggers and syntactic parsers are increasingly being used in studies of second language (L2) proficiency and development. However, relatively little work has focused on reporting on the accuracy of these tools or optimizing their performance in L2 contexts. While some studies reference the published overall accuracy of a particular tool or include a small-scale accuracy analysis, very few (if any) studies provide a comprehensive account of the performance of taggers and parsers across a range of written and spoken registers. In this study, we provide a large-scale accuracy analysis of popular taggers and parsers across L1 and L2 written and spoken texts, both when default and L2-optimized models are used. Accuracy is examined both at the feature level (e.g., identifying adjective-noun relationships) and the text level (e.g., mean mutualinformation scores). The results highlight the strength and weaknesses of these tools.

Original languageEnglish
Article number100120
JournalResearch Methods in Applied Linguistics
Volume3
Issue number2
DOIs
Publication statusPublished - 2024 Aug

Keywords

  • L2 speaking
  • L2 writing
  • Learner corpus research
  • Linguistic analysis
  • Natural language processing

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Evaluating NLP models with written and spoken L2 samples'. Together they form a unique fingerprint.

Cite this