Applying lexical sophistication models to wordlist development: A proof-of-concept study

Christopher Nicklin*, Daniel Bailey, Stuart McLean, Young Ae Kim, Hyeonah Kang, Joseph P. Vitta

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Language teaching stakeholders generally rely on frequency-derived wordlists to determine words for pedagogical purposes. However, words that are instinctively easier for many learners, such as “pizza”, occur less frequently in reference corpora than words that might be considered more difficult, such as “physics”. Furthermore, research demonstrates that modeling frequency alongside other lexical sophistication variables predicts word difficulty better than frequency alone. This study constitutes a proof-of-concept; the concept being that a lexical sophistication-based approach to wordlist construction can produce lists that outperform frequency as word difficulty predictors. The method resulted in lexical sophistication-derived difficulty scores for 14,054 of the 20,000 most frequent Corpus of Contemporary American English lemmas. When compared with other commonly used wordlists, these scores successfully addressed the “pizza/physics” problem in that “pizza” was ranked easier than “physics”, and they also displayed larger correlations with word difficulty than other lists across two linguistic domains. More importantly, the scores also performed comparably to a knowledge-based vocabulary list, but contained almost three times as many lemmas for a fraction of the time and financial costs. We envisage that the present study's methodology can be used by researchers and language teaching stakeholders to create bespoke wordlists for a range of contexts.

Original languageEnglish
Article number100175
JournalResearch Methods in Applied Linguistics
Volume4
Issue number1
DOIs
Publication statusPublished - 2025 Apr

Keywords

  • Frequency
  • Lexical sophistication
  • Vocabulary
  • Word difficulty
  • Wordlists

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Applying lexical sophistication models to wordlist development: A proof-of-concept study'. Together they form a unique fingerprint.

Cite this