Transitional probability predicts native and non-native use of formulaic sequences

Randy Appel*, Pavel Trofimovich

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

18 Citations (Scopus)


Formulaic sequences (FSs), or prefabricated multi-word structures (e.g. on the other hand), are often difficult to identify objectively, and current corpus-driven methods yield structurally incomplete, overlapping, or overly extended structures of questionable psychological validity and pedagogical usefulness. To address these limitations, this study evaluated transitional probability as a potential metric to improve the identification of FSs by presenting 100 four-word sequences from the British National Corpus, varying in transitional probabilities between words, to native and non-native speakers of English (N = 293) in a sequence completion task (e.g. for the sake__). Results revealed that the application of transitional probability reduces many of the problems associated with current approaches to FS identification and can produce lists of FSs that are more functionally salient and psychologically valid.

Original languageEnglish
Pages (from-to)24-43
Number of pages20
JournalInternational Journal of Applied Linguistics (United Kingdom)
Issue number1
Publication statusPublished - 2017 Mar 1
Externally publishedYes


  • corpus-driven research
  • formulaic language
  • formulaic sequences
  • lexical bundles
  • n-grams

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language


Dive into the research topics of 'Transitional probability predicts native and non-native use of formulaic sequences'. Together they form a unique fingerprint.

Cite this