TY - JOUR
T1 - Morphological predictability of unseen words using computational analogy
AU - Fam, Rashel
AU - Lepage, Yves
N1 - Funding Information:
This work was supported by a JSPS Grant, Number 15K00317 (Kakenhi C), entitled Language productivity: efficient extraction of productive analogical clusters and their evaluation using statistical machine translation.
Publisher Copyright:
Copyright © 2016 for this paper by its authors.
PY - 2016
Y1 - 2016
N2 - We address the problem of predicting unseen words by relying on the organization of the vocabulary of a language as exhibited by paradigm tables. We present a pipeline to automatically produce paradigm tables from all the words contained in a text. We measure how many unseen words from an unseen test text can be predicted using the paradigm tables obtained from a training text. Experiments are carried out in several languages to compare the morphological richness of languages, and also the richness of the vocabulary of different authors.
AB - We address the problem of predicting unseen words by relying on the organization of the vocabulary of a language as exhibited by paradigm tables. We present a pipeline to automatically produce paradigm tables from all the words contained in a text. We measure how many unseen words from an unseen test text can be predicted using the paradigm tables obtained from a training text. Experiments are carried out in several languages to compare the morphological richness of languages, and also the richness of the vocabulary of different authors.
KW - Paradigm tables
KW - Unseen words
KW - Word predictability
UR - http://www.scopus.com/inward/record.url?scp=85017371499&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85017371499&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85017371499
SN - 1613-0073
VL - 1815
SP - 51
EP - 60
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 24th International Conference on Case-Based Reasoning Workshops, ICCBR-WS 2016
Y2 - 31 October 2016 through 2 November 2016
ER -