Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages

Rashel Fam, Yves Lepage

研究成果: Conference contribution

12 被引用数 (Scopus)

抄録

We release a Python module containing several tools to build analogical grids from words contained in a corpus. The module implements several previously presented algorithms. The tools are language-independent. This permits their use with any language and any writing system. We hope that the tools will ease research in morphology by allowing researchers to automatically obtain structured representations of the vocabulary contained in corpora or linguistic data. We also release analogical grids built on the vocabularies contained in 1,000 corresponding lines of the 11 different language versions of the Europarl corpus v.3. The grids were built on N-grams of different lengths, from words to 6-grams. We hope that the use of structured parallel data will foster research in comparative linguistics.

本文言語English
ホスト出版物のタイトルLREC 2018 - 11th International Conference on Language Resources and Evaluation
編集者Hitoshi Isahara, Bente Maegaard, Stelios Piperidis, Christopher Cieri, Thierry Declerck, Koiti Hasida, Helene Mazo, Khalid Choukri, Sara Goggi, Joseph Mariani, Asuncion Moreno, Nicoletta Calzolari, Jan Odijk, Takenobu Tokunaga
出版社European Language Resources Association (ELRA)
ページ1060-1066
ページ数7
ISBN(電子版)9791095546009
出版ステータスPublished - 2019
イベント11th International Conference on Language Resources and Evaluation, LREC 2018 - Miyazaki, Japan
継続期間: 2018 5月 72018 5月 12

出版物シリーズ

名前LREC 2018 - 11th International Conference on Language Resources and Evaluation

Other

Other11th International Conference on Language Resources and Evaluation, LREC 2018
国/地域Japan
CityMiyazaki
Period18/5/718/5/12

ASJC Scopus subject areas

  • 言語学および言語
  • 教育
  • 図書館情報学
  • 言語および言語学

フィンガープリント

「Tools for the production of analogical grids and a resource of N-gram analogical grids in 11 languages」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル