Introducing EM-FT for Manipuri-English Neural Machine Translation

Rudali Huidrom, Yves Lepage

研究成果: Conference contribution

抄録

This paper introduces pretrained word embeddings for Manipuri, a low-resourced Indian language. The pretrained word embeddings based on fastText is capable of handling the highly agglutinative language Manipuri (mni). We then perform machine translation (MT) experiments using neural network (NN) models. In this paper, we confirm the following observations. Firstly, the reported BLEU score of the Transformer architecture with fastText word embedding model EM-FT performs better than without in all the NMT experiments. Secondly, we observe that adding more training data from a different domain of the test data negatively impacts translation accuracy. The resources reported in this paper are made available in the ELRA catalogue to help the low-resourced languages community with MT/NLP tasks.

本文言語English
ホスト出版物のタイトル6th Workshop on Indian Language Data
ホスト出版物のサブタイトルResources and Evaluation, WILDRE 2022 - held in conjunction with the International Conference on Language Resources and Evaluation, LREC 2022 - Proceedings
編集者Girish Nath Jha, Sobha Lalitha Devi, Kalika Bali, Atul Kr. Ojha
出版社European Language Resources Association (ELRA)
ページ1-6
ページ数6
ISBN(電子版)9791095546870
出版ステータスPublished - 2022
イベント6th Workshop on Indian Language Data: Resources and Evaluation, WILDRE 2022 - Marseille, France
継続期間: 2022 6月 20 → …

出版物シリーズ

名前6th Workshop on Indian Language Data: Resources and Evaluation, WILDRE 2022 - held in conjunction with the International Conference on Language Resources and Evaluation, LREC 2022 - Proceedings

Conference

Conference6th Workshop on Indian Language Data: Resources and Evaluation, WILDRE 2022
国/地域France
CityMarseille
Period22/6/20 → …

ASJC Scopus subject areas

  • 言語および言語学
  • 教育
  • 図書館情報学
  • 言語学および言語

フィンガープリント

「Introducing EM-FT for Manipuri-English Neural Machine Translation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル