抄録
In this paper we explore the contribution of the use of two Arabic morphological analyzers as preprocessing tools for statistical machine translation. Similar investigations have already been reported for morphologically rich languages like German, Turkish and Arabic. Here, we focus on the case of the Arabic language and mainly discuss the use of the G-LexAr analyzer. A preliminary experiment has been designed to choose the most promising translation system among the 3 G-LexAr-based systems, we concluded that the systems are equivalent. Nevertheless, we decided to use the lemmatized output of G-LexAr and use its translations as primary run for the BTEC AE track. The results showed that G-LexAr outputs degrades translation compared to the basic SMT system trained on the un-analyzed corpus.
本文言語 | English |
---|---|
ページ | 59-65 |
ページ数 | 7 |
出版ステータス | Published - 2010 |
外部発表 | はい |
イベント | 7th International Workshop on Spoken Language Translation, IWSLT 2010 - Paris, France 継続期間: 2010 12月 2 → 2010 12月 3 |
Conference
Conference | 7th International Workshop on Spoken Language Translation, IWSLT 2010 |
---|---|
国/地域 | France |
City | Paris |
Period | 10/12/2 → 10/12/3 |
ASJC Scopus subject areas
- 言語および言語学
- 人間とコンピュータの相互作用
- 言語学および言語