Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables

Juan Luo*, Adrien Lardilleux, Yves Lepage

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper describes an approach to improve the performance of sampling-based multilingual alignment on translation tasks by investigating the distribution of n-grams in the translation tables. This approach consists in enforcing the alignment of n-grams. The quality of phrase translation tables output by this approach and that of MGIZA++ is compared in statistical machine translation tasks. Significant improvements for this approach are reported. In addition, merging translation tables is shown to outperform state-of-the-art techniques.

Original languageEnglish
Title of host publicationPACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation
Pages150-159
Number of pages10
Publication statusPublished - 2011
Event25th Pacific Asia Conference on Language, Information and Computation, PACLIC 25 - , Singapore
Duration: 2011 Dec 162011 Dec 18

Publication series

NamePACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

Conference

Conference25th Pacific Asia Conference on Language, Information and Computation, PACLIC 25
Country/TerritorySingapore
Period11/12/1611/12/18

Keywords

  • Alignment
  • Phrase translation table
  • Statistical machine translation

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

Fingerprint

Dive into the research topics of 'Improving sampling-based alignment by investigating the distribution of N-grams in phrase translation tables'. Together they form a unique fingerprint.

Cite this