Yet another symmetrical & real-time word alignment method: Hierarchical sub-sentential alignment using F-measure

Hao Wang, Yves Lepage

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Symmetrization of word alignments is the fundamental issue in statistical machine translation (SMT). In this paper, we describe an novel reformulation of Hierarchical Subsentential Alignment (HSSA) method using F-measure. Starting with a soft alignment matrix, we use the F-measure to recursively split ENGL the matrix into two soft alignment submatrices. A direction is chosen as the same time on the basis of Inversion Transduction Grammar (ITG). In other words, our method simplifies the processing of word alignment as recursive segmentation in a bipartite graph, which is simple and easy to implement. It can be considered as an alternative of growdiag- final-and heuristic. We show its application on phrase-based SMT systems combined with the state-of-the-art approaches. In addition, by feeding with word-to-word associations, it also can be a real-time word aligner. Our experiments show that, given a reliable lexicon translation table, this simple method can yield comparable results with state-of-theart approaches.

Original languageEnglish
Title of host publicationProceedings of the 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016
EditorsJong C. Park, Jin-Woo Chung
PublisherInstitute for the Study of Language and Information
Pages143-152
Number of pages10
ISBN (Electronic)9788968174285
Publication statusPublished - 2016
Event30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016 - Seoul, Korea, Republic of
Duration: 2016 Oct 282016 Oct 30

Publication series

NameProceedings of the 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016

Other

Other30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016
Country/TerritoryKorea, Republic of
CitySeoul
Period16/10/2816/10/30

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)
  • Information Systems

Fingerprint

Dive into the research topics of 'Yet another symmetrical & real-time word alignment method: Hierarchical sub-sentential alignment using F-measure'. Together they form a unique fingerprint.

Cite this