TY - GEN
T1 - Yet another symmetrical & real-time word alignment method
T2 - 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016
AU - Wang, Hao
AU - Lepage, Yves
N1 - Funding Information:
This work is supported in part by China Scholarship Council (CSC) under the CSC Grant No.201406890026. We also thank the anonymous reviewers for their insightful comments.
PY - 2016
Y1 - 2016
N2 - Symmetrization of word alignments is the fundamental issue in statistical machine translation (SMT). In this paper, we describe an novel reformulation of Hierarchical Subsentential Alignment (HSSA) method using F-measure. Starting with a soft alignment matrix, we use the F-measure to recursively split ENGL the matrix into two soft alignment submatrices. A direction is chosen as the same time on the basis of Inversion Transduction Grammar (ITG). In other words, our method simplifies the processing of word alignment as recursive segmentation in a bipartite graph, which is simple and easy to implement. It can be considered as an alternative of growdiag- final-and heuristic. We show its application on phrase-based SMT systems combined with the state-of-the-art approaches. In addition, by feeding with word-to-word associations, it also can be a real-time word aligner. Our experiments show that, given a reliable lexicon translation table, this simple method can yield comparable results with state-of-theart approaches.
AB - Symmetrization of word alignments is the fundamental issue in statistical machine translation (SMT). In this paper, we describe an novel reformulation of Hierarchical Subsentential Alignment (HSSA) method using F-measure. Starting with a soft alignment matrix, we use the F-measure to recursively split ENGL the matrix into two soft alignment submatrices. A direction is chosen as the same time on the basis of Inversion Transduction Grammar (ITG). In other words, our method simplifies the processing of word alignment as recursive segmentation in a bipartite graph, which is simple and easy to implement. It can be considered as an alternative of growdiag- final-and heuristic. We show its application on phrase-based SMT systems combined with the state-of-the-art approaches. In addition, by feeding with word-to-word associations, it also can be a real-time word aligner. Our experiments show that, given a reliable lexicon translation table, this simple method can yield comparable results with state-of-theart approaches.
UR - http://www.scopus.com/inward/record.url?scp=85015821312&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015821312&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85015821312
T3 - Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016
SP - 143
EP - 152
BT - Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation, PACLIC 2016
A2 - Park, Jong C.
A2 - Chung, Jin-Woo
PB - Institute for the Study of Language and Information
Y2 - 28 October 2016 through 30 October 2016
ER -