TY - GEN
T1 - Leveraging the advantages of associative alignment methods for PB-SMT systems
AU - Yang, Baosong
AU - Lepage, Yves
N1 - Funding Information:
This work was supported by Business Finland (prev. Tekes, 644/31/2015) and the Academy of Finland (315376).
Publisher Copyright:
© 2018, Springer International Publishing AG, part of Springer Nature.
PY - 2018
Y1 - 2018
N2 - Training statistical machine translation systems used to require heavy computation times. It has been shown that approximations in the probabilistic approach could lead to impressing improvements (Fast align). We show that, by leveraging the advantages of the associative approach, we achieve similar, even faster, training times, while keeping comparable BLEU scores. Our contributions are of two types: of the engineering type, by introducing multi-processing both in sampling-based alignment and hierarchical sub-sentential alignment; of modeling type, by introducting approximations in hierarchical sub-sentential alignment that lead to important reductions in time without affecting the alignments produced. We test and compare our improvements on six typical language pairs of the Europarl corpus.
AB - Training statistical machine translation systems used to require heavy computation times. It has been shown that approximations in the probabilistic approach could lead to impressing improvements (Fast align). We show that, by leveraging the advantages of the associative approach, we achieve similar, even faster, training times, while keeping comparable BLEU scores. Our contributions are of two types: of the engineering type, by introducing multi-processing both in sampling-based alignment and hierarchical sub-sentential alignment; of modeling type, by introducting approximations in hierarchical sub-sentential alignment that lead to important reductions in time without affecting the alignments produced. We test and compare our improvements on six typical language pairs of the Europarl corpus.
UR - http://www.scopus.com/inward/record.url?scp=85049108485&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049108485&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-93782-3_16
DO - 10.1007/978-3-319-93782-3_16
M3 - Conference contribution
AN - SCOPUS:85049108485
SN - 9783319937816
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 214
EP - 228
BT - Human Language Technology. Challenges for Computer Science and Linguistics - 7th Language and Technology Conference, LTC 2015, Revised Selected Papers
A2 - Vetulani, Zygmunt
A2 - Kubis, Marek
A2 - Mariani, Joseph
PB - Springer Verlag
T2 - 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, LTC 2015
Y2 - 27 November 2015 through 29 November 2015
ER -