Abstract
This paper describes a method for automatically converting existing English-Japanese and Japanese-English machine translation dictionaries into English-Japanese transliteration rules and Japanese-English back-transliteration rules for cross language information retrieval. An existing English-katakana word alignment module, which is part of our own machine translation system, is exploited in generating probabilistic rewriting rules. If our system is allowed to output 15 candidate spellings, it successfully transliterates more than 75% of a set of out-of-vocabulary English words into katakana, and successfully back-transliterates more than 55% of a set of out-of-vocabulary katakana words into English. Moreover, our preliminary cross-language information retrieval experiments, which treat the candidate spellings as a group of synonyms, suggest that our methods can indeed compensate for the failure of machine translation in some cases.
Original language | English |
---|---|
Pages (from-to) | 290-295 |
Number of pages | 6 |
Journal | Proceedings of the IEEE International Conference on Systems, Man and Cybernetics |
Volume | 6 |
DOIs | |
Publication status | Published - 2002 |
Externally published | Yes |
Keywords
- Cross-language information retrieval
- Katakana
- Machine translation
- Transliteration
ASJC Scopus subject areas
- Control and Systems Engineering
- Hardware and Architecture