Abstract
This paper explores the use of an external (i.e. non-target) document collection in cross-language information retrieval (CLIR) based on machine translation (MT). In our CLIR and monolingual IR experiments using an external target language collection, we show that parallel pseudorelevance feedback is comparable to collection enrichment. In our CLIR experiments using an external source language collection, we show that context-sensitive translation of pre-translation expansion terms outperforms word-by-word (or context-free) translation on average. Moreover, we show that the combination of context-sensitive translation with pseudo-relevance feedback significantly outperforms the corresponding context-free combination and the pseudo-relevance feedback component. Thus, context-sensitive translation for pre-translation expansion is probably superior to context-free translation.
Original language | English |
---|---|
Pages (from-to) | 284-289 |
Number of pages | 6 |
Journal | Proceedings of the IEEE International Conference on Systems, Man and Cybernetics |
Volume | 6 |
DOIs | |
Publication status | Published - 2002 |
Externally published | Yes |
Keywords
- Cross-language information retrieval
- External document collections
- Machine translation
- Pseudo-relevance feedback
ASJC Scopus subject areas
- Control and Systems Engineering
- Hardware and Architecture