A feature parameter space called PRPG (Probability Ratios between Phoneme Group pairs) is utilized for speaker adaptive phoneme recognition. The coordinate conversion is performed by neural networks. Each outputnode of the network represents a posteriori probability of phoneme group. Therefore, distance in the PRPG coordinate system corresponds directly to the difference of likelihood. The area with the same information for speech recognition is compressed into one point. Moreover, by the definition of the coordinate system, the meaning of axes are equivalent among different speakers, so the speaker adaptation can be easily performed without trajectory mapping. The experimental results show that the scores of the speaker-adaptive recognition in the PRPG domain are always superior to those of the speaker-dependent recognition in the spectral domain.
|出版ステータス||Published - 1992|
|イベント||2nd International Conference on Spoken Language Processing, ICSLP 1992 - Banff, Canada|
継続期間: 1992 10月 13 → 1992 10月 16
|Conference||2nd International Conference on Spoken Language Processing, ICSLP 1992|
|Period||92/10/13 → 92/10/16|
ASJC Scopus subject areas