抄録
An automatic labeling technique for known speech samples is proposed to construct a fine speech data base. A word (or sentence) is represented by a phonetic network which covers the acoustic variation contained in the utterances of the word (or sentence). An input speech sample is segmented using its parameter pattern dynamics and labeled to the optimal phonetic label (called APSEG) sequence by matching th segment sequence to the generated phonetic network using constrained dynamic programming. The feasibility of the method is confirmed when it is applied ot a word set containing 53 city names.
本文言語 | English |
---|---|
ページ(範囲) | 30-37 |
ページ数 | 8 |
ジャーナル | Denshi Gijutsu Sogo Kenkyusho Iho/Bulletin of the Electrotechnical Laboratory |
巻 | 52 |
号 | 3 |
出版ステータス | Published - 1988 |
外部発表 | はい |
ASJC Scopus subject areas
- 凝縮系物理学
- 電子工学および電気工学