Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals

Toru Taniguchi*, Akishige Adachi, Shigeki Okawa, Masaaki Honda, Katsuhiko Shirai

*この研究の対応する著者

研究成果: Paper査読

抄録

We developed a method for discriminating speech, musical instruments and singing voices based on sinusoidal decomposition of audio signals. Although many studies have been conducted, few have worked on the problem of the temporal overlapping of the categories of sounds. In order to cope with such problems, we used sinusoidal segments with variable lengths as the discrimination units, although most of traditional work has used fixed-length units. The discrimination is based on the temporal characteristics of the sinusoidal segments. We achieved an average discrimination rate of 71.56% in classifying sinusoidal segments in non-mixed audio data. In the time segments, the accuracy 87.9% in non-mixed-category audio data and 66.4% in 2-mixed-category are achieved. In the comparison of the proposed and the MFCC methods, the effectiveness of temporal features and the importance of the use of both the spectral and temporal characteristics were proved.

本文言語English
ページ589-592
ページ数4
出版ステータスPublished - 2005 12月 1
イベント9th European Conference on Speech Communication and Technology - Lisbon, Portugal
継続期間: 2005 9月 42005 9月 8

Conference

Conference9th European Conference on Speech Communication and Technology
国/地域Portugal
CityLisbon
Period05/9/405/9/8

ASJC Scopus subject areas

  • 工学(全般)

フィンガープリント

「Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル