Acoustic features for estimation of perceptional similarity

Yoshihiro Adachi*, Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura

*この研究の対応する著者

研究成果: Conference contribution

抄録

This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman's rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.

本文言語English
ホスト出版物のタイトルAdvances in Multimedia Information Processing - PCM 2007 - 8th Pacific Rim Conference on Multimedia, Proceedings
出版社Springer Verlag
ページ306-314
ページ数9
ISBN(印刷版)9783540772545
DOI
出版ステータスPublished - 2007
外部発表はい
イベント8th Pacific-Rim Conference on Multimedia, PCM 2007 - Hong Kong, Hong Kong
継続期間: 2007 12月 112007 12月 14

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
4810 LNCS
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference8th Pacific-Rim Conference on Multimedia, PCM 2007
国/地域Hong Kong
CityHong Kong
Period07/12/1107/12/14

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Acoustic features for estimation of perceptional similarity」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル