A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data

Naohiro Tawara*, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi

*この研究の対応する著者

研究成果: Article査読

1 被引用数 (Scopus)

抄録

An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet process mixture model (UO-DPMM). The present paper demonstrates that UO-DPMM is successfully applied on large-scale data and outperforms the conventional hierarchical agglomerative clustering, especially for large amounts of utterances.

本文言語English
ジャーナルAPSIPA Transactions on Signal and Information Processing
4
DOI
出版ステータスPublished - 2015 10月 28

ASJC Scopus subject areas

  • 信号処理
  • 情報システム

フィンガープリント

「A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル