Automatic indexing of multimedia content by integration of audio, spoken language, and visual information

Katsutoshi Ohtsuki, Katsuji Bessho, Yoshihiro Matsuo, Shoichi Matsunaga, Yoshihiko Hayashi

研究成果: Conference contribution

7 被引用数 (Scopus)

抄録

This paper describes an automatic multimedia content indexing system that includes acoustic segmentation, automatic speech recognition, topic segmentation, and video indexing features. The system is intended for indexing of multimedia news programs. Speech segments extracted from news content are delivered to the speech recognition module. The speech recognition result is segmented into topics using a segmentation algorithm based on word conceptual vectors. The indexing results derived from audio and speech information are integrated with video indexing results to extract the story structure. Experimental results show that topic segmentation using word conceptual vectors is superior to the conventional method using local word co-occurrence frequencies, and that the integrated segmentation provides better news story structures than would be possible with any single type of information.

本文言語English
ホスト出版物のタイトル2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003
出版社Institute of Electrical and Electronics Engineers Inc.
ページ601-606
ページ数6
ISBN(電子版)0780379802, 9780780379800
DOI
出版ステータスPublished - 2003
外部発表はい
イベントIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003 - St. Thomas, United States
継続期間: 2003 11月 302003 12月 4

出版物シリーズ

名前2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003

Other

OtherIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003
国/地域United States
CitySt. Thomas
Period03/11/3003/12/4

ASJC Scopus subject areas

  • 信号処理
  • コンピュータ ビジョンおよびパターン認識
  • コンピュータ サイエンスの応用

フィンガープリント

「Automatic indexing of multimedia content by integration of audio, spoken language, and visual information」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル