Decision Fusion by Boosting Method for Multi-Modal Voice Activity Detection

Shin'ichi Takeuchi, Takashi Hashiba, Satoshi Tamura, Satoru Hayamizu

Research output: Contribution to conferencePaperpeer-review

Abstract

In this paper, we propose a multi-modal voice activity detection system (VAD) that uses audio and visual information. In multi-modal (speech) signal processing, there are two methods for fusing the audio and the visual information: concatenating the audio and visual features, and employing audio-only and visual-only classifiers, then fusing the unimodal decisions. We investigate the effectiveness of decision fusion given by the results from AdaBoost. AdaBoost is one of the machine learning method. By using AdaBoost, the effective classifier is constructed by combining weak classifiers. It classifies input data into two classes based on the weighted results from weak classifiers. In proposed method, this fusion scheme is applied to decision fusion of multi-modal VAD. Experimental results show proposed method to generally be more effective.

Original languageEnglish
Publication statusPublished - 2010
Externally publishedYes
Event2010 International Conference on Auditory-Visual Speech Processing, AVSP 2010 - Hakone, Japan
Duration: 2010 Sept 302010 Oct 3

Conference

Conference2010 International Conference on Auditory-Visual Speech Processing, AVSP 2010
Country/TerritoryJapan
CityHakone
Period10/9/3010/10/3

Keywords

  • VAD
  • multi-modal
  • voice activity detection

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing
  • Otorhinolaryngology

Fingerprint

Dive into the research topics of 'Decision Fusion by Boosting Method for Multi-Modal Voice Activity Detection'. Together they form a unique fingerprint.

Cite this