Data collection for mobile audio-visual speech recognition in various environments

Satoshi Tamura, Takumi Seko, Satoru Hayamizu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper introduces our recent activities for audio-visual speech recognition on mobile devices and data collection in various environments. Audio-visual automatic speech recognition is effective in noisy or real conditions to enhance the robustness of speech recognizer and to improve the recognition accuracy. We have developed an audio-visual speech recognition interface for mobile devices. In order to evaluate the recognizer and investigate issues related to audio-visual processing on mobile computers, we collected speech data and lip images of 16 subjects in eight conditions, where there were various audio noises and visual difficulties. Audio-only speech recognition and visual-only lipreading were then conducted. Through these experiments, we found some issues and future works not only for construction of audio-visual database but also for robust audio-visual speech recognition.

Original languageEnglish
Title of host publicationOriental COCOSDA 2014 - 17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment / CASLRE (Conference on Asian Spoken Language Research and Evaluation)
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479970940
DOIs
Publication statusPublished - 2014 Feb 27
Externally publishedYes
Event17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, Oriental COCOSDA 2014 - Phuket, Thailand
Duration: 2014 Sept 102014 Sept 12

Publication series

NameOriental COCOSDA 2014 - 17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment / CASLRE (Conference on Asian Spoken Language Research and Evaluation)

Other

Other17th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, Oriental COCOSDA 2014
Country/TerritoryThailand
CityPhuket
Period14/9/1014/9/12

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Data collection for mobile audio-visual speech recognition in various environments'. Together they form a unique fingerprint.

Cite this