Multi-angle lipreading using angle classification and angle-specific feature integration

Shinnosuke Isobe, Satoshi Tamura, Satoru Hayamizu, Yuuto Gotoh, Masaki Nose

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

Recently, visual speech recognition (VSR), or namely lipreading, has been widely researched due to development of Deep Learning (DL). The most lipreading researches focus only on frontal face images. However, assuming real scenes, it is obvious that a lipreading system should correctly recognize spoken contents not only from frontal but also side faces. In this paper, we propose a novel lipreading method that is applicable to faces taken at any angles, using Convolutional Neural Networks (CNNs) which is one of key deep-learning techniques. Our method consists of three parts; the view classification part, the feature extraction part and the integration part. We firstly apply angle classification to input faces. Based on the results, secondly we determine the best combination of pre-trained angle-specific feature extraction scheme. Finally, we integrate these features followed by DL-based lipreading. We evaluated our method using the open dataset OuluVS2 dataset including multi-angle audiovisual data. We then confirmed our approach has achieved the best performance among conventional and the other DL-based lipreading schemes in the phrase classification task.

本文言語English
ホスト出版物のタイトルICCSPA 2020 - 4th International Conference on Communications, Signal Processing, and their Applications
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781728165356
DOI
出版ステータスPublished - 2021 3月 16
外部発表はい
イベント4th International Conference on Communications, Signal Processing, and their Applications, ICCSPA 2020 - Sharjah, United Arab Emirates
継続期間: 2021 3月 162021 3月 18

出版物シリーズ

名前ICCSPA 2020 - 4th International Conference on Communications, Signal Processing, and their Applications
2021-January

Conference

Conference4th International Conference on Communications, Signal Processing, and their Applications, ICCSPA 2020
国/地域United Arab Emirates
CitySharjah
Period21/3/1621/3/18

ASJC Scopus subject areas

  • 信号処理
  • コンピュータ ネットワークおよび通信
  • コンピュータ サイエンスの応用

フィンガープリント

「Multi-angle lipreading using angle classification and angle-specific feature integration」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル