Lipreading using deep bottleneck features for optical and depth images

Satoshi Tamura, Koichi Miyazaki, Satoru Hayamizu

研究成果: Paper査読

抄録

This paper investigates a lipreading scheme employing optical and depth modalities, with using deep bottleneck features. Optical and depth data are captured by Microsoft Kinect v2, followed by computing an appearance-based feature set in each modality. A basic feature set is then converted into a deep bottleneck feature using a deep neural network having a bottleneck layer. Multi-stream hidden Marcov models are used for recognition. We evaluated the method using our connected-digit corpus, comparing to our previous method. It is finally found that we could improve lipreading performance by employing deep bottleneck features.

本文言語English
ページ76-77
ページ数2
出版ステータスPublished - 2017
外部発表はい
イベント14th International Conference on Auditory-Visual Speech Processing, AVSP 2017 - Stockholm, Sweden
継続期間: 2017 8月 252017 8月 26

Conference

Conference14th International Conference on Auditory-Visual Speech Processing, AVSP 2017
国/地域Sweden
CityStockholm
Period17/8/2517/8/26

ASJC Scopus subject areas

  • 言語および言語学
  • 耳鼻咽喉科学
  • 言語聴覚療法

フィンガープリント

「Lipreading using deep bottleneck features for optical and depth images」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル