Multilingual End-To-End Speech Translation

Hirofumi Inaguma, Kevin Duh, Tatsuya Kawahara, Shinji Watanabe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

38 Citations (Scopus)

Abstract

In this paper, we propose a simple yet effective framework for multilingual end-To-end speech translation (ST), in which speech utterances in source languages are directly translated to the desired target languages with a universal sequence-To-sequence architecture. While multilingual models have shown to be useful for automatic speech recognition (ASR) and machine translation (MT), this is the first time they are applied to the end-To-end ST problem. We show the effectiveness of multilingual end-To-end ST in two scenarios: one-To-many and many-To-many translations with publicly available data. We experimentally confirm that multilingual end-To-end ST models significantly outperform bilingual ones in both scenarios. The generalization of multilingual training is also evaluated in a transfer learning scenario to a very low-resource language pair. All of our codes and the database are publicly available to encourage further research in this emergent multilingual ST topic11Available at https://github.com/espnet/espnet.

Original languageEnglish
Title of host publication2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages570-577
Number of pages8
ISBN (Electronic)9781728103068
DOIs
Publication statusPublished - 2019 Dec
Externally publishedYes
Event2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Singapore, Singapore
Duration: 2019 Dec 152019 Dec 18

Publication series

Name2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings

Conference

Conference2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019
Country/TerritorySingapore
CitySingapore
Period19/12/1519/12/18

Keywords

  • Speech translation
  • attention-based sequence-To-sequence
  • multilingual end-To-end speech translation
  • transfer learning

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Signal Processing
  • Linguistics and Language
  • Communication

Fingerprint

Dive into the research topics of 'Multilingual End-To-End Speech Translation'. Together they form a unique fingerprint.

Cite this