Development of the “VoiceTra” Multi-Lingual Speech Translation System

Shigeki MATSUDA  Teruaki HAYASHI  Yutaka ASHIKARI  Yoshinori SHIGA  Hidenori KASHIOKA  Keiji YASUDA  Hideo OKUMA  Masao UCHIYAMA  Eiichiro SUMITA  Hisashi KAWAI  Satoshi NAKAMURA  

IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.4   pp.621-632
Publication Date: 2017/04/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2016AWI0006
Type of Manuscript: INVITED PAPER (Special Section on Award-winning Papers)
speech translation,  statistical machine translation,  speech recognition,  speech synthesis,  

Full Text: FreePDF(1.2MB)

This study introduces large-scale field experiments of VoiceTra, which is the world's first speech-to-speech multilingual translation application for smart phones. In the study, approximately 10 million input utterances were collected since the experiments commenced. The usage of collected data was analyzed and discussed. The study has several important contributions. First, it explains system configuration, communication protocol between clients and servers, and details of multilingual automatic speech recognition, multilingual machine translation, and multilingual speech synthesis subsystems. Second, it demonstrates the effects of mid-term system updates using collected data to improve an acoustic model, a language model, and a dictionary. Third, it analyzes system usage.