AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition

Satoshi NAKAMURA  Kazuya TAKEDA  Kazumasa YAMAMOTO  Takeshi YAMADA  Shingo KUROIWA  Norihide KITAOKA  Takanobu NISHIURA  Akira SASOU  Mitsunori MIZUMACHI  Chiyomi MIYAJIMA  Masakiyo FUJIMOTO  Toshiki ENDO  

IEICE TRANSACTIONS on Information and Systems   Vol.E88-D    No.3    pp.535-544
Publication Date: 2005/03/01
Online ISSN: 
DOI: 10.1093/ietisy/e88-d.3.535
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Corpus-Based Speech Technologies)
Category: Speech Corpora and Related Topics
noisy speech recognition,  evaluation platform,  performance differences over speakers,  evaluation categories,  

Full Text: PDF(1.7MB)>>
Buy this Article

This paper introduces an evaluation framework for Japanese noisy speech recognition named AURORA-2J. Speech recognition systems must still be improved to be robust to noisy environments, but this improvement requires development of the standard evaluation corpus and assessment technologies. Recently, the Aurora 2, 3 and 4 corpora and their evaluation scenarios have had significant impact on noisy speech recognition research. The AURORA-2J is a Japanese connected digits corpus and its evaluation scripts are designed in the same way as Aurora 2 with the help of European Telecommunications Standards Institute (ETSI) AURORA group. This paper describes the data collection, baseline scripts, and its baseline performance. We also propose a new performance analysis method that considers differences in recognition performance among speakers. This method is based on the word accuracy per speaker, revealing the degree of the individual difference of the recognition performance. We also propose categorization of modifications, applied to the original HTK baseline system, which helps in comparing the systems and in recognizing technologies that improve the performance best within the same category.

open access publishing via