For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
VLSI Architecture of GMM Processing and Viterbi Decoder for 60,000-Word Real-Time Continuous Speech Recognition
Hiroki NOGUCHI Kazuo MIURA Tsuyoshi FUJINAGA Takanobu SUGAHARA Hiroshi KAWAGUCHI Masahiko YOSHIMOTO
IEICE TRANSACTIONS on Electronics
Publication Date: 2011/04/01
Online ISSN: 1745-1353
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Circuits and Design Techniques for Advanced Large Scale Integration)
speech recognition, hidden Markov model (HMM), VLSI architecture,
Full Text: PDF(1.8MB)>>
We propose a low-memory-bandwidth, high-efficiency VLSI architecture for 60-k word real-time continuous speech recognition. Our architecture includes a cache architecture using the locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model searching, a parallel Gaussian Mixture Model (GMM) architecture based on the mixture level and frame level, a parallel Viterbi architecture, and pipeline operation between Viterbi transition and GMM processing. Results show that our architecture achieves 88.24% required frequency reduction (66.74 MHz) and 84.04% memory bandwidth reduction (549.91 MB/s) for real-time 60-k word continuous speech recognition.