For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Statistical Language Models for On-Line Handwriting Recognition
Freddy PERRAUD Christian VIARD-GAUDIN Emmanuel MORIN Pierre-Michel LALLICAN
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2005/08/01
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Document Image Understanding and Digital Documents)
Category: On-line Word Recognition
handwriting recognition, language modeling, n-gram, n-class, perplexity,
Full Text: PDF(635.9KB)>>
This paper incorporates statistical language models into an on-line handwriting recognition system for devices with limited memory and computational resources. The objective is to minimize the error recognition rate by taking into account the sentence context to disambiguate poorly written texts. Probabilistic word n-grams have been first investigated, then to fight the curse of dimensionality problem induced by such an approach and to decrease significantly the size of the language model an extension to class-based n-grams has been achieved. In the latter case, the classes result either from a syntactic criterion or a contextual criteria. Finally, a composite model is proposed; it combines both previous kinds of classes and exhibits superior performances compared with the word n-grams model. We report on many experiments involving different European languages (English, French, and Italian), they are related either to language model evaluation based on the classical perplexity measurement on test text corpora but also on the evolution of the word error rate on test handwritten databases. These experiments show that the proposed approach significantly improves on state-of-the-art n-gram models, and that its integration into an on-line handwriting recognition system demonstrates a substantial performance improvement.