Recognition of Anomalously Deformed Kana Sequences in Japanese Historical Documents

Nam Tuan LY  Kha Cong NGUYEN  Cuong Tuan NGUYEN  Masaki NAKAGAWA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.8   pp.1554-1564
Publication Date: 2019/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDP7361
Type of Manuscript: PAPER
Category: Image Recognition, Computer Vision
Keyword: 
historical documents,  deformed kana recognition,  handwriting recognition,  deep neural networks,  

Full Text: PDF(2.7MB)>>
Buy this Article




Summary: 
This paper presents recognition of anomalously deformed Kana sequences in Japanese historical documents, for which a contest was held by IEICE PRMU 2017. The contest was divided into three levels in accordance with the number of characters to be recognized: level 1: single characters, level 2: sequences of three vertically written Kana characters, and level 3: unrestricted sets of characters composed of three or more characters possibly in multiple lines. This paper focuses on the methods for levels 2 and 3 that won the contest. We basically follow the segmentation-free approach and employ the hierarchy of a Convolutional Neural Network (CNN) for feature extraction, Bidirectional Long Short-Term Memory (BLSTM) for frame prediction, and Connectionist Temporal Classification (CTC) for text recognition, which is named a Deep Convolutional Recurrent Network (DCRN). We compare the pretrained CNN approach and the end-to-end approach with more detailed variations for level 2. Then, we propose a method of vertical text line segmentation and multiple line concatenation before applying DCRN for level 3. We also examine a two-dimensional BLSTM (2DBLSTM) based method for level 3. We present the evaluation of the best methods by cross validation. We achieved an accuracy of 89.10% for the three-Kana-character sequence recognition and an accuracy of 87.70% for the unrestricted Kana recognition without employing linguistic context. These results prove the performances of the proposed models on the level 2 and 3 tasks.