A Contour-Based Robust Algorithm for Text Detection in Color Images

Yangxing LIU  Satoshi GOTO  Takeshi IKENAGA  

IEICE TRANSACTIONS on Information and Systems   Vol.E89-D   No.3   pp.1221-1230
Publication Date: 2006/03/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e89-d.3.1221
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Image Recognition, Computer Vision
text detection,  texture analysis,  connected component analysis,  region contour,  edge detection,  

Full Text: PDF(1.2MB)>>
Buy this Article

Text detection in color images has become an active research area in the past few decades. In this paper, we present a novel approach to accurately detect text in color images possibly with a complex background. The proposed algorithm is based on the combination of connected component and texture feature analysis of unknown text region contours. First, we utilize an elaborate color image edge detection algorithm to extract all possible text edge pixels. Connected component analysis is performed on these edge pixels to detect the external contour and possible internal contours of potential text regions. The gradient and geometrical characteristics of each region contour are carefully examined to construct candidate text regions and classify part non-text regions. Then each candidate text region is verified with texture features derived from wavelet domain. Finally, the Expectation maximization algorithm is introduced to binarize each text region to prepare data for recognition. In contrast to previous approach, our algorithm combines both the efficiency of connected component based method and robustness of texture based analysis. Experimental results show that our proposed algorithm is robust in text detection with respect to different character size, orientation, color and language and can provide reliable text binarization result.