For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Table-Form Structure Analysis Based on Box-Driven Reasoning
Osamu HORI David S. DOERMANN
IEICE TRANSACTIONS on Information and Systems
Publication Date: 1996/05/25
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Character Recognition and Document Understanding)
Category: Document Recognition and Analysis
document understanding, table-form, form, structure analysis,
Full Text: PDF(514KB)>>
Table-form document structure analysis is an important problem in the document processing domain. This paper presents a new method called Box-Driven Reasoning (BDR) to robustly analyze the structure of table-form documents that include touching characters and broken lines. Real documents are copied repeatedly and overlaid with printed data, resulting in characters that touch cells and lines that are broken. Most previous methods employ a line-oriented approach, but touching characters and broken lines make the procedure fail at an early stage. BDR deals with regions directly in contrast with other previous methods and a reduced resolution image is introduced to supplement information deteriorated by noise. Experimental tests show that BDR reliably recognizes cells and strings in document images with touching characters and broken lines.