For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Minimum Covering Run Expression of Document Images Based on Matching of Bipartite Graph
Supoj CHINVEERAPHAN Ken'ichi DOUNIWA Makoto SATO
IEICE TRANSACTIONS on Information and Systems
Publication Date: 1993/04/25
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Image Processing and Understanding)
Minimum Covering Run (MCR) expression, run, bipartite graph, adjacency matrix, minimum vertex-covering, maximum matching, Hopcroft-Karp's algorithm,
Full Text: PDF(570.3KB)>>
An efficient technique for expressing document image is required as part of a unified approach to document image processing. This paper presents a new method, Minimum Covering Run (MCR), for expressing binary images. The name being adapted from horizontal or vertical run representation. The proposed technique uses some horizontal and vertical runs together to represent binary images in which the total number of representative runs is minimized. Considering the characteristic of above run types precisely, it is shown that horizontal and vertical runs of any binary image could be thought of as partite sets of a bipartite graph. Consequently, the MCR expression that corresponds to the construction of one of the most interesting problems in graphs; i.e., maximum matching, is analogously found by using an algorithm which solves this problem in a corresponding graph. The most efficient algorithm takes at most O(n5/2) computations for solving the problem where n is the sum of cardinalities of both partite sets. However, some patterns in images like tables or line drowings, generally, have a large number of runs representing them which results in a long processing time. Therefore, we provide the Rectangular Segment Analysis (RSA) as a pre-processing to define runs representing such patterns beforehand. We also show that horizontal and vertical covering parts of the proposed expression are able to represent stroke components of characters in document images. As an implementation, an efficient algorithm including arrangement for run data structure of the MCR expression is presented. The experimental results show the possibility of stroke extraction of characters in document images. As an application, some patterns such as tables can be extracted from document images.