Most existing page segmentation algorithms do not handle document images with skew or utilize time consuming skew detection techniques. This paper present an application of the hierarchical Hough transform, a computationally efficient skew detection algorithm to connected components. It is capable of detecting the skew angle in many types of images, including scientific articles, postal labels, handwritten texts, forms, drawings and bar codes. The algorithm is robust even when black-margins introduced by photocopying are present in the image and when the document is scanned at a low resolution of 50 dpi. The algorithm consists of two steps. In the first step, we quickly extract the centroids of connected components using a graph data structure. Then, the hierachical Hough transform, at two different angular resolutions, is applied to the selected centroids. The skew angle corresponds to the location of the highest peak in the Hough space.

Issue: Vol 3 No 1 (2000)
Page No.: 5-18
Published: Jan 31, 2000
