Automatic page frame registration of digitized text images using connected components - Marco Klindt
- Illustration of the setup
- © Copyright??
While more and more documents are being stored,
transmitted and used only in a digital format, old books or other
printed materials have to be digitized either for archival reasons or
to be usable in further processing applications. During the image
acquisition process either by flatbed scanners or by digital
cameras artefacts like noise, borders, skew, perspective distortion,
or warping might be introduced, all of which may diminish further
usability of the digital copies. This thesis discusses a framework to
deal with these artefacts and reconstruct
the aligned text region of a single page by adaptively thresholding the input into a binary representation and employing a connected component labeling approach as a bottom-up method to extract entities that are used as input to algorithms that determine the classes of distortion present in the image, detect the global skew angle, and, if applicable, estimate distortion parameters for flattening the page image onto a plane representing a sheet of paper. Using these parameters the text image is finally skew corrected, flattened, cropped, and saved into the output image to achieve the desired result.
Das Dateilinks-Plugin steht nicht mehr zur Verfügung. Bitte verwenden Sie statt dessen das Plugin TUB Downloadliste. Für die Löschung des alten Inhaltselements wenden Sie sich bitte an webmaster  unter Nennung des Direktzugangs.