Options
Am I readable? Transfer learning based document image rectification
Journal
International Journal on Document Analysis and Recognition
ISSN
14332833
Date Issued
2024-01-01
Author(s)
Kumari, Pooja
Abstract
Document image rectification is a commonly explored problem in computer vision. However, in recent works, the improvements made on a distorted document page are mostly confined to a few specific and limited types of distortions in the document images. Apart from projective and a few other distortions, many other types of distortions are largely ignored in the prior published works. However, some developments have parallelly been made in this area for real-world image rectification (of outdoor scenes). The goodness and strength of such existing real-world rectification models are leveraged in our work to solve the problem of unconstrained document image rectification. However, there are subtle distinctions between the two tasks that prevent the direct use of existing real-world image rectification models for document image rectification. Thus, in this work, we focus on narrowing this gap and propose a novel network called DocTLNet, which is based on transfer learning, for rectifying unconstrained document images from a single input image. The proposed network strikes the right balance between the transfer of knowledge from generic rectification models and facilitates the learning of document-specific features. In addition to the rectification task, our proposed work also focuses on issues such as shading, contrast, and illumination corrections on the document images. While maintaining the color information and contrast on the rectified image, the proposed approach incorporates non-uniform sampling and illumination correction. To the best of our knowledge, this is the first method that uses transfer learning for the rectification of document images. The efficacy of our system is demonstrated by extensive experiments on various datasets such as DIR300 and DocUNet.
Subjects