FADIT: Fast Document Image Thresholding
Abstract
:1. Introduction
2. FADIT
2.1. Bayesian Framework
2.2. Criterion Function
- Because both the dark text and the bright background are uniform in an ideal document image, we assume the probabilities of grayscales of the text and the background are two constants and respectively and the probabilities of other grayscales are 0. It means that once an image is segmented with threshold the image only have pure text and background. Under this assumption, a degraded document image is easily to be restored because Equation (3) implies that the actual probability tries to approximate the probability of the ideal document image. As Equation (2) is obtained by the maximum correct classification probability, and are given by the cumulative sum,The probabilities satisfy the constraint,
- A lower threshold makes its text pixels with lower grayscales, and a lower grayscale has larger posterior probability to be classified into text. The lower grayscale a pixel has, the larger probability it is classified into dark text, and the higher grayscale a pixel has, the larger probability it is classified into bright background, so we assume that the posterior probability of the dark text is a decreasing function of the threshold and the posterior probability of the bright background is an increasing function . The probabilities satisfy the constraint,
2.3. Posterior Probability Function
- The mean of an image can be used to measure the average intensity of the image, and the average intensity of a document image mainly related with the bright background.
- If a document image is degraded, the text is still dark but the background tends to become tarnished. The background of a degraded document image is not very bright.
2.4. Speed-up Algorithms
Algorithm 1: The FADIT algorithm. | |
1: | Compute the criterion function, , |
2: | for to do |
3: | , |
4: | , |
5: | , |
6: | end for |
7: | Obtain the FADIT threshold, . |
Algorithm 2: The fast Otsu’s algorithm. | |
1: | Compute the between-class variance, , |
2: | for to do |
3: | , |
4: | , |
5: | , |
6: | end for |
7: | Obtain the Otsu’s threshold, |
Algorithm 3: The Kittler et al.’s algorithm. | |
1: | Compute the criterion function, , |
2: | for to do |
3: | , |
4: | , |
5: | , |
6: | , |
7: | , |
8: | , |
9: | , |
10: | , |
11: | |
12: | |
13: | end for |
14: | Obtain the Kittler et al.’s threshold, . |
3. Grid-Based FADIT
4. Experimental Results
4.1. Experimental Setup
4.2. FADIT Compared with Classic Algorithms
4.3. Grid-based FADIT Compared with Other Algorithms
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Nagy, G. Twenty years of document image analysis in PAMI. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 38–62. [Google Scholar] [CrossRef]
- Lelore, T.; Bouchara, F. FAIR: A Fast Algorithm for Document Image Restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 2039–2048. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rowley-Brooke, R.; Pitie, F.; Kokaram, A. A Non-parametric Framework for Document Bleed-through Removal. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA, 23–28 June 2013; pp. 2954–2960. [Google Scholar]
- Su, B.; Lu, S.; Tan, C.L. Robust document image binarization technique for degraded document images. IEEE Trans. Image Process. 2013, 22, 1408–1417. [Google Scholar] [PubMed]
- Solihin, Y.; Leedham, C. Integral ratio: A new class of global thresholding techniques for handwriting images. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21, 761–768. [Google Scholar] [CrossRef]
- Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 1979, SMC-9, 62–66. [Google Scholar] [CrossRef] [Green Version]
- Kittler, J.; Illingworth, J. Minimum error thresholding. Pattern Recognit. 1986, 19, 41–47. [Google Scholar] [CrossRef]
- Sezgin, M.; Sankur, B. Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 2004, 13, 146–168. [Google Scholar]
- Lee, S.U.; Chung, S.Y.; Park, R.H. A comparative performance study of several global thresholding techniques for segmentation. Comput. Vis. Graph. Image Process. 1990, 52, 171–190. [Google Scholar] [CrossRef]
- Trier, B.D.; Jain, A.K. Goal-directed evaluation of binarization methods. IEEE Trans. Pattern Anal. Mach. Intell. 1995, 17, 1191–1201. [Google Scholar] [CrossRef]
- Kittler, J.; Illingworth, J. On threshold selection using clustering criteria. IEEE Trans. Syst. Man Cybern. 1985, SMC-15, 652–655. [Google Scholar] [CrossRef]
- Cheriet, M.; Said, J.N.; Suen, C.Y. A recursive thresholding technique for image segmentation. IEEE Trans. Image Process. 1998, 7, 918–921. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhan, K.; Zhang, H.; Ma, Y. New spiking cortical model for invariant texture retrieval and image processing. IEEE Trans. Neural Netw. 2009, 20, 1980–1986. [Google Scholar] [CrossRef] [PubMed]
- Nina, O.; Morse, B.; Barrett, W. A recursive Otsu thresholding method for scanned document binarization. In Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision (WACV), Kona, HI, USA, 5–7 January 2011; pp. 307–314. [Google Scholar]
- Moghaddam, R.F.; Cheriet, M. AdOtsu: An adaptive and parameterless generalization of Otsu’s method for document image binarization. Pattern Recognit. 2012, 45, 2419–2431. [Google Scholar] [CrossRef]
- Zhan, K.; Shi, J.; Li, Q.; Teng, J.; Wang, M. Image segmentation using fast linking SCM. In Proceedings of the 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2015; Volume 25, pp. 2093–2100. [Google Scholar]
- Sauvola, J.; Pietikäinen, M. Adaptive document image binarization. Pattern Recognit. 2000, 33, 225–236. [Google Scholar] [CrossRef] [Green Version]
- Moghaddam, R.F.; Cheriet, M. A multi-scale framework for adaptive binarization of degraded document images. Pattern Recognit. 2010, 43, 2186–2198. [Google Scholar] [CrossRef]
- Ye, Q.Z.; Danielsson, P.E. On minimum error thresholding and its implementations. Pattern Recognit. Lett. 1988, 7, 201–206. [Google Scholar] [CrossRef]
- Bishop, C.M. Pattern Recognition and Machine Learning; Springer Science Business Media: New York, NY, USA, 2006; pp. 21–24. [Google Scholar]
- Gonzalez, R.C.; Woods, R.E.; Eddins, S.L. Digital Image Processing Using MATLAB. Available online: https://www.mathworks.com/academia/books/digital-image-processing-using-matlab-gonzalez.html (accessed on 21 February 2020).
- Kerr, M.; Burrage, K. New Multivalue Methods for Differential Algebraic Equations. Numer. Algorithms 2002, 31, 193–213. [Google Scholar] [CrossRef]
- Yasnoff, W.A.; Mui, J.K.; Bacus, J.W. Error measures for scene segmentation. Pattern Recognit. 1977, 9, 217–231. [Google Scholar] [CrossRef]
- Xiao, Y.; Cao, Z.; Yuan, J. Entropic image thresholding based on GLGM histogram. Pattern Recognit. Lett. 2014, 40, 47–55. [Google Scholar] [CrossRef]
FADIT | Otsu’s | Kitter et al.’s | |
---|---|---|---|
✓ | ✓ | ✓ | |
× | ✓ | ✓ | |
× | × | ✓ | |
× | × | × | |
× | × | × | |
× | × | ✓ |
Peak Signal-to-Noise Ratio (dB) | |||
Otsu | Kittler | FADIT | |
Image 1 | 9.2647 | 7.1802 | 11.5618 |
Image 2 | 20.1543 | 20.3800 | 20.9538 |
Image 3 | 7.2727 | 6.2408 | 16.0214 |
Image 4 | 16.5733 | 13.1810 | 16.7075 |
Image 5 | 20.3387 | 19.9889 | 21.5522 |
Misclassification Error | |||
Otsu | Kittler | FADIT | |
Image 1 | 0.1184 | 0.1914 | 0.0698 |
Image 2 | 0.0097 | 0.0092 | 0.0080 |
Image 3 | 0.1874 | 0.2376 | 0.0250 |
Image 4 | 0.0220 | 0.0481 | 0.0213 |
Image 5 | 0.0092 | 0.0100 | 0.0070 |
Running Time in MATLAB (Microsecond) | |||
Otsu | Kittler | FADIT | |
Image 1 | 2.5779 | 3.4130 | 1.6194 |
Image 2 | 2.5601 | 3.3707 | 1.6029 |
Image 3 | 2.5841 | 3.4544 | 1.6101 |
Image 4 | 2.5750 | 3.4324 | 1.6071 |
Image 5 | 2.5761 | 3.4037 | 1.6105 |
Peak Signal-To-Noise Ratio (dB) | |||||
Sauvola | Grid-Sauvola | Grid-Kittler | GLGM | Grid-FADIT | |
Image 1 | 7.3912 | 10.6198 | 8.1273 | 9.4828 | 13.0383 |
Image 2 | 4.6402 | 16.4017 | 20.4056 | 19.9702 | 20.5779 |
Image 3 | 3.6341 | 14.0252 | 6.2377 | 15.6426 | 17.6719 |
Image 4 | 4.7587 | 14.4093 | 13.6244 | 14.0242 | 16.6563 |
Image 5 | 4.7563 | 17.4495 | 20.0865 | 15.0982 | 21.5843 |
Misclassification Error | |||||
Sauvola | Grid-Sauvola | Grid-Kittler | GLGM | Grid-FADIT | |
Image 1 | 0.1823 | 0.0867 | 0.1539 | 0.1126 | 0.0497 |
Image 2 | 0.3435 | 0.0229 | 0.0091 | 0.0101 | 0.0088 |
Image 3 | 0.4331 | 0.0396 | 0.2378 | 0.0273 | 0.0171 |
Image 4 | 0.3343 | 0.0362 | 0.0434 | 0.0396 | 0.0216 |
Image 5 | 0.3345 | 0.0180 | 0.0098 | 0.0309 | 0.0069 |
Running Time in MATLAB (Second) | |||||
Sauvola | Grid-Sauvola | Grid-Kittler | GLGM | Grid-FADIT | |
Image 1 | 38.5553 | 3.7726 | 1.8631 | 0.7000 | 1.8401 |
Image 2 | 32.3430 | 3.2322 | 1.6674 | 0.5537 | 1.5850 |
Image 3 | 75.5198 | 7.4040 | 3.6795 | 1.3186 | 3.6751 |
Image 4 | 87.3679 | 8.4955 | 4.2793 | 1.5257 | 4.2375 |
Image 5 | 75.0670 | 7.3851 | 3.6802 | 1.3247 | 3.6703 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Min, Y.; Zhang, Y. FADIT: Fast Document Image Thresholding. Algorithms 2020, 13, 46. https://doi.org/10.3390/a13020046
Min Y, Zhang Y. FADIT: Fast Document Image Thresholding. Algorithms. 2020; 13(2):46. https://doi.org/10.3390/a13020046
Chicago/Turabian StyleMin, Yufang, and Yaonan Zhang. 2020. "FADIT: Fast Document Image Thresholding" Algorithms 13, no. 2: 46. https://doi.org/10.3390/a13020046
APA StyleMin, Y., & Zhang, Y. (2020). FADIT: Fast Document Image Thresholding. Algorithms, 13(2), 46. https://doi.org/10.3390/a13020046