End-to-End Transcript Alignment of 17th Century Manuscripts: The Case of Moccia Code
Abstract
:1. Introduction
2. State of the Art
3. The Manuscript Collection
4. Method
4.1. Image Preprocessing
4.2. Line-Level Segmentation
- Divide the image into stripes.
- 2.
- Find the horizontal projection profile.
- 3.
- Carry out A* path planning along the search areas in each stripe.
- 4.
- Connect the cutting boundaries between adjacent stripes.
4.3. Transcript Alignment
- Correct segmentation: . The size of the box matches the number of characters in the transcript . In this case, the transcript is aligned with the bounding box , and the next unmatched pair is considered.
- Over-segmentation: . The box size is too small to accommodate the number of characters in the transcription . In this case, the algorithm assumes that an over-segmentation error has occurred, and the tentative word segmentation is modified by merging with . This way, the box size increases, and the consistency test can be repeated. If it is passed, the merged bounding boxes are associated with the transcript , and the next unmatched pair is considered.
- Under-segmentation: .. The box size is too large to accommodate the number of characters of the transcript . In this case, the algorithm assumes that an under-segmentation error has occurred.
5. Results
5.1. Line Segmentation
5.2. Alignment Method
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- DVL—Digital Vatican Library. Available online: https://digi.vatlib.it (accessed on 15 August 2022).
- Gallica. Available online: https://gallica.bnf.fr (accessed on 15 August 2022).
- e-codices—Virtual Manuscript Library of Switzerland. Available online: https://www.e-codices.unifr.ch (accessed on 15 August 2022).
- Manuscripta Mediaevalia. Available online: http://www.manuscripta-mediaevalia.de/ (accessed on 15 August 2022).
- Internet Culturale. Cataloghi e Collezioni Digitali Delle Biblioteche Italiane. Available online: http://www.internetculturale.it (accessed on 15 August 2022).
- Lombardi, F.; Marinai, S. Deep Learning for Historical Document Analysis and Recognition—A Survey. J. Imaging 2020, 6, 110. [Google Scholar] [CrossRef] [PubMed]
- Sánchez, J.A.; Romero, V.; Toselli, A.H.; Villegas, M.; Vidal, E. A set of benchmarks for Handwritten Text Recognition on historical documents. Pattern Recognit. 2019, 94, 122–134. [Google Scholar] [CrossRef]
- Parziale, A.; Capriolo, G.; Marcelli, A. One Step Is Not Enough: A Multi-Step Procedure for Building the Training Set of a Query by String Keyword Spotting System to Assist the Transcription of Historical Document. J. Imaging 2020, 6, 109. [Google Scholar] [CrossRef] [PubMed]
- Tomai, C.I.; Zhang, B.; Govindaraju, V. Transcript mapping for historic handwritten document images. In Proceedings of the Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition, Niagara-on-the-Lake, ON, Canada, 6–8 August 2002; pp. 413–418. [Google Scholar]
- Kornfield, E.; Manmatha, R.; Allan, J. Text alignment with handwritten documents. In Proceedings of the First International Workshop on Document Image Analysis for Libraries, Palo Alto, CA, USA, 23–24 January 2004; pp. 195–209. [Google Scholar] [CrossRef] [Green Version]
- Rothfeder, J.; Manmatha, R.; Rath, T.M. Aligning Transcripts to Automatically Segmented Handwritten Manuscripts. In Proceedings of the Document Analysis Systems VII; Bunke, H., Spitz, A.L., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 84–95. [Google Scholar]
- Toselli, A.H.; Romero, V.; Vidal, E. Viterbi based alignment between text images and their transcripts. In Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), Prague, Czech Republic, 28 June 2007; pp. 9–16. [Google Scholar]
- Zinger, S.; Nerbonne, J.; Schomaker, L. Text-image alignment for historical handwritten documents. In Proceedings of the Document Recognition and Retrieval XVI, San Jose, CA, USA, 20–22 January 2009; Volume 7247, pp. 14–21. [Google Scholar]
- Indermühle, E.; Liwicki, M.; Bunke, H. Combining alignment results for historical handwritten document analysis. In Proceedings of the 2009 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, 26–29 July 2009; pp. 1186–1190. [Google Scholar]
- Stamatopoulos, N.; Louloudis, G.; Gatos, B. Efficient transcript mapping to ease the creation of document image segmentation ground truth with text-image alignment. In Proceedings of the 2010 12th International Conference on Frontiers in Handwriting Recognition, Kolkata, India, 16–18 November 2010; pp. 226–231. [Google Scholar]
- Stamatopoulos, N.; Gatos, B.; Louloudis, G. A Novel Transcript Mapping Technique for Handwritten Document Images. In Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Crete, Greece, 1–4 September 2014; pp. 41–46. [Google Scholar] [CrossRef]
- Leydier, Y.; Églin, V.; Brès, S.; Stutzmann, D. Learning-Free Text-Image Alignment for Medieval Manuscripts. In Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Crete, Greece, 1–4 September 2014; pp. 363–368. [Google Scholar] [CrossRef]
- Romero-Gómez, V.; Toselli, A.H.; Bosch, V.; Sánchez, J.A.; Vidal, E. Automatic alignment of handwritten images and transcripts for training handwritten text recognition systems. In Proceedings of the 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), Vienna, Austria, 24–27 April 2018; pp. 328–333. [Google Scholar]
- Ziran, Z.; Pic, X.; Innocenti, S.U.; Mugnai, D.; Marinai, S. Text alignment in early printed books combining deep learning and dynamic programming. Pattern Recognit. Lett. 2020, 133, 109–115. [Google Scholar] [CrossRef]
- Torras, P.; Souibgui, M.A.; Chen, J.; Fornés, A. A Transcription Is All You Need: Learning to Align Through Attention. In Proceedings of the International Conference on Document Analysis and Recognition, Lausanne, Switzerland, 5–10 September 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 141–146. [Google Scholar]
- Capriolo, G. Paternas Literas Confirmamus: Il Libro dei Privilegi e Delle Facoltà del Mastro Portolano di Terra di Lavoro (secc. XV-XVII); FedOA-Federico II University Press: Napoli, Italy, 2017; Volume 2. [Google Scholar]
- Sauvola, J.; Pietikäinen, M. Adaptive document image binarization. Pattern Recognit. 2000, 33, 225–236. [Google Scholar] [CrossRef] [Green Version]
- Wong, K.Y.; Casey, R.G.; Wahl, F.M. Document Analysis System. IBM J. Res. Dev. 1982, 26, 647–656. [Google Scholar] [CrossRef]
- Nagy, G. Twenty years of document image analysis in PAMI. IEEE Trans. Patt. Anal. Mach. Intell. 2000, 22, 38–63. [Google Scholar] [CrossRef]
- Namboodiri, A.; Jain, A.K. Document structure and layout analysis. In Proceedings of the Digital Document Processing; Springer: London, UK, 2007; pp. 29–48. [Google Scholar]
- Kise, K. Page segmentation techniques in document analysis. In Proceedings of the Handbook of Document Image Processing and Recognition; Springer: London, UK, 2014; pp. 135–175. [Google Scholar]
- Eskenazi, S.; Gomez-Krämer, P.; Ogier, J.M. A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recognit. 2017, 64, 1–14. [Google Scholar] [CrossRef] [Green Version]
- Antonacopoulus, A.; Clausner, C.; Papadopoulos, C.; Pletschacher, S. ICDAR2009 Page segmentation competition. In Proceedings of the 2009 International Conference on Document Analysis and Recognition (ICDAR), Barcelona, Spain, 26–29 July 2011; pp. 1370–1374. [Google Scholar]
- Murdock, M.; Reid, S.; Hamilton, B.; Reese, J. ICDAR 2015 Competition on text line detection in historical documents. In Proceedings of the 2015 International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 23–26 August 2015; pp. 1171–1175. [Google Scholar]
- Diem, M.; Kleber, F.; Fiel, S.; Grüning, T.; Gatos, B. cBAD: ICDAR2017 competition on baseline detection. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; Volume 1, pp. 1355–1360. [Google Scholar]
- Zhang, R.; Zhou, Y.; Jiang, Q.; Song, Q.; Li, N.; Zhou, K.; Wang, L.; Wang, D.; Liao, M.; Yang, M.; et al. Icdar 2019 robust reading challenge on reading chinese text on signboard. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 20–25 September 2019; pp. 1577–1581. [Google Scholar]
- Surinta, O.; Holtkamp, M.; Karabaa, F.; Van Oosten, J.P.; Schomaker, L.; Wiering, M. A path planning for line segmentation of handwritten documents. In Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Crete, Greece, 1–4 September 2014; pp. 175–180. [Google Scholar]
- De Gregorio, G.; Citro, I.; Marcelli, A. Transcript Alignment for Historical Handwritten Documents: The MiM Algorithm. In Proceedings of the 20th International Graphonomics Society Conference, Las Palmas de Gran Canaria, Spain, 7–9 June 2022. in press. [Google Scholar]
- Alberti, M.; Vögtlin, L.; Pondenkandath, V.; Seuret, M.; Ingold, R.; Liwicki, M. Labeling, cutting, grouping: An efficient text line segmentation method for medieval manuscripts. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 20–25 September 2019; pp. 1200–1206. [Google Scholar]
- Monnier, T.; Aubry, M. docExtractor: An off-the-shelf historical document element extraction. In Proceedings of the 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), Dortmund, Germany, 7–10 September 2020; pp. 91–96. [Google Scholar]
- Oliveira, A.; Seguin, B.; Kaplan, F. dhSegment: A Generic Deep-Learning Approach for Document Segmentation. In Proceedings of the 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Niagara Falls, NY, USA, 5–8 August 2018; pp. 7–12. [Google Scholar]
- Transcribe Bentham. Available online: https://www.ucl.ac.uk/bentham-project/research-tools (accessed on 5 August 2022).
- Rath, T.M.; Manmatha, R. Word spotting for historical documents. Int. J. Doc. Anal. Recognit. (IJDAR) 2007, 9, 139–152. [Google Scholar] [CrossRef]
- Fischer, A.; Frinken, V.; Fornés, A.; Bunke, H. Transcription alignment of Latin manuscripts using hidden Markov models. In Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, Beijing, China, 16–17 September 2011; pp. 29–36. [Google Scholar]
Dataset | N Lines | Our | Surinta et al. [32] | Alberti et al. [34] | docExtractor [35] | dhSegment [36] |
---|---|---|---|---|---|---|
Moccia | 275 | 253 | 153 | 144 | 274 | 267 |
Code | 92.00% | 55.64% | 52.36% | 99.64% | 97.09% | |
Bentham | 1056 | 1004 | 924 | 1003 | 1040 | 967 |
Collection | 95.08% | 87.50% | 94.98% | 98.48% | 92.45% | |
George | 653 | 600 | 585 | 587 | 632 | 635 |
Washington | 91.88% | 89.59% | 89.89% | 96.78% | 97.24% | |
Jefferson | 23 | 22 | 19 | 19 | 22 | 23 |
Letter | 95.65% | 82.61% | 82.61% | 95.65% | 100.00% | |
Saint Gall | 1430 | 1415 | 1351 | 1387 | 1419 | 1420 |
98.95% | 94.48% | 96.99% | 99.23% | 99.30% |
Alignment | Forward | MiM |
---|---|---|
Perfect | 45.05 % | 42.47% |
Acceptable | 67.59 % | 68.39% |
Type | Dataset | Period | Available | Method | Result | |
---|---|---|---|---|---|---|
[9] | Handwritten | Thomas Jefferson Letter | XVIII | Yes | Dynamic Programming | 72.00% |
[10] | Handwritten | George Washington | XVIII | Yes | Dynamic Time Warping | 75.40% |
[11] | Handwritten | George Washington | XVIII | Yes | HMM | 72.80% |
[12] | Handwritten | Corpus Cristo Salvador | XIX | Yes | HMM | 92.80% |
[14] | Handwritten | The Swiss Literary Archives | XX | No | HMM | 94.66% |
[13] | Handwritten | Kabinet van de Koningin (KdK) collection | XIX | Only images Transcription not available | Ink Projection Segmentation | 69.00% |
[15] | Handwritten | ICDAR20009 test set | XXI | No | Word Segmentation | 97.04% |
[16] | Handwritten | ICDAR20009 test set | XXI | No | Word Segmentation | 99.48% |
[17] | Handwritten | Queste del Saint Graal | IX | Yes | Segmentation Free | 72.90% |
[18] | Handwritten | C5 Hattem Manuscript | XVI | Only images Transcription not available | HMM—Dynamic programming | 75.50% |
[33] | Handwritten | Bentham Collection | XVIII | Yes | Ink Projection | 75.93% |
[19] | Early Printed | Gutenberg Bible | XV | Yes | CNN-based | 90.00% |
[20] | Chipered | Copiale ciphered manuscript | XVIII | Yes | Attention | 90.00% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
De Gregorio, G.; Capriolo, G.; Marcelli, A. End-to-End Transcript Alignment of 17th Century Manuscripts: The Case of Moccia Code. J. Imaging 2023, 9, 17. https://doi.org/10.3390/jimaging9010017
De Gregorio G, Capriolo G, Marcelli A. End-to-End Transcript Alignment of 17th Century Manuscripts: The Case of Moccia Code. Journal of Imaging. 2023; 9(1):17. https://doi.org/10.3390/jimaging9010017
Chicago/Turabian StyleDe Gregorio, Giuseppe, Giuliana Capriolo, and Angelo Marcelli. 2023. "End-to-End Transcript Alignment of 17th Century Manuscripts: The Case of Moccia Code" Journal of Imaging 9, no. 1: 17. https://doi.org/10.3390/jimaging9010017
APA StyleDe Gregorio, G., Capriolo, G., & Marcelli, A. (2023). End-to-End Transcript Alignment of 17th Century Manuscripts: The Case of Moccia Code. Journal of Imaging, 9(1), 17. https://doi.org/10.3390/jimaging9010017