skip to main content
research-article

A User Perspective on HTR Methods for the Automatic Transcription of Rare Scripts: The Case of Codex Runicus

Authors Info & Claims
Published:14 March 2023Publication History
Skip Abstract Section

Abstract

Recent breakthroughs in Artificial Intelligence, Deep Learning, and Document Image Analysis and Recognition have significantly eased the creation of digital libraries and the transcription of historical documents. However, for documents in rare scripts with few labelled training data available, current Handwritten Text Recognition (HTR) systems are too constraining. Moreover, research on HTR often focuses on technical aspects only, and rarely puts emphasis on implementing software tools for scholars in Humanities. In this article, we describe, compare, and analyse different transcription methods for rare scripts. We evaluate their performance in a real-use case of a medieval manuscript written in the runic script (Codex Runicus) and discuss advantages and disadvantages of each method from the user perspective. From this exhaustive analysis and comparison with a fully manual transcription, we raise conclusions and provide recommendations to scholars interested in using automatic transcription tools.

REFERENCES

  1. [1] Fornés A., Megyesi B., and Mas J.. 2017. Transcription of encoded manuscripts with image processing techniques. In Digital Humanities Conference (DH2017). 441443.Google ScholarGoogle Scholar
  2. [2] Arai Kohei and Barakbah Ali Ridho. 2007. Hierarchical k-means: An algorithm for centroids initialization for K-means. Reports of the Faculty of Science and Engineering 36, 1 (2007), 2531.Google ScholarGoogle Scholar
  3. [3] Baro A., Chen J., Fornés A., and Megyesi B.. 2019. Towards a generic unsupervised method for transcription of encoded manuscripts. In International Conference on Digital Access to Textual Cultural Heritage (DATECH). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Bensalah Asma, Riba Pau, Fornés Alicia, and Lladós Josep. 2019. Shoot less and sketch more: An efficient sketch classification via joining graph neural networks and few-shot learning. In International Workshop on Graphics Recognition (GREC). IEEE, 8085.Google ScholarGoogle Scholar
  5. [5] Bogacz Bartosz, Howe Nicholas, and Mara Hubert. 2016. Segmentation free spotting of cuneiform using part structured models. In 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, 301306.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Chowdhury Arindam and Vig Lovekesh. 2018. An efficient end-to-end neural model for handwritten text recognition. arXiv preprint arXiv:1807.07965.Google ScholarGoogle Scholar
  7. [7] Frinken Volkmar and Bunke Horst. 2014. Continuous handwritten script recognition. In Handbook of Document Image Processing and Recognition. Springer, 391425.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Holmback Ake and Wessén Elias. 1943. Svenska Landskapslagar. Serie 4 Skanelagen Och Gutalagen. Vol. 4. Geber, Stockholm.Google ScholarGoogle Scholar
  9. [9] Jemni Sana Khamekhem, Kessentini Yousri, and Kanoun Slim. 2020. Improving recurrent neural networks for offline Arabic handwriting recognition by combining different language models. International Journal of Pattern Recognition and Artificial Intelligence (2020), 2052007.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Kang Lei, Riba Pau, Villegas Mauricio, Fornés Alicia, and Rusiñol Marçal. 2020. Candidate fusion: Integrating language modelling into a sequence-to-sequence handwritten word recognition architecture. Pattern Recognition (2020), 107790.Google ScholarGoogle Scholar
  11. [11] Kiessling Benjamin, Tissot Robin, Stokes Peter, and Ezra Daniel Stökl Ben. 2019. eScriptorium: An open source platform for historical document analysis. In 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), Vol. 2. IEEE, 1919.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Lee K., Maji S., Ravichandran A., and Soatto S.. 2019. Meta-learning with differentiable convex optimization. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1064910657. Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Lee Yoonho and Choi Seungjin. 2018. Gradient-based meta-learning with learned layerwise metric and subspace. In 35th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 80), Dy Jennifer and Krause Andreas (Eds.). PMLR, Stockholmsmässan, Stockholm Sweden, 29272936. http://proceedings.mlr.press/v80/lee18a.html.Google ScholarGoogle Scholar
  14. [14] Li Hongyang, Eigen David, Dodge Samuel, Zeiler Matthew, and Wang Xiaogang. 2019. Finding task-relevant features for few-shot learning by category traversal. 110. Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Muehlberger Guenter, Seaward Louise, Terras Melissa, Oliveira Sofia Ares, Bosch Vicente, Bryan Maximilian, Colutto Sebastian, Déjean Hervé, Diem Markus, Fiel Stefan, Gatos Basilis, Greinoecker Albert, Grüning Tobias, Hackl Guenter, Haukkovaara Vili, Heyer Gerhard, Hirvonen Lauri, Hodel Tobias, Jokinen Matti, Kahle Philip, Kallio Mario, Kaplan Frederic, Kleber Florian, Labahn Roger, Eva Maria Lang, Laube Sören, Leifert Gundram, Louloudis Georgios, Meunier Rory M., Meunier Jean-Luc, Michael Johannes, Mühlbauer Elena, Philipp Nathanael, Pratikakis Ioannis, Pérez Joan Puigcerver, Putz Hannelore, Retsinas George, Romero Verónica, Sablatnig Robert, Sánchez Joan Andreu, Schofield Philip, Sfikas Giorgos, Sieber Christian, Stamatopoulos Nikolaos, Strauß Tobias, Terbul Tamara, Toselli Alejandro Héctor, Ulreich Berthold, Villegas Mauricio, Vidal Enrique, Walcher Johanna, Weidemann Max, Wurster Herbert, and Zagoris Konstantinos. 2019. Transforming scholarship in the archives through handwritten text recognition: Transkribus as a case study. Journal of Documentation 75, 5 (2019), 954976. Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Peratello Paola. 2020. Codex runicus (AM 28 8vo): A pilot project for encoding a runic manuscript. Umanistica Digitale 9 (Dec.2020), 155169. Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Perez-Rua Juan-Manuel, Zhu Xiatian, Hospedales Timothy M., and Xiang Tao. 2020. Incremental few-shot object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Puigcerver Joan. 2017. Are multidimensional recurrent layers really necessary for handwritten text recognition? In International Conference on Document Analysis and Recognition (ICDAR), Vol. 1. IEEE, 6772.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Rothacker Leonard, Fisseler Denis, Müller Gerfrid G. W., Weichert Frank, and Fink Gernot A.. 2015. Retrieving cuneiform structures in a segmentation-free word spotting framework. In 3rd International Workshop on Historical Document Imaging and Processing. 129136.Google ScholarGoogle Scholar
  20. [20] Santoro Adolfo and Marcelli Angelo. 2019. A novel procedure to speed up the transcription of historical handwritten documents by interleaving keyword spotting and user validation. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 12261230.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Santoro Adolfo and Marcelli Angelo. 2020. Using keyword spotting systems as tools for the transcription of historical handwritten documents: Models and procedures for performance evaluation. Pattern Recognition Letters 131 (2020), 329335.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Satorras Victor Garcia and Estrach Joan Bruna. 2018. Few-shot learning with graph neural networks. In ICLR. https://openreview.net/forum?id=BJj6qGbRW.Google ScholarGoogle Scholar
  23. [23] Sauvola J., Seppanen T., Haapakoski S., and Pietikainen M.. 1997. Adaptive document binarization. In 4th International Conference on Document Analysis and Recognition, Vol. 1. 147152. Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Se Stephen, Lowe David, and Little Jim. 2001. Vision-based mobile robot localization and mapping using scale-invariant features. In Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No. 01CH37164), Vol. 2. IEEE, 20512058.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Snell Jake, Swersky Kevin, and Zemel Richard S.. 2017. Prototypical networks for few-shot learning. In NIPS. 40804090. http://papers.nips.cc/paper/6996-prototypical-networks-for-few-shot-learning.Google ScholarGoogle Scholar
  26. [26] Souibgui Mohamed Ali, Biten Ali Furkan, Dey Sounak, Fornés Alicia, Kessentini Yousri, Gomez Lluis, Karatzas Dimosthenis, and Lladós Josep. 2022. One-shot compositional data generation for low resource handwritten text recognition. In IEEE/CVF Winter Conference on Applications of Computer Vision. 935943.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Souibgui Mohamed Ali, Fornés Alicia, Kessentini Yousri, and Tudor Crina. 2021. A few-shot learning approach for historical ciphered manuscript recognition. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 54135420.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Tamm Ditlev and Vogt Helle. 2016. The Danish Medieval Laws. The Laws of Scania, Zealand and Jutland. Routledge, London, New York. xiii, 349 Seiten pages.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Yin Xusen, Aldarrab Nada, Megyesi Beáta, and Knight Kevin. 2019. Decipherment of historical manuscript images. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 7885.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Zhong Zhao, Zhang Xu-Yao, Yin Fei, and Liu Cheng-Lin. 2016. Handwritten chinese character recognition with spatial transformer and deep residual networks. In 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 34403445.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A User Perspective on HTR Methods for the Automatic Transcription of Rare Scripts: The Case of Codex Runicus

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Journal on Computing and Cultural Heritage
          Journal on Computing and Cultural Heritage   Volume 15, Issue 4
          December 2022
          483 pages
          ISSN:1556-4673
          EISSN:1556-4711
          DOI:10.1145/3572828
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 March 2023
          • Online AM: 25 July 2022
          • Accepted: 15 February 2022
          • Revised: 11 January 2022
          • Received: 3 August 2021
          Published in jocch Volume 15, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)153
          • Downloads (Last 6 weeks)18

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format