Ефективност на генерични модели HTR за историческа кирилица и глаголица: Сравнение на средства

Performance of Generic HTR Models on Historical Cyrillic and Glagolitic: Comparison of Engines

Author(s): Achim Rabus Walker R. Thompson
Subject(s): Language studies // Language and Literature Studies // Theoretical Linguistics // Applied Linguistics // Historical Linguistics // Computational linguistics // South Slavic Languages // Philology // Translation Studies //
Published by: Institute for Literature BAS
Print ISSN: 1312-238X
Summary/Abstract:

The present study offers a comparative evaluation of the performance of different AI-based digital tools for handwritten text recognition (HTR) on historical manuscripts and prints. The focus is on generic models capable of transcribing a range of texts in a similar script. The training dataset for these comprises Old Cyrillic ustav and poluustav manuscripts, on the one hand, and early Glagolitic printed books, on the other. We give an overview of the performance statistics for the HTR platforms Transkribus and eScriptorium as well as for the command-line tool Calamari. In each case, we additionally offer a close, qualitative analysis of select examples in order to convey a sense of the models’ real-world performance. In this way, our study supplies comparative data on the respective capabilities of these technologies that ought to be of interest to scholars working with them in digital humanities projects.

Journal: Scripta & e-Scripta vol. 23, 2023

Page Range: 11-34
No. of Pages: 24
Language: English

Year: 2023
Issue No:: Scripta & e-Scripta vol. 23, 2023

Submitted on: 3 December 2023
LINK CEEOL: https://www.ceeol.com/search/article-detail?id=1196211
Achim Rabus

Germany

achim.rabus@slavistik.uni-freiburg.de

Department of Slavic Linguistics, University of Freiburg, Germany

Description

Prof. Dr. Achim Rabus is the current Head of the Department of Slavonic Studies at the University of Freiburg, Germany. Rabus defended his PhD thesis on the language of East Slavic spiritual songs in 2008 and his Habilitationsschrift on Slavic language contact in 2014. Since 2009, Rabus has been a member of the Special Commission on the Computer- Supported Processing of Mediæval Slavonic Manuscripts and Early Printed Books to the International Committee of Slavists, and since 2018, the President of the Commission. His current research focuses on Slavic social dialectology, Handwritten Text Recognition, corpus and (digital) historical linguistics.

Walker R. Thompson

walker.thompson@slav.uni-heidelberg.de

Slavic Institute, Heidelberg University

Description

Walker Thompson read Russian and German as an undergraduate at Magdalen College, Oxford, going on to complete his Master’s in Syriac Studies at Wolfson College in 2019. From 2019–2020, he was an Academic Assistant at the Slavic Institute of Heidelberg University. Since October 2020, he has been continuing his doctoral research on Epifanij Slavineckij’s lexicographical works as a DAAD-GSSP scholar at the Heidelberg Graduate School for the Humanities and Social Sciences (HGGS). His teaching and research interests span a diverse range of topics including digital humanities, graphematics, language contact, early modern lexicography, the history of Church Slavonic, and Eastern Orthodox liturgics.
SUBJECT: Language studies // Language and Literature Studies // Theoretical Linguistics // Applied Linguistics // Historical Linguistics // Computational linguistics // South Slavic Languages // Philology // Translation Studies //

KEYWORDS: handwritten text recognition // TRANSKRIBUS // MACHINE LEARNING // Cyrillic palaeography // Glagolitic printings //

Ефективност на генерични модели HTR за историческа кирилица и глаголица: Сравнение на средства

Performance of Generic HTR Models on Historical Cyrillic and Glagolitic: Comparison of Engines

Journal: Scripta & e-Scripta vol. 23, 2023

Achim Rabus

Walker R. Thompson