Old Russian

Evaluating Stanza and UDPipe for Morphosyntactic Annotation of Old Russian: A Case Study on Maximus the Greek Scripta & e-Scripta vol. 25, 2025 floyd Tue, 08/19/2025 - 17:15
Оценка на строфи и UDPipe за морфосинтактична анотация на староруски език: казусът Максим Грек

The automation of morphosyntactic annotation of Old Russian texts represents a key challenge in contemporary Slavistics, underscoring the need for computational tools capable of processing historical linguistic data with high accuracy. This study qualitatively evaluates the performance of two statistical taggers, Stanza and UDPipe, in annotating a text by Maximus the Greek, using the TOROT and RNC treebanks as reference corpora. The analysis assesses the accuracy of morphosyntactic annotation—specifically, part-of-speech tagging, morphological feature assignment, and lemmatisation—identifying recurring errors and structural limitations in applying these tools to historical Slavic texts. While both taggers facilitate annotation, they do not yet ensure a level of automation sufficient for fully reliable linguistic analysis. Key challenges include the misinterpretation of morphosyntactic relationships and inaccuracies in grammatical feature assignment. The comparison with their respective reference corpora highlights these issues, demonstrating the need for further refinement in automated annotation methods. This study critically examines the applicability of current NLP technologies to historical texts, emphasizing the necessity of adapting existing models.

Subject: e-Scripta Keywords: Stanza UDPipe natural language processing Morphosyntactic analysis Annotation Old Russian Maximus the Greek
The annotation of verbal aspect in diachrony: parameters, algorithms and problems Scripta & e-Scripta vol. 21, 2021 floyd Sat, 11/20/2021 - 07:28
Анотиране на глаголния вид в диахрония: параметри, алго- ритми и проблеми

Digital annotation of verbal aspect in Old Russian and Church Slavonic texts is a challenging and quite complicated task that requires a complex approach. While studying Slavic aspect systems synchronically, we always know whether the verb is perfective, imperfective or biaspectual, however, this is often not the case for the research of aspect in a diachronic perspective. The determination of the aspectual status of a particular verb for earlier stages is possible only after considering together different parameters such as: actionality, lexical semantic, morphology, functional distribution, syntactic restrictions, collocations, statistics etc. All essential parameters should be annotated sufficiently for an effective use of a corpora. That would enable a researcher to collect quickly the information necessary to build aspectual profile of a verb. It is also important to understand the hierarchy of the parameters, as they might have different degrees of importance, and for this purpose a special algorithm should be developed. The preliminary results, related to the parameters of annotation and the algorithm for aspect determination (using ‘Morphy’, the System for digital morphological annotation of Old Russian and Church Slavonic manuscripts, developed in Vinogradov Russian Language Institute RAS), are discussed in the paper.

Subject: Digital humanities Keywords: verbal aspect digital annotation Old Russian Old Church Slavonic
Subscribe to Old Russian