BelarusianGLUE: Analyzing Performance of Open-weight Models
BelarusianGLUE: анализ на продуктивността на модели от отворен тип

- Author(s): Maksim Aparovich Volha Harytskaya Vladislav Poritski Oksana Volchek Pavel Smrž
- Subject(s): e-Scripta //
-
Published by: Institute for Literature BAS
- Print ISSN: 1312-238X
- Summary/Abstract:
We use BelarusianGLUE, a recently introduced benchmark, to analyze the performance of open-weight large language models (LLMs) on Belarusian language understanding tasks. The impact of prompting language, few-shot prompts, orthography (modern/classical/Latin), chat templates, and evaluation mode (discriminative/ generative) is investigated. Our findings suggest that more recent models generally perform better, but improvements are gradual. Fine-tuning on related Slavic languages doesn’t always improve Belarusian understanding. Classical orthography has limited impact, while latinization degrades performance. Analysis of specific tasks (sentiment analysis, Winograd schema challenge) reveals biases in the models, difficulties with understanding linguistic structure, and gaps in world knowledge and cultural context.
Journal: Scripta & e-Scripta vol. 25, 2025
-
Page Range: 25-38
No. of Pages: 14
Language: English - LINK CEEOL:
-
Maksim AparovichDescription
Maksim Aparovich, PhD student, Brno University of Technology, Brno (Czech Republic). Areas of interest: question answering, multilingual question answering, and cross-lingual transfer learning
Volha HarytskayaDescriptionVolha Harytskaya, PhD, independent researcher, Vilnius (Lithuania). Areas of interest: sociolinguistics, corpus linguistics.
Vladislav PoritskiDescriptionVladislav Poritski, M.A., independent researcher, Vilnius (Lithuania), working on computational tools for the Belarusian language.
Oksana VolchekDescriptionOksana Volchek, PhD, independent researcher, Vilnius (Lithuania). Areas of interest: Belarusian and Russian lexicology, corpus linguistics.
Pavel SmržDescriptionPavel Smrž, PhD, associate professor, Brno University of Technology, Brno (Czech Republic). Areas of interest: question answering, information retrieval.
-
SUBJECT: e-Scripta //KEYWORDS: natural language processing // Belarusian language // large language models // language understanding evaluation //
-