statistics

Scripta & e-Scripta vol. 24, 2024

Victor Baranov Cтатистическая значимость компонентов лексических синонимических рядов в древнеболгарских письменных памятниках: поиск метода

Statistical significance of the components of lexical synonymous series in ancient Bulgarian written manuscripts: search for a method

Summary/Abstract

The results of statistical experiments to find the characteristics of words that are traditionally considered as the Ohrid-Moravian and Preslav components of synonymous series – иерѣи ‘priest’ – жьрьць ‘priest, cleric’ – свѧщеньникъ ‘priest, clergyman’, колѣ- но ‘knee’, ‘kindred’ – племѧ ‘tribe, genus’, коньчина ‘demise, end’ – коньць ‘end’, кънигы ‘books’ – писаниѥ ‘scripture’, любодѣица ‘adulteress, fornicator’ – блѫдьница ‘harlot’ are presented. The use of information about the relative number of words in a subcorpus, about significant deviations from the average values, and the calculation of statistical characteristics of lexemes in each of the subcorpora made it possible, in particular, to detect opposed and non-opposed components of synonymous series. The methods used to identify the statistical characteristics of words have shown that the degree of opposition of synonyms can be different – statistically significant or statistically insignificant. On this basis, it is concluded that it is necessary to move away from the unconditional attribution of the components of the synonymic series to the Ohrid-Moravian and Preslav vocabulary: the relations between the components of each synonymic series are individual and can range from statistically opposed in the texts of different schools to

Subject: e-Scripta

Keywords: Old Bulgarian writing Western Bulgarian Eastern Bulgarian lexical synonyms statistics text corpus

Колации с компонентьн(o) в руските летописи: количествено-статистически анализ (върху подкорпуса на руските летописи в ИАС «Манускрипт»

Summary/Abstract

The article is dedicated to the quantitative and statistical research of linguistic units in the ancient Russian chronicles. The relevant samples were obtained by using the n-gram module of the information-analytical system (IAS) «Manuscript», which allows identifying textual combinations with various numbers of components. The module makes it possible to carry out a statistical analysis of linguistic units using measures of association. It is the aim of this work to prove that the remainder of an indivisible noun that has preserved semantic and grammatical unity is present in the chronicles. This gives insight into the formation of the part of speech system of the Old Russian language. The tools of the IAS “Manuscript” allowed the conclusion that the analyzed suffixal forms in -о perform predominantly a predicative function in the syntagmas. Within the framework of this research, collocations with a component in -ьн(о) were identified that are not lexically stable (not idiomatic) but grammatically stable, that is, they represent colligations. On the whole, this paper demonstrates the effectiveness of statistical measures in extracting collocations from Old Russian texts in order to perform a complex analysis.

Subject: Digital humanities

Keywords: chronicles statistics linguistic research automation

Victor Baranov Cтатистическая значимость компонентов лексических синонимических рядов в древнеболгарских письменных памятниках: поиск метода

Statistical significance of the components of lexical synonymous series in ancient Bulgarian written manuscripts: search for a method

Regina Vernyaeva Collocations with a component -ьн(o) in Russian Chronicles: the quantitative-statistical analysis (based on the corpus of Russian Chronicles of the IAS “Manuscript”)

Колации с компонентьн(o) в руските летописи: количествено-статистически анализ (върху подкорпуса на руските летописи в ИАС «Манускрипт»