linguistic corpus

Texts of corpus of Russian dialects of Udmurtia as a source of linguistic and culturological information Scripta & e-Scripta vol. 21, 2021 floyd Fri, 11/19/2021 - 19:15
Текстовете от корпуса на руските диалекти от Удмуртия като източник на лингвистична и културологична информация

The corpus of Russian dialects of Udmurtia, created on the platform of the linguistic and geographical information system (LGIS) “Dialect” (URL: http://dialect. manuscripts.ru/), contains recordings of oral speech of residents of 166 localities of the Udmurt Republic in the 1970s–1990s. The texts are presented mainly in the form of scanned copies of the pages of notebooks, in which transcription of the conversations of the collectors with the informants is given. There are 9300 scanned copies of pages, all records are certified. The existing markup provides the creation of token samples and visualization of contexts at the user’s request (http://dialect.manuscripts.ru/Lexical/ FindQuestPage), which allows us to analyze the features of the lexical composition, as well as some phonetic and grammatical features of the Russian dialects of Udmurtia. Dialect words from texts of the corpus can be mapped in lexical, word building and semantic maps of Russian dialects of Udmurtia. At present, the texts of the corpus are available in the public domain http://manuscripts. ru/dialect-test/notebooks. Recordings of dialect speech can serve as a source of non- linguistic information, namely about historical events and personalities, material and spiritual culture, customs and traditions of the local population, national composition and interethnic relations in Udmurtia of the 20th century. In the paper examples of texts of the corpus in Russian transcription with all lexical, phonetic and grammatical features of dialects, information about speakers, time and places of recordings are given.

Subject: Digital humanities Keywords: linguistic corpus Russian dialects of Udmurtia LGIS Dialect history cultural science ethnography
Subscribe to linguistic corpus