Administrative documents of the Don Cossack Host in the 18th – 19th centuries: the issue of the creation of a linguistic corpus

Административните документи на Донската казашка армия от XVIII–XIX век: проблемът за изграждане на лингвистичен корпус

scripta_cover_21.jpg
  • Author(s):
  • Subject(s): Digital humanities //
  • Published by: Institute for Literature BAS
  • Print ISSN: 1312-238X
  • Summary/Abstract:

    The article presents basic principles of designing the diachronic linguistic corpus of documents of the Don Cossack Host offices from the State Archive of the Volgograd region, Russia, including collecting documents for the text corpus, arranging the technical base of automatic processing and text editing, scheduling automated tagging, morphological annotation, and corpus software tools. The authors explain some technical aspects of corpus processing and text corpus constituency. It is considered reasonable to add any document to the corpus, the draft texts with the crossed-out fragments included, as it ensures accurate registration of grammar and vocabulary of the language at a certain historical period. A set of language marker types is worked over for automated meta-tagging. The corpus software tools are defined to enable accurate annotation of obsolete fonts so that they can be processed in a pair with regular language units and expressions in morphological and genre meta-tagging; in cases of partial text adaptation, the authentic old graphic symbols may have to be preserved.


  • Page Range: 139-150
    No. of Pages: 12
    Language: English
    Year: 2021
    Issue No:: Scripta & e-Scripta vol. 21, 2021

    Submitted on:

  • LINK CEEOL:
  • Oksana Gorban

    Russia
    Russian Philology and Journalism, Volgograd State Univer­sity
    Description

    Oksana Gorban – Doctor of Sciences (Philology), Professor of Depart­ ment of Russian Philology and Journalism, Volgograd State Univer­sity, history of the Russian language;

    Marina Kosova

    Russia
    Russian Philology and Journalism, Volgograd State University
    Description

    Marina Kosova – Doctor of Sciences (Philology), Professor of Department of Russian Philology and Journalism, Volgograd State University, Russian language, documentation studies; e-mail:

    Elena Sheptukhina

    Russia
    Russian Philology and Journalism, Volgograd State University
    Description

    Elena Sheptukhina – Doctor of Sciences (Philology), Professor of Depart­ ment of Russian Philology and Journalism, Volgograd State University, history of the Russian language;

    Andrey Svetlov

    Russia
    Volgograd State University
    Description

    Andrey Svetlov – Candidate of Physical and Mathematical Sciences, Associate Professor, Department of Mathematical Analysis and Function Theory, Volgograd State University, mathematical modelling, data mi­ ning;

    Anatoly Komendantov

    Russia
    Volgo­grad State University
    Description

    Anatoly Komendantov – Student, Institute of Mathematics and IT, Volgo­grad State University, software development;

    Alexander Matveev

    Russia
    Volgo­ grad State University
    Description

    Alexander Matveev – Student, Institute of Mathematics and IT, Volgo­ grad State University, software development;

    Daniil Filimonov

    Russia
    Volgograd State University
    Description

    Daniil Filimonov – Student, Institute of Mathematics and IT, Volgograd State University, software development;

  • SUBJECT: Digital humanities //