Linguistic annotation

Electronic Edition and Linguistic Annotation of Slavic Fragments Scripta & e-Scripta vol. 18, 2018 floyd Fri, 12/28/2018 - 08:01

The paper introduces a project on edition and linguistic annotation of Medieval and Early Modern South Slavic manuscript fragments. The main topic is implementation of various approaches on integration of electronic edtion, manuscript description and linguistic annotation. A corpus will include fragments from parchment manuscripts kept in Bulgarian repositories. We will illustrate the approach with several pieces of texts from various fragments. The representation will be supplied with textual, as well as part-of-speech and basic syntactic annotation. On the basis of it an attempt will be made at experimental anaphora and related morpho-syntactic annotation. The work will offer a discussion on the features that will be useful for such annotation. The project relies on eXist database (http://exist-db.org) and the initiatives: Repertorium (http://repertorium.obdurodon.org/), PROIEL (http://www.hf.uio.no/ifikk/english/ research/ projects/proiel/) and TOROT (http://site.uit.no/slavhistcorp/files/2015/04/Eckhoff.pdf).

Language studies // Language and Literature Studies // Theoretical Linguistics // Applied Linguistics // Studies of Literature // Computational linguistics // South Slavic Languages // Philology // South Slavic manuscripts // Fragments // Linguistic annotation // Linguistic corpora // Electronic text edition // Electronic description // XML technologies //
Paul the Not-So-Simple Scripta & e-Scripta vol 6, 2008 floyd Fri, 12/26/2008 - 08:52

The present report describes the construction of a technologically innovative electronic edition of the Old Church Slavonic "Life of St. Paul the Simple" from the Codex Suprasliensis. From a philological perspective, the on-line edition of the Life of St. Paul the Simple described here has attempted to address the pedagogical and research needs of Slavic medievalists by providing a textual edition that presents the manuscript material in a way that is both richer and more easily accessible than any other edition, paper or electronic. From a technological perspective, it has also attempted to explore some of the ways in which modern electronic text technology can be used to produce research and teaching tools that are superior to those available without such technology.

Language and Literature Studies // Codex Suprasliensis // Electonic corpus // Linguistic annotation // XML // HTML //

The Linguistic Information in the Electronic Corpus of Old Slavonic Texts

  • Summary/Abstract

    Доклад посвящен включению лингвистических данных в электронном корпусе древних славянских текстов. Анализу подвергаются различные современные подходы в этом отношении. Приводятся сведения о форматах, использованных до сих пор в славянской научной традиции, а также делаются параллели с практикой включения языковых данных популярнейших проектов в области электронной обработки современных и древних языков. Подход авторов базирован на технологии языка описательной разметки XML (Extensible Markup Language) и все выводы сделаны на основании этого выбора.

    Keywords:

Subscribe to Linguistic annotation