Kiril Ribarov

Ribarov, Kiril, RHDr,
Research fellow at Charles University in Prague, Prague (formerly)

University in Prague
Unicode U+2E2F, Cyrillic Yerik (Vertical Tilde) Scripta & e-Scripta vol. 7, 2009 floyd Sat, 12/26/2009 - 09:33

In the previous volume of Scripta & e-Scripta (vol. 6, 2008), the authors published a "White Paper" concerning "Early Cyrillic Writing after Unicode 5.1", which commented upon achievements in Unicode version 5.1 as well as candidates for future inclusion and variants. The White Paper was accompanied by a large table that included, among other things, representative glyphs for each character and its assigned codepoint in Unicode 5.1. Copies of both documents were distributed to participants in the XIVth International Congress of Slavists, held in Ohrid in September 2008. In the printed version, the Unicode code point for the vertical tilde, a new addition to Unicode version 5.1, was given as U+2E3A. However, as was brought to our attention later, the vertical tilde was assigned a different code point for the final published version of the Unicode standard, v. 5.1. In the standard, the correct code point for this glyph is U+2E2F (see http://www.unicode.org/charts/PDF/U2E00.pdf).

Subject: Language and Literature Studies Unicode 5.1 Vertical Tilde Characters and glifs

Character Set Standardization for Early Cyrillic Writing after Unicode 5.1

  • Summary/Abstract

    A White Paper prepared on behalf of the Commission for Computer Processing of Slavic Manuscripts and Early Printed Books to the International Committee of Slavists This White Paper emerged from discussions among the authors at the Slovo conference that took place in Sofia from 2008-02-21 through 2008-02-26. It is partially a response to three documents published by the Serbian Academy of Arts and Sciences: "Standard of the Old Slavic Cyrillic Script", "Standardisation of the Old Church Slavonic Cyrillic Script and its Registration in Unicode", and "Proposal for Registering the Old Slavic Cyrillic Script in Unicode" The purpose of this White Paper is to provide for the benefit of medieval Slavic philologists: 1. A review of the current state of Unicode with respect to encoding early Cyrillic writing. 2. A brief statement of basic Unicode design principles. 3. An overview of the relationship between character set and font technologies. 4. A response to "Standard","Standardisation", and "Proposal" that provides a realistic perspective on the compatibility of these documents with modern character set standards. 5. A discussion of the possible need for further expansion of the early Cyrillic character inventory in Unicode. 6. A discussion of strategies for meeting the encoding needs of Slavic medievalists in a standards-conformant way. This White Paper is contributed for discussion before and during the September 2008 International Congress of Slavists in Ohrid.


The Annotation Corpora of Text (ACT) Tool Scripta & e-Scripta vol 2, 2004 floyd Sat, 10/09/2004 - 12:11

The article presents a description of a computer application, developed by a team of authors at the Charles University in Prague, meant to support a multiaspect processing of examples of the cultural heritage in the intelligent environment of the information technologies. Provided is a possibility for the application to be used when describing and analyzing medieval manuscripts, experimented in the project of the Faculty of Mathematics and Physics Annotation corpora of Text (ACT, see http://prometheus.ms.mff.cuni.cz/act). The system is a linguistically-independent product for lexical and corpora processing of written texts. It has been created with the purpose to process large corpora with linguistic annotation, which mostly includes lexicographic and morphological analysis, and the syntactic and semantic information is marked on basic level. The system allows the users when working at different places, to compare the data between one another, and to keep the results regardless of the number of specialists working at the same time. A product which completely solves the problems of the linguistic analysis of medieval records with the help of computers is offered.

Subject: Language and Literature Studies Medieval slavic manuscripts Annotation corpora network & lexicographic applications

Incorporation of Old Church Slavonic Card Files into a Corpus

  • Summary/Abstract

    The creation of a multimillion card index of the Old Church Slavonic language, intended for linguistic research of the Prague group of the Old Church Slavonic dictionary, was carried out manually for several decades. The purpose of this work is to include linguistic data on paper cards in computer form. The task is not only to transfer linguistic data (lemma, forms, Greek correspondence and translation) but to make available the information contained in a card and to provide a reverse reconstruction of the OCS texts from which extracts were given.

    Subject: Language studies

Subscribe to Kiril Ribarov