Kiril Ribarov

University in Prague

Unicode U+2E2F, Cyrillic Yerik (Vertical Tilde)

    In the previous volume of Scripta & e-Scripta (vol. 6, 2008), the authors published a "White Paper" concerning "Early Cyrillic Writing after Unicode 5.1", which commented upon achievements in Unicode version 5.1 as well as candidates for future inclusion and variants. The White Paper was accompanied by a large table that included, among other things, representative glyphs for each character and its assigned codepoint in Unicode 5.1. Copies of both documents were distributed to participants in the XIVth International Congress of Slavists, held in Ohrid in September 2008. In the printed version, the Unicode code point for the vertical tilde, a new addition to Unicode version 5.1, was given as U+2E3A. However, as was brought to our attention later, the vertical tilde was assigned a different code point for the final published version of the Unicode standard, v. 5.1. In the standard, the correct code point for this glyph is U+2E2F (see

Character Set Standardization for Early Cyrillic Writing after Unicode 5.1

    A White Paper prepared on behalf of the Commission for Computer Processing of Slavic Manuscripts and Early Printed Books to the International Committee of Slavists This White Paper emerged from discussions among the authors at the Slovo conference that took place in Sofia from 2008-02-21 through 2008-02-26. It is partially a response to three documents published by the Serbian Academy of Arts and Sciences: "Standard of the Old Slavic Cyrillic Script", "Standardisation of the Old Church Slavonic Cyrillic Script and its Registration in Unicode", and "Proposal for Registering the Old Slavic Cyrillic Script in Unicode" The purpose of this White Paper is to provide for the benefit of medieval Slavic philologists: 1. A review of the current state of Unicode with respect to encoding early Cyrillic writing. 2. A brief statement of basic Unicode design principles. 3. An overview of the relationship between character set and font technologies. 4. A response to "Standard","Standardisation", and "Proposal" that provides a realistic perspective on the compatibility of these documents with modern character set standards. 5. A discussion of the possible need for further expansion of the early Cyrillic character inventory in Unicode. 6. A discussion of strategies for meeting the encoding needs of Slavic medievalists in a standards-conformant way. This White Paper is contributed for discussion before and during the September 2008 International Congress of Slavists in Ohrid.

The Annotation Corpora of Text (ACT) Tool

  • Summary/Abstract

    The article presents a description of a computer application, developed by a team of authors at the Charles University in Prague, meant to support a multiaspect processing of examples of the cultural heritage in the intelligent environment of the information technologies. Provided is a possibility for the application to be used when describing and analyzing medieval manuscripts, experimented in the project of the Faculty of Mathematics and Physics Annotation corpora of Text (ACT, see The system is a linguistically-independent product for lexical and corpora processing of written texts. It has been created with the purpose to process large corpora with linguistic annotation, which mostly includes lexicographic and morphological analysis, and the syntactic and semantic information is marked on basic level. The system allows the users when working at different places, to compare the data between one another, and to keep the results regardless of the number of specialists working at the same time. A product which completely solves the problems of the linguistic analysis of medieval records with the help of computers is offered.

