MITT RU 1.0
MITT RU-RZ/NT 1.0 -- Standard for romanized and native-script text styles for russian language.
This document contains many rare Unicode characters, whose name is not mentioned. If you need to know the Unicode name of some character, you can copy the character from this document to some website that identifies Unicode characters.
Letters marked with a hashtag # have a different design in the MITT fonts than you will see in this documentation with a standard font. Many of these design differences are explained in this document. A complete list of untypical letter designs related to the MITT text styles is available in the documentation of the MITT font.
This page is best viewed as the source code, in Notepad or other text editor.
Main versions of writing styles:
RU-RZ -- fluent romanized russian:
** RU-RZ-P Precise romanized russian. Differentiates the pronunciations and literal spelling preferences a/ȧ, ă/ya/ẳ, e/ĕ/ye/ė/ŏ/yo, o/ọ, ŭ/yu, v/ṽ, g/ḡ, d/đ, ĭ/ī, l/ł, t/ŧ and ḫ/h. Foreign non-cyrillic proper nouns are written literally (letter by letter), not based on pronunciation, as in RU-RZ-S. In foreign proper nouns uses letters that are outside of the basic russian alphabet, such as C, H, Q, W, X, Ä, Å, Ö or Ü. (RU-RZ-S imitates the selection of letters that is available in the basic russian alphabet, and refrains from using any letters outside of it.)
-- RU-RZ-S Simple romanized russian. Does not differentiate any of the above-mentioned pronunciations or literal spelling preferences. Differentiating the letters е and ё is supported but not required by this standard. Any ordinary russian text can be automatically converted into RU-RZ-S text (and typically does not differentiate letters е and ё). Foreign non-cyrillic proper nouns are written based on pronunciation.
RU-BL -- russian romanized with Basic Latin characters only (ASCII 32 - 126):
-- RU-BL-P Precise basic-latinized russian.
-- RU-BL-S Simple basic-latinized russian.
RU-NT -- russian with the native script:
** RU-NT-P Precise russian with cyrillic script. Preferably should be shown with a customized special font, such as Malki, in which the following characters look different than is intended by the Unicode standard:
Ԭ ԭ (cyrillic DCHE) => Д д with macron above.
Ԉ ԉ (komi LJE) => Л л with macron above.
Ԏ ԏ (komi TJE) => Т т with macron above.
Ƣ ƣ (latin OI) => Ю ю with tilde above.
Ԅ ԅ (komi ZJE) => Я я with tilde above.
Ԇ ԇ (komi DZJE) => Я я with dot above.
If this text style is shown with a standard font, in which these characters are not customized as explained above, the text is probably still understandable but not very stylish, in the scenarios where these characters occur. The customized characters make the text more stylish and easily understandable. RU-NT-P differentiates all the same pronunciations and literal spelling preferences as RU-RZ-P. Foreign non-cyrillic proper nouns are written literally (letter by letter), not based on pronunciation, as in RU-NT-S. In foreign proper nouns uses letters that are outside of the basic russian alphabet, such as the equivalents of C, H, Q, W, X, Ä, Å, Ö or Ü. (RU-NT-S imitates the selection of letters that is available in the basic russian alphabet, and refrains from using any letters outside of it.)
-- RU-NT-D Detailed russian with cyrillic script. Otherwise similar to RU-NT-P, but does not need a nonstandard customized font design to look stylish and easily readable. Careful attention should be paid to the font choice nevertheless, because this text style uses combining diacritical marks in some scenarios: Я with hook above, я with dot above, and Д д, Ю ю or Я я with dot below. To display the text in a stylish and easily readable way, you need a font such as Calibri or Malki, which places the combining diacritical marks on the previous character, as the diacritical mark character is encoded in the text after the letter that it should be over or under.
Majority of popular fonts (such as Arial, Cascadia, Century, Garamond, Georgia, Lucida, Tahoma, Times New Roman, Verdana, etc.) position the combining diacritical marks in varying locations: some of them between the letters, and some of them on the previous letter or on the next letter. This is confusing and misleading for the reader, if a diacritical mark is shown ambiguously between two letters, or in the worst case on the wrong letter. The fonts Consolas and Courier New position these three diacritical marks (hook above, dot above, and dot below) always on the next letter, so it would be possible to make the text stylish and correct by encoding each diacritical mark before its letter, not after the letter. This practice is against the text standard RU-NT-D, however, and should be avoided.
-- RU-NT-S Standard russian with cyrillic script. Does not differentiate any of the above-mentioned pronunciations or literal spelling preferences. Differentiating the letters е and ё is supported but not required by this standard. Foreign non-cyrillic proper nouns are written based on pronunciation.
Foreign non-cyrillic proper nouns are usually written based on pronunciation in the cyrillic script. The russian style of writing foreign proper nouns is imitated also in such of these romanized standards, whose primary emphasis is replicating the cyrillic text as such in latin script, rather than writing optimally convenient text with the latin script.
Only such latin and cyrillic Unicode characters have been deemed acceptable for these standards, which do not force the text row to be any higher than normal. Sometimes a cyrillic text standard uses a latin Unicode character, even if a similar character were possible to achieve as a combination of a cyrillic letter (visually identical to the latin letter) and a separate diacritical mark. Single Unicode characters are always favoured, because they produce the expected visual look more reliably and precisely in various software.
The RU-BL text styles may not be esthetically very pleasant to read. Their purpose is to provide an easy and safe way to write and store russian text (with the probable intention to later display the text in a more fluently readable text style), using the most universally supported Basic Latin character set only (ASCII 32 - 126): a...z A...Z 0 1 2 3 4 5 6 7 8 9 . , : ; - ' " ! ? ( ) [ ] { } < > / | \ _ ~ = + * ^ ` @ # $ & %.
Sample text, for comparing these text styles:
** RU-RZ-P Janice [{Dženis}] pọznakomilsă s Yevgeniĕm desẳtᶥ let nazad. Ọna vidėla čto u nĕḡo dobroĕ serđċe.
-- RU-RZ-S Dženis {[Janiċe]} poznakomilsă s Еvgeniem desătᶥ let nazad. Ona videla čto u nego dobroe serdċe.
-- RU-BL-P Janice [{D^zenis}] p@znakomils^a s Yevgeni^em des*at= let nazad. `@na vid`ela ^cto u n^e`go dobro^e ser`d`ce.
-- RU-BL-S D^zenis {[Jani`ce]} poznakomils^a s Еvgeniem des^at= let nazad. Ona videla ^cto u nego dobroe serd`ce.
** RU-NT-P Ҋаниџе [{Дженис}] пọзнакомился с Ẽвгениӗм деся̇ть лет назад. Өна видėла что у нӗӻо доброӗ серд̄це.
-- RU-NT-S Дженис {[Жанице]} познакомился с Евгением десять лет назад. Она видела что у него доброе сердце.
Logical convertibility between these text standards (based on the text itself only, without using any dictionary data or human help):
                   RU-RZ-P    RU-RZ-S    RU-NT-P    RU-NT-S
* RU-RZ-P    =>               YES        YES        YES
- RU-RZ-S    =>    ---                   -->        YES
* RU-NT-P    =>    YES        YES                   YES
- RU-NT-S    =>    -->        YES        ---
A dash --- or arrow --> is used in this diagram, if the conversion between two text styles is not supported, because the source text does not contain enough information for logically concluding the correct ortography in the target text style (in all possible scenarios). If such an unsupported conversion is requested, the default option is not to convert the source text at all. However, if it is deemed preferable to convert the text into the closest possible text style (most notably, when the script should change from native to romanized, or vice versa), the arrows point towards the text style that is the recommendable substitute for the unsupported conversion. A dash --- is used in the diagram, when no other logically supported substitutes are available (in the same script) than the source text style itself.
Foreign proper nouns cannot be automatically converted between a literal format (replicated letter by letter) and a transliteration based on pronunciation. Such a conversion would be reliably possible only if both the literal and the pronounced form of the name are documented in the source text, using some kind of tags or footnotes. This table of logical convertibility between the text styles ignores this aspect of foreign proper nouns, and promises convertibility from a text style to another, if no other logical obstacles exist for the conversion than the writing of foreign proper nouns being based on different principles.
The sample texts afore use such a notation that the form based on pronunciation is given in [{curly brackets inside square brackets}], and the literal form is given in {[square brackets inside curly brackets]}, after the spelling that is chosen for the main text. Thus it would be possible to automatically recognize, which of the two formats is the literal one. These codes and alternative spellings are not intended to be seen by the human reader in the main text.
These text styles use a strict logical correlation between latin letters and cyrillic script letters, so that the text can be converted back and forth between cyrillic script and latin script, and the text should stay exactly similar through all these conversions, without any changes caused by the conversion process back and forth. However, foreign proper nouns are not always fully compatible with this system. In some scenarios it is possible that converting a foreign proper noun from latin script to cyrillic script, and then back into latin script, produces a different spelling in latin script than was the original form of the name. This can happen because these romanized cyrillic text styles are optimized for fluent reading and strict logical compatibility with the cyrillic script, not strict compatibility with the way how other languages use the latin script.
Transliterated RU-RZ-P standard uses alternative esthetic spelling preferences ă/ya, ĕ/ye, ŏ/yo and ŭ/yu, whose purpose is to make the text more stylish and convenient to read. While this feature was originally not intended for text that is written in cyrillic script, it is included in RU-NT-P standard (mostly as diacritical marks over or under vowels), to make this standard fully convertible to RU-RZ-P format without the loss of any information.
Technical reliability of these text styles, as Unicode characters:
To increase the grammatical information content of the text, these text styles use many uncommon diactirical marks and special characters, both in latin and in cyrillic script. This causes a higher risk that some fonts or software will fail to display the text correctly and beautifully. One of the esthetic risks is that some letters in the text are displayed with a different font than the rest of text. This happens if the primary font does not include some rare character. In that case the software will use any other font that contains the character. If none of the available fonts contains the character, then the software probably displays some generic character, such as a square or a question mark, for example.
Below is a guide for transliterating foreign proper nouns from latin script to cyrillic script literally, letter by letter -- regardless of the language, or how the word is pronounced. The standard RU-NT-P uses the primary variant only, which is not in parentheses. Some ambiguity and taking the pronunciation into consideration is allowed in standard RU-NT-S, whose acceptable alternatives are listed in parentheses, using the most basic russian alphabet only.
A a = А а
B b = Б б
C c = Џ џ (Ц ц / С с / К к)
D d = Д д
E e = Е е (Е е / Э э)
F f = Ф ф
G g = Г г (Г г / Ж ж)
Ġ ġ = Ѓ ѓ (Г г)
H h = Ӽ ӽ (Х х)
I i = И и
J j = Ҋ ҋ (Й й / Ж ж)
K k = К к
L l = Л л
M m = М м
N n = Н н
O o = О о
P p = П п
Q q = Қ қ (К к)
R r = Р р
S s = С с
T t = Т т
U u = У у
V v = В в
W w = Ӯ ӯ (У у / В в)
X x = Ӿ ӿ (КС Кс кс)
Y y = Ѝ ѝ (И и / Ы ы / Й й / УЕ Уе уе)
Z z = З з (З з / Ц ц)
Ä ä = Ӓ ӓ (А а / Я я / АЕ Aе ае)
Å å = Ҩ ҩ (О о / АA Aа аа / А а)
Ö ö = Ӧ ӧ (О о / ОЕ Ое ое)
Ü ü = Ӱ ӱ (Ы ы / У у / УЕ Уе уе)
RU-BL: Ä ä = ``A ``a, Å å = ^^A ^^a, Ö ö = ``O ``o, Ü ü = ``U ``u.
CYR > LAT : SPECIAL CASES
А а > A a => Ȧ ȧ, when pronounced "i": dvėnadċȧtᶥ. RU-BL: `A `a. In RU-NT-P and RU-NT-D: Foreign letter Å å => Ҩ ҩ. => Modified glyphs in MITT fonts: Ҩ ҩ => Å å (А а with ring above). In RU-NT-P: Ȧ ȧ => Ӑ ӑ. => Modified glyphs in MITT fonts: Ӑ ӑ => Ȧ ȧ (А а with dot above). In RU-NT-D: Ȧ ȧ => А а -- using combined characters: cyrillic letter А а and a separate character "combining comma above" (uppercase) or "combining dot above" (lowercase).
Б б > B b
В в > V v => Ṽ ṽ, when left unpronounced by most people, for example in letter combination vstv. RU-BL: `V `v. => Modified glyphs in MITT fonts: Ṽ ṽ => V v with macron above. In RU-NT-P: Ṽ ṽ => Ѣ ѣ. => Modified glyphs in MITT fonts: Ѣ ѣ => В в with macron above. In RU-NT-D: Ṽ ṽ => В̣ в̣ -- using combined characters: cyrillic letter В в and a separate character "combining dot below". (Also supported: combining overline/macron or comma/dot above.)
Г г > G g => Ḡ ḡ, in the ending -ḡo, which is pronounced "-vo": ĕḡo. RU-BL: `G `g. => Ĝ ĝ, in loanwords, in which the foreign word has H instead of G: e.g. ĝorizont (horizon). RU-BL: ^G ^g. In RU-NT-P and RU-NT-D: Ḡ ḡ => Ӻ ӻ. => Modified glyphs in MITT fonts: Ӻ ӻ => Г г with macron above.
Д д > D d => Đ đ, when left unpronounced by most people, for example in letter combinations ndsk, zdn or rdċ. RU-BL: `D `d. In RU-NT-P: Đ đ => Ԭ ԭ. => Modified glyphs in MITT fonts: Ԭ ԭ => Д д with macron above. In RU-NT-D: Đ đ => Д̣ д̣ -- using combined characters: cyrillic letter Д д and a separate character "combining dot below". (Also supported: combining overline/macron or comma/dot above.)
Е е > E e => Ĕ ĕ -- when pronounced "ye": ĕsli, nĕ, načalĕ. RU-BL: ^E ^e. => YE ye -- Ĕ ĕ alone as a 1-letter word, or at the end of a 2-letter word, whose first letter is ă, ĕ, ŏ or ŭ. (Unlike ŏ, ŭ and ă, the letter ĕ is not romanized with two letters [as "ye"] as the first letter of a word.) -- In proper nouns the preferred transliteration might be "ye" also in other situations (depending on personal esthetic preferences): Ĕlena => Yelena, Fedĕnka => Fedyenka. -- NOTE: ĕ and ye are synonyms: they mean the same sound, and behave similarly if text is converted into RU-NT-S. => Ė ė -- when pronounced "i": smọtrėli, vidėl, iŝėtĕ, mėnă, pọnėdelᶥnik. RU-BL: E` e`. In RU-NT-P and RU-NT-D: Ĕ ĕ => Ӗ ӗ (this is a conversion from latin script to cyrillic script, even if the characters look identical). In RU-NT-P: YE/Ye ye => Ҽ ҽ. => Modified glyphs in MITT fonts: Ҽ ҽ => Ĕ ĕ with descender in the right bottom corner. Ė ė => È ѐ. => Modified glyphs in MITT fonts: È ѐ => Ė ė (Е е with dot/comma above). In RU-NT-D: YE/Ye ye => Ӗ̣ ӗ̣ -- using combined characters: cyrillic letter Ӗ ӗ and a separate character "combining dot below". Ė ė => Е̓ е̇ -- using combined characters: cyrillic letter Е е and a separate character "combining comma above" (uppercase) or "combining dot above" (lowercase).
Ё ё > Ŏ ŏ RU-BL: ^O ^o. => YO yo -- as the first letter of a word, or at the end of a 2-letter word, whose first letter is ă, ĕ, ŏ or ŭ: ŏlka => yolka, ĕŏ => ĕyo. -- In proper nouns the preferred transliteration might be "yo" also in other situations (depending on personal esthetic preferences): Alŏna => Alyona, Artŏm => Artyom. -- NOTE: ŏ and yo are synonyms: they mean the same sound, and behave similarly if text is converted into RU-NT-S. In RU-NT-P, ONLY in cases of personal preference, when not the first letter of a word, and not immediately after ă, ĕ, ŏ or ŭ: YO/Yo yo => Ӭ ӭ. => Modified glyphs in MITT fonts: Ӭ ӭ => Ё ё with descender in the right bottom corner. RU-NT-D, ONLY in cases of personal preference, when not the first letter of a word, and not immediately after ă, ĕ, ŏ or ŭ: YO/Yo yo => Ё̣ ё̣ -- using combined characters: cyrillic letter Ё ё and a separate character "combining dot below". (Also supported: combining macron / tilde below.) -- NOTE: It is very common in cyrillic texts to misspell ё as е. The standards RU-NT-S and RU-RZ-S accept such ambiguity, but any other of these standards do not accept it.
Ж ж > Ž ž RU-BL: ^Z ^z.
З з > Z z
И и > I i
Й й > Ĭ ĭ => Ī ī, when left unpronounced by most people, for example in word pọžaluīsta. RZ-P: `I `i. In RU-NT-P and RU-NT-D: Ī ī => Ӣ ӣ.
К к > K k
Л л > L l => Ł ł, when left unpronounced by most people, for example in letter combination lnċ. RU-BL: `L `l. In RU-NT-P: Ł ł => Ԉ ԉ. => Modified glyphs in MITT fonts: Ԉ ԉ => Л л with macron above. In RU-NT-D: Ł ł => Ԯ ԯ.
М м > M m
Н н > N n
О о > O o => Ọ ọ, when prononuced nearly similarly as "a": ḫọrọšo. RU-BL: `@ @. In RU-NT-P and RU-NT-D: Ọ ọ => Ө ө. => Modified glyphs in MITT fonts: Ө ө => Ọ ọ (O o with dot below).
П п > P p
Р р > R r
С с > S s
Т т > T t => Ŧ ŧ, when left unpronounced by most people, for example in letter combinations ntsk, ntst, stsk or stn, possibly also stl. RU-BL: `T `t. In RU-NT-P: Ŧ ŧ => Ԏ ԏ. => Modified glyphs in MITT fonts: Ԏ ԏ => Т т with macron above. In RU-NT-D: Ŧ ŧ => Ҭ ҭ.
У у > U u
Ф ф > F f
Х х > Ḫ ḫ RU-BL: ^H ^h.
Ц ц > Ċ ċ RU-BL: `C `c.
Ч ч > Č č RU-BL: ^C ^c.
Ш ш > Š š RU-BL: ^S ^s.
Щ щ > Ŝ ŝ RU-BL: *S *s.
Ъ ъ > ᶜ ˁ RU-BL: + ~ => Modified glyph in MITT fonts: ˁ => ᶜ positioned with its top edge on the same level with the top edge of lowercase "e".
Ы ы > Ý ý RU-BL: `Y `y.
Ь ь > ᴵ ᶥ RU-BL: | =
Э э > Ē ē RU-BL: *E *e.
Ю ю > Ŭ ŭ RU-BL: ^U ^u. => yu, as the first letter of a word, or at the end of a 2-letter word, whose first letter is ă, ĕ, ŏ or ŭ. -- In proper nouns the preferred transliteration might be "yu" also in other situations (depending on personal esthetic preferences): Ŭrí => Yurí, Katŭška => Katyuška. -- NOTE: ŭ and yu are synonyms: they mean the same sound, and behave similarly if text is converted into RU-NT-S. In RU-NT-P, ONLY in cases of personal preference, when not the first letter of a word, and not immediately after ă, ĕ, ŏ or ŭ: YU/Yu yu => Ӳ ӳ -- (these are latin characters) => Modified glyphs in MITT fonts: Ӳ ӳ => Ю ю with reversed hook in the left leg. In RU-NT-D, ONLY in cases of personal preference, when not the first letter of a word, and not immediately after ă, ĕ, ŏ or ŭ: YU/Yu yu => Ю̣ ю̣ -- using combined characters: letter Ю ю and a separate character "combining dot below". (Also supported: combining macron or tilde below.)
Я я > Ă ă RU-BL: ^A ^a. => ya, as the first letter of a word, or at the end of a 2-letter word, whose first letter is ă, ĕ, ŏ or ŭ: ă => ya, ăblọkọ => yablọkọ. -- In proper nouns the preferred transliteration might be "ya" also in other situations (depending on personal esthetic preferences): Ăna => Yana, Ženă => Ženya. -- NOTE: ă and ya are synonyms: they mean the same sound, and behave similarly if text is converted into RU-NT-S. => Ẳ ẳ, when pronounced "i": desẳtᶥ, verẳŝiĕ, svẳŝennik, prẳmoĭ. RU-BL: *A *a. => Modified glyphs in MITT fonts: Ẳ ẳ => Ă ă with dot above the breve. In RU-NT-P, ONLY in cases of personal preference, when not the first letter of a word, and not immediately after ă, ĕ, ŏ or ŭ: YA/Ya ya => Ԅ ԅ => Modified glyphs in MITT fonts: Ԅ ԅ => Я я with hook in the right leg. In RU-NT-D, ONLY in cases of personal preference, when not the first letter of a word, and not immediately after ă, ĕ, ŏ or ŭ: YA/Ya ya => Я̣ я̣ -- using combined characters: letter Я я and a separate character "combining dot below". (Also supported: combining macron or tilde below.) In RU-NT-P: Ẳ ẳ => Ԇ ԇ => Modified glyphs in MITT fonts: Ԇ ԇ => Я я with dot above. In RU-NT-D: Ẳ ẳ => Я̓ я̇ , using combined characters: letter Я я and a separate character "combining comma above" (uppercase) or "combining dot above" (lowercase).
MITT RU-RZ/NT 1.0 -- Standard for romanized and native-script text styles for russian language. Ion Mittler, 10 march 2025. Released in the public domain under CC0-1.0 license (Creative Commons 0 version 1.0). http://creativecommons.org/publicdomain/ zero/1.0/
Modern International Text Types — mitt.fi
Keyword variants for search engines: The standard MITTRURZNT (MITTRURZ / MITTRUNT) defines the text styles MITT RU-RZ-P [MITTRURZP], MITT RU-RZ-S [MITTRURZS], MITT RU-BL-P [MITTRUBLP], MITT RU-BL-S [MITTRUBLS], MITT RU-NT-P [MITTRUNTP], MITT RU-NT-D [MITTRUNTD] and MITT RU-NT-S [MITTRUNTS].