Coverage • Design Goals • Text Handling
Architectural Context • Unicode Design Principles • Compatibility Characters • Code Points and Characters • Encoding Forms • Encoding Schemes • Unicode Strings • Unicode Allocation • Details of Allocation • Writing Direction • Combining Characters • Equivalent Sequences • Special Characters • Conforming to the Unicode Standard
Versions of the Unicode Standard • Conformance Requirements • Semantics • Characters and Encoding • Properties • Combination • Decomposition • Surrogates • Unicode Encoding Forms • Unicode Encoding Schemes • Normalization Forms • Conjoining Jamo Behavior • Default Case Algorithms
Unicode Character Database • Case • Combining Classes • Directionality • General Category • Numeric Value • Bidi Mirrored • Name • Unicode 1.0 Names • Letters, Alphabetic, and Ideographic • Properties for Text Boundaries • Characters with Unusual Properties • Characters and Sequences That Should Not Be Emitted
Data Structures for Character Conversion • Programming Languages and Data Types • Unknown and Missing Characters • Handling Surrogate Pairs in UTF-16 • Handling Numbers • Normalization • Compression • Newline Guidelines • Regular Expressions • Language Information in Plain Text • Editing and Selection • Strategies for Handling Nonspacing Marks • Rendering Nonspacing Marks • Locating Text Element Boundaries • Identifiers • Sorting and Searching • Binary Order • Case Mappings • Mapping Compatibility Variants • Unicode Security • Ignoring Characters in Processing • U+FFFD Substitution in Conversion
Writing Systems • General Punctuation
Latin • Greek • Coptic • Cyrillic • Glagolitic • Armenian • Georgian • Modifier Letters • Combining Marks
Linear A • Linear B • Cypriot Syllabary • Cypro-Minoan • Ancient Anatolian Alphabets • Old Italic • Runic • Old Hungarian • Gothic • Elbasan • Caucasian Albanian • Vithkuqi • Todhri • Old Permic • Ogham • Shavian • Sidetic
Hebrew • Arabic • Syriac • Samaritan • Mandaic • Yezidi
Old North Arabian • Old South Arabian • Phoenician • Imperial Aramaic • Manichaean • Pahlavi and Parthian • Avestan • Chorasmian • Elymaic • Nabataean • Palmyrene • Hatran
Sumero-Akkadian • Ugaritic • Old Persian • Egyptian Hieroglyphs • Meroitic • Anatolian Hieroglyphs
Devanagari • Bengali (Bangla) • Gurmukhi • Gujarati • Oriya (Odia) • Tamil • Telugu • Kannada • Malayalam
Thaana • Sinhala • Newa • Tibetan • Mongolian • Limbu • Meetei Mayek • Mro • Warang Citi • Ol Chiki • Ol Onal • Nag Mundari • Chakma • Lepcha • Saurashtra • Masaram Gondi • Gunjala Gondi • Wancho • Toto • Tangsa • Sunuwar • Gurung Khema • Kirat Rai • Tolong Siki
Brahmi • Kharoshthi • Bhaiksuki • Phags-pa • Marchen • Zanabazar Square • Soyombo • Old Turkic • Old Sogdian • Sogdian • Old Uyghur
Syloti Nagri • Kaithi • Sharada • Takri • Siddham • Mahajani • Khojki • Dogra • Khudawadi • Multani • Tirhuta • Modi • Nandinagari • Grantha • Dives Akuru • Ahom • Sora Sompeng • Tulu-Tigalari
Thai • Lao • Myanmar • Khmer • Tai Le • New Tai Lue • Tai Tham • Tai Viet • Kayah Li • Cham • Pahawh Hmong • Nyiakeng Puachue Hmong • Pau Cin Hau • Hanifi Rohingya • Tai Yo
Philippine Scripts: Tagalog, Hanunóo, Buhid, and Tagbanwa • Buginese • Balinese • Javanese • Rejang • Batak • Sundanese • Makasar • Kawi
Han • Ideographic Description Characters • Bopomofo • Hiragana and Katakana • Halfwidth and Fullwidth Forms • Hangul • Yi • Nüshu • Lisu • Miao • Tangut • Khitan Small Script
Ethiopic • Osmanya • Tifinagh • N’Ko • Vai • Bamum • Bassa Vah • Mende Kikakui • Adlam • Medefaidrin • Garay • Beria Erfe
Cherokee • Canadian Aboriginal Syllabics • Osage • Deseret
Braille • Western Musical Symbols • Byzantine Musical Symbols • Znamenny Musical Notation • Ancient Greek Musical Notation • Duployan • Sutton SignWriting
Currency Symbols • Letterlike Symbols • Numerals • Superscript and Subscript Symbols • Mathematical Symbols • Invisible Mathematical Operators • Technical Symbols • Geometrical Symbols • Miscellaneous Symbols • Enclosed and Square
Control Codes • Layout Controls • Deprecated Format Characters • Variation Selectors • Private-Use Characters • Surrogates Area • Noncharacters • Specials • Tag Characters
Character Names List • CJK and Other Ideographs • Hangul Syllables
Typographic Conventions • Extended BNF • Rendering
The Unicode Consortium • Unicode Publications • Other Unicode Online Resources
History • Encoding Forms in ISO/IEC 10646 • UTF-8 and UTF-16 • Synchronization of the Standards • Identification of Features for Unicode • Character Names • Character Functional Specifications
Development of the URO • Continuing Research on Ideographs • CJK Sources