Copyright | (c) 2020 Composewell Technologies and Contributors |
---|---|
License | Apache-2.0 |
Maintainer | streamly@composewell.com |
Stability | experimental |
Safe Haskell | None |
Language | Haskell2010 |
Low level Unicode database functions to facilitate Unicode normalization.
For more information on Unicode normalization please refer to the following sections of the Unicode standard:
2 General Structure
- 2.3 Compatibility Characters
- 2.12 Equivalent Sequences
3 Conformance
- 3.6 Combination
- 3.7 Decomposition
- 3.11 Normalization Forms
- 3.12 Conjoining Jamo Behavior
4 Character Properties
- 4.3 Combining Classes
- Unicode® Standard Annex #15 - Unicode Normalization Forms
- Unicode® Standard Annex #44 - Unicode Character Database
Synopsis
- isCombining :: Char -> Bool
- combiningClass :: Char -> Int
- isCombiningStarter :: Char -> Bool
- compose :: Char -> Char -> Maybe Char
- composeStarters :: Char -> Char -> Maybe Char
- data DecomposeMode
- isDecomposable :: DecomposeMode -> Char -> Bool
- decompose :: DecomposeMode -> Char -> [Char]
- decomposeHangul :: Char -> (Char, Char, Char)
Combining class
combiningClass :: Char -> Int Source #
Returns the combining class of a character.
isCombiningStarter :: Char -> Bool Source #
Return True
if a starter character may combine with some preceding
starter character.
Composition
compose :: Char -> Char -> Maybe Char Source #
Compose a starter character (combining class 0) with a combining character
(non-zero combining class). Returns the composed character if the starter
combines with the combining character, returns Nothing
otherwise.
composeStarters :: Char -> Char -> Maybe Char Source #
Compose a starter character with another starter character. Returns the
composed character if the two starters combine, returns Nothing
otherwise.
Decomposition
Non-Hangul
data DecomposeMode Source #
Whether we are decomposing in canonical or compatibility mode.
isDecomposable :: DecomposeMode -> Char -> Bool Source #
Given a non-Hangul character determine if the character is decomposable. Note that in case compatibility decompositions a character may decompose into a single compatibility character.
decompose :: DecomposeMode -> Char -> [Char] Source #
Decompose a non-Hangul character into its canonical or compatibility decompositions. Note that the resulting characters may further decompose.