Portability | untested |
---|---|
Stability | experimental |
Maintainer | twanvl@gmail.com |
Different encodings of characters into bytes.
- data UTF8 = UTF8
- data BE = BE
- data LE = LE
- type Native = LE
- data UTF16 endianness = UTF16 endianness
- type UTF16BE = UTF16 BE
- type UTF16LE = UTF16 LE
- type UTF16Native = UTF16 Native
- data UTF32 endianness = UTF32 endianness
- type UTF32BE = UTF32 BE
- type UTF32LE = UTF32 LE
- type UTF32Native = UTF32 Native
- data ASCII = ASCII
- data Latin1 = Latin1
- data Compact = Compact
Unicode encodings
Tag representing the UTF-8 encoding.
Use
for UTF-8 encoded strings.
CompactString
UTF8
Tag representing the UTF-16 encoding
UTF16 endianness |
type UTF16Native = UTF16 NativeSource
Tag representing the platform native UTF-16 encoding.
Tag representing the UTF-32 encoding
UTF32 endianness |
type UTF32Native = UTF32 NativeSource
Tag representing the platform native UTF-32 encoding.
Other encodings
Non-standard encodings
Tag representing a custom encoding optimized for memory usage.
This encoding looks like UTF-8, but is slightly more efficient. It requires at most 3 byes per character, as opposed to 4 for UTF-8.
Encoding looks like:
0zzzzzzz -> 0zzzzzzz 00yyyyyy yzzzzzzz -> 1xxxxxxx 1yyyyyyy 000xxxxx xxyyyyyy yzzzzzzz -> 1xxxxxxx 0yyyyyyy 1zzzzzzz
The reasoning behind the tag bits is that this allows the char to be read both forwards and backwards.