compact-string-0.3.1: Fast, packed and strict strings with Unicode support, based on bytestrings.

Portabilityuntested
Stabilityexperimental
Maintainertwanvl@gmail.com

Data.CompactString.Encodings

Contents

Description

Different encodings of characters into bytes.

Synopsis

Unicode encodings

data UTF8 Source

Tag representing the UTF-8 encoding. Use CompactString UTF8 for UTF-8 encoded strings.

Constructors

UTF8 

Instances

data BE Source

Tag representing big endian encoding

Constructors

BE 

Instances

Endian BE 

data LE Source

Tag representing little endian encoding

Constructors

LE 

Instances

Endian LE 

type Native = LESource

The platform native endianness

data UTF16 endianness Source

Tag representing the UTF-16 encoding

Constructors

UTF16 endianness 

Instances

Endian e => Encoding (UTF16 e) 

type UTF16BE = UTF16 BESource

Tag representing the big endian UTF-16 encoding, aka. UTF-16BE.

type UTF16LE = UTF16 LESource

Tag representing the little endian UTF-16 encoding, aka. UTF-16LE.

type UTF16Native = UTF16 NativeSource

Tag representing the platform native UTF-16 encoding.

data UTF32 endianness Source

Tag representing the UTF-32 encoding

Constructors

UTF32 endianness 

Instances

Endian e => Encoding (UTF32 e) 

type UTF32BE = UTF32 BESource

Tag representing the big endian UTF-32 encoding, aka. UTF-32BE.

type UTF32LE = UTF32 LESource

Tag representing the little endian UTF-32 encoding, aka. UTF-32LE.

type UTF32Native = UTF32 NativeSource

Tag representing the platform native UTF-32 encoding.

Other encodings

data ASCII Source

Tag representing the ASCII encoding.

Constructors

ASCII 

Instances

data Latin1 Source

Tag representing the ISO 8859-1 encoding (latin 1).

Constructors

Latin1 

Instances

Non-standard encodings

data Compact Source

Tag representing a custom encoding optimized for memory usage.

This encoding looks like UTF-8, but is slightly more efficient. It requires at most 3 byes per character, as opposed to 4 for UTF-8.

Encoding looks like:

                   0zzzzzzz -> 0zzzzzzz
          00yyyyyy yzzzzzzz -> 1xxxxxxx 1yyyyyyy
 000xxxxx xxyyyyyy yzzzzzzz -> 1xxxxxxx 0yyyyyyy 1zzzzzzz

The reasoning behind the tag bits is that this allows the char to be read both forwards and backwards.

Constructors

Compact 

Instances