text-1.0.0.0: An efficient packed Unicode text type.

Portabilityportable
Stabilityexperimental
Maintainerbos@serpentine.com, rtomharper@googlemail.com, duncan@haskell.org
Safe HaskellTrustworthy

Data.Text.Lazy.Encoding

Contents

Description

Functions for converting lazy Text values to and from lazy ByteString, using several standard encodings.

To gain access to a much larger variety of encodings, use the text-icu package: http://hackage.haskell.org/package/text-icu

Synopsis

Decoding ByteStrings to Text

All of the single-parameter functions for decoding bytestrings encoded in one of the Unicode Transformation Formats (UTF) operate in a strict mode: each will throw an exception if given invalid input.

Each function has a variant, whose name is suffixed with -With, that gives greater control over the handling of decoding errors. For instance, decodeUtf8 will throw an exception, but decodeUtf8With allows the programmer to determine what to do on a decoding error.

decodeASCII :: ByteString -> TextSource

Deprecated: Use decodeUtf8 instead

Deprecated. Decode a ByteString containing 7-bit ASCII encoded text.

This function is deprecated. Use decodeLatin1 instead.

decodeLatin1 :: ByteString -> TextSource

Decode a ByteString containing Latin-1 (aka ISO-8859-1) encoded text.

decodeUtf8 :: ByteString -> TextSource

Decode a ByteString containing UTF-8 encoded text that is known to be valid.

If the input contains any invalid UTF-8 data, an exception will be thrown that cannot be caught in pure code. For more control over the handling of invalid data, use decodeUtf8' or decodeUtf8With.

decodeUtf16LE :: ByteString -> TextSource

Decode text from little endian UTF-16 encoding.

If the input contains any invalid little endian UTF-16 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf16LEWith.

decodeUtf16BE :: ByteString -> TextSource

Decode text from big endian UTF-16 encoding.

If the input contains any invalid big endian UTF-16 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf16BEWith.

decodeUtf32LE :: ByteString -> TextSource

Decode text from little endian UTF-32 encoding.

If the input contains any invalid little endian UTF-32 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf32LEWith.

decodeUtf32BE :: ByteString -> TextSource

Decode text from big endian UTF-32 encoding.

If the input contains any invalid big endian UTF-32 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf32BEWith.

Catchable failure

decodeUtf8' :: ByteString -> Either UnicodeException TextSource

Decode a ByteString containing UTF-8 encoded text..

If the input contains any invalid UTF-8 data, the relevant exception will be returned, otherwise the decoded text.

Note: this function is not lazy, as it must decode its entire input before it can return a result. If you need lazy (streaming) decoding, use decodeUtf8With in lenient mode.

Controllable error handling

decodeUtf8With :: OnDecodeError -> ByteString -> TextSource

Decode a ByteString containing UTF-8 encoded text.

decodeUtf16LEWith :: OnDecodeError -> ByteString -> TextSource

Decode text from little endian UTF-16 encoding.

decodeUtf16BEWith :: OnDecodeError -> ByteString -> TextSource

Decode text from big endian UTF-16 encoding.

decodeUtf32LEWith :: OnDecodeError -> ByteString -> TextSource

Decode text from little endian UTF-32 encoding.

decodeUtf32BEWith :: OnDecodeError -> ByteString -> TextSource

Decode text from big endian UTF-32 encoding.

Encoding Text to ByteStrings

encodeUtf16LE :: Text -> ByteStringSource

Encode text using little endian UTF-16 encoding.

encodeUtf16BE :: Text -> ByteStringSource

Encode text using big endian UTF-16 encoding.

encodeUtf32LE :: Text -> ByteStringSource

Encode text using little endian UTF-32 encoding.

encodeUtf32BE :: Text -> ByteStringSource

Encode text using big endian UTF-32 encoding.