encoding-io-0.0.1: Encoding-aware file I/O.

System.IO.Encoding

Description

This module provides encoding-aware file I/O operations. They are lifted and operate with any textual instance of IOData. A strict read operation is also provided.

Synopsis

# Convenient re-exports

data TextEncoding :: * #

A TextEncoding is a specification of a conversion scheme between sequences of bytes and sequences of Unicode characters.

For example, UTF-8 is an encoding of Unicode characters into a sequence of bytes. The TextEncoding for UTF-8 is utf8.

Instances

 MethodsshowList :: [TextEncoding] -> ShowS #

The Latin1 (ISO8859-1) encoding. This encoding maps bytes directly to the first 256 Unicode code points, and is thus not a complete Unicode encoding. An attempt to write a character greater than '\255' to a Handle using the latin1 encoding will result in an error.

The UTF-8 Unicode encoding

The UTF-8 Unicode encoding, with a byte-order-mark (BOM; the byte sequence 0xEF 0xBB 0xBF). This encoding behaves like utf8, except that on input, the BOM sequence is ignored at the beginning of the stream, and on output, the BOM sequence is prepended.

The byte-order-mark is strictly unnecessary in UTF-8, but is sometimes used to identify the encoding of a file.

The UTF-16 Unicode encoding (a byte-order-mark should be used to indicate endianness).

The UTF-16 Unicode encoding (big-endian)

The UTF-16 Unicode encoding (litte-endian)

The UTF-32 Unicode encoding (a byte-order-mark should be used to indicate endianness).

The UTF-32 Unicode encoding (big-endian)

The UTF-32 Unicode encoding (litte-endian)

The Unicode encoding of the current locale

This is the initial locale encoding: if it has been subsequently changed by setLocaleEncoding this value will not reflect that change.

An encoding in which Unicode code points are translated to bytes by taking the code point modulo 256. When decoding, bytes are translated directly into the equivalent code point.

This encoding never fails in either direction. However, encoding discards information, so encode followed by decode is not the identity.

Since: 4.4.0.0