Portability | GHC |
---|---|
Stability | experimental |
Maintainer | bos@serpentine.com, rtharper@aftereternity.co.uk, duncan@haskell.org |
Types and functions for dealing with encoding and decoding errors in Unicode text.
The standard functions for encoding and decoding text are strict,
which is to say that they throw exceptions on invalid input. This
is often unhelpful on real world input, so alternative functions
exist that accept custom handlers for dealing with invalid inputs.
These OnError
handlers are normal Haskell functions. You can use
one of the presupplied functions in this module, or you can write a
custom handler of your own.
- data UnicodeException
- = DecodeError String (Maybe Word8)
- | EncodeError String (Maybe Char)
- type OnError a b = String -> Maybe a -> Maybe b
- type OnDecodeError = OnError Word8 Char
- type OnEncodeError = OnError Char Word8
- lenientDecode :: OnError Word8 Char
- strictDecode :: OnError Word8 Char
- strictEncode :: OnError Char Word8
- ignore :: OnError a b
- replace :: b -> OnError a b
Error handling types
data UnicodeException Source
An exception type for representing Unicode encoding errors.
DecodeError String (Maybe Word8) | Could not decode a byte sequence because it was invalid under the given encoding, or ran out of input in mid-decode. |
EncodeError String (Maybe Char) | Tried to encode a character that could not be represented under the given encoding, or ran out of input in mid-encode. |
type OnError a b = String -> Maybe a -> Maybe bSource
Function type for handling a coding error. It is supplied with two inputs:
- A
String
that describes the error. - The input value that caused the error. If the error arose
because the end of input was reached or could not be identified
precisely, this value will be
Nothing
.
If the handler returns a value wrapped with Just
, that value will
be used in the output as the replacement for the invalid input. If
it returns Nothing
, no value will be used in the output.
Should the handler need to abort processing, it should use error
or throw
an exception (preferably a UnicodeException
). It may
use the description provided to construct a more helpful error
report.
type OnDecodeError = OnError Word8 CharSource
type OnEncodeError = OnError Char Word8Source
Useful error handling functions
lenientDecode :: OnError Word8 CharSource
Replace an invalid input byte with the Unicode replacement character U+FFFD.
strictDecode :: OnError Word8 CharSource
Throw a UnicodeException
if decoding fails.
strictEncode :: OnError Char Word8Source
Throw a UnicodeException
if encoding fails.