zenacy-unicode: Unicode utilities for Haskell

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

Zenacy Unicode includes tools for checking byte order marks (BOM) and cleaning data to remove invalid bytes. These tools can help ensure that data pulled from the web can be parsed and converted to text.

[Skip to Readme]

Properties

Versions	1.0.0, 1.0.0, 1.0.1, 1.0.2, 1.0.3
Change log	CHANGES.md
Dependencies	base (>=4 && <5), bytestring (>=0.10.6.0 && <0.11), vector (>=0.11 && <0.13), word8 (>=0.1.2 && <0.2) [details]
License	MIT
Copyright	Copyright (C) 2015-2020 Michael P Williams
Author	Michael Williams <mlcfp@icloud.com>
Maintainer	Michael Williams <mlcfp@icloud.com>
Category	Web
Home page	https://github.com/mlcfp/zenacy-unicode
Source repo	head: git clone https://github.com/mlcfp/zenacy-unicode.git
Uploaded	by mlcfp at 2020-08-28T14:11:12Z

Modules

[Index] [Quick Jump]

Zenacy
- Zenacy.Unicode

Downloads

zenacy-unicode-1.0.0.tar.gz [browse] (Cabal source package)
Package description (as included in the package)

Maintainer's Corner

Package maintainers

mlcfp

For package maintainers and hackage trustees

edit package information

Readme for zenacy-unicode-1.0.0

[back to package description]

Zenacy Unicode

The following is an example of converting dubious data to a text.

textDecode :: ByteString -> Text
textDecode b =
  case bomStrip b of
    (Nothing, s)           -> T.decodeUtf8 $ unicodeCleanUTF8 s -- Assume UTF8
    (Just BOM_UTF8, s)     -> T.decodeUtf8 $ unicodeCleanUTF8 s
    (Just BOM_UTF16_BE, s) -> T.decodeUtf16BE s
    (Just BOM_UTF16_LE, s) -> T.decodeUtf16LE s
    (Just BOM_UTF32_BE, s) -> T.decodeUtf32BE s
    (Just BOM_UTF32_LE, s) -> T.decodeUtf32LE s