The unicode-transforms package

[Tags:benchmark, bsd3, library, test]

Fast Unicode 9.0 normalization in Haskell (NFC, NFKC, NFD, NFKD).

[Skip to Readme]


Versions, 0.2.0, 0.2.1, 0.3.0
Change log
Dependencies base (>=4.7 && <5), bitarray (>=0.0.1 && <0.1), bytestring (>=0.9 && <0.11), text (>=1.1.1 && <1.3) [details]
License BSD3
Copyright 2016 Harendra Kumar, 2014–2015 Antonio Nikishaev
Author Harendra Kumar
Category Data, Text, Unicode
Home page
Bug tracker
Source repository head: git clone
Uploaded Sun Feb 12 17:37:41 UTC 2017 by harendra
Distributions Arch:0.2.1, LTSHaskell:, NixOS:0.3.0, Stackage:0.3.0, Tumbleweed:0.2.1
Downloads 677 total (183 in the last 30 days)
0 []
Status Docs available [build log]
Last success reported on 2017-02-12 [all 1 reports]




devDeveloper buildDisabledManual
has-icuUse text-icu for benchmark and test comparisonsDisabledManual
has-llvmUse llvm backend (faster) for compilationDisabledManual

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info


Maintainer's Corner

For package maintainers and hackage trustees

Readme for unicode-transforms

Readme for unicode-transforms-0.3.0

Unicode Transforms

Hackage Build Status Windows Build status Coverage Status

Fast Unicode 9.0 normalization in Haskell (NFC, NFKC, NFD, NFKD).

What is normalization?

Unicode characters with adornments (e.g. Á) can be represented in two different forms, as a single composed character (U+00C1 = Á) or as multiple decomposed characters (U+0041(A) U+0301( ́ ) = Á). They are differently encoded byte sequences but for humans they have exactly the same visual appearance.

A regular byte comparison may tell that two strings are different even though they might be equivalent. We need to convert both the strings in a normalized form using the Unicode Character Database before we can compare them for equivalence. For example:

>> import Data.Text.Normalize
>> normalize NFC "\193" == normalize NFC "\65\769"


Please use to raise issues, or send pull requests.