The unicode-transforms package

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain]

Warnings:

Fast Unicode 8.0 normalization in Haskell (NFC, NFKC, NFD, NFKD).


[Skip to ReadMe]

Properties

Versions0.1.0.1, 0.2.0, 0.2.1, 0.2.1, 0.3.0
Change logChangelog.md
Dependenciesbase (>=4.7 && <5), bitarray (>=0.0.1 && <0.1), bytestring (>=0.9 && <0.11), text (>=1.1.1 && <1.3) [details]
LicenseBSD3
Copyright2016 Harendra Kumar, 2014–2015 Antonio Nikishaev
AuthorHarendra Kumar
Maintainerharendra.kumar@gmail.com
CategoryData, Text, Unicode
Home pagehttp://github.com/harendra-kumar/unicode-transforms
Bug trackerhttps://github.com/harendra-kumar/unicode-transforms/issues
Source repositoryhead: git clone https://github.com/harendra-kumar/unicode-transforms
UploadedSun Jan 22 14:19:35 UTC 2017 by harendra

Modules

[Index]

Flags

NameDescriptionDefaultType
devDeveloper buildDisabledManual
has-icuUse text-icu for benchmark and test comparisonsDisabledManual
has-llvmUse llvm backend (faster) for compilationDisabledManual

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Maintainers' corner

For package maintainers and hackage trustees

Readme for unicode-transforms-0.2.1

Unicode Transforms

Hackage Build Status Windows Build status Coverage Status

Fast Unicode 8.0 normalization in Haskell (NFC, NFKC, NFD, NFKD).

What is normalization?

Unicode characters with adornments (e.g. Á) can be represented in two different forms, as a single composed character (U+00C1 = Á) or as multiple decomposed characters (U+0041(A) U+0301( ́ ) = Á). They are differently encoded byte sequences but for humans they have exactly the same visual appearance.

A regular byte comparison may tell that two strings are different even though they might be equivalent. We need to convert both the strings in a normalized form using the Unicode Character Database before we can compare them for equivalence. For example:

>> import Data.Text.Normalize
>> normalize NFC "\193" == normalize NFC "\65\769"
True

Contributing

Please use https://github.com/harendra-kumar/unicode-transforms to raise issues, or send pull requests.