unicode-transforms: Unicode transforms (normalization NFC/NFD/NFKC/NFKD)

[ bsd3, data, library, text, unicode ] [ Propose Tags ]

This is a lightweight library supporting a limited set of unicode transformations (only normalizations as of now) on ByteStrings (UTF-8) and Text without requiring any other system libraries. It is based on the utf8proc C utility supporting unicode versions upto 5.1.0.

text-icu is a full featured alternative for all unicode operations but with a dependency on the system installed icu libraries. This package aims to provide an API similar to text-icu.

For more details see the README.md file.


[Skip to Readme]

Flags

Manual Flags

NameDescriptionDefault
bench-icu

Use text-icu for benchmark comparison

Disabled

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 0.1.0.1, 0.2.0, 0.2.1, 0.3.0, 0.3.1, 0.3.2, 0.3.3, 0.3.4, 0.3.5, 0.3.6, 0.3.7, 0.3.7.1, 0.3.8, 0.4.0, 0.4.0.1 (info)
Dependencies base (>=4.7 && <5), bytestring, text (<1.3) [details]
License BSD-3-Clause
Copyright 2016 Harendra Kumar
Author Harendra Kumar
Maintainer harendra.kumar@gmail.com
Revised Revision 1 made by sjakobi at 2022-09-08T13:33:52Z
Category Data, Text, Unicode
Home page http://github.com/harendra-kumar/unicode-transforms
Source repo head: git clone https://github.com/harendra-kumar/unicode-transforms
Uploaded by harendra at 2016-06-20T16:06:49Z
Distributions Arch:0.4.0.1, Debian:0.3.6, Fedora:0.4.0.1, LTSHaskell:0.4.0.1, NixOS:0.4.0.1, Stackage:0.4.0.1, openSUSE:0.4.0.1
Reverse Dependencies 11 direct, 181 indirect [details]
Downloads 37306 total (206 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2016-06-20 [all 1 reports]

Readme for unicode-transforms-0.1.0.1

[back to package description]

Unicode Transforms

This is a lightweight Haskell library supporting commonly used unicode transformations (currently only normalizations) on ByteStrings (UTF-8) and Text.

Haskell package text-icu provides a comprehensive set of unicode transforms. The drawback of text-icu is that it requires you to install the ICU library OS packages first. This package is self contained and aims to provide an API similar to text-icu so that it can be used as a drop-in replacement for the features it supports.

Features

Unicode normalization in NFC, NFKC, NFD, NFKD forms is supported. This version of the library supports unicode versions upto 5.1.0.

Documentation

Please see the haddock documentation available with the package.

Implementation

This package is implemented as bindings to the utf8proc C utility. The utf8proc version bundled with this package is taken from the xqilla project (xqilla version 2.3.2).

In future the underlying utf8proc implementation will get replaced by a Haskell implementation supporting the latest unicode versions.

Please see the NOTES.md file shipped with this package for more details on related packages, missing features and todo etc.

Contributing

Contributions are welcome! Please use the github repository at https://github.com/harendra-kumar/unicode-transforms to raise issues, request features or send pull requests.