unicode-data: Access Unicode character database

[ apache, data, library, text, unicode ] [ Propose Tags ]

unicode-data provides Haskell APIs to efficiently access the Unicode character database. Performance is the primary goal in the design of this package.

The Haskell data structures are generated programmatically from the Unicode character database (UCD) files. The latest Unicode version supported by this library is 14.0.0.


[Skip to Readme]

Flags

Manual Flags

NameDescriptionDefault
ucd2haskell

Build the ucd2haskell executable

Disabled

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0, 0.1.0.1, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 0.4.0.1 (info)
Change log Changelog.md
Dependencies base (>=4.7 && <4.18) [details]
License Apache-2.0
Copyright 2020 Composewell Technologies and Contributors
Author Composewell Technologies and Contributors
Maintainer streamly@composewell.com
Revised Revision 1 made by adithyaov at 2023-11-29T12:22:43Z
Category Data, Text, Unicode
Home page http://github.com/composewell/unicode-data
Bug tracker https://github.com/composewell/unicode-data/issues
Source repo head: git clone https://github.com/composewell/unicode-data
Uploaded by harendra at 2021-11-18T21:09:14Z
Distributions Arch:0.4.0.1, Fedora:0.3.1, LTSHaskell:0.4.0.1, NixOS:0.4.0.1, Stackage:0.4.0.1, openSUSE:0.4.0.1
Reverse Dependencies 5 direct, 226 indirect [details]
Executables ucd2haskell
Downloads 11539 total (232 in the last 30 days)
Rating 2.25 (votes: 2) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2021-11-18 [all 1 reports]

Readme for unicode-data-0.2.0

[back to package description]

README

unicode-data provides Haskell APIs to efficiently access the Unicode character database. Performance is the primary goal in the design of this package.

The Haskell data structures are generated programmatically from the Unicode character database (UCD) files. The latest Unicode version supported by this library is 14.0.0.

This package is far from complete. Currently it supports normalization related functions and a few other properties, primarily to support unicode-transforms package. More properties can be added as needed by any other packages or use cases.

Please see the haddock documentation for reference documentation.

Unicode database version update

To update the Unicode version please update the version number in ucd.sh.

To download the Unicode database, run ucd.sh download from the top level directory of the repo to fetch the database in ./ucd.

$ ./ucd.sh download

To generate the Haskell data structure files from the downloaded database files, run ucd.sh generate from the top level directory of the repo.

$ ./ucd.sh generate

Running property doctests

Temporarily add QuickCheck to build depends of library.

$ cabal build
$ cabal-docspec --check-properties --property-variables c

Licensing

unicode-data is an open source project available under a liberal Apache-2.0 license.

Contributing

As an open project we welcome contributions.