charset-0.0: Fast unicode character sets

Portabilityportable
Stabilityexperimental
Maintainerekmett@gmail.com

Data.CharSet

Contents

Description

Encode unicode character sets as arbitrary precision floating point values using the least character in the set as the exponent. Can efficiently represent reasonably tightly grouped character sets, but may use up to 139KiB to represent a particularly sparse set.

Designed to be imported qualified:

 import Data.CharSet (CharSet)
 import qualified Data.CharSet as CharSet

Synopsis

CharSet

Manipulation

empty :: CharSetSource

O(1) The empty set. Permits O(1) null and size.

singleton :: Char -> CharSetSource

O(1) Construct a CharSet with a single element. Permits O(1) null and size

full :: CharSetSource

O(d) A CharSet containing every member of the enumeration of a.

union :: CharSet -> CharSet -> CharSetSource

O(d). May force size to take O(d) if ranges overlap, preserves order of null

intersection :: CharSet -> CharSet -> CharSetSource

O(d). May force size and null both to take O(d).

complement :: CharSet -> CharSetSource

O(d) Complements a CharSet with respect to the bounds of a. Preserves order of null and size

insert :: Char -> CharSet -> CharSetSource

O(d) Insert a single element of type a into the CharSet. Preserves order of null and size

delete :: Char -> CharSet -> CharSetSource

O(d) Delete a single item from the CharSet. Preserves order of null and size

(\\) :: CharSet -> CharSet -> CharSetSource

O(d). Preserves order of null. May force O(d) size.

fromList :: String -> CharSetSource

O(d * n) Make a CharSet from a list of items.

fromDistinctAscList :: String -> CharSetSource

O(d * n) Make a CharSet from a distinct ascending list of items

Accessors

null :: CharSet -> BoolSource

O(1|d) Is the CharSet empty? May be faster than checking if size == 0 after union. Operations that require a recount are noted.

size :: CharSet -> IntSource

O(1|d) The number of elements in the bit set.

member :: Char -> CharSet -> BoolSource

O(1) Test for membership in a CharSet

elem :: Char -> CharSet -> BoolSource

O(1) Alias for member

notElem :: Char -> CharSet -> BoolSource

O(1) Alias for notMember

isComplemented :: CharSet -> BoolSource

O(1) check to see if we are represented as a complemented CharSet.

toInteger :: CharSet -> IntegerSource

O(d) convert to an Integer representation. Discards negative elements

Builtins

POSIX

Unicode

Data.Char classifiers