dawg-0.8: Directed acyclic word graphs

Safe HaskellNone

Data.DAWG.Static

Contents

Description

The module implements directed acyclic word graphs (DAWGs) internaly represented as minimal acyclic deterministic finite-state automata.

In comparison to Data.DAWG module the automaton implemented here:

  • Keeps all nodes in one array and therefore uses much less memory,
  • When weighed, it can be used to perform static hashing with hash and unHash functions,
  • Doesn't provide insert/delete family of operations.

Transition backend has to be specified by a type signature. You can import the desired transition type and define your own dictionary construction function.

 import Data.DAWG.Static
 import Data.DAWG.Trans.Map (Trans)

 mkDict :: (Enum a, Ord b) => [([a], b)] -> DAWG Trans a Weight b
 mkDict = weigh . fromList

Synopsis

DAWG type

data DAWG t a b c Source

DAWG t a b c constitutes an automaton with alphabet symbols of type a, transition labels of type b and node values of type Maybe c, implemented on top of the Trans t backend. All nodes are stored in a Vector with positions of nodes corresponding to their IDs.

Instances

(Eq b, Eq c, Unbox b) => Eq (DAWG Trans a b c) 
(Eq (DAWG Trans a b c), Ord b, Ord c, Unbox b) => Ord (DAWG Trans a b c) 
(Show t, Show b, Show c, Unbox b) => Show (DAWG t a b c) 
(Binary t, Binary b, Binary c, Unbox b) => Binary (DAWG t a b c) 

Query

lookup :: (Enum a, Trans t, Unbox b) => [a] -> DAWG t a b c -> Maybe cSource

Find value associated with the key.

numStates :: DAWG t a b c -> IntSource

Number of states in the automaton.

Index

index :: (Enum a, Trans t) => [a] -> DAWG t a Weight c -> Maybe IntSource

Position in a set of all dictionary entries with respect to the lexicographic order.

byIndex :: (Enum a, Trans t) => Int -> DAWG t a Weight c -> Maybe [a]Source

Find dictionary entry given its index with respect to the lexicographic order.

Hash

hash :: (Enum a, Trans t) => [a] -> DAWG t a Weight c -> Maybe IntSource

Perfect hashing function for dictionary entries. A synonym for the index function.

unHash :: (Enum a, Trans t) => Int -> DAWG t a Weight c -> Maybe [a]Source

Inverse of the hash function and a synonym for the byIndex function.

Construction

empty :: (Trans t, Unbox b) => DAWG t a b cSource

Empty DAWG.

fromList :: (Enum a, MkNode t b) => [([a], b)] -> DAWG t a () bSource

Construct DAWG from the list of (word, value) pairs. First a DAWG is created and then it is frozen using the freeze function.

fromListWith :: (Enum a, MkNode t b) => (b -> b -> b) -> [([a], b)] -> DAWG t a () bSource

Construct DAWG from the list of (word, value) pairs with a combining function. The combining function is applied strictly. First a DAWG is created and then it is frozen using the freeze function.

fromLang :: (Enum a, MkNode t ()) => [[a]] -> DAWG t a () ()Source

Make DAWG from the list of words. Annotate each word with the () value. First a DAWG is created and then it is frozen using the freeze function.

freeze :: Trans t => DAWG t a b -> DAWG t a () bSource

Construct immutable version of the automaton.

Weight

type Weight = IntSource

Weight of a node corresponds to the number of final states reachable from the node. Weight of an edge is a sum of weights of preceding nodes outgoing from the same parent node.

weigh :: Trans t => DAWG t a b c -> DAWG t a Weight cSource

Compute node weights and store corresponding values in transition labels.

Conversion

assocs :: (Enum a, Trans t, Unbox b) => DAWG t a b c -> [([a], c)]Source

Return all key/value pairs in the DAWG in ascending key order.

keys :: (Enum a, Trans t, Unbox b) => DAWG t a b c -> [[a]]Source

Return all keys of the DAWG in ascending order.

elems :: (Trans t, Unbox b) => DAWG t a b c -> [c]Source

Return all elements of the DAWG in the ascending order of their keys.