dawg-0.8.2: Directed acyclic word graphs

Safe HaskellNone
LanguageHaskell2010

Data.DAWG.Static

Contents

Description

The module implements directed acyclic word graphs (DAWGs) internaly represented as minimal acyclic deterministic finite-state automata.

In comparison to Data.DAWG module the automaton implemented here:

  • Keeps all nodes in one array and therefore uses much less memory,
  • When weighed, it can be used to perform static hashing with hash and unHash functions,
  • Doesn't provide insert/delete family of operations.

Transition backend has to be specified by a type signature. You can import the desired transition type and define your own dictionary construction function.

import Data.DAWG.Static
import Data.DAWG.Trans.Map (Trans)

mkDict :: (Enum a, Ord b) => [([a], b)] -> DAWG Trans a Weight b
mkDict = weigh . fromList
Synopsis

DAWG type

data DAWG t a b c Source #

DAWG t a b c constitutes an automaton with alphabet symbols of type a, transition labels of type b and node values of type Maybe c, implemented on top of the Trans t backend. All nodes are stored in a Vector with positions of nodes corresponding to their IDs.

Instances
(Eq b, Eq c, Unbox b) => Eq (DAWG Trans a b c) Source # 
Instance details

Defined in Data.DAWG.Static

Methods

(==) :: DAWG Trans a b c -> DAWG Trans a b c -> Bool #

(/=) :: DAWG Trans a b c -> DAWG Trans a b c -> Bool #

(Ord b, Ord c, Unbox b) => Ord (DAWG Trans a b c) Source # 
Instance details

Defined in Data.DAWG.Static

Methods

compare :: DAWG Trans a b c -> DAWG Trans a b c -> Ordering #

(<) :: DAWG Trans a b c -> DAWG Trans a b c -> Bool #

(<=) :: DAWG Trans a b c -> DAWG Trans a b c -> Bool #

(>) :: DAWG Trans a b c -> DAWG Trans a b c -> Bool #

(>=) :: DAWG Trans a b c -> DAWG Trans a b c -> Bool #

max :: DAWG Trans a b c -> DAWG Trans a b c -> DAWG Trans a b c #

min :: DAWG Trans a b c -> DAWG Trans a b c -> DAWG Trans a b c #

(Unbox b, Show t, Show b, Show c) => Show (DAWG t a b c) Source # 
Instance details

Defined in Data.DAWG.Static

Methods

showsPrec :: Int -> DAWG t a b c -> ShowS #

show :: DAWG t a b c -> String #

showList :: [DAWG t a b c] -> ShowS #

(Binary t, Binary b, Binary c, Unbox b) => Binary (DAWG t a b c) Source # 
Instance details

Defined in Data.DAWG.Static

Methods

put :: DAWG t a b c -> Put #

get :: Get (DAWG t a b c) #

putList :: [DAWG t a b c] -> Put #

Query

lookup :: (Enum a, Trans t, Unbox b) => [a] -> DAWG t a b c -> Maybe c Source #

Find value associated with the key.

numStates :: DAWG t a b c -> Int Source #

Number of states in the automaton.

Index

index :: (Enum a, Trans t) => [a] -> DAWG t a Weight c -> Maybe Int Source #

Position in a set of all dictionary entries with respect to the lexicographic order.

byIndex :: (Enum a, Trans t) => Int -> DAWG t a Weight c -> Maybe [a] Source #

Find dictionary entry given its index with respect to the lexicographic order.

Hash

hash :: (Enum a, Trans t) => [a] -> DAWG t a Weight c -> Maybe Int Source #

Perfect hashing function for dictionary entries. A synonym for the index function.

unHash :: (Enum a, Trans t) => Int -> DAWG t a Weight c -> Maybe [a] Source #

Inverse of the hash function and a synonym for the byIndex function.

Construction

empty :: (Trans t, Unbox b) => DAWG t a b c Source #

Empty DAWG.

fromList :: (Enum a, MkNode t b) => [([a], b)] -> DAWG t a () b Source #

Construct DAWG from the list of (word, value) pairs. First a DAWG is created and then it is frozen using the freeze function.

fromListWith :: (Enum a, MkNode t b) => (b -> b -> b) -> [([a], b)] -> DAWG t a () b Source #

Construct DAWG from the list of (word, value) pairs with a combining function. The combining function is applied strictly. First a DAWG is created and then it is frozen using the freeze function.

fromLang :: (Enum a, MkNode t ()) => [[a]] -> DAWG t a () () Source #

Make DAWG from the list of words. Annotate each word with the () value. First a DAWG is created and then it is frozen using the freeze function.

freeze :: Trans t => DAWG t a b -> DAWG t a () b Source #

Construct immutable version of the automaton.

Weight

type Weight = Int Source #

Weight of a node corresponds to the number of final states reachable from the node. Weight of an edge is a sum of weights of preceding nodes outgoing from the same parent node.

weigh :: Trans t => DAWG t a b c -> DAWG t a Weight c Source #

Compute node weights and store corresponding values in transition labels.

Conversion

assocs :: (Enum a, Trans t, Unbox b) => DAWG t a b c -> [([a], c)] Source #

Return all key/value pairs in the DAWG in ascending key order.

keys :: (Enum a, Trans t, Unbox b) => DAWG t a b c -> [[a]] Source #

Return all keys of the DAWG in ascending order.

elems :: (Trans t, Unbox b) => DAWG t a b c -> [c] Source #

Return all elements of the DAWG in the ascending order of their keys.