hist-pl-dawg-0.1.0: A generic, DAWG-based dictionary

Safe HaskellNone

NLP.HistPL.DAWG

Contents

Description

A DAWG-based dictionary with additional information assigned to lexical entries and word forms.

Synopsis

Rule

data Rule Source

A rule for translating a form into another form.

Constructors

Rule 

Fields

cut :: !Int

Number of characters to cut from the end of the form.

suffix :: !Text

A suffix to paste.

Instances

apply :: Rule -> Text -> TextSource

Apply the rule.

between :: Text -> Text -> RuleSource

Determine a rule which translates between two strings.

DAWG

type DAWG i a b = DAWG Char () (Node i a b)Source

A dictionary parametrized over ID i, with info a for every (key, i) pair and info b for every (key, i, apply rule key) triple.

Entry

data Lex i a b Source

A dictionary entry consists of a Key and a Value.

Constructors

Lex 

Fields

lexKey :: Key i

Entry key.

lexVal :: Val a Text b

Entry value.

Instances

(Eq i, Eq a, Eq b) => Eq (Lex i a b) 
(Ord i, Ord a, Ord b) => Ord (Lex i a b) 
(Show i, Show a, Show b) => Show (Lex i a b) 

data Key i Source

A key of a dictionary entry.

Constructors

Key 

Fields

path :: Text

A path of the entry, i.e. DAWG key.

uid :: i

Unique identifier among entries with the same path.

Instances

Eq i => Eq (Key i) 
Ord i => Ord (Key i) 
Show i => Show (Key i) 

data Val a w b Source

A value of the entry.

Constructors

Val 

Fields

info :: a

Additional information assigned to the entry.

forms :: Map w b

A map of forms with additional info of type b assigned. Invariant: in case of a reverse dictionary (from word forms to base forms) this map should contain exactly one element (a base form and a corresonding information).

Instances

(Eq a, Eq w, Eq b) => Eq (Val a w b) 
(Ord a, Ord w, Ord b) => Ord (Val a w b) 
(Show a, Show w, Show b) => Show (Val a w b) 
(Ord w, Binary a, Binary w, Binary b) => Binary (Val a w b) 

type Node i a b = Map i (Val a Rule b)Source

Actual values stored in automaton states contain all entry information but path.

Entry set

type LexSet i a b = Map (Key i) (Val a Text b)Source

A set of dictionary entries.

mkLexSet :: Ord i => [Lex i a b] -> LexSet i a bSource

Make lexical set from a list of entries.

unLexSet :: LexSet i a b -> [Lex i a b]Source

List lexical entries.

decode :: Ord i => Text -> Node i a b -> LexSet i a bSource

Decode dictionary value given path.

Query

lookup :: Ord i => Text -> DAWG i a b -> LexSet i a bSource

Lookup the key in the dictionary.

Conversion

fromList :: (Ord i, Ord a, Ord b) => [(Text, i, a, Text, b)] -> DAWG i a bSource

Make dictionary from a list of (key, ID, entry info, form, entry/form info) tuples.

toList :: (Ord i, Ord a, Ord b) => DAWG i a b -> [(Text, i, a, Text, b)]Source

Transform dictionary back into the list of (key, ID, key/ID info, elem, key/ID/elem info) tuples.

entries :: Ord i => DAWG i a b -> [Lex i a b]Source

List dictionary lexical entries.

revDAWG :: (Ord i, Ord a, Ord b) => DAWG i a b -> DAWG i a bSource

Reverse the dictionary.