nerf-0.5.3: Nerf, the named entity recognition tool based on linear-chain CRFs

Safe HaskellNone
LanguageHaskell98

NLP.Nerf.Dict.Base

Contents

Description

Basic types for dictionary handling.

Synopsis

Lexicon entry

type NeType = Text Source

A type of named entity.

type Form = Text Source

A orthographic form.

isMultiWord :: Form -> Bool Source

Is the form a multiword one?

data Entry Source

A Named Entity entry from the LMF dictionary.

Constructors

Entry 

Fields

neOrth :: !Form

Orthographic form of the NE

neType :: !NeType

Type of the NE

Dictionary

type Label = Text Source

Dictionary label.

type DAWG = DAWG Trans Char () Source

A Dict is a map from forms to labels. Each form may be annotated with multiple labels. The map is represented using the directed acyclic word graph. type Dict = D.DAWG (S.Set Label)

fromPairs :: [(Form, Label)] -> Dict Source

Construct dictionary from the list of form/label pairs.

fromEntries :: [Entry] -> Dict Source

Construct dictionary from the list of entries.

siftDict :: (Form -> Set Label -> Bool) -> Dict -> Dict Source

Remove dictionary entries which do not satisfy the predicate.

saveDict :: FilePath -> Dict -> IO () Source

Save the dictionary in the file.

loadDict :: FilePath -> IO Dict Source

Load the dictionary from the file.

Merging dictionaries

merge :: [Dict] -> Dict Source

Merge dictionary resources.

diff :: [Dict] -> [Dict] Source

Differentiate labels from separate dictionaries using dictionary-unique prefixes.