nerf-0.5.0: Nerf, the named entity recognition tool based on linear-chain CRFs

Safe HaskellNone

NLP.Nerf.Dict.Base

Contents

Description

Basic types for dictionary handling.

Synopsis

Lexicon entry

type NeType = TextSource

A type of named entity.

type Form = TextSource

A orthographic form.

isMultiWord :: Form -> BoolSource

Is the form a multiword one?

data Entry Source

A Named Entity entry from the LMF dictionary.

Constructors

Entry 

Fields

neOrth :: !Form

Orthographic form of the NE

neType :: !NeType

Type of the NE

Instances

Eq Entry 
Ord Entry 
Read Entry 
Show Entry 

Dictionary

type Label = TextSource

Dictionary label.

type DAWG = DAWG Trans Char ()Source

A Dict is a map from forms to labels. Each form may be annotated with multiple labels. The map is represented using the directed acyclic word graph. type Dict = D.DAWG (S.Set Label)

type Dict = DAWG (Set Label)Source

fromPairs :: [(Form, Label)] -> DictSource

Construct dictionary from the list of form/label pairs.

fromEntries :: [Entry] -> DictSource

Construct dictionary from the list of entries.

siftDict :: (Form -> Set Label -> Bool) -> Dict -> DictSource

Remove dictionary entries which do not satisfy the predicate.

saveDict :: FilePath -> Dict -> IO ()Source

Save the dictionary in the file.

loadDict :: FilePath -> IO DictSource

Load the dictionary from the file.

Merging dictionaries

merge :: [Dict] -> DictSource

Merge dictionary resources.

diff :: [Dict] -> [Dict]Source

Differentiate labels from separate dictionaries using dictionary-unique prefixes.