nerf-0.1.0: Nerf, the named entity recognition tool based on linear-chain CRFs

Safe HaskellNone

NLP.Nerf.Dict.Base

Contents

Description

Basic types for dictionary handling.

Synopsis

Lexicon entry

type Form = TextSource

A orthographic form.

isMultiWord :: Form -> BoolSource

Is the form a multiword one?

type NeType = TextSource

A type of named entity.

data Entry Source

A Named Entity entry from the LMF dictionary.

Constructors

Entry 

Fields

neOrth :: !Form

Orthographic form of the NE

neType :: !NeType

Type of the NE

Dictionary

type NeDict = Map Form (Set NeType)Source

A NeDict is a map from forms to NE types. Each NE may be annotated with multiple types.

mkDict :: [Entry] -> NeDictSource

Construct the dictionary from the list of entries.

siftDict :: (Form -> Set NeType -> Bool) -> NeDict -> NeDictSource

Remove dictionary entries which do not satisfy the predicate.

saveDict :: FilePath -> NeDict -> IO ()Source

Save the dictionary in the file.

loadDict :: FilePath -> IO NeDictSource

Load the dictionary from the file.

Merging dictionaries

merge :: [NeDict] -> NeDictSource

Merge dictionary resources.

diff :: [NeDict] -> [NeDict]Source

Differentiate labels from separate dictionaries using dictionary-unique prefixes.