hist-pl-fusion-0.4.0: Merging historical dictionary with PoliMorf

Safe HaskellNone

NLP.HistPL.Fusion

Contents

Synopsis

Basic types

type UID = Int

A unique identifier among entries with the same keyForm.

type POS = TextSource

Part of speech.

type Word = TextSource

Word form.

type Base = TextSource

Base form.

type IsBase = BoolSource

Is the word form a base form?

Dictionary

Bilateral

data Bila i a b Source

Bilateral dictionary.

Constructors

Bila 

Fields

baseDict :: BaseDict i a b
 
formDict :: FormDict i a b
 

Instances

(Eq i, Eq a, Eq b) => Eq (Bila i a b) 
(Ord i, Ord a, Ord b) => Ord (Bila i a b) 
(Show i, Show a, Show b) => Show (Bila i a b) 

mkBila :: (Ord i, Ord a, Ord b) => [(Base, i, a, Word, b)] -> Bila i a bSource

Make bilateral dictionary from a list of (base form, ID, additional lexeme info, word form, additional word form info) tuples.

withForm :: Ord i => Bila i a b -> Word -> LexSet i a bSource

Identify entries which contain given word form.

Contemporary

type Poli = Bila POS () ()Source

PoliMorf dictionary in a bilateral form.

type PLex = Lex POS () ()Source

PoliMorf dictionary entry.

type PLexSet = LexSet POS () ()Source

Set of PoliMorf dictionary entries.

mkPoli :: [Entry] -> PoliSource

Make bilateral dictionary from PoliMorf.

Correspondence

type Corresp = Poli -> LexEntry -> PLexSetSource

A function which determines entries from a bilateral dictionary corresponing to a given historical lexeme.

buildCorresp :: Core -> Filter -> Choice -> CorrespSource

Build Corresp function form individual components.

Components

type Core = Poli -> LexEntry -> [PLexSet]Source

We provide three component types, Core, Filter and Choice, which can be combined together using the buildCorresp function to construct a Corresp function. The first one, Core, is used to identify a list of potential sets of lexemes. It is natural to define the core function in such a way because the task of determining corresponding lexemes can be usually divided into a set of smaller tasks of the same purpose. For example, we may want to identify LexSets corresponding to individual word forms of the historical lexeme.

type Filter = LexEntry -> PLex -> BoolSource

Function which can be used to filter out lexemes which do not satisfy a particular predicate. For example, we may want to filter out lexemes with incompatible POS value.

type Choice = [PLexSet] -> PLexSetSource

The final choice of lexemes. Many different strategies can be used here sum of the sets, intersection, or voting.

byForms :: CoreSource

Identify LexSets corresponding to individual word forms of the historical lexeme using the withForm function.

posFilter :: FilterSource

Filter out lexemes with POS value incompatible with the set of POS values assigned to the historical lexeme.

sumChoice :: ChoiceSource

Sum of sets of lexemes.