polimorf-0.1.0: Working with the PoliMorf dictionary

Safe HaskellNone




The module provides functionality for manipulating PoliMorf, the morphological dictionary for Polish. Apart from IO utilities there is a merge function which can be used to merge the PoliMorf with another dictionary resources.



type Form = TextSource

A form.

type Base = TextSource

A base form.

type Tag = TextSource

A morphosyntactic tag.

data Entry Source

An entry from the PoliMorf dictionary.




form :: !Form
base :: !Base
tag :: !Tag


readPoliMorf :: FilePath -> IO [Entry]Source

Read the PoliMorf from the file.

parsePoliMorf :: Text -> [Entry]Source

Parse the PoliMorf into a list of entries.


type BaseMap = Map Form [Base]Source

A map from forms to their possible base forms (there may be many since the form may be a member of multiple lexemes).

mkBaseMap :: [Entry] -> BaseMapSource

Make the base map from the list of entries.

data RelCode Source

Reliability information: how did we assign a particular label to a particular word form.



Label assigned in a direct manner


Label assigned based on a lemma label


Based on labels of other forms within the same lexeme

merge :: Monoid m => BaseMap -> Map Form m -> Map Form (Maybe (m, RelCode))Source

Merge the BaseMap with the dictionary resource which maps forms to monoidal labels. Depending on the inference technique there are three kinds of labels in the resultant dictionary: Exact labels assigned in a direct manner, ByBase labels assigned to all forms which have a base form with a label in the input dictionary, and ByForm labels assigned to all forms which have a related form from the same lexeme with a label in the input dictionary.

For a particular form in the output dictionary there are labels extracted with at most one of the methods described above, with Exact labels having a precedence over ByBase labels and ByBase labels having a precedence over ByForm labels.

This function is far from being memory efficient right now. If you plan to run it with respect to the entire PoliMorf dictionary you should do it on a machine with an abundance of available memory.