Safe Haskell | None |
---|
The module provides functionality for manipulating PoliMorf, the
morphological dictionary for Polish. Apart from IO utilities there
is a merge
function which can be used to merge the PoliMorf with
another dictionary resources.
- type Form = Text
- type Base = Text
- type Tag = Text
- type Cat = Text
- data Entry = Entry {}
- readPoliMorf :: FilePath -> IO [Entry]
- parsePoliMorf :: Text -> [Entry]
- type BaseMap = Map Form (Set Base)
- mkBaseMap :: [Entry] -> BaseMap
- data RelCode
- merge :: Ord a => BaseMap -> Map Form (Set a) -> Map Form (Map a RelCode)
Types
An entry from the PoliMorf dictionary.
Parsing
readPoliMorf :: FilePath -> IO [Entry]Source
Read the PoliMorf from the file.
parsePoliMorf :: Text -> [Entry]Source
Parse the PoliMorf into a list of entries.
Merging
type BaseMap = Map Form (Set Base)Source
A map from forms to their possible base forms (there may be many since the form may be a member of multiple lexemes).
Reliability information: how did we assign a particular label to a particular word form.
merge :: Ord a => BaseMap -> Map Form (Set a) -> Map Form (Map a RelCode)Source
Merge the BaseMap
with the dictionary resource which maps forms to
sets of labels. Every label is assigned a RelCode
which tells what
is the relation between the label and the form. There are three
kinds of labels:
Exact
labels assigned in a direct manner, ByBase
labels assigned
to all forms which have a base form with a label in the input dictionary,
and ByForm
labels assigned to all forms which have a related form from the
same lexeme with a label in the input dictionary.
This function is far from being memory efficient right now. If you plan to run it with respect to the entire PoliMorf dictionary you should do it on a machine with an abundance of available memory.