Safe Haskell | None |
---|
The module provides functionality for manipulating PoliMorf, the
morphological dictionary for Polish. Apart from IO utilities there
is a merge
function which can be used to merge the PoliMorf with
another dictionary resources.
- type Form = Text
- type Base = Text
- type Tag = Text
- type Cat = Text
- data Entry = Entry {}
- atomic :: Entry -> Bool
- readPoliMorf :: FilePath -> IO [Entry]
- parsePoliMorf :: Text -> [Entry]
- data Rule = Rule {}
- apply :: Rule -> Text -> Text
- toBase :: Entry -> Maybe Rule
- mkRuleMap :: [(Text, Text)] -> DAWG (Set Rule)
- type BaseMap = DAWG (Set Rule)
- mkBaseMap :: [Entry] -> BaseMap
- type FormMap = DAWG (Set Rule)
- mkFormMap :: [Entry] -> FormMap
- data RelCode
- mergeWith :: Ord a => (String -> String -> a -> a) -> BaseMap -> DAWG (Set a) -> DAWG (Map a RelCode)
- merge :: Ord a => BaseMap -> DAWG (Set a) -> DAWG (Map a RelCode)
Core types
An entry from the PoliMorf dictionary.
Is the entry an atomic one? More precisely, we treat all negative forms starting with ''nie'' and all superlatives starting with ''naj'' as non-atomic entries.
Parsing
readPoliMorf :: FilePath -> IO [Entry]Source
Read the PoliMorf from the file.
parsePoliMorf :: Text -> [Entry]Source
Parse the PoliMorf into a list of entries.
Merging
A rule for translating a form into another one.
toBase :: Entry -> Maybe RuleSource
Determine the rule needed to translate the form into its base form.
type BaseMap = DAWG (Set Rule)Source
A map from forms to their possible base forms (there may be many since the form may be a member of multiple lexemes).
Reliability information: how did we assign a particular label to a particular word form.
mergeWith :: Ord a => (String -> String -> a -> a) -> BaseMap -> DAWG (Set a) -> DAWG (Map a RelCode)Source
Merge the BaseMap
with the dictionary resource which maps forms to sets
of labels. Every label is assigned a RelCode
which tells what is the
relation between the label and the form. It is a generalized version
of the merge
function with additional function f x y y'label
which
can be used to determine the resultant set of labels for the form x
given ,,similar'' form y
and its original label y'label
.
There are three kinds of labels:
Exact
labels assigned in a direct manner, ByBase
labels assigned
to all forms which have a base form with a label in the input dictionary,
and ByForm
labels assigned to all forms which have a related form from the
same lexeme with a label in the input dictionary.