Safe Haskell | None |
---|
- data Concraft = Concraft {}
- saveModel :: FilePath -> Concraft -> IO ()
- loadModel :: FilePath -> IO Concraft
- tag :: Word w => Concraft -> Sent w Tag -> [(Set Tag, Tag)]
- marginals :: Word w => Concraft -> Sent w Tag -> [WMap Tag]
- train :: (Word w, FromJSON w, ToJSON w) => Tagset -> Int -> TrainConf -> TrainConf -> IO [Sent w Tag] -> IO [Sent w Tag] -> IO Concraft
- reAnaTrain :: (Word w, FromJSON w, ToJSON w) => Tagset -> Analyse w Tag -> Int -> TrainConf -> TrainConf -> IO [SentO w Tag] -> IO [SentO w Tag] -> IO Concraft
- prune :: Double -> Concraft -> Concraft
Model
Concraft data.
Binary Concraft |
saveModel :: FilePath -> Concraft -> IO ()Source
Save model in a file. Data is compressed using the gzip format.
Tagging
tag :: Word w => Concraft -> Sent w Tag -> [(Set Tag, Tag)]Source
Tag sentence using the model. In your code you should probably
use your analysis function, translate results into a container of
Sent
ences, evaluate tag
on each sentence and embed the
tagging results into the morphosyntactic structure of your own.
The function returns guessing results as fst
elements
of the output pairs and disambiguation results as snd
elements of the corresponding pairs.
marginals :: Word w => Concraft -> Sent w Tag -> [WMap Tag]Source
Determine marginal probabilities corresponding to individual tags w.r.t. the disambiguation model. Since the guessing model is used first, the resulting weighted maps corresponding to OOV words may contain tags not present in the input sentence.
Training
:: (Word w, FromJSON w, ToJSON w) | |
=> Tagset | A morphosyntactic tagset to which |
-> Int | How many tags is the guessing model supposed
to produce for a given OOV word? It will be
used (see |
-> TrainConf | Training configuration for the guessing model. |
-> TrainConf | Training configuration for the disambiguation model. |
-> IO [Sent w Tag] | Training dataset. This IO action will be executed a couple of times, so consider using lazy IO if your dataset is big. |
-> IO [Sent w Tag] | Evaluation dataset IO action. Consider using lazy IO if your dataset is big. |
-> IO Concraft |
:: (Word w, FromJSON w, ToJSON w) | |
=> Tagset | A morphosyntactic tagset to which |
-> Analyse w Tag | Analysis function. It will be used to reanalyse input dataset. |
-> Int | How many tags is the guessing model supposed
to produce for a given OOV word? It will be
used (see |
-> TrainConf | Training configuration for the guessing model. |
-> TrainConf | Training configuration for the disambiguation model. |
-> IO [SentO w Tag] | Training dataset. This IO action will be executed a couple of times, so consider using lazy IO if your dataset is big. |
-> IO [SentO w Tag] | Evaluation dataset IO action. Consider using lazy IO if your dataset is big. |
-> IO Concraft |