concraft-pl-0.7.0: Morphological tagger for Polish

Safe HaskellNone

NLP.Concraft.Polish

Contents

Synopsis

Model

data Concraft

Concraft data.

Instances

Binary Concraft 

saveModel :: FilePath -> Concraft -> IO ()

Save model in a file. Data is compressed using the gzip format.

loadModel :: FilePath -> IO Concraft

Load model from a file.

Tagging

tag :: Concraft -> Sent Tag -> Sent TagSource

Tag the analysed sentence.

marginals :: Concraft -> Sent Tag -> Sent TagSource

Tag the sentence with marginal probabilities.

Analysis

macaPar :: MacaPool -> Text -> IO [Sent Tag]Source

Analyse paragraph with Maca. The function is thread-safe.

Training

data TrainConf Source

Training configuration.

Constructors

TrainConf 

Fields

tagset :: Tagset

Tagset.

sgdArgs :: SgdArgs

SGD parameters.

reana :: Bool

Perform reanalysis.

onDisk :: Bool

Store SGD dataset on disk.

guessNum :: Int

Numer of guessed tags for each word.

r0 :: R0T

r0T parameter.

trainSource

Arguments

:: TrainConf 
-> IO [SentO Tag]

Training data

-> IO [SentO Tag]

Evaluation data

-> IO Concraft 

Train concraft model. TODO: It should be possible to supply the two training procedures with different SGD arguments.

Pruning

prune :: Double -> Concraft -> Concraft

Prune disambiguation model: discard model features with absolute values (in log-domain) lower than the given threshold.