concraft-pl-2.1.1: Morphological tagger for Polish

Safe HaskellNone
LanguageHaskell2010

NLP.Concraft.Polish.DAGSeg

Contents

Description

DAG-based model for morphosyntactic tagging.

Synopsis

Types

Simplification

simplify4gsr :: Tagset -> Interp Tag -> Tag Source #

Simplify the tag for the sake of the guessing model. TODO: it is also used in the evaluation script, which assumes that simplify4gsr simplifies to a positional tag. The name of the function should reflect this, perhaps, or there should be two separate functions: one dedicated to guesser, one dedicated to evaluation (and other more generic things).

simplify4dmb :: Tagset -> Interp Tag -> Tag Source #

Simplify the tag for the sake of the disambiguation model.

Model

data Concraft t #

Concraft data.

saveModel :: (Ord t, Binary t) => FilePath -> Concraft t -> IO () #

Save model in a file. Data is compressed using the gzip format.

loadModel #

Arguments

:: (Ord t, Binary t) 
=> (Tagset -> t -> Tag)

Guesser simplification function

-> (Tagset -> t -> Tag)

Disamb simplification function

-> FilePath 
-> IO (Concraft t) 

Load model from a file.

Tagging

guess :: Config Tag -> Concraft Tag -> Sent Tag -> Sent Tag Source #

Tag the sentence with guessing marginal probabilities.

High level

data AnnoSent Source #

Annotated sentence.

Constructors

AnnoSent 

Fields

data AnnoConf Source #

Annotation config.

Constructors

AnnoConf 

Fields

annoAll :: AnnoConf -> Concraft Tag -> Sent Tag -> [AnnoSent] Source #

Annotate all possibly interesting information.

Training

data TrainConf Source #

Training configuration.

Constructors

TrainConf 

Fields

train Source #

Arguments

:: TrainConf 
-> IO [Sent Tag]

Training data

-> IO [Sent Tag]

Evaluation data

-> IO (Concraft Tag) 

Train concraft model. TODO: It should be possible to supply the two training procedures with different SGD arguments.

Pruning