Safe Haskell | None |
---|---|
Language | Haskell2010 |
DAG-based model for morphosyntactic tagging.
Synopsis
- type Tag = Interp Tag
- simplify4gsr :: Tagset -> Interp Tag -> Tag
- simplify4dmb :: Tagset -> Interp Tag -> Tag
- data Concraft t
- saveModel :: (Ord t, Binary t) => FilePath -> Concraft t -> IO ()
- loadModel :: (Ord t, Binary t) => (Tagset -> t -> Tag) -> (Tagset -> t -> Tag) -> FilePath -> IO (Concraft t)
- guess :: Config Tag -> Concraft Tag -> Sent Tag -> Sent Tag
- data AnnoSent = AnnoSent {}
- data AnnoConf = AnnoConf {}
- annoAll :: AnnoConf -> Concraft Tag -> Sent Tag -> [AnnoSent]
- data TrainConf = TrainConf {}
- train :: TrainConf -> IO [Sent Tag] -> IO [Sent Tag] -> IO (Concraft Tag)
Types
Simplification
simplify4gsr :: Tagset -> Interp Tag -> Tag Source #
Simplify the tag for the sake of the guessing model.
TODO: it is also used in the evaluation script, which assumes that
simplify4gsr
simplifies to a positional tag. The name of the function
should reflect this, perhaps, or there should be two separate functions: one
dedicated to guesser, one dedicated to evaluation (and other more generic
things).
simplify4dmb :: Tagset -> Interp Tag -> Tag Source #
Simplify the tag for the sake of the disambiguation model.
Model
saveModel :: (Ord t, Binary t) => FilePath -> Concraft t -> IO () #
Save model in a file. Data is compressed using the gzip format.
:: (Ord t, Binary t) | |
=> (Tagset -> t -> Tag) | Guesser simplification function |
-> (Tagset -> t -> Tag) | Disamb simplification function |
-> FilePath | |
-> IO (Concraft t) |
Load model from a file.
Tagging
guess :: Config Tag -> Concraft Tag -> Sent Tag -> Sent Tag Source #
Tag the sentence with guessing marginal probabilities.
High level
Annotated sentence.
AnnoSent | |
|
Annotation config.
annoAll :: AnnoConf -> Concraft Tag -> Sent Tag -> [AnnoSent] Source #
Annotate all possibly interesting information.
Training
Training configuration.
TrainConf | |
|
Train concraft model. TODO: It should be possible to supply the two training procedures with different SGD arguments.