Safe Haskell | None |
---|---|

Language | Haskell98 |

Provisional training module which works on sequential external data but transforms it to DAG internal data form.

## Synopsis

- data CRF a b = CRF {}
- train :: (Ord a, Ord b) => SgdArgs -> Bool -> ([SentL a b] -> Set b) -> (AVec Lb -> [DAG () (X, Y)] -> [Feature]) -> IO [SentL a b] -> IO [SentL a b] -> IO (CRF a b)
- oovChosen :: Ord b => [SentL a b] -> Set b
- anyChosen :: Ord b => [SentL a b] -> Set b
- anyInterps :: Ord b => [SentL a b] -> Set b
- verifyDAG :: DAG a (X, Y) -> Maybe Error
- data Error
- = Malformed
- | Cyclic
- | SeveralSources [NodeID]
- | SeveralTargets [NodeID]
- | WrongBalance [NodeID]

# Model

A conditional random field model with additional codec used for data encoding.

# Training

:: (Ord a, Ord b) | |

=> SgdArgs | Args for SGD |

-> Bool | Store dataset on a disk |

-> ([SentL a b] -> Set b) | R0 construction |

-> (AVec Lb -> [DAG () (X, Y)] -> [Feature]) | Feature selection |

-> IO [SentL a b] | Training data |

-> IO [SentL a b] | Evaluation data |

-> IO (CRF a b) | Resulting model |

Train the CRF using the stochastic gradient descent method.

The resulting model will contain features extracted with the user supplied extraction function. You can use the functions provided by the Data.CRF.Chain1.Constrained.Feature.Present and Data.CRF.Chain1.Constrained.Feature.Hidden modules for this purpose.

You also have to supply R0 construction method (e.g. `oovChosen`

)
which determines the contents of the default set of labels.

# R0 construction

anyInterps :: Ord b => [SentL a b] -> Set b Source #

Collect interpretations (also labels assigned) of words in a dataset.

# Utils

verifyDAG :: DAG a (X, Y) -> Maybe Error Source #

Check if the DAG satisfies all the desirable properties.