Safe Haskell | None |
---|

- data Word a b
- mkWord :: Set a -> Set b -> Word a b
- type Sent a b = [Word a b]
- data Dist a
- mkDist :: Ord a => [(a, Double)] -> Dist a
- type WordL a b = (Word a b, Dist b)
- type SentL a b = [WordL a b]
- newtype Ob = Ob {}
- newtype Lb1 = Lb1 {}
- newtype Lb2 = Lb2 {}
- type Lb = (Lb1, Lb2)
- data Feat
- data CRF a b c = CRF {}
- train :: (Ord a, Ord b, Ord c) => SgdArgs -> FeatSel Ob Lb Feat -> IO [SentL a (b, c)] -> Maybe (IO [SentL a (b, c)]) -> IO (CRF a b c)
- tag :: (Ord a, Ord b, Ord c) => CRF a b c -> Sent a (b, c) -> [(b, c)]
- type FeatSel o t f = FeatGen o t f -> Xs o t -> Ys t -> [f]
- selectHidden :: FeatSel o t f
- selectPresent :: FeatSel o t f

# Data types

## External

A word consists of a set of observations and a set of potential labels.

mkWord :: Set a -> Set b -> Word a bSource

A word constructor which checks non-emptiness of the potential set of labels.

A probability distribution defined over elements of type a. All elements not included in the map have probability equal to 0.

type WordL a b = (Word a b, Dist b)Source

A WordL is a labeled word, i.e. a word with probability distribution defined over labels. We assume that every label from the distribution domain is a member of the set of potential labels corresponding to the word. TODO: Ensure the assumption using the smart constructor.

# Internal

# CRF

## Training

:: (Ord a, Ord b, Ord c) | |

=> SgdArgs | Args for SGD |

-> FeatSel Ob Lb Feat | Feature selection |

-> IO [SentL a (b, c)] | Training data |

-> Maybe (IO [SentL a (b, c)]) | Maybe evalation data |

-> IO (CRF a b c) | Resulting codec and model |

Train the CRF using the stochastic gradient descent method.
When the evaluation data `IO`

action is `Just`

, the iterative
training process will notify the user about the current accuracy
on the evaluation part every full iteration over the training part.
Use the provided feature selection function to determine model
features.

## Tagging

tag :: (Ord a, Ord b, Ord c) => CRF a b c -> Sent a (b, c) -> [(b, c)]Source

Find the most probable label sequence.

# Feature selection

selectHidden :: FeatSel o t fSource

The `hiddenFeats`

adapted to fit feature selection specs.

selectPresent :: FeatSel o t fSource

The `presentFeats`

adapted to fit feature selection specs.