crf-chain2-generic-0.3.0: Second-order, generic, constrained, linear conditional random fields

Data.CRF.Chain2.Pair

Synopsis

# Data types

## External

data Word a b Source

A word consists of a set of observations and a set of potential labels.

Instances

 (Eq a, Eq b) => Eq (Word a b) (Eq (Word a b), Ord a, Ord b) => Ord (Word a b) (Show a, Show b) => Show (Word a b)

mkWord :: Set a -> Set b -> Word a bSource

A word constructor which checks non-emptiness of the potential set of labels.

type Sent a b = [Word a b]Source

A sentence of words.

data Dist a Source

A probability distribution defined over elements of type a. All elements not included in the map have probability equal to 0.

mkDist :: Ord a => [(a, Double)] -> Dist aSource

Construct the probability distribution.

type WordL a b = (Word a b, Dist b)Source

A WordL is a labeled word, i.e. a word with probability distribution defined over labels. We assume that every label from the distribution domain is a member of the set of potential labels corresponding to the word. TODO: Ensure the assumption using the smart constructor.

type SentL a b = [WordL a b]Source

A sentence of labeled words.

# Internal

newtype Ob Source

Constructors

 Ob FieldsunOb :: Int

Instances

 Eq Ob Ord Ob Show Ob Ix Ob Binary Ob

newtype Lb1 Source

Constructors

 Lb1 FieldsunLb1 :: Int

Instances

 Eq Lb1 Ord Lb1 Show Lb1 Ix Lb1 Binary Lb1

newtype Lb2 Source

Constructors

 Lb2 FieldsunLb2 :: Int

Instances

 Eq Lb2 Ord Lb2 Show Lb2 Ix Lb2 Binary Lb2

type Lb = (Lb1, Lb2)Source

data Feat Source

Constructors

 TFeat3'1 !Lb1 !Lb1 !Lb1 TFeat3'2 !Lb2 !Lb2 !Lb2 TFeat2'1 !Lb1 !Lb1 TFeat2'2 !Lb2 !Lb2 TFeat1'1 !Lb1 TFeat1'2 !Lb2 OFeat'1 !Ob !Lb1 OFeat'2 !Ob !Lb2

Instances

 Eq Feat Ord Feat Show Feat Binary Feat FeatMap FeatMap Feat Binary (FeatMap Feat)

# CRF

data CRF a b c Source

Constructors

 CRF FieldscodecData :: CodecData a b c model :: Model FeatMap Ob Lb Feat

Instances

 (Ord a, Ord b, Ord c, Binary a, Binary b, Binary c) => Binary (CRF a b c)

## Training

Arguments

 :: (Ord a, Ord b, Ord c) => SgdArgs Args for SGD -> FeatSel Ob Lb Feat Feature selection -> IO [SentL a (b, c)] Training data `IO` action -> Maybe (IO [SentL a (b, c)]) Maybe evalation data -> IO (CRF a b c) Resulting codec and model

Train the CRF using the stochastic gradient descent method. When the evaluation data `IO` action is `Just`, the iterative training process will notify the user about the current accuracy on the evaluation part every full iteration over the training part. Use the provided feature selection function to determine model features.

## Tagging

tag :: (Ord a, Ord b, Ord c) => CRF a b c -> Sent a (b, c) -> [(b, c)]Source

Find the most probable label sequence.

# Feature selection

type FeatSel o t f = FeatGen o t f -> Xs o t -> Ys t -> [f]Source

A feature selection function type.

selectHidden :: FeatSel o t fSource

The `hiddenFeats` adapted to fit feature selection specs.

selectPresent :: FeatSel o t fSource

The `presentFeats` adapted to fit feature selection specs.