crf-chain1-constrained-0.3.0: First-order, constrained, linear-chain conditional random fields

Data.CRF.Chain1.Constrained

Contents

Description

The module provides first-order, linear-chain conditional random fields (CRFs) with position-wide constraints over label values.

Synopsis

# Data types

data Word a b Source

A Word is represented by a set of observations and a set of potential interpretation labels. When the set of potential labels is empty the word is considered to be unknown and the default potential set is used in its place.

Constructors

 Word Fieldsobs :: Set aThe set of observations lbs :: Set bThe set of potential interpretations.

Instances

 (Eq a, Eq b) => Eq (Word a b) (Ord a, Ord b) => Ord (Word a b) (Show a, Show b) => Show (Word a b)

unknown :: Word a b -> BoolSource

The word is considered to be unknown when the set of potential labels is empty.

type Sent a b = [Word a b]Source

A sentence of words.

data Prob a Source

A probability distribution defined over elements of type a. All elements not included in the map have probability equal to 0.

Instances

 Eq a => Eq (Prob a) Ord a => Ord (Prob a) Show a => Show (Prob a)

mkProb :: Ord a => [(a, Double)] -> Prob aSource

Construct the probability distribution.

data WordL a b Source

A WordL is a labeled word, i.e. a word with probability distribution defined over labels. We assume that every label from the distribution domain is a member of the set of potential labels corresponding to the word. Use the mkWordL smart constructor to build WordL.

mkWordL :: Word a b -> Prob b -> WordL a bSource

Ensure, that every label from the distribution domain is a member of the set of potential labels corresponding to the word.

type SentL a b = [WordL a b]Source

A sentence of labeled words.

## Tagging

tag :: (Ord a, Ord b) => CRF a b -> Sent a b -> [b]Source

Determine the most probable label sequence within the context of the given sentence using the model provided by the CRF.

tagK :: (Ord a, Ord b) => Int -> CRF a b -> Sent a b -> [[b]]Source

Determine the most probable label sets of the given size (at maximum) for each position in the input sentence.