concraft-0.14.2: Morphological disambiguation based on constrained CRFs

Safe HaskellNone
LanguageHaskell98

NLP.Concraft.DAG.DisambSeg

Contents

Description

A version of the disambigation model adapted to perform sentence segmentation as well.

Synopsis

Types

data Tag Source #

The internal tag type.

Constructors

Tag 

Fields

Instances
Eq Tag Source # 
Instance details

Defined in NLP.Concraft.DAG.DisambSeg

Methods

(==) :: Tag -> Tag -> Bool #

(/=) :: Tag -> Tag -> Bool #

Ord Tag Source # 
Instance details

Defined in NLP.Concraft.DAG.DisambSeg

Methods

compare :: Tag -> Tag -> Ordering #

(<) :: Tag -> Tag -> Bool #

(<=) :: Tag -> Tag -> Bool #

(>) :: Tag -> Tag -> Bool #

(>=) :: Tag -> Tag -> Bool #

max :: Tag -> Tag -> Tag #

min :: Tag -> Tag -> Tag #

Show Tag Source # 
Instance details

Defined in NLP.Concraft.DAG.DisambSeg

Methods

showsPrec :: Int -> Tag -> ShowS #

show :: Tag -> String #

showList :: [Tag] -> ShowS #

data Disamb t Source #

A disambiguation model.

Constructors

Disamb 

Fields

  • tiers :: [Tier]
     
  • schemaConf :: SchemaConf
     
  • crf :: CRF Ob Atom
     
  • simplify :: t -> Tag

    A function which simplifies the tags of the generic type t to (i) the corresponding positional tags and (ii) information if the segment represents sentence end.

    NOTE: it can happen in real situations that a tag is encountered which is not known by the model. It would be nice to be able to treat it as the closest tag that can be handled. Then, one have to define the notion of the similarilty between tags, though... But probably it should be done at a different level (where more information about the structure of t is known)

putDisamb :: Disamb t -> Put Source #

Store the entire disambiguation model apart from the simplification function.

getDisamb :: (t -> Tag) -> Get (Disamb t) Source #

Get the disambiguation model, provided the simplification function. getDisamb :: (M.Map t T.Tag) -> Get (Disamb t)

Tiers

data Tier Source #

A tier description.

Constructors

Tier 

Fields

Instances
Binary Tier Source # 
Instance details

Defined in NLP.Concraft.Disamb.Positional

Methods

put :: Tier -> Put #

get :: Get Tier #

putList :: [Tier] -> Put #

data Atom Source #

An atomic part of morphosyntactic tag with optional POS.

Constructors

Atom 

Fields

Instances
Eq Atom Source # 
Instance details

Defined in NLP.Concraft.Disamb.Positional

Methods

(==) :: Atom -> Atom -> Bool #

(/=) :: Atom -> Atom -> Bool #

Ord Atom Source # 
Instance details

Defined in NLP.Concraft.Disamb.Positional

Methods

compare :: Atom -> Atom -> Ordering #

(<) :: Atom -> Atom -> Bool #

(<=) :: Atom -> Atom -> Bool #

(>) :: Atom -> Atom -> Bool #

(>=) :: Atom -> Atom -> Bool #

max :: Atom -> Atom -> Atom #

min :: Atom -> Atom -> Atom #

Show Atom Source # 
Instance details

Defined in NLP.Concraft.Disamb.Positional

Methods

showsPrec :: Int -> Atom -> ShowS #

show :: Atom -> String #

showList :: [Atom] -> ShowS #

Binary Atom Source # 
Instance details

Defined in NLP.Concraft.Disamb.Positional

Methods

put :: Atom -> Put #

get :: Get Atom #

putList :: [Atom] -> Put #

Disambiguation

disamb :: (Word w, Ord t) => Disamb t -> Sent w t -> DAG () (Map t Bool) Source #

Perform disambiguation.

Probs in general

data ProbType #

Type of resulting probabilities.

Constructors

Marginals

Marginal probabilities

MaxProbs

TODO

probsSent :: (Word w, Ord t) => ProbType -> Disamb t -> Sent w t -> Sent w t Source #

Determine the marginal probabilities of to individual labels in the sentence.

probs :: (Word w, Ord t) => ProbType -> Disamb t -> Sent w t -> DAG () (WMap t) Source #

Determine the marginal probabilities of to individual labels in the sentence.

Training

data TrainConf t Source #

Training configuration.

Constructors

TrainConf 

Fields

train Source #

Arguments

:: (Word w, Ord t) 
=> TrainConf t

Training configuration

-> IO [Sent w t]

Training data

-> IO [Sent w t]

Evaluation data

-> IO (Disamb t) 

Train disambiguation module.

Pruning

prune :: Double -> Disamb t -> Disamb t Source #

Prune disamb model: discard model features with absolute values (in log-domain) lower than the given threshold.