concraft-0.9.2: Morphological disambiguation based on constrained CRFs

Safe HaskellNone

NLP.Concraft.Schema

Contents

Description

Observation schema blocks for Concraft.

Synopsis

Types

type Ob = ([Int], Text)Source

An observation consist of an index (of list type) and an actual observation value.

type Ox a = Ox Text aSource

The Ox monad specialized to word token type and text observations.

type Schema w t a = Vector (Seg w t) -> Int -> Ox aSource

A schema is a block of the Ox computation performed within the context of the sentence and the absolute sentence position.

void :: a -> Schema w t aSource

A dummy schema block.

sequenceS_ :: [Vector (Seg w t) -> a -> Ox b] -> Vector (Seg w t) -> a -> Ox ()Source

Sequence the list of schemas (or blocks) and discard individual values.

Usage

schematize :: Schema w t a -> Sent w t -> [[Ob]]Source

Use the schema to extract observations from the sentence.

Configuration

data Body a Source

Body of configuration entry.

Constructors

Body 

Fields

range :: [Int]

Range argument for the schema block.

oovOnly :: Bool

When true, the entry is used only for oov words.

args :: a

Additional arguments for the schema block.

Instances

Show a => Show (Body a) 
Binary a => Binary (Body a) 

type Entry a = Maybe (Body a)Source

Maybe entry.

entry :: [Int] -> Entry ()Source

Plain entry with no additional arugments.

entryWith :: a -> [Int] -> Entry aSource

Entry with additional arguemnts.

data SchemaConf Source

Configuration of the schema. All configuration elements specify the range over which a particular observation type should be taken on account. For example, the [-1, 0, 2] range means that observations of particular type will be extracted with respect to previous (k - 1), current (k) and after the next (k + 2) positions when identifying the observation set for position k in the input sentence.

Constructors

SchemaConf 

Fields

orthC :: Entry ()

The orthB schema block.

lowOrthC :: Entry ()

The lowOrthB schema block.

lowPrefixesC :: Entry [Int]

The lowPrefixesB schema block. The first list of ints represents lengths of prefixes.

lowSuffixesC :: Entry [Int]

The lowSuffixesB schema block. The first list of ints represents lengths of suffixes.

knownC :: Entry ()

The knownB schema block.

shapeC :: Entry ()

The shapeB schema block.

packedC :: Entry ()

The packedB schema block.

begPackedC :: Entry ()

The begPackedB schema block.

nullConf :: SchemaConfSource

Null configuration of the observation schema.

fromConf :: Word w => SchemaConf -> Schema w t ()Source

Build the schema based on the configuration.

Schema blocks

type Block w t a = Vector (Seg w t) -> [Int] -> Ox aSource

A block is a chunk of the Ox computation performed within the context of the sentence and the list of absolute sentence positions.

fromBlock :: Word w => Block w t a -> [Int] -> Bool -> Schema w t aSource

Transform a block to a schema depending on * A list of relative sentence positions, * A boolean value; if true, the block computation will be performed only on positions where an OOV word resides.

orthB :: Word w => Block w t ()Source

Orthographic form at the current position.

lowOrthB :: Word w => Block w t ()Source

Orthographic form at the current position.

lowPrefixesB :: Word w => [Int] -> Block w t ()Source

List of lowercased prefixes of given lengths.

lowSuffixesB :: Word w => [Int] -> Block w t ()Source

List of lowercased suffixes of given lengths.

knownB :: Word w => Block w t ()Source

Shape of the word.

shapeB :: Word w => Block w t ()Source

Shape of the word.

packedB :: Word w => Block w t ()Source

Packed shape of the word.

begPackedB :: Word w => Block w t ()Source

Packed shape of the word.