concraft-0.4.0: Morphosyntactic tagging tool based on constrained CRFs

Safe HaskellNone

NLP.Concraft.Format.Plain

Contents

Description

Simple format for morphosyntax representation which assumes that all tags have a textual representation with no spaces inside and that one of the tags indicates unknown words.

Synopsis

Types

data Token Source

A token.

Constructors

Token 

Fields

orth :: Text
 
space :: Space
 
known :: Bool
 
interps :: Map Interp Bool

Interpretations of the token, each interpretation annotated with a disamb Boolean value (if True, the interpretation is correct within the context).

Instances

data Interp Source

Constructors

Interp 

Fields

base :: Maybe Text
 
tag :: Tag
 

Instances

data Space Source

No space, space or newline.

Constructors

None 
Space 
NewLine 

Instances

Format handler

plainFormat :: Tag -> Doc [] [Token] TokenSource

Create document handler given value of the ignore tag.

Parsing

parsePlain :: Tag -> Text -> [[Token]]Source

Parse the text in the plain format given the oov tag.

parseSent :: Tag -> Text -> [Token]Source

Parse the sentence in the plain format given the oov tag.

Printing

showPlain :: Tag -> [[Token]] -> TextSource

Show the plain data.

showSent :: Tag -> [Token] -> TextSource

Show the sentence.