concraft-pl-0.3.1: Morphological tagger for Polish

Safe HaskellNone

NLP.Concraft.Polish.Morphosyntax

Contents

Description

Morphosyntax data layer in Polish.

Synopsis

Tag

type Tag = TextSource

A textual representation of a morphosyntactic tag.

Segment

data Seg t Source

A segment consists of a word and a set of morphosyntactic interpretations.

Constructors

Seg 

Fields

word :: Word
 
interps :: Map (Interp t) Bool

Interpretations of the token, each interpretation annotated with a disamb Boolean value (if True, the interpretation is correct within the context).

Instances

Eq t => Eq (Seg t) 
Ord t => Ord (Seg t) 
Show t => Show (Seg t) 
(Ord t, Binary t) => Binary (Seg t) 

data Word Source

A word.

Constructors

Word 

Fields

orth :: Text
 
space :: Space
 
known :: Bool
 

data Interp t Source

A morphosyntactic interpretation. TODO: Should we allow base to be Nothing?

Constructors

Interp 

Fields

base :: Maybe Text
 
tag :: t
 

Instances

Eq t => Eq (Interp t) 
Ord t => Ord (Interp t) 
Show t => Show (Interp t) 
(Ord t, Binary t) => Binary (Interp t) 

data Space Source

No space, space or newline. TODO: Perhaps we should use a bit more informative data type.

Constructors

None 
Space 
NewLine 

select :: Ord a => a -> Seg a -> Seg aSource

Select one interpretation.

Sentence

type Sent t = [Seg t]Source

A sentence.

data SentO t Source

A sentence.

Constructors

SentO 

Fields

segs :: [Seg t]
 
orig :: Text
 

restore :: Sent t -> TextSource

Restore textual representation of a sentence. The function is not very accurate, it could be improved if we enrich representation of a space.

withOrig :: Sent t -> SentO tSource

Use restore to translate Sent to a SentO.

Conversion

packSeg :: Tagset -> Seg Tag -> Seg Word TagSource

Convert a segment to a segment from a core library.

packSent :: Tagset -> Sent Tag -> Sent Word TagSource

Convert a sentence to a sentence from a core library.

packSentO :: Tagset -> SentO Tag -> SentO Word TagSource

Convert a sentence to a sentence from a core library.