nkjp-0.4.0: Manipulating the National Corpus of Polish (NKJP)

Safe HaskellNone

Text.NKJP.Morphosyntax

Contents

Description

Parsing the NKJP morphosyntax layer.

Synopsis

Data types

data Para t Source

A paragraph.

Constructors

Para 

Fields

paraID :: t
 
sentences :: [Sent t]
 

Instances

Functor Para 
Show t => Show (Para t) 

data Sent t Source

A sentence.

Constructors

Sent 

Fields

sentID :: t
 
segments :: [Seg t]
 

Instances

Functor Sent 
Show t => Show (Sent t) 

data Seg t Source

A segment.

Constructors

Seg 

Fields

segID :: t
 
orth :: t
 
nps :: Bool
 
lexs :: [Lex t]
 
choice :: (t, t)
 

Instances

Functor Seg 
Show t => Show (Seg t) 

data Lex t Source

A lexciacal entry, potential interpretation of the segment.

Constructors

Lex 

Fields

lexID :: t
 
base :: t
 
ctag :: t
 
msds :: [(t, t)]
 

Instances

Functor Lex 
Show t => Show (Lex t) 

Parsing

parseMorph :: Text -> [Para Text]Source

Parse textual contents of the ann_morphosyntax.xml file.

readMorph :: FilePath -> IO [Para Text]Source

Parse the stand-alone ann_morphosyntax.xml file.

readCorpus :: FilePath -> IO [(FilePath, Maybe [Para Text])]Source

Parse all ann_morphosyntax.xml files from the NCP .tar.gz file.