chatter-0.8.0.2: A library of simple NLP algorithms.

Safe HaskellNone
LanguageHaskell2010

NLP.Corpora.Conll

Description

Data types representing the POS tags and Chunk tags derived from the Conll2000 training corpus.

Synopsis

Documentation

data Chunk Source

Phrase chunk tags defined for the Conll task.

Constructors

ADJP 
ADVP 
CONJP 
INTJ 
LST 
NP

Noun Phrase.

PP

Prepositional Phrase.

PRT 
SBAR 
UCP 
VP

Verb Phrase.

O

"out"; not a chunk.

tagTxtPatterns :: [(Text, Text)] Source

Order matters here: The patterns are replaced in reverse order when generating tags, and in top-to-bottom when generating tags.

data Tag Source

These tags may actually be the Penn Treebank tags. But I have not (yet?) seen the punctuation tags added to the Penn set.

This particular list was complied from the union of:

Constructors

START

START tag, used in training.

END

END tag, used in training.

Hash

#

Dollar

$

CloseDQuote

''

OpenDQuote

``

Op_Paren

(

Cl_Paren

)

Comma

,

Term

. Sentence Terminator

Colon

:

CC

Coordinating conjunction

CD

Cardinal number

DT

Determiner

EX

Existential there

FW

Foreign word

IN

Preposition or subordinating conjunction

JJ

Adjective

JJR

Adjective, comparative

JJS

Adjective, superlative

LS

List item marker

MD

Modal

NN

Noun, singular or mass

NNS

Noun, plural

NNP

Proper noun, singular

NNPS

Proper noun, plural

PDT

Predeterminer

POS

Possessive ending

PRP

Personal pronoun

PRPdollar

Possessive pronoun

RB

Adverb

RBR

Adverb, comparative

RBS

Adverb, superlative

RP

Particle

SYM

Symbol

TO

to

UH

Interjection

VB

Verb, base form

VBD

Verb, past tense

VBG

Verb, gerund or present participle

VBN

Verb, past participle

VBP

Verb, non-3rd person singular present

VBZ

Verb, 3rd person singular present

WDT

Wh-determiner

WP

Wh-pronoun

WPdollar

Possessive wh-pronoun

WRB

Wh-adverb

Unk