| Safe Haskell | None |
|---|
NLP.Corpora.Conll
Description
Data types representing the POS tags and Chunk tags derived from the Conll2000 training corpus.
- data NERTag
- data Chunk
- readTag :: Text -> Either Error Tag
- tagTxtPatterns :: [(Text, Text)]
- reversePatterns :: [(Text, Text)]
- showTag :: Tag -> Text
- replaceAll :: [(Text, Text)] -> Text -> Text
- data Tag
- = START
- | END
- | Hash
- | Dollar
- | CloseDQuote
- | OpenDQuote
- | Op_Paren
- | Cl_Paren
- | Comma
- | Term
- | Colon
- | CC
- | CD
- | DT
- | EX
- | FW
- | IN
- | JJ
- | JJR
- | JJS
- | LS
- | MD
- | NN
- | NNS
- | NNP
- | NNPS
- | PDT
- | POS
- | PRP
- | PRPdollar
- | RB
- | RBR
- | RBS
- | RP
- | SYM
- | TO
- | UH
- | VB
- | VBD
- | VBG
- | VBN
- | VBP
- | VBZ
- | WDT
- | WP
- | WPdollar
- | WRB
- | Unk
Documentation
Named entity categories defined for the Conll 2003 task.
Phrase chunk tags defined for the Conll task.
tagTxtPatterns :: [(Text, Text)]Source
Order matters here: The patterns are replaced in reverse order when generating tags, and in top-to-bottom when generating tags.
reversePatterns :: [(Text, Text)]Source
replaceAll :: [(Text, Text)] -> Text -> TextSource
These tags may actually be the Penn Treebank tags. But I have not (yet?) seen the punctuation tags added to the Penn set.
This particular list was complied from the union of:
- All tags used on the Conll2000 training corpus. (contributing the punctuation tags) * The PennTreebank tags, listed here: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html (which contributed LS over the items in the corpus). * The tags: START, END, and Unk, which are used by Chatter.
Constructors
| START | START tag, used in training. |
| END | END tag, used in training. |
| Hash | # |
| Dollar | $ |
| CloseDQuote | '' |
| OpenDQuote | `` |
| Op_Paren | ( |
| Cl_Paren | ) |
| Comma | , |
| Term | . Sentence Terminator |
| Colon | : |
| CC | Coordinating conjunction |
| CD | Cardinal number |
| DT | Determiner |
| EX | Existential there |
| FW | Foreign word |
| IN | Preposition or subordinating conjunction |
| JJ | Adjective |
| JJR | Adjective, comparative |
| JJS | Adjective, superlative |
| LS | List item marker |
| MD | Modal |
| NN | Noun, singular or mass |
| NNS | Noun, plural |
| NNP | Proper noun, singular |
| NNPS | Proper noun, plural |
| PDT | Predeterminer |
| POS | Possessive ending |
| PRP | Personal pronoun |
| PRPdollar | Possessive pronoun |
| RB | Adverb |
| RBR | Adverb, comparative |
| RBS | Adverb, superlative |
| RP | Particle |
| SYM | Symbol |
| TO | to |
| UH | Interjection |
| VB | Verb, base form |
| VBD | Verb, past tense |
| VBG | Verb, gerund or present participle |
| VBN | Verb, past participle |
| VBP | Verb, non-3rd person singular present |
| VBZ | Verb, 3rd person singular present |
| WDT | Wh-determiner |
| WP | Wh-pronoun |
| WPdollar | Possessive wh-pronoun |
| WRB | Wh-adverb |
| Unk |