]<"      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                                  ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~                                                                         !None @Defaulting Map; a Map that returns a default value when queried  for a key that does not exist. Create an empty  BQuery the map for a value. Returns the default if the key is not  found.  Create a " from a default value and a list. Access the keys as a list. )Access the non-default values as a list. )Map a function over the values in a map. !Fold over the values in the map. Note that this *does* not fold B over the default value -- this fold behaves in the same way as a  standard  <Compute the union of two maps using the specified per-value ? combination function and the specified new map default value.  "Combine values with this function  The new map's default value The first map to combine The second map to combine "#$    "#$None 8Path to the directory containing all the PLUG archives.     None6Boolean type to indicate case sensitivity for textual  comparisons. Just a handy alias for Text %&%&NoneA fallback POS tag instance.  A fall-back " instance, analogous to  The class of POS Tags. CWe use a typeclass here because POS tags just need a few things in A excess of equality (they also need to be serializable and human E readable). Passing around all the constraints everywhere becomes a  hassle, and it'4s handy to have a uniform interface to the diferent  kinds of tag types. :This typeclass also allows for corpus-specific tags to be E distinguished; They have different semantics, so they should not be C merged. That said, if you wish to create a unifying POS Tag set, C and mappings into that set, you can use the type system to ensure  that that is done correctly. This may+ get renamed to POSTag at some later date. !$Check if a tag is a determiner tag. ",The class of things that can be regarded as chunks ; Chunk tags @ are much like POS tags, but should not be confused. Generally, E chunks distinguish between different phrasal categories (e.g.; Noun 6 Phrases, Verb Phrases, Prepositional Phrases, etc..) &?The class of named entity sets. This typeclass can be defined 6 entirely in terms of the required class constraints. '"Tag instance for unknown tagsets.  !"#$%&'((')*+ !"#$%&'(&'("#$% !  !"#$%&'((')*+None)Raw tokenized text. ) has a , instance to simplify use. +A POS-tagged token. /?A tagged sentence has POS Tags. Generated by a part-of-speech ? tagger. (tagger :: Tag tag => Sentence -> TaggedSentence tag) 13A Chunk that strictly contains chunks or POS tags. 3BA data type to represent the portions of a parse tree for Chunks. B Note that this part of the parse tree could be a POS tag with no  chunk. 6?A chunked sentence has POS tags and chunk tags. Generated by a  chunker. W(chunker :: (Chunk chunk, Tag tag) => TaggedSentence tag -> ChunkedSentence chunk tag) 8@A sentence of tokens without tags. Generated by the tokenizer. ! (tokenizer :: Text -> Sentence) :Extract the token list from a 8 ;Apply a parallel list of s to a 8. =AGenerate a Text representation of a TaggedSentence in the common  tagged format, eg:  "the/at dog/nn jumped/vbd ./." >'Remove the tags from a tagged sentence ?>Extract the tags from a tagged sentence, returning a parallel 2 list of tags along with the underlying Sentence. A>Combine the results of POS taggers, using the second param to  fill in  entries, where possible. BMerge /* values, preffering the tags in the first /.  Delegates to C. C-Returns the first param, unless it is tagged . - Throws an error if the text does not match. DHelper to create 3 types. EHelper to create 3' types that just hold POS tagged data. F%Show the underlying text token only. HShow the text and tag. IExtract the text of a ) J'Extract the last three characters of a ), if the token is 5 long enough, otherwise returns the full token text. KExtract the list of + tags from a / LCalculate the length of a / (in terms of the  number of tokens). MBrutally concatenate two /s N@True if the input sentence contains the given text token. Does C not do partial or approximate matching, and compares details in a  fully case-sensitive manner. O7True if the input sentence contains the given POS tag. 8 Does not do partial matching (such as prefix matching) P6Compare the POS-tag token with a supplied tag string. Q1Compare the POS-tagged token with a text string. R$Compare a token with a text string. 2)*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQR-./01234*)*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQR*89:;6735412</0=>?@ABCDE+,-.FGH)*IJKLMNOPQR()*+,-./0123546789:;<=>?@ABCDEFGHIJKLMNOPQR-./01234NoneS,Data type to indicate IOB tags for chunking TNot in a chunk. U In chunk tag VBeging marker. [ Turn an IOB result into a tree. \Parse an IOB-encoded corpus. STUVWXYZ[\]^5 STUVWXYZ[\]^ SVUTWXYZ[\]^ SVUTWXYZ[\]^5None_`_``__` Safe-Inferred 6789:;<=>?@AB7=>?@AB 6789:;<=>?@ABNone7a?These tags may actually be the Penn Treebank tags. But I have = not (yet?) seen the punctuation tags added to the Penn set. 5This particular list was complied from the union of: U All tags used on the Conll2000 training corpus. (contributing the punctuation tags) ) * The PennTreebank tags, listed here:  Khttps://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html6 (which contributed LS over the items in the corpus). ? * The tags: START, END, and Unk, which are used by Chatter. c Wh-adverb dPossessive wh-pronoun e Wh-pronoun fWh-determiner g"Verb, 3rd person singular present h&Verb, non-3rd person singular present iVerb, past participle j#Verb, gerund or present participle kVerb, past tense lVerb, base form m Interjection nto oSymbol p Particle qAdverb, superlative rAdverb, comparative sAdverb tPossessive pronoun uPersonal pronoun vPossessive ending wPredeterminer xProper noun, plural yProper noun, singular z Noun, plural {Noun, singular or mass |Modal }List item marker ~Adjective, superlative Adjective, comparative  Adjective )Preposition or subordinating conjunction  Foreign word Existential there  Determiner Cardinal number Coordinating conjunction : . Sentence Terminator , ) ( `` '' $ # END tag, used in training. START tag, used in training. .Phrase chunk tags defined for the Conll task. out; not a chunk.  Verb Phrase. Prepositional Phrase.  Noun Phrase. 9Named entity categories defined for the Conll 2003 task. ?Order matters here: The patterns are replaced in reverse order B when generating tags, and in top-to-bottom when generating tags. Qabcdefghijklmnopqrstuvwxyz{|}~CDEFGHIJKHabcdefghijklmnopqrstuvwxyz{|}~Ha~}|{zyxwvutsrqponmlkjihgfedcba0~}|{zyxwvutsrqponmlkjihgfedcb CDEFGHIJK None Unknown. 'WH-adverb + modal auxillary e.g.; where'd !WH-adverb + preposition e.g.; why'n WH-adverb + verb to do, present tense,  3rd person singular e.g.; how's WH-adverb + verb to do, past tense,  negated e.g.; whyn't WH-adverb + verb to do, past tense e.g.;  where'd how'd WH-adverb + verb to do, present, not 3rd  person singular e.g.; howda WH-adverb + verb to be, present, 3rd  person singular e.g.; how's where's WH-adverb + verb to be, present, 2nd - person singular or all persons plural e.g.;  where're /WH-adverb e.g.; however when where why whereby 1 wherever how whenever whereon wherein wherewith $ wheare wherefore whereof howsabout WH-qualifier e.g.; however how )WH-pronoun, nominative + modal auxillary  e.g.; who'll that'd who'd that'll WH-pronoun, nominative + verb to have, * present tense, 3rd person singular e.g.;  who's that's WH-pronoun, nominative + verb to have,  past tense e.g.; who'd WH-pronoun, nominative + verb to be, ( present, 3rd person singular e.g.; that's  who's .WH-pronoun, nominative e.g.; that who whoever  whosoever what whatsoever +WH-pronoun, accusative e.g.; whom that who *WH-pronoun, genitive e.g.; whose whosever WH-determiner + verb to have , present & tense, 3rd person singular e.g.; what's WH-determiner + verb to do , past tense  e.g.; what'd WH-determiner + verb to do, & uninflected present tense + pronoun, & personal, nominative, not 3rd person  singular e.g.; whaddya WH-determiner + verb to be , present & tense, 3rd person singular e.g.; what's WH-determiner + verb to be, % present, 2nd person singular or all % persons plural + pronoun, personal, % nominative, not 3rd person singular  e.g.; whaddya WH-determiner + verb to be , present + tense, 2nd person singular or all persons  plural e.g.; what're 2WH-determiner e.g.; which what whatever whichever  whichever-the-hell /verb, present tense, 3rd person singular e.g.; 4 deserves believes receives takes goes expires says 3 opposes starts permits expects thinks faces votes 1 teaches holds calls fears spends collects backs - eliminates sets flies gives seeks reads ... -verb, past participle + infinitival to e.g.;  gotta 2verb, past participle e.g.; conducted charged won 5 received studied revised operated accepted combined / experienced recommended effected granted seen / protected adopted retarded notarized selected  composed gotten printed ... *verb, present participle + infinitival to  e.g.; gonna )verb, present participle or gerund e.g.; 5 modernizing improving purchasing Purchasing lacking 3 enabling pricing keeping getting picking entering - voting warning making strengthening setting 0 neighboring attending participating moving ... *verb, past tense e.g.; said produced took 1 recommended commented urged found added praised 2 charged listed became announced brought attended / wanted voted defeated received got stood shot $ scheduled feared promised made ... ,verb, base: uninflected present, imperative / or infinitive; hypenated pair e.g.; say-speak ,verb, base: uninflected present, imperative , or infinitive + infinitival to e.g.; wanta  wanna ,verb, imperative + adverbial particle e.g.;  g'ahn c'mon +verb, uninflected present tense + pronoun,  personal, accusative e.g.; let's lemme gimme ,verb, base: uninflected present, imperative * or infinitive + adjective e.g.; die-dead ,verb, base: uninflected present, imperative * or infinitive + preposition e.g.; lookit .verb, base: uninflected present or infinitive  + article e.g.; wanna /verb, base: uninflected present, imperative or 4 infinitive e.g.; investigate find act follow inure 6 achieve reduce take remedy re-set distribute realize 7 disable feel receive continue place protect eliminate + elaborate work permit run enter force ... 4interjection e.g.; Hurrah bang whee hmpf ah goodbye 6 oops oh-the-pain-of-it ha crunch say oh why see well 7 hello lo alas tarantara rum-tum-tum gosh hell keerist  Jesus Keeeerist boy c'mon 'mon goddamn bah hoo-pig  damn ... (infinitival to + verb, infinitive e.g.;  t'jawn t'lah infinitival to e.g.; to t' (adverb, particle + preposition e.g.; out'n  outta 2adverb, particle e.g.; up out off down over on in  about through across after %adverb, nominal e.g.; here afar then ,adverb, superlative e.g.; most best highest 5 uppermost nearest brightest hardest fastest deepest  farthest loudest ... #adverb, comparative + conjunction,  coordinating e.g.; more'n 1adverb, comparative e.g.; further earlier better 5 later higher tougher more harder longer sooner less 5 faster easier louder farther oftener nearer cheaper 0 slower tighter lower worse heavier quicker ... )adverb + conjunction, coordinating e.g.;  well's soon's adverb + verb to be, present tense, 3rd  person singular e.g.; here's there's adverb, genitive e.g.; else's 4adverb e.g.; only often generally also nevertheless 3 upon together back newly no likely meanwhile near 2 then heavily there apparently yet outright fully 4 aside consistently specifically formally ever just  ... *qualifier, post e.g.; indeed enough still 'nuff 4qualifier, pre e.g.; well less very most so real as / highly fundamentally even how much remarkably 5 somewhat more completely too thus ill deeply little 7 overly halfway almost impossibly far severly such ... 'pronoun, personal, nominative, not 3rd  person singular + verb to verb, uninflected  present tense e.g.; y'know 'pronoun, personal, nominative, not 3rd ) person singular + modal auxillary e.g.;  you'll we'll I'll we'd I'd they'll they'd  you'd 'pronoun, personal, nominative, not 3rd  person singular + verb to have , past tense  e.g.; I'd you'd we'd they'd 'pronoun, personal, nominative, not 3rd  person singular + verb to have, uninflected  present tense e.g.; I've we've they've you've 'pronoun, personal, nominative, not 3rd  person singular + verb to be , present % tense, 3rd person singular, negated  e.g.; taint 'pronoun, personal, nominative, not 3rd  person singular + verb to be , present % tense, 3rd person singular e.g.; you's 'pronoun, personal, nominative, not 3rd  person singular + verb to be , present + tense, 2nd person singular or all persons  plural e.g.; we're you're they're 'pronoun, personal, nominative, not 3rd  person singular + verb to be , present # tense, 1st person singular e.g.; I'm Ahm .pronoun, personal, nominative, not 3rd person ) singular e.g.; they we I you ye thou you'uns *pronoun, personal, nominative, 3rd person $ singular + modal auxillary e.g.; he'll she'll  it'll he'd it'd she'd *pronoun, personal, nominative, 3rd person  singular + verb to have, present tense, 3rd  person singular e.g.; it's he's she's *pronoun, personal, nominative, 3rd person  singular + verb to have, past tense e.g.;  she'd he'd it'd *pronoun, personal, nominative, 3rd person  singular + verb to be, present tense, 3rd  person singular e.g.; it's he's she's 3pronoun, personal, nominative, 3rd person singular  e.g.; it he she thee 3pronoun, personal, accusative e.g.; them it him me  us you 'em her thee we'uns ,pronoun, plural, reflexive e.g.; themselves  ourselves yourselves 2pronoun, singular, reflexive e.g.; itself himself ) myself yourself herself oneself ownself (pronoun, possessive e.g.; ours mine his  hers theirs yours )determiner, possessive e.g.; our its his & their my your her out thy mine thine )pronoun, nominal + modal auxillary e.g.;  someone' ll somebody' ll anybody'd pronoun, nominal + verb to have , present ( tense, 3rd person singular e.g.; nobody's  somebody's one's pronoun, nominal + verb to have, past  tense e.g.; nobody'd pronoun, nominal + verb to be , present ) tense, 3rd person singular e.g.; nothing's  everything' s somebody's nobody' s someone's $pronoun, nominal, genitive e.g.; one's  someone' s anybody's nobody' s everybody's  anyone' s everyone's 1pronoun, nominal e.g.; none something everything 6 one anyone nothing nobody everybody everyone anybody  anything someone no-one nothin 3numeral, ordinal e.g.; first 13th third nineteenth 0 2d 61st second sixth eighth ninth twenty-first . eleventh 50th eighteenth- Thirty-ninth 72nd 1/20th 5 twentieth mid-19th thousandth 350th sixteenth 701st  ... .noun, plural, adverbial e.g.; Sundays Mondays % Saturdays Wednesdays Souths Fridays ,noun, singular, adverbial + modal auxillary  e.g.; today'll *noun, singular, adverbial, genitive e.g.;  Saturday's Monday' s yesterday' s tonight's  tomorrow's Sunday' s Wednesday's Friday's  today' s Tuesday's West's Today's South's ,noun, singular, adverbial e.g.; Friday home 4 Wednesday Tuesday Monday Sunday Thursday yesterday 4 tomorrow tonight West East Saturday west left east 4 downtown north northeast southeast northwest North  South right ... %noun, plural, proper, genitive e.g.;  Republicans' Orioles' Birds' Yanks' Redbirds'  Bucs' Yankees' Stevenses' Geraghtys' Burkes'  Wackers' Achaeans' Dresbachs' Russians'  Democrats' Gershwins' Adventists' Negroes'  Catholics' ... ,noun, plural, proper e.g.; Chases Aderholds . Chapelles Armisteads Lockies Carbones French 6 Marskmen Toppers Franciscans Romans Cadillacs Masons 4 Blacks Catholics British Dixiecrats Mississippians  Congresses ... )noun, singular, proper + modal auxillary  e.g.; Gyp'll John'll noun, singular, proper + verb to have, * present tense, 3rd person singular e.g.;  Bill' s Guardino's Celie' s Skolman' s Crosson's  Tim's Wally's noun, singular, proper + verb to be, , present tense, 3rd person singular e.g.; W.'s  Ike's Mack's Jack's Kate' s Katharine's Black's  Arthur's Seaton' s Buckhorn's Breed's Penny's  Rob's Kitty' s Blackwell's Myra's Wally's  Lucille' s Springfield's Arlene's 'noun, singular, proper, genitive e.g.;  Green's Landis' Smith' s Carreon' s Allison's  Boston's Spahn's Willie's Mickey' s Milwaukee's  Mays' Howsam's Mantle's Shaw's Wagner's  Rickey's Shea's Palmer's Arnold' s Broglio's ... ,noun, singular, proper e.g.; Fulton Atlanta * September-October Durwood Pye Ivan Allen 7 Jr. Jan. Alpharetta Grady William B. Hartsfield Pearl 4 Williams Aug. Berry J. M. Cheshire Griffin Opelika  Ala. E. Pelham Snodgrass ... -noun, plural, common + modal auxillary e.g.;  duds' d oystchers'll %noun, plural, common, genitive e.g.;  taxpayers' children' s members' States' women's  cutters' motorists' steelmakers' hours'  Nations' lawyers' prisoners' architects'  tourists' Employers' secretaries' Rogues' ... *noun, plural, common e.g.; irregularities 5 presentments thanks reports voters laws legislators 4 years areas adjustments chambers $100 bonds courts 3 sales details raises sessions members congressmen  votes polls calls ... .noun, singular, common, hyphenated pair e.g.;  stomach-belly  )noun, singular, common + modal auxillary  e.g.; cowhand'd sun'll  +noun, singular, common + preposition e.g.;  buncha  noun, singular, common + verb to have, - present tense, 3rd person singular e.g.; guy's  Knife's boat's summer's rain' s company's  noun, singular, common + verb to have,  past tense e.g.; Pa'd  noun, singular, common + verb to be, * present tense, 3rd person singular e.g.;  water's camera's sky's kid's Pa's heat's  throat's father's money's undersecretary's  granite's level's wife's fat's Knife's fire's  name's hell's leg's sun' s roulette's cane's  guy's kind' s baseball's ... 'noun, singular, common, genitive e.g.;  season's world's player's night' s chapter's  golf' s football' s baseball's club's U.'s  coach's bride' s bridegroom's board's county's  firm' s company's superintendent's mob's Navy's  ... 2noun, singular, common e.g.; failure burden court 6 fire appointment awarding compensation Mayor interim 7 committee fact effect airport management surveillance . jail doctor intern extern night weekend duty  legislation Tax Office ... .modal auxillary + infinitival to e.g.; oughta %modal auxillary + pronoun, personal, + nominative, not 3rd person singular e.g.;  willya modal auxillary + verb to have, uninflected & form e.g.; shouldda musta coulda must' ve woulda  could've ,modal auxillary, negated e.g.; cannot couldn't  wouldn't can't won' t shouldn't shan't mustn't  musn't 2modal auxillary e.g.; should may might will would & must can could shall ought need wilt 2adjective, superlative e.g.; best largest coolest 5 calmest latest greatest earliest simplest strongest 5 newest fiercest unhappiest worst youngest worthiest 0 fastest hottest fittest lowest finest smallest  staunchest ... .adjective, semantically superlative e.g.; top 3 chief principal northernmost master key head main 4 tops utmost innermost foremost uppermost paramount  topmost ,adjective + conjunction, coordinating e.g.;  lighter'n 3adjective, comparative e.g.; greater older further 1 earlier later freer franker wider better deeper 3 firmer tougher faster higher bigger worse younger 5 lighter nicer slower happier frothier Greater newer  Elder ... +adjective, hyphenated pair e.g.; big-large  long-far adjective, genitive e.g.; Great's )adjective e.g.; recent over-all possible 7 hard-fought favorable hard meager fit such widespread . outmoded inadequate ambiguous grand clerical * effective orderly federal foster general  proportionate ... ,preposition + pronoun, personal, accusative  e.g.; t'hi-im $preposition, hyphenated pair e.g.; f'ovuh 1preposition e.g.; of in for by considering to on 7 among at through with under into regarding than since 6 despite according per before toward against as after 7 during including between without except upon out over  ... verb to have, present tense, 3rd person  singular, negated e.g.; hasn't ain't  verb to have%, present tense, 3rd person singular  e.g.; has hath !verb to have, past participle e.g.; had "verb to have%, present participle or gerund e.g.;  having #verb to have, past tense, negated e.g.;  hadn't $verb to have, past tense e.g.; had %verb to have, uninflected present tense +  infinitival to e.g.; hafta &verb to have, uninflected present tense or  imperative, negated e.g.; haven't ain't 'verb to have, uninflected present tense, * infinitive or imperative e.g.; have hast (/foreign word: WH-pronoun, nominative e.g.; qui )+foreign word: WH-pronoun, accusative e.g.;  quibusdam */foreign word: WH-determiner e.g.; quo qua quod  que quok +.foreign word: verb, present tense, 3rd person . singular e.g.; gouverne sinkt sigue diapiace ,.foreign word: verb, past participle e.g.; vue # verstrichen rasa verboten engages -*foreign word: verb, present participle or & gerund e.g.; nolens volens appellant * seq. obliterans servanda dicendi delenda .,foreign word: verb, past tense e.g.; stabat  peccavi audivi /+foreign word: verb, present tense, not 3rd 1 person singular, imperative or infinitive e.g.; 0 nolo contendere vive fermate faciunt esse vade , noli tangere dites duces meminisse iuvabit  gosaimasu voulez habla ksuu'peli afo lacheln " miuchi say allons strafe portant 0/foreign word: interjection e.g.; sayonara bien 1 adieu arigato bonjour adios bueno tchalo ciao o 1%foreign word: infinitival to + verb,  infinitive e.g.; d' entretenir 2$foreign word: adverb + conjunction,  coordinating e.g.; forisque 3-foreign word: adverb e.g.; bas assai deja um 4 wiederum cito velociter vielleicht simpliciter non / zu domi nuper sic forsan olim oui semper tout  despues hors 4$foreign word: qualifier e.g.; minus 5!foreign word: pronoun, personal, , nominative, not 3rd person singular + verb  to have , present tense, not 3rd person  singular e.g.; j'ai 6-foreign word: pronoun, personal, nominative, / not 3rd person singular e.g.; ich vous sie je 7-foreign word: pronoun, personal, nominative,  3rd person singular e.g.; il 8!foreign word: pronoun, personal, , accusative + preposition e.g.; mecum tecum 9/pronoun, personal, accusative e.g.; lui me moi  mi :!foreign word: pronoun, singular, & reflexive + verb, present tense, 3rd  person singular e.g.; s'excuse s'accuse ;+foreign word: pronoun, singular, reflexive  e.g.; se <+foreign word: determiner, possessive e.g.;  mea mon deras vos =)foreign word: pronoun, nominal e.g.; hoc >-foreign word: numeral, ordinal e.g.; 18e 17e  quintus ?.foreign word: noun, singular, adverbial e.g.;  heute morgen aujourd'hui hoy @)foreign word: noun, plural, proper e.g.;  Svenskarna Atlantes Dieux A+foreign word: noun, singular, proper e.g.; 1 Karshilama Dieu Rundfunk Afrique Espanol Afrika  Spagna Gott Carthago deus B,foreign word: noun, plural, common e.g.; al , culpas vopos boites haflis kolkhozes augen 0 tyrannis alpha-beta-gammas metis banditos rata 0 phis negociants crus Einsatzkommandos kamikaze 3 wohaws sabinas zorrillas palazzi engages coureurs # corroborees yori Ubermenschen ... C&foreign word: noun, singular, common, ) genitive e.g.; corporis intellectus arte's & dei aeternitatis senioritatis curiae  patronne' s chambre's D+foreign word: noun, singular, common e.g.; . ballet esprit ersatz mano chatte goutte sang 3 Fledermaus oud def kolkhoz roi troika canto boite 2 blutwurst carne muzyka bonheur monde piece force  ... E+foreign word: adjective, superlative e.g.;  optimo F+foreign word: adjective, comparative e.g.;  fortiori G-foreign word: adjective e.g.; avant Espagnol 3 sinfonica Siciliana Philharmonique grand publique + haute noire bouffe Douce meme humaine bel - serieuses royaux anticus presto Sovietskaya " Bayerische comique schwarzen ... H"foreign word: preposition + noun,  singular, proper e.g.; d'Yquem d'Eiffel I"foreign word: preposition + noun,  singular, common e.g.; d'etat d'hotel  d'argent d' identite d'art J*foreign word: preposition + article e.g.;  della des du aux zur d' un del dell' K/foreign word: preposition e.g.; ad de en a par 2 con dans ex von auf super post sine sur sub avec # per inter sans pour pendant in di Lforeign word: verb to have, present tense, not  3rd person singular e.g.; habe Mforeign word: determiner/pronoun, plural e.g.;  haec N foreign word: determiner + verb to be, * present tense, 3rd person singular e.g.;  c'est Oforeign word: determiner/pronoun, singular e.g.;  hoc P/foreign word: conjunction, subordinating e.g.;  bevor quam ma Q/foreign word: numeral, cardinal e.g.; une cinq  deux sieben unam zwei R1foreign word: conjunction, coordinating e.g.; et  ma mais und aber och nec y Sforeign word: verb to be, present tense, 3rd  person singular e.g.; ist est Tforeign word: verb to be, present tense, 2nd 2 person singular or all persons plural e.g.; sind  sunt etes Uforeign word: verb to be, infinitive or  imperative e.g.; sit V(foreign word: article + noun, singular,  proper e.g.; L'Astree L' Imperiale W(foreign word: article + noun, singular,  common e.g.; l' orchestre l' identite l'arcade  l'ange l' assistance l' activite L' Universite  l'independance L'Union L'Unita l' osservatore X0foreign word: article e.g.; la le el un die der  ein keine eine das las les Il Y'foreign word: negator e.g.; pas non ne Z*existential there + modal auxillary e.g.;  there'll there'd [existential there + verb to have , present ' tense, 3rd person singular e.g.; there's \existential there + verb to have, past  tense e.g.; there'd ]existential there + verb to be , present ' tense, 3rd person singular e.g.; there's ^existential there e.g.; there _0determiner, pronoun or double conjunction e.g.;  neither either one `pronoun, plural + verb to be , present & tense, 3rd person singular e.g.; them's a determiner/'pronoun, plural e.g.; these those them b determiner/&pronoun, singular or plural e.g.; any  some c determiner/ pronoun + modal auxillary e.g.;  that'll this'll d determiner/pronoun + verb to be , present & tense, 3rd person singular e.g.; that's e determiner/"pronoun, singular, genitive e.g.;  another's f determiner/"pronoun, singular e.g.; this each  another that 'nother gverb to do, present tense, 3rd person  singular, negated e.g.; doesn't don't hverb to do%, present tense, 3rd person singular  e.g.; does iverb to do , past tense, negated e.g.; didn't jverb to do, past tense e.g.; did done kverb to do, past or present tense + / pronoun, personal, nominative, not 3rd person  singular e.g.; d'you lverb to do, uninflected present tense or  imperative, negated e.g.; don't mverb to do(, uninflected present tense, infinitive  or imperative e.g.; do dost n/conjunction, subordinating e.g.; that as after 5 whether before while like because if since for than 2 altho until so unless though providing once lest  sposin% till whereas whereupon supposing tho' albeit  then so's 'fore o&numeral, cardinal, genitive e.g.; 1960's  1961's .404's p4numeral, cardinal e.g.; two one 1 four 2 1913 71 74 6 637 1937 8 five three million 87-31 29-5 seven 1,119 6 fifty-three 7.5 billion hundred 125,000 1,700 60 100  six ... q0conjunction, coordinating e.g.; and or but plus &  either neither nor yet n and/ or minus an' rverb to be, present tense, 3rd person  singular, negated e.g.; isn't ain't sverb to be%, present tense, 3rd person singular  e.g.; is tverb to be, present tense, 2nd person / singular or all persons plural, negated e.g.;  aren't ain't uverb to be%, present tense, 2nd person singular % or all persons plural e.g.; are art vverb to be, past participle e.g.; been wverb to be, present tense, 1st person  singular, negated e.g.; ain't xverb to be%, present tense, 1st person singular  e.g.; am yverb to be%, present participle or gerund e.g.;  being zverb to be!, past tense, 1st and 3rd person  singular, negated e.g.; wasn't {verb to be!, past tense, 1st and 3rd person  singular e.g.; was |verb to be", past tense, 2nd person singular + or all persons plural, negated e.g.; weren't }verb to be%, past tense, 2nd person singular or  all persons plural e.g.; were ~verb to be$, infinitive or imperative e.g.; be "article e.g.; the an no a every th' ever' ye  determiner/pronoun, post-determiner, ! hyphenated pair e.g.; many-much  determiner/#pronoun, post-determiner, genitive  e.g.; other's  determiner/)pronoun, post-determiner many other next 3 more last former little several enough most least 7 only very few fewer past same Last latter less single  plenty '#nough lesser certain various manye 4 next-to-last particular final previous present nuf  determiner/pronoun, double conjunction or  pre-quantifier both  determiner/'pronoun, pre-quantifier e.g.; all half  many nary  determiner/(pronoun, pre-qualifier e.g.; quite such  rather : . Sentence Terminator  , , not n't ) ( END tag, used in training. START tag, used in training. Out not a chunk. Clause. Prepositional Phrase.  Verb Phrase.  Noun Phrase. L?Order matters here: The patterns are replaced in reverse order B when generating tags, and in top-to-bottom when generating tags.       !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~MLNOPQRSTUV      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,+*)('&%$#"!       ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,+*)('&%$#"!      MLNOPQRSTUV NoneDocument corpus. DThis is a simple hashed corpus, the document content is not stored. 'The number of documents in the corpus. :A count of the number of documents each term occurred in. -Part of Speech tagger, with back-off tagger. <A sequence of pos taggers can be assembled by using backoff D taggers. When tagging text, the first tagger is run on the input, * possibly tagging some tokens as unknown ('Tag Unk'). The first C backoff tagger is then recursively invoked on the text to fill in D the unknown tags, but that may still leave some tokens marked with  'Tag Unk'9. This process repeats until no more taggers are found. ; (The current implementation is not very efficient in this  respect.). @Back off taggers are particularly useful when there is a set of ? domain specific vernacular that a general purpose statistical B tagger does not know of. A LitteralTagger can be created to map D terms to fixed POS tags, and then delegate the bulk of the text to @ a statistical back off tagger, such as an AvgPerceptronTagger. 4 values can be serialized and deserialized by using   and NLP.POS.deserialize`. This is a bit tricky D because the POSTagger abstracts away the implementation details of E the particular tagging algorithm, and the model for that tagger (if D any). To support serialization, each POSTagger value must provide 2 a serialize value that can be used to generate a W = representation of the model, as well as a unique id (also a  W,). Furthermore, that ID must be added to a `Map < ByteString (ByteString -> Maybe POSTagger -> Either String  POSTagger)` that is provided to  deserialize. The function in the  map takes the output of , and possibly a backoff = tagger, and reconstitutes the POSTagger that was serialized A (assigning the proper functions, setting up closures as needed,  etc.) Look at the source for  and    for examples. #The initial part-of-speech tagger. 5Training function to train the immediate POS tagger. &A tagger to invoke on unknown tokens. A tokenizer; ( will work.) 4A sentence splitter. If your input is formatted as ! one sentence per line, then use ,  otherwise try Erik Kow's fullstop library. Store this POS tagger to a  bytestring. This does not  serialize the backoff taggers. #A unique id that will identify the + algorithm used for this POS Tagger. This  is used in deserialization 5Get the number of documents that a term occurred in. Add a document to the corpus. 9This can be dangerous if the documents are pre-processed < differently. All corpus-related functions assume that the E documents have all been tokenized and the tokens normalized, in the  same way. 9Create a corpus from a list of documents, represented by  normalized tokens. XYZT !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQR XYZ NoneACreate a Literal Tagger using the specified back-off tagger as a ! fall-back, if one is specified. 'This uses a tokenizer adapted from the tokenize package for a  tokenizer, and Erik Kow',s fullstop sentence segmenter as a sentence  splitter. ACreate a tokenizer that protects the provided terms (to tokenize  multi-word terms) Adeserialization for Literal Taggers. The serialization logic is B in the posSerialize record of the POSTagger created in mkTagger. [  [ None1Create an unambiguous tagger, using the supplied \ as a  source of tags. (Trainer method for unambiguous taggers.  None The perceptron model. 9Each feature gets its own weight vector, so weights is a  dict-of-dicts 9The accumulated values, for the averaging. These will be  keyed by feature/ clas tuples ?The last time the feature was changed, for the averaging. Also  keyed by feature/ clas tuples # (tstamps is short for timestamps) Number of instances seen ATypedef for doubles to make the code easier to read, and to make % this simple to change if necessary. >The classes that the perceptron assigns are represnted with a  newtype-wrapped String. CEventually, I think this should become a typeclass, so the classes C can be defined by the users of the Perceptron (such as custom POS % tag ADTs, or more complex classes). -An empty perceptron, used to start training. (Predict a class given a feature vector. Ported from python:  def predict(self, features): S '''Dot-product the features and current weights and return the best label.''' ! scores = defaultdict(float) * for feat, value in features.items(): 4 if feat not in self.weights or value == 0:  continue & weights = self.weights[feat] / for label, weight in weights.items(): - scores[label] += value * weight 5 # Do a secondary alphabetic sort, for stability H return max(self.classes, key=lambda label: (scores[label], label)) *Update the perceptron with a new example. & update(self, truth, guess, features)  ...  self.i += 1  if truth == guess:  return None  for f in features: m weights = self.weights.setdefault(f, {}) -- setdefault is Map.findWithDefault, and destructive. > upd_feat(truth, f, weights.get(truth, 0.0), 1.0) ? upd_feat(guess, f, weights.get(guess, 0.0), -1.0)  return None ]ported from python: + def update(self, truth, guess, features): * '''Update the feature weights.''' " def upd_feat(c, f, w, v):  param = (f, c) F self._totals[param] += (self.i - self._tstamps[param]) * w ) self._tstamps[param] = self.i & self.weights[f][c] = w + v Average the weights Ported from Python:  def average_weights(self): 0 for feat, weights in self.weights.items():  new_feat_weights = {} . for clas, weight in weights.items(): " param = (feat, clas) ) total = self._totals[param] ? total += (self.i - self._tstamps[param]) * weight 8 averaged = round(total / float(self.i), 3)  if averaged: 3 new_feat_weights[clas] = averaged / self.weights[feat] = new_feat_weights  return None ^8round a fractional number to a specified decimal place. roundTo 2 3.14593.15_`abc]^defgh_`abc]^defghNone 7The type of Chunkers, incorporates chunking, training, 3 serilazitaion and unique IDs for deserialization. +The unique ID for this implementation of a  deserialize an AvgPerceptronChunker from a W. Create a chunker from a . >Chunk a list of POS-tagged sentence, generating a parse tree. $Chunk a single POS-tagged sentence. i Turn an IOB result into a tree. jECopied directly from the AvgPerceptronTagger; should be generalized? k;start markers to ensure all features in context are valid,  even for the first real tokens. l7end markers to ensure all features are valid, even for  the last real tokens. mTrain on one sentence. noip1The number of times to iterate over the training 1 data, randomly shuffling after each iteration. (5  is a reasonable choice.) The  to train. The training data. (A list of  [(Text, Tag)]'s) $A trained perceptron. IO is needed  for randomization. qjklmr0The full sentence that this word is located in. The index of the current word. The current word/ tag pair. *The predicted class of the previous word. stu  noipqjklmrstuNone7An efficient (ish) representation for documents in the  bag of words sense. 'Make a document from a list of tokens. DAccess the underlying DefaultMap used to store term vector details.  Generate a  from a tokenized document. *Invokes similarity on full strings, using v for @ tokenization, and no stemming. The return value will be in the  range [0, 1] 5There *must* be at least one document in the corpus. )Determine how similar two documents are. DThis function assumes that each document has been tokenized and (if  desired) stemmed/case-normalized. This is a wrapper around #, which is a *much* more efficient C implementation. If you need to run similarity against any single 1 document more than once, then you should create s for  each of your documents and use  instead of . +The return value will be in the range [0, 1]. 5There *must* be at least one document in the corpus. )Determine how similar two documents are. @Calculates the similarity between two documents, represented as   TermVectors', returning a double in the range [0, 1] where 1 represents   most similar. 6Return the raw frequency of a term in a body of text. AThe firt argument is the term to find, the second is a tokenized E document. This function does not do any stemming or additional text  modification. *Calculate the inverse document frequency. BThe IDF is, roughly speaking, a measure of how popular a term is. ?Calculate the tf*idf measure for a term given a document and a  corpus. >Add two term vectors. When a term is added, its value in each  vector is used (or that vector'$s default value is used if the term D is absent from the vector). The new term vector resulting from the / addition always uses a default value of zero. A  zero vector term vector (i.e. addVector v zeroVector = v). Negate a term vector. Add a list of term vectors. %Calculate the magnitude of a vector. %find the dot product of two vectors. wxywxyNone'Consume a token with the given POS Tag 7Text equality matching with optional case sensitivity. 7Consume a token with the given lexical representation. !Consume any one non-empty token. ASkips any number of fill tokens, ending with the end parser, and # returning the last parsed result. %This is useful when you know what you're looking for and (for  instance) don't care what comes first. z{ z{None.Find a clause in a larger collection of text. A clause is defined by the ! extractor, and is a Noun Phrase ) followed (immediately) by a Verb Phrase =findClause skips over leading tokens, if needed, to locate a  clause. -Find a Noun Phrase followed by a Verb Phrase None;Read a POS-tagged corpus out of a Text string of the form:  token/tag token/tag... %readPOS "Dear/jj Sirs/nns :/: Let/vb"5[("Dear",JJ),("Sirs",NNS),(":",Other ":"),("Let",VB)]@Returns all but the last element of a string, unless the string 1 is empty, in which case it returns that string. None BCreate an Averaged Perceptron Tagger using the specified back-off - tagger as a fall-back, if one is specified. 'This uses a tokenizer adapted from the _ package for a  tokenizer, and Erik Kow's fullstop sentence segmenter  ( +http://hackage.haskell.org/package/fullstop) as a sentence  splitter.  Train a new . =The training corpus should be a collection of sentences, one B sentence on each line, and with each token tagged with a part of  speech. For example, the input:  ; "The/DT dog/NN jumped/VB ./.\nThe/DT cat/NN slept/VB ./."  defines two training sentences. Btagger <- trainNew "Dear/jj Sirs/nns :/: Let/vb\nUs/nn begin/vb\n"-tag tagger $ map T.words $ T.lines "Dear sir""Dear/jj Sirs/nns :/: Let/vb" Train a new  on a corpus of files. 'Add training examples to a perceptron. Otagger <- train emptyPerceptron "Dear/jj Sirs/nns :/: Let/vb\nUs/nn begin/vb\n"-tag tagger $ map T.words $ T.lines "Dear sir""Dear/jj Sirs/nns :/: Let/vb"If you'=re using multiple input files, this can be useful to improve < performance (by folding over the files). For example, see  |;start markers to ensure all features in context are valid,  even for the first real tokens. }7end markers to ensure all features are valid, even for  the last real tokens. )Tag a document (represented as a list of 8 s) with a  trained  Ported from Python: ' def tag(self, corpus, tokenize=True): # '''Tags a string `corpus`.''' P # Assume untokenized corpus has \n between sentences and ' ' between words K s_split = nltk.sent_tokenize if tokenize else lambda t: t.split('\n') G w_split = nltk.word_tokenize if tokenize else lambda s: s.split()  def split_sents(corpus): # for s in s_split(corpus):  yield w_split(s)  prev, prev2 = self.START  tokens = [] ' for words in split_sents(corpus): O context = self.START + [self._normalize(w) for w in words] + self.END * for i, word in enumerate(words): * tag = self.tagdict.get(word)  if not tag: N features = self._get_features(i, word, context, prev, prev2) 4 tag = self.model.predict(features) ( tokens.append((word, tag))  prev2 = prev  prev = tag  return tokens Tag a single sentence. Train a model from sentences. Ported from Python: 7 def train(self, sentences, save_loc=None, nr_iter=5): # self._make_tagdict(sentences) ' self.model.classes = self.classes  prev, prev2 = START " for iter_ in range(nr_iter):  c = 0  n = 0 ' for words, tags in sentences: I context = START + [self._normalize(w) for w in words] + END . for i, word in enumerate(words): 0 guess = self.tagdict.get(word)  if not guess: O feats = self._get_features(i, word, context, prev, prev2) 7 guess = self.model.predict(feats) > self.model.update(tags[i], guess, feats) , prev2 = prev; prev = guess ' c += guess == tags[i]  n += 1 # random.shuffle(sentences) N logging.info("Iter {0}: {1}/{2}={3}".format(iter_, c, n, _pc(c, n))) " self.model.average_weights()  # Pickle as a binary file  if save_loc is not None: G pickle.dump((self.model.weights, self.tagdict, self.classes), 0 open(save_loc, 'wb'), -1)  return None ~Train on one sentence. 6Adapted from this portion of the Python train method: I context = START + [self._normalize(w) for w in words] + END . for i, word in enumerate(words): 0 guess = self.tagdict.get(word)  if not guess: O feats = self._get_features(i, word, context, prev, prev2) 7 guess = self.model.predict(feats) > self.model.update(tags[i], guess, feats) , prev2 = prev; prev = guess ' c += guess == tags[i]  n += 1 ,Predict a Part of Speech, defaulting to the Unk tag, if no  classification is found. Default feature set. 9 def _get_features(self, i, word, context, prev, prev2): C '''Map tokens into a feature representation, implemented as a I {hashable: float} dict. If the features change, a new model must be  trained.  '''  def add(name, *args): 8 features[' '.join((name,) + tuple(args))] += 1  i += len(self.START) ! features = defaultdict(int) O # It's useful to have a constant feature, which acts sort of like a prior  add('bias')  add('i suffix', word[-3:])  add('i pref1', word[0])  add('i-1 tag', prev)  add('i-2 tag', prev2) ' add('i tag+i-2 tag', prev, prev2)  add('i word', context[i]) - add('i-1 tag+i word', prev, context[i]) # add('i-1 word', context[i-1]) * add('i-1 suffix', context[i-1][-3:]) # add('i-2 word', context[i-2]) # add('i+1 word', context[i+1]) * add('i+1 suffix', context[i+1][-3:]) # add('i+2 word', context[i+2])  return features The POS tag parser. The inital model. +Training data; formatted with one sentence , per line, and standard POS tags after each  space-delimeted token. |}1The number of times to iterate over the training 1 data, randomly shuffling after each iteration. (5  is a reasonable choice.) The  to train. The training data. (A list of  [(Text, Tag)]'s) $A trained perceptron. IO is needed  for randomization. ~  |}~#Part-of-Speech tagging facilities.  experimentalcreswick@gmail.comNone A basic POS tagger. ?A POS tagger that has been trained on the Conll 2000 POS tags. 6A POS tagger trained on a subset of the Brown corpus. ?The default table of tagger IDs to readTagger functions. Each = tagger packaged with Chatter should have an entry here. By D convention, the IDs use are the fully qualified module name of the  tagger package. Store a POSTager to a file. !Load a tagger, using the interal . If you need to E specify your own mappings for new composite taggers, you should use  . >This function checks the filename to determine if the content 0 should be decompressed. If the file ends with .gz , then we  assume it is a gziped model. >Tag a chunk of input text with part-of-speech tags, using the ; sentence splitter, tokenizer, and tagger contained in the POSTager. Tag the tokens in a string. @Returns a space-separated string of tokens, each token suffixed ( with the part of speech. For example: tag tagger "the dog jumped .""the/at dog/nn jumped/vbd ./." Text version of tagStr  <Train a tagger on string input in the standard form for POS  tagged corpora: 0 trainStr tagger "the/at dog/nn jumped/vbd ./."  The  version of    Train a  on a corpus of sentences. This will recurse through the  stack, training all the E backoff taggers as well. In order to do that, this function has to B be generic to the kind of taggers used, so it is not possible to ( train up a new POSTagger from nothing:   wouldn' t know what  tagger to create. 8To get around that restriction, you can use the various mkTagger  implementations, such as    or % NLP.POS.AvgPerceptronTagger.mkTagger'. For example: + import NLP.POS.AvgPerceptronTagger as APT  : let newTagger = APT.mkTagger APT.emptyPerceptron Nothing , posTgr <- train newTagger trainingExamples   Evaluate a POSTager. 4Measures accuracy over all tags in the test corpus. Accuracy is calculated as: * |tokens tagged correctly| / |all tokens|                     Phrase Chunking facilities.  experimentalcreswick@gmail.comNone A basic Phrasal chunker. 3Convenient function to load the Conll2000 Chunker. 1Train a chunker on a set of additional examples. Chunk a /% that has been produced by a Chatter D tagger, producing a rich representation of the Chunks and the Tags  detected. ?If you just want to see chunked output from standard text, you  probably want  or . :Convenience funciton to Tokenize, POS-tag, then Chunk the A provided text, and format the result in an easy-to-read format.  > tgr <- defaultTagger  > chk <- defaultChunker ? > chunkText tgr chk "The brown dog jumped over the lazy cat." V "[NP The/DT brown/NN dog/NN] [VP jumped/VBD] [NP over/IN the/DT lazy/JJ cat/NN] ./." A wrapper around  that packs strings. ?The default table of tagger IDs to readTagger functions. Each = tagger packaged with Chatter should have an entry here. By D convention, the IDs use are the fully qualified module name of the  tagger package. Store a  to disk. Load a % from disk, optionally gunzipping if # needed. (based on file extension)    None@Different classes of Named Entity used in the WikiNER data set. out not a chunk. >Convert wikiNer format to basic IOB (one token perline, space 9 separated tags, and a blank line between each sentence) ;Tranlsate a WikiNER sentence into a list of IOB-lines, for  parsing with Y !&Train a chunker on a provided corpus.  !  ! !   !!"#$%&'()*+,-./012345667789:;<=>?@ABCDEFGGHHIJKLMMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{8|}~HMD 8 |         }        ~                                                                                                      ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ]  ^ _ ` a b c d  e f g h i j k l m   n o p q r s t u v w x y z { | } ~             M                                               (& M              ! " # $ %&'( ) * + , - . / 0 1 2 3 456789:;<=>?@ABCDEFG789H>:=6IABJKwLMchatter-0.8.0.1Data.DefaultMapNLP.Corpora.EmailNLP.Types.GeneralNLP.Types.TagsNLP.Types.Tree NLP.Types.IOBNLP.Tokenize.ChatterNLP.Corpora.ConllNLP.Corpora.Brown NLP.TypesNLP.POS.LiteralTaggerNLP.POS.UnambiguousTaggerNLP.ML.AvgPerceptronNLP.Chunk.AvgPerceptronChunkerNLP.Similarity.VectorSimNLP.Extraction.Parsec&NLP.Extraction.Examples.ParsecExamplesNLP.Corpora.ParsingNLP.POS.AvgPerceptronTaggerNLP.POS NLP.ChunkNLP.Corpora.WikiNerData.Mapfoldl Paths_chatter serialize taggerTable readTagger Data.TextwordslinesmkTagger DefaultMapDefMap defDefaultdefMapemptylookupfromListkeyselemsmap unionWith plugDataPathplugArchiveTextplugArchiveTokensfullPlugArchivereadF CaseSensitive Insensitive SensitiveError toEitherErrRawTagRawChunkTagfromTagparseTagtagUNKtagTermstartTagendTagisDtChunkTag fromChunk parseChunknotChunkNERTag fromNERTag parseNERTagTokenPOSposTagposTokenTaggedSentence TaggedSentChunkChunkOrPOS_CNChunk_CNChunkedSentence ChunkedSentSentenceSenttokens applyTagsshowChunkedSentprintTS stripTags unzipTags unzipChunkscombinecombineSentencespickTagmkChunkmkChink showPOStok showPOStagprintPOSshowToksuffixunTStsLengthtsConcatcontains containsTag posTagMatches posTokMatches tokenMatchesIOBChunkOChunkIChunkBChunkgetPOStoTaggedSentence parseIOBLine iobBuilder toChunkTreeparseIOB parseSentence getSentencestokenize runTokenizerUnkWRBWPdollarWPWDTVBZVBPVBNVBGVBDVBUHTOSYMRPRBSRBRRB PRPdollarPRPPDTNNPSNNPNNSNNMDLSJJSJJRJJINFWEXDTCDCCColonTermCommaCl_ParenOp_Paren OpenDQuote CloseDQuoteDollarHashENDSTARTOVPUCPSBARPRTPPNPLSTINTJCONJPADVPADJPMISCLOCORGPERreadTagtagTxtPatternsreversePatternsshowTag replaceAll WRB_pl_MD WRB_pl_IN WRB_pl_DOZWRB_pl_DODstar WRB_pl_DOD WRB_pl_DO WRB_pl_BEZ WRB_pl_BERWQL WPS_pl_MD WPS_pl_HVZ WPS_pl_HVD WPS_pl_BEZWPSWPO WDT_pl_HVZ WDT_pl_DODWDT_pl_DO_pl_PPS WDT_pl_BEZWDT_pl_BER_pl_PP WDT_pl_BER VBN_pl_TO VBG_pl_TOVB_pl_VBVB_pl_TOVB_pl_RP VB_pl_PPOVB_pl_JJVB_pl_INVB_pl_ATTO_pl_VBRP_pl_INRNRBT RBR_pl_CSRB_pl_CS RB_pl_BEZRBdollarQLPQL PPSS_pl_VB PPSS_pl_MD PPSS_pl_HVD PPSS_pl_HVPPSS_pl_BEZstar PPSS_pl_BEZ PPSS_pl_BER PPSS_pl_BEMPPSS PPS_pl_MD PPS_pl_HVZ PPS_pl_HVD PPS_pl_BEZPPSPPOPPLSPPLPPdollardollarPPdollarPN_pl_MD PN_pl_HVZ PN_pl_HVD PN_pl_BEZPNdollarPNODNRSNR_pl_MDNRdollarNR NPSdollarNPSNP_pl_MD NP_pl_HVZ NP_pl_BEZNPdollar NNS_pl_MD NNSdollarNN_pl_NNNN_pl_MDNN_pl_IN NN_pl_HVZ NN_pl_HVD NN_pl_BEZNNdollarMD_pl_TO MD_pl_PPSSMD_pl_HVMDstarJJT JJR_pl_CSJJ_pl_JJJJdollar IN_pl_PPOIN_pl_INHVZstarHVZHVNHVGHVDstarHVDHV_pl_TOHVstarHVFW_WPSFW_WPOFW_WDTFW_VBZFW_VBNFW_VBGFW_VBDFW_VBFW_UH FW_TO_pl_VB FW_RB_pl_CCFW_RBFW_QL FW_PPSS_pl_HVFW_PPSSFW_PPS FW_PPO_pl_INFW_PPO FW_PPL_pl_VBZFW_PPL FW_PPdollarFW_PNFW_ODFW_NRFW_NPSFW_NPFW_NNS FW_NNdollarFW_NNFW_JJTFW_JJRFW_JJ FW_IN_pl_NP FW_IN_pl_NN FW_IN_pl_ATFW_INFW_HVFW_DTS FW_DT_pl_BEZFW_DTFW_CSFW_CDFW_CCFW_BEZFW_BERFW_BE FW_AT_pl_NP FW_AT_pl_NNFW_ATFW_starEX_pl_MD EX_pl_HVZ EX_pl_HVD EX_pl_BEZDTX DTS_pl_BEZDTSDTIDT_pl_MD DT_pl_BEZDTdollarDOZstarDOZDODstarDOD DO_pl_PPSSDOstarDOCSCDdollarBEZstarBEZBERstarBERBENBEMstarBEMBEGBEDZstarBEDZBEDstarBEDBEATAP_pl_APAPdollarAPABXABNABLDashNegatorC_OC_CLC_PPC_VPC_NPCorpus corpLengthcorpTermCounts POSTagger posTagger posTrainer posBackoff posTokenizer posSplitter posSerializeposID termCounts addDocumentmkCorpusaddTermsaddTermtaggerID protectTermstag tagSentencetrain Perceptronweightstotalststamps instancesWeightClassFeatureFeatemptyPerceptronpredictupdateaverageWeightsChunker chChunker chTrainer chSerializechId chunkerID readChunker mkChunkerchunk chunkSentencetrainIntDocumentdocTermFrequencies docTokens TermVector mkDocumentfromTVmkVectorsim similaritytvSimtfidftf_idfcosVec addVectors zeroVectornegatesum magnitudedotProd ExtractorposTok posPrefixmatchestxtTokanyTokenoneOf followedBy findClauseclause prepPhrase nounPhrase verbPhrasereadPOS readPOSWithsafeInittrainNew trainOnFiles defaultTagger conllTagger brownTagger saveTagger loadTagger deserialize tagTokenstagStrtagTexttrainStr trainTextevaldefaultChunker conllChunker chunkTextchunkStr chunkerTable saveChunker loadChunker parseWikiNerwikiNerChunker trainChunker$fArbitraryDefaultMap$fNFDataDefaultMap$fSerializeDefaultMap$fArbitraryCaseSensitive$fSerializeCaseSensitive $fTagRawTag$fArbitraryRawTag$fSerializeRawTag$fChunkTagRawChunk$fSerializeRawChunkbase Data.StringIsString$fIsStringToken$fArbitraryToken$fArbitraryPOS$fArbitraryChunk$fArbitraryChunkOr$fArbitraryTaggedSentence$fArbitraryChunkedSentence$fArbitrarySentence$fArbitraryIOBChunkcatchIOversionbindirlibdirdatadir libexecdir sysconfdir getBinDir getLibDir getDataDir getLibexecDir getSysconfDirgetDataFileName$fChunkTagChunk$fSerializeTag$fArbitraryTag$fTagTag$fSerializeChunk$fNERTagNERTag$fSerializeNERTag$fArbitraryNERTag parseBrownTag showBrownTagbytestring-0.10.0.2Data.ByteString.Internal ByteString$fArbitraryCorpus$fSerializeCorpus$fNFDataCorpusescapeRegexCharscontainers-0.5.0.0 Data.Map.BaseMapupd_featroundToinfinityincrementInstances getTimestampgetTotalgetFeatureWeighttrainEx$fNFDataPerceptron$fSerializePerceptron$fSerializeClass$fSerializeFeaturetoTreetrainCls startToksendToks trainSentence itterations predictChunk toChunkOr toClassLst getFeatures tagsSinceDttagsSinceHelper mkFeature text-1.2.1.3$fArbitraryDocument$fNFDataDocument$fArbitraryTermVector$fStreamChunkedSentencemChunkOr$fStreamTaggedSentencemPOS predictPos tokenToClassData.Text.InternalText toIOBLines