Mt      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTU V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~     None0_Defaulting Map; a Map that returns a default value when queried for a key that does not exist.Create an empty IQuery the map for a value. Returns the default if the key is not found. Create a ! from a default value and a list.Access the keys as a list. Fold over the values in the map.kNote that this *does* not fold over the default value -- this fold behaves in the same way as a standard       Safe-Inferred 7Path to the directory containing all the PLUG archives.    None0CBoolean type to indicate case sensitivity for textual comparisons.Just a handy alias for TextNone0"!Tag instance for unknown tagsets. !"#$% %$#"!  !"#$% Safe-Inferred  None0'Unknown.()WH-adverb + modal auxillary e.g.; where'd)#WH-adverb + preposition e.g.; why'n*IWH-adverb + verb "to do", present tense, 3rd person singular e.g.; how's+;WH-adverb + verb "to do", past tense, negated e.g.; whyn't,9WH-adverb + verb "to do", past tense e.g.; where'd how'd-GWH-adverb + verb "to do", present, not 3rd person singular e.g.; howda.KWH-adverb + verb "to be", present, 3rd person singular e.g.; how's where's/]WH-adverb + verb "to be", present, 2nd person singular or all persons plural e.g.; where're0WH-adverb e.g.; however when where why whereby wherever how whenever whereon wherein wherewith wheare wherefore whereof howsabout1WH-qualifier e.g.; however how2KWH-pronoun, nominative + modal auxillary e.g.; who'll that'd who'd that'll3`WH-pronoun, nominative + verb "to have", present tense, 3rd person singular e.g.; who's that's4@WH-pronoun, nominative + verb "to have", past tense e.g.; who'd5XWH-pronoun, nominative + verb "to be", present, 3rd person singular e.g.; that's who's6HWH-pronoun, nominative e.g.; that who whoever whosoever what whatsoever7*WH-pronoun, accusative e.g.; whom that who8)WH-pronoun, genitive e.g.; whose whosever9PWH-determiner + verb "to have", present tense, 3rd person singular e.g.; what's:6WH-determiner + verb "to do", past tense e.g.; what'd;WH-determiner + verb "to do", uninflected present tense + pronoun, personal, nominative, not 3rd person singular e.g.; whaddya<NWH-determiner + verb "to be", present tense, 3rd person singular e.g.; what's=WH-determiner + verb "to be", present, 2nd person singular or all persons plural + pronoun, personal, nominative, not 3rd person singular e.g.; whaddya>fWH-determiner + verb "to be", present tense, 2nd person singular or all persons plural e.g.; what're?EWH-determiner e.g.; which what whatever whichever whichever-the-hell@verb, present tense, 3rd person singular e.g.; deserves believes receives takes goes expires says opposes starts permits expects thinks faces votes teaches holds calls fears spends collects backs eliminates sets flies gives seeks reads ...A3verb, past participle + infinitival to e.g.; gottaBverb, past participle e.g.; conducted charged won received studied revised operated accepted combined experienced recommended effected granted seen protected adopted retarded notarized selected composed gotten printed ...C6verb, present participle + infinitival to e.g.; gonnaDverb, present participle or gerund e.g.; modernizing improving purchasing Purchasing lacking enabling pricing keeping getting picking entering voting warning making strengthening setting neighboring attending participating moving ...Everb, past tense e.g.; said produced took recommended commented urged found added praised charged listed became announced brought attended wanted voted defeated received got stood shot scheduled feared promised made ...FZverb, base: uninflected present, imperative or infinitive; hypenated pair e.g.; say-speakG^verb, base: uninflected present, imperative or infinitive + infinitival to e.g.; wanta wannaH8verb, imperative + adverbial particle e.g.; g'ahn c'monIXverb, uninflected present tense + pronoun, personal, accusative e.g.; let's lemme gimmeJUverb, base: uninflected present, imperative or infinitive + adjective e.g.; die-deadKUverb, base: uninflected present, imperative or infinitive + preposition e.g.; lookitLDverb, base: uninflected present or infinitive + article e.g.; wannaMverb, base: uninflected present, imperative or infinitive e.g.; investigate find act follow inure achieve reduce take remedy re-set distribute realize disable feel receive continue place protect eliminate elaborate work permit run enter force ...Ninterjection e.g.; Hurrah bang whee hmpf ah goodbye oops oh-the-pain-of-it ha crunch say oh why see well hello lo alas tarantara rum-tum-tum gosh hell keerist Jesus Keeeerist boy c'mon 'mon goddamn bah hoo-pig damn ...O5infinitival to + verb, infinitive e.g.; t'jawn t'lahPinfinitival to e.g.; to t'Q1adverb, particle + preposition e.g.; out'n outtaRMadverb, particle e.g.; up out off down over on in about through across afterS$adverb, nominal e.g.; here afar thenTvadverb, superlative e.g.; most best highest uppermost nearest brightest hardest fastest deepest farthest loudest ...U=adverb, comparative + conjunction, coordinating e.g.; more'nVadverb, comparative e.g.; further earlier better later higher tougher more harder longer sooner less faster easier louder farther oftener nearer cheaper slower tighter lower worse heavier quicker ...W7adverb + conjunction, coordinating e.g.; well's soon'sXOadverb + verb "to be", present tense, 3rd person singular e.g.; here's there'sYadverb, genitive e.g.; else'sZadverb e.g.; only often generally also nevertheless upon together back newly no likely meanwhile near then heavily there apparently yet outright fully aside consistently specifically formally ever just ...[/qualifier, post e.g.; indeed enough still 'nuff\qualifier, pre e.g.; well less very most so real as highly fundamentally even how much remarkably somewhat more completely too thus ill deeply little overly halfway almost impossibly far severly such ...]qpronoun, personal, nominative, not 3rd person singular + verb "to verb", uninflected present tense e.g.; y'know^pronoun, personal, nominative, not 3rd person singular + modal auxillary e.g.; you'll we'll I'll we'd I'd they'll they'd you'd_qpronoun, personal, nominative, not 3rd person singular + verb "to have", past tense e.g.; I'd you'd we'd they'd`pronoun, personal, nominative, not 3rd person singular + verb "to have", uninflected present tense e.g.; I've we've they've you'vea|pronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 3rd person singular, negated e.g.; taintbwpronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 3rd person singular e.g.; you'scpronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 2nd person singular or all persons plural e.g.; we're you're they'redypronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 1st person singular e.g.; I'm Ahme[pronoun, personal, nominative, not 3rd person singular e.g.; they we I you ye thou you'unsfopronoun, personal, nominative, 3rd person singular + modal auxillary e.g.; he'll she'll it'll he'd it'd she'dgpronoun, personal, nominative, 3rd person singular + verb "to have", present tense, 3rd person singular e.g.; it's he's she'shgpronoun, personal, nominative, 3rd person singular + verb "to have", past tense e.g.; she'd he'd it'di}pronoun, personal, nominative, 3rd person singular + verb "to be", present tense, 3rd person singular e.g.; it's he's she'sjHpronoun, personal, nominative, 3rd person singular e.g.; it he she theekNpronoun, personal, accusative e.g.; them it him me us you 'em her thee we'unslApronoun, plural, reflexive e.g.; themselves ourselves yourselvesmZpronoun, singular, reflexive e.g.; itself himself myself yourself herself oneself ownselfn:pronoun, possessive e.g.; ours mine his hers theirs yoursoNdeterminer, possessive e.g.; our its his their my your her out thy mine thinepJpronoun, nominal + modal auxillary e.g.; someone'll somebody'll anybody'dqgpronoun, nominal + verb "to have", present tense, 3rd person singular e.g.; nobody's somebody's one'sr=pronoun, nominal + verb "to have", past tense e.g.; nobody'dspronoun, nominal + verb "to be", present tense, 3rd person singular e.g.; nothing's everything's somebody's nobody's someone'stepronoun, nominal, genitive e.g.; one's someone's anybody's nobody's everybody's anyone's everyone'supronoun, nominal e.g.; none something everything one anyone nothing nobody everybody everyone anybody anything someone no-one nothinvnumeral, ordinal e.g.; first 13th third nineteenth 2d 61st second sixth eighth ninth twenty-first eleventh 50th eighteenth- Thirty-ninth 72nd 1/20th twentieth mid-19th thousandth 350th sixteenth 701st ...wRnoun, plural, adverbial e.g.; Sundays Mondays Saturdays Wednesdays Souths Fridaysx;noun, singular, adverbial + modal auxillary e.g.; today'llynoun, singular, adverbial, genitive e.g.; Saturday's Monday's yesterday's tonight's tomorrow's Sunday's Wednesday's Friday's today's Tuesday's West's Today's South'sznoun, singular, adverbial e.g.; Friday home Wednesday Tuesday Monday Sunday Thursday yesterday tomorrow tonight West East Saturday west left east downtown north northeast southeast northwest North South right ...{noun, plural, proper, genitive e.g.; Republicans' Orioles' Birds' Yanks' Redbirds' Bucs' Yankees' Stevenses' Geraghtys' Burkes' Wackers' Achaeans' Dresbachs' Russians' Democrats' Gershwins' Adventists' Negroes' Catholics' ...|noun, plural, proper e.g.; Chases Aderholds Chapelles Armisteads Lockies Carbones French Marskmen Toppers Franciscans Romans Cadillacs Masons Blacks Catholics British Dixiecrats Mississippians Congresses ...}>noun, singular, proper + modal auxillary e.g.; Gyp'll John'll~noun, singular, proper + verb "to have", present tense, 3rd person singular e.g.; Bill's Guardino's Celie's Skolman's Crosson's Tim's Wally'snoun, singular, proper + verb "to be", present tense, 3rd person singular e.g.; W.'s Ike's Mack's Jack's Kate's Katharine's Black's Arthur's Seaton's Buckhorn's Breed's Penny's Rob's Kitty's Blackwell's Myra's Wally's Lucille's Springfield's Arlene'snoun, singular, proper, genitive e.g.; Green's Landis' Smith's Carreon's Allison's Boston's Spahn's Willie's Mickey's Milwaukee's Mays' Howsam's Mantle's Shaw's Wagner's Rickey's Shea's Palmer's Arnold's Broglio's ...noun, singular, proper e.g.; Fulton Atlanta September-October Durwood Pye Ivan Allen Jr. Jan. Alpharetta Grady William B. Hartsfield Pearl Williams Aug. Berry J. M. Cheshire Griffin Opelika Ala. E. Pelham Snodgrass ...Anoun, plural, common + modal auxillary e.g.; duds'd oystchers'llnoun, plural, common, genitive e.g.; taxpayers' children's members' States' women's cutters' motorists' steelmakers' hours' Nations' lawyers' prisoners' architects' tourists' Employers' secretaries' Rogues' ...noun, plural, common e.g.; irregularities presentments thanks reports voters laws legislators years areas adjustments chambers $100 bonds courts sales details raises sessions members congressmen votes polls calls ...<noun, singular, common, hyphenated pair e.g.; stomach-belly@noun, singular, common + modal auxillary e.g.; cowhand'd sun'll2noun, singular, common + preposition e.g.; bunchanoun, singular, common + verb "to have", present tense, 3rd person singular e.g.; guy's Knife's boat's summer's rain's company's?noun, singular, common + verb "to have", past tense e.g.; Pa'dnoun, singular, common + verb "to be", present tense, 3rd person singular e.g.; water's camera's sky's kid's Pa's heat's throat's father's money's undersecretary's granite's level's wife's fat's Knife's fire's name's hell's leg's sun's roulette's cane's guy's kind's baseball's ...noun, singular, common, genitive e.g.; season's world's player's night's chapter's golf's football's baseball's club's U.'s coach's bride's bridegroom's board's county's firm's company's superintendent's mob's Navy's ...noun, singular, common e.g.; failure burden court fire appointment awarding compensation Mayor interim committee fact effect airport management surveillance jail doctor intern extern night weekend duty legislation Tax Office ...-modal auxillary + infinitival to e.g.; oughtaWmodal auxillary + pronoun, personal, nominative, not 3rd person singular e.g.; willyahmodal auxillary + verb "to have", uninflected form e.g.; shouldda musta coulda must've woulda could'veemodal auxillary, negated e.g.; cannot couldn't wouldn't can't won't shouldn't shan't mustn't musn'tWmodal auxillary e.g.; should may might will would must can could shall ought need wiltadjective, superlative e.g.; best largest coolest calmest latest greatest earliest simplest strongest newest fiercest unhappiest worst youngest worthiest fastest hottest fittest lowest finest smallest staunchest ...adjective, semantically superlative e.g.; top chief principal northernmost master key head main tops utmost innermost foremost uppermost paramount topmost6adjective + conjunction, coordinating e.g.; lighter'nadjective, comparative e.g.; greater older further earlier later freer franker wider better deeper firmer tougher faster higher bigger worse younger lighter nicer slower happier frothier Greater newer Elder ...4adjective, hyphenated pair e.g.; big-large long-far!adjective, genitive e.g.; Great'sadjective e.g.; recent over-all possible hard-fought favorable hard meager fit such widespread outmoded inadequate ambiguous grand clerical effective orderly federal foster general proportionate ...:preposition + pronoun, personal, accusative e.g.; t'hi-im)preposition, hyphenated pair e.g.; f'ovuhpreposition e.g.; of in for by considering to on among at through with under into regarding than since despite according per before toward against as after during including between without except upon out over ...Overb "to have", present tense, 3rd person singular, negated e.g.; hasn't ain'tBverb "to have", present tense, 3rd person singular e.g.; has hath)verb "to have", past participle e.g.; had:verb "to have", present participle or gerund e.g.; having1verb "to have", past tense, negated e.g.; hadn't$verb "to have", past tense e.g.; hadGverb "to have", uninflected present tense + infinitival to e.g.; haftaUverb "to have", uninflected present tense or imperative, negated e.g.; haven't ain'tTverb "to have", uninflected present tense, infinitive or imperative e.g.; have hast.foreign word: WH-pronoun, nominative e.g.; qui5foreign word: WH-pronoun, accusative e.g.; quibusdam8foreign word: WH-determiner e.g.; quo qua quod que quok[foreign word: verb, present tense, 3rd person singular e.g.; gouverne sinkt sigue diapiacePforeign word: verb, past participle e.g.; vue verstrichen rasa verboten engagesyforeign word: verb, present participle or gerund e.g.; nolens volens appellant seq. obliterans servanda dicendi delenda;foreign word: verb, past tense e.g.; stabat peccavi audiviforeign word: verb, present tense, not 3rd person singular, imperative or infinitive e.g.; nolo contendere vive fermate faciunt esse vade noli tangere dites duces meminisse iuvabit gosaimasu voulez habla ksuu'peli-afo lacheln miuchi say allons strafe portant_foreign word: interjection e.g.; sayonara bien adieu arigato bonjour adios bueno tchalo ciao oCforeign word: infinitival to + verb, infinitive e.g.; d'entretenir@foreign word: adverb + conjunction, coordinating e.g.; forisqueforeign word: adverb e.g.; bas assai deja um wiederum cito velociter vielleicht simpliciter non zu domi nuper sic forsan olim oui semper tout despues hors#foreign word: qualifier e.g.; minusforeign word: pronoun, personal, nominative, not 3rd person singular + verb "to have", present tense, not 3rd person singular e.g.; j'ai[foreign word: pronoun, personal, nominative, not 3rd person singular e.g.; ich vous sie jeJforeign word: pronoun, personal, nominative, 3rd person singular e.g.; ilLforeign word: pronoun, personal, accusative + preposition e.g.; mecum tecum2pronoun, personal, accusative e.g.; lui me moi mioforeign word: pronoun, singular, reflexive + verb, present tense, 3rd person singular e.g.; s'excuse s'accuse4foreign word: pronoun, singular, reflexive e.g.; se=foreign word: determiner, possessive e.g.; mea mon deras vos(foreign word: pronoun, nominal e.g.; hoc5foreign word: numeral, ordinal e.g.; 18e 17e quintusKforeign word: noun, singular, adverbial e.g.; heute morgen aujourd'hui hoyCforeign word: noun, plural, proper e.g.; Svenskarna Atlantes Dieuxvforeign word: noun, singular, proper e.g.; Karshilama Dieu Rundfunk Afrique Espanol Afrika Spagna Gott Carthago deus foreign word: noun, plural, common e.g.; al culpas vopos boites haflis kolkhozes augen tyrannis alpha-beta-gammas metis banditos rata phis negociants crus Einsatzkommandos kamikaze wohaws sabinas zorrillas palazzi engages coureurs corroborees yori Ubermenschen ...foreign word: noun, singular, common, genitive e.g.; corporis intellectus arte's dei aeternitatis senioritatis curiae patronne's chambre'sforeign word: noun, singular, common e.g.; ballet esprit ersatz mano chatte goutte sang Fledermaus oud def kolkhoz roi troika canto boite blutwurst carne muzyka bonheur monde piece force ...2foreign word: adjective, superlative e.g.; optimo4foreign word: adjective, comparative e.g.; fortioriforeign word: adjective e.g.; avant Espagnol sinfonica Siciliana Philharmonique grand publique haute noire bouffe Douce meme humaine bel serieuses royaux anticus presto Sovietskaya Bayerische comique schwarzen ...Jforeign word: preposition + noun, singular, proper e.g.; d'Yquem d'Eiffelcforeign word: preposition + noun, singular, common e.g.; d'etat d'hotel d'argent d'identite d'artNforeign word: preposition + article e.g.; della des du aux zur d'un del dell'foreign word: preposition e.g.; ad de en a par con dans ex von auf super post sine sur sub avec per inter sans pour pendant in diPforeign word: verb "to have", present tense, not 3rd person singular e.g.; habe4foreign word: determiner/pronoun, plural e.g.; haecYforeign word: determiner + verb "to be", present tense, 3rd person singular e.g.; c'est5foreign word: determiner/pronoun, singular e.g.; hoc=foreign word: conjunction, subordinating e.g.; bevor quam maEforeign word: numeral, cardinal e.g.; une cinq deux sieben unam zweiLforeign word: conjunction, coordinating e.g.; et ma mais und aber och nec yMforeign word: verb "to be", present tense, 3rd person singular e.g.; ist estkforeign word: verb "to be", present tense, 2nd person singular or all persons plural e.g.; sind sunt etes?foreign word: verb "to be", infinitive or imperative e.g.; sitJforeign word: article + noun, singular, proper e.g.; L'Astree L'Imperialeforeign word: article + noun, singular, common e.g.; l'orchestre l'identite l'arcade l'ange l'assistance l'activite L'Universite l'independance L'Union L'Unita l'osservatoreNforeign word: article e.g.; la le el un die der ein keine eine das las les Il&foreign word: negator e.g.; pas non ne;existential there + modal auxillary e.g.; there'll there'dUexistential there + verb "to have", present tense, 3rd person singular e.g.; there's=existential there + verb "to have", past tense e.g.; there'dSexistential there + verb "to be", present tense, 3rd person singular e.g.; there'sexistential there e.g.; thereCdeterminer, pronoun or double conjunction e.g.; neither either onePpronoun, plural + verb "to be", present tense, 3rd person singular e.g.; them's1determiner/pronoun, plural e.g.; these those them6determiner/pronoun, singular or plural e.g.; any some;determiner/pronoun + modal auxillary e.g.; that'll this'llSdeterminer/pronoun + verb "to be", present tense, 3rd person singular e.g.; that's7determiner/pronoun, singular, genitive e.g.; another'sBdeterminer/pronoun, singular e.g.; this each another that 'notherNverb "to do", present tense, 3rd person singular, negated e.g.; doesn't don't<verb "to do", present tense, 3rd person singular e.g.; does.verb "to do", past tense, negated e.g.; didn't'verb "to do", past tense e.g.; did donejverb "to do", past or present tense + pronoun, personal, nominative, not 3rd person singular e.g.; d'youKverb "to do", uninflected present tense or imperative, negated e.g.; don'tPverb "to do", uninflected present tense, infinitive or imperative e.g.; do dostconjunction, subordinating e.g.; that as after whether before while like because if since for than altho until so unless though providing once lest sposin> till whereas whereupon supposing tho' albeit then so's 'fore7numeral, cardinal, genitive e.g.; 1960's 1961's .404'snumeral, cardinal e.g.; two one 1 four 2 1913 71 74 637 1937 8 five three million 87-31 29-5 seven 1,119 fifty-three 7.5 billion hundred 125,000 1,700 60 100 six ...Jconjunction, coordinating e.g.; and or but plus & either neither nor yet n and/or minus an'Lverb "to be", present tense, 3rd person singular, negated e.g.; isn't ain't:verb "to be", present tense, 3rd person singular e.g.; isdverb "to be", present tense, 2nd person singular or all persons plural, negated e.g.; aren't ain'tUverb "to be", present tense, 2nd person singular or all persons plural e.g.; are art(verb "to be", past participle e.g.; beenFverb "to be", present tense, 1st person singular, negated e.g.; ain't:verb "to be", present tense, 1st person singular e.g.; am7verb "to be", present participle or gerund e.g.; beingLverb "to be", past tense, 1st and 3rd person singular, negated e.g.; wasn't@verb "to be", past tense, 1st and 3rd person singular e.g.; was[verb "to be", past tense, 2nd person singular or all persons plural, negated e.g.; weren'tOverb "to be", past tense, 2nd person singular or all persons plural e.g.; were/verb "to be", infinitive or imperative e.g.; be,article e.g.; the an no a every th' ever' yeEdeterminer/pronoun, post-determiner, hyphenated pair e.g.; many-much<determiner/pronoun, post-determiner, genitive e.g.; other'sdeterminer/pronoun, post-determiner many other next more last former little several enough most least only very few fewer past same Last latter less single plenty 'nough lesser certain various manye next-to-last particular final previous present nuf>determiner/pronoun, double conjunction or pre-quantifier both<determiner/pronoun, pre-quantifier e.g.; all half many nary:determiner/pronoun, pre-qualifier e.g.; quite such rather:. Sentence Terminator, , not n't) ( Clause. Prepositional Phrase.  Verb Phrase. Noun Phrase.Order matters here: The patterns are replaced in reverse order when generating tags, and in top-to-bottom when generating tags.&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~     &'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~     & ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,+*)('    & ~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,+*)('    NonekThis type seem redundant, it just exists to support the differences in TaggedSentence and ChunkedSentence.9See the t3 example below to see how verbose this becomes.}A tagged sentence has POS Tags. Generated by a part-of-speech tagger. (tagger :: Tag tag => Sentence -> TaggedSentence tag)HA chunked sentence has POS tags and chunk tags. Generated by a chunker.V(chunker :: (Chunk chunk, Tag tag) => TaggedSentence tag -> ChunkedSentence chunk tag)`A sentence of tokens without tags. Generated by the tokenizer. (tokenizer :: Text -> Sentence) TGenerate a Text representation of a TaggedSentence in the common tagged format, eg: "the/at dog/nn jumped/vbd ./."!&Remove the tags from a tagged sentence"oExtract the tags from a tagged sentence, returning a parallel list of tags along with the underlying Sentence.#GCombine the results of POS taggers, using the second param to fill in  entries, where possible.%-Returns the first param, unless it is tagged .. Throws an error if the text does not match.($Show the underlying text token only.)Show the text and tag./True if the input sentence contains the given text token. Does not do partial or approximate matching, and compares details in a fully case-sensitive manner.0nTrue if the input sentence contains the given POS tag. Does not do partial matching (such as prefix matching)15Compare the POS-tag token with a supplied tag string.20Compare the POS-tagged token with a text string.3#Compare a token with a text string.0 !"#$%&'()*+,-./0123456789:;<=>( !"#$%&'()*+,-./01234560>=< !"#$%;&':9()87*+,-./0123456( !"#$%&'()*+,-./0123456789:;<=>None?@?@@??@None0ADocument corpus.CThis is a simple hashed corpus, the document content is not stored.C&The number of documents in the corpus.D9A count of the number of documents each term occurred in.E,Part of Speech tagger, with back-off tagger.A sequence of pos taggers can be assembled by using backoff taggers. When tagging text, the first tagger is run on the input, possibly tagging some tokens as unknown ('Tag Unk'). The first backoff tagger is then recursively invoked on the text to fill in the unknown tags, but that may still leave some tokens marked with 'Tag Unk'. This process repeats until no more taggers are found. (The current implementation is not very efficient in this respect.).DBack off taggers are particularly useful when there is a set of domain specific vernacular that a general purpose statistical tagger does not know of. A LitteralTagger can be created to map terms to fixed POS tags, and then delegate the bulk of the text to a statistical back off tagger, such as an AvgPerceptronTagger.E5 values can be serialized and deserialized by using / and NLP.POS.deserialize`. This is a bit tricky because the POSTagger abstracts away the implementation details of the particular tagging algorithm, and the model for that tagger (if any). To support serialization, each POSTagger value must provide a serialize value that can be used to generate a ? representation of the model, as well as a unique id (also a ). Furthermore, that ID must be added to a `Map ByteString (ByteString -> Maybe POSTagger -> Either String POSTagger)` that is provided to  deserialize0. The function in the map takes the output of L, and possibly a backoff tagger, and reconstitutes the POSTagger that was serialized (assigning the proper functions, setting up closures as needed, etc.) Look at the source for  and   for examples.G"The initial part-of-speech tagger.H4Training function to train the immediate POS tagger.I%A tagger to invoke on unknown tokens.JA tokenizer; ( will work.)KUA sentence splitter. If your input is formatted as one sentence per line, then use -, otherwise try Erik Kow's fullstop library.L3Store this POS tagger to a bytestring. This does not serialize the backoff taggers.MiA unique id that will identify the algorithm used for this POS Tagger. This is used in deserializationN4Get the number of documents that a term occurred in.OAdd a document to the corpus.This can be dangerous if the documents are pre-processed differently. All corpus-related functions assume that the documents have all been tokenized and the tokens normalized, in the same way.PLCreate a corpus from a list of documents, represented by normalized tokens.ABCDEFGHIJKLMNOPQRSTI !"#$%&'()*+,-./0123456ABCDEFGHIJKLMNOPQREFGHIJKLMABCDTSNOPQR ABCDEFGHIJKLMNOPQRST NoneVaCreate a Literal Tagger using the specified back-off tagger as a fall-back, if one is specified.'This uses a tokenizer adapted from the tokenize^ package for a tokenizer, and Erik Kow's fullstop sentence segmenter as a sentence splitter.WSCreate a tokenizer that protects the provided terms (to tokenize multi-word terms)Zdeserialization for Literal Taggers. The serialization logic is in the posSerialize record of the POSTagger created in mkTagger.UVWXYZ UVWXYZ XYVUZWUVWXYZ None]1Create an unambiguous tagger, using the supplied  as a source of tags.^'Trainer method for unambiguous taggers.[\]^[\]^[\]^[\]^ None0 _The perceptron model.aGEach feature gets its own weight vector, so weights is a dict-of-dictsbVThe accumulated values, for the averaging. These will be keyed by feature/clas tuplescThe last time the feature was changed, for the averaging. Also keyed by feature/clas tuples (tstamps is short for timestamps)dNumber of instances seeneeTypedef for doubles to make the code easier to read, and to make this simple to change if necessary.fVThe classes that the perceptron assigns are represnted with a newtype-wrapped String.Eventually, I think this should become a typeclass, so the classes can be defined by the users of the Perceptron (such as custom POS tag ADTs, or more complex classes).j,An empty perceptron, used to start training.k'Predict a class given a feature vector.Ported from python: def predict(self, features): '''Dot-product the features and current weights and return the best label.''' scores = defaultdict(float) for feat, value in features.items(): if feat not in self.weights or value == 0: continue weights = self.weights[feat] for label, weight in weights.items(): scores[label] += value * weight # Do a secondary alphabetic sort, for stability return max(self.classes, key=lambda label: (scores[label], label))l)Update the perceptron with a new example. update(self, truth, guess, features) ... self.i += 1 if truth == guess: return None for f in features: weights = self.weights.setdefault(f, {}) -- setdefault is Map.findWithDefault, and destructive. upd_feat(truth, f, weights.get(truth, 0.0), 1.0) upd_feat(guess, f, weights.get(guess, 0.0), -1.0) return Noneported from python: def update(self, truth, guess, features): '''Update the feature weights.''' def upd_feat(c, f, w, v): param = (f, c) self._totals[param] += (self.i - self._tstamps[param]) * w self._tstamps[param] = self.i self.weights[f][c] = w + vmAverage the weightsPorted from Python: def average_weights(self): for feat, weights in self.weights.items(): new_feat_weights = {} for clas, weight in weights.items(): param = (feat, clas) total = self._totals[param] total += (self.i - self._tstamps[param]) * weight averaged = round(total / float(self.i), 3) if averaged: new_feat_weights[clas] = averaged self.weights[feat] = new_feat_weights return None7round a fractional number to a specified decimal place.roundTo 2 3.14593.15_`abcdefghijklmn_`abcdefghijklmn_`abcdfgehijknlm_`abcdefghijklmn None oMAn efficient (ish) representation for documents in the "bag of words" sense.p Generate a o from a tokenized document.q*Invokes similarity on full strings, using $ for tokenization, and no stemming.4There *must* be at least one document in the corpus.r(Determine how similar two documents are.fThis function assumes that each document has been tokenized and (if desired) stemmed/case-normalized.This is a wrapper around s, which is a *much* more efficient implementation. If you need to run similarity against any single document more than once, then you should create o&s for each of your documents and use s instead of r.4There *must* be at least one document in the corpus.s(Determine how similar two documents are.ACalculates the similarity between two documents, represented as  TermVectorst5Return the raw frequency of a term in a body of text.The firt argument is the term to find, the second is a tokenized document. This function does not do any stemming or additional text modification.u)Calculate the inverse document frequency.AThe IDF is, roughly speaking, a measure of how popular a term is.vGCalculate the tf*idf measure for a term given a document and a corpus.x$Calculate the magnitude of a vector.y$find the dot product of two vectors. opqrstuvwxy opqrstuvwxy opqrstuvwxy opqrstuvwxy None2346HMzA Parsec parser.Example usage: n> set -XOverloadedStrings > import Text.Parsec.Prim > parse myExtractor "interactive repl" someTaggedSentence {&Consume a token with the given POS Tag|.Consume a token with the specified POS prefix. !> parse (posPrefix "n") "ghci" [(Bob, Tag "np")] Right [(Bob , Tag "np")] }6Text equality matching with optional case sensitivity.~6Consume a token with the given lexical representation. Consume any one non-empty token.cSkips any number of fill tokens, ending with the end parser, and returning the last parsed result.eThis is useful when you know what you're looking for and (for instance) don't care what comes first. z{|}~z{|}~ z{|}~ z{|}~None-Find a clause in a larger collection of text.EfindClause skips over leading tokens, if needed, to locate a clause.NoneTRead a POS-tagged corpus out of a Text string of the form: "token/tag token/tag..."%readPOS "Dear/jj Sirs/nns :/: Let/vb"5[("Dear",JJ),("Sirs",NNS),(":",Other ":"),("Let",VB)]pReturns all but the last element of a string, unless the string is empty, in which case it returns that string.None nCreate an Averaged Perceptron Tagger using the specified back-off tagger as a fall-back, if one is specified.'This uses a tokenizer adapted from the ?H package for a tokenizer, and Erik Kow's fullstop sentence segmenter ( +http://hackage.haskell.org/package/fullstop) as a sentence splitter. Train a new _.The training corpus should be a collection of sentences, one sentence on each line, and with each token tagged with a part of speech.For example, the input: 9"The/DT dog/NN jumped/VB ./.\nThe/DT cat/NN slept/VB ./."defines two training sentences.Btagger <- trainNew "Dear/jj Sirs/nns :/: Let/vb\nUs/nn begin/vb\n"-tag tagger $ map T.words $ T.lines "Dear sir""Dear/jj Sirs/nns :/: Let/vb" Train a new _ on a corpus of files.&Add training examples to a perceptron.Otagger <- train emptyPerceptron "Dear/jj Sirs/nns :/: Let/vb\nUs/nn begin/vb\n"-tag tagger $ map T.words $ T.lines "Dear sir""Dear/jj Sirs/nns :/: Let/vb"If you're using multiple input files, this can be useful to improve performance (by folding over the files). For example, see ]start markers to ensure all features in context are valid, even for the first "real" tokens.Oend markers to ensure all features are valid, even for the last "real" tokens.)Tag a document (represented as a list of s) with a trained _Ported from Python: wdef tag(self, corpus, tokenize=True): '''Tags a string `corpus`.''' # Assume untokenized corpus has \n between sentences and ' ' between words s_split = nltk.sent_tokenize if tokenize else lambda t: t.split('\n') w_split = nltk.word_tokenize if tokenize else lambda s: s.split() def split_sents(corpus): for s in s_split(corpus): yield w_split(s) prev, prev2 = self.START tokens = [] for words in split_sents(corpus): context = self.START + [self._normalize(w) for w in words] + self.END for i, word in enumerate(words): tag = self.tagdict.get(word) if not tag: features = self._get_features(i, word, context, prev, prev2) tag = self.model.predict(features) tokens.append((word, tag)) prev2 = prev prev = tag return tokensTag a single sentence.Train a model from sentences.Ported from Python: 3def train(self, sentences, save_loc=None, nr_iter=5): self._make_tagdict(sentences) self.model.classes = self.classes prev, prev2 = START for iter_ in range(nr_iter): c = 0 n = 0 for words, tags in sentences: context = START + [self._normalize(w) for w in words] + END for i, word in enumerate(words): guess = self.tagdict.get(word) if not guess: feats = self._get_features(i, word, context, prev, prev2) guess = self.model.predict(feats) self.model.update(tags[i], guess, feats) prev2 = prev; prev = guess c += guess == tags[i] n += 1 random.shuffle(sentences) logging.info("Iter {0}: {1}/{2}={3}".format(iter_, c, n, _pc(c, n))) self.model.average_weights() # Pickle as a binary file if save_loc is not None: pickle.dump((self.model.weights, self.tagdict, self.classes), open(save_loc, 'wb'), -1) return NoneTrain on one sentence.5Adapted from this portion of the Python train method:  context = START + [self._normalize(w) for w in words] + END for i, word in enumerate(words): guess = self.tagdict.get(word) if not guess: feats = self._get_features(i, word, context, prev, prev2) guess = self.model.predict(feats) self.model.update(tags[i], guess, feats) prev2 = prev; prev = guess c += guess == tags[i] n += 1,Predict a Part of Speech, defaulting to the Unk% tag, if no classification is found.Default feature set. def _get_features(self, i, word, context, prev, prev2): '''Map tokens into a feature representation, implemented as a {hashable: float} dict. If the features change, a new model must be trained. ''' def add(name, *args): features[' '.join((name,) + tuple(args))] += 1 i += len(self.START) features = defaultdict(int) # It's useful to have a constant feature, which acts sort of like a prior add('bias') add('i suffix', word[-3:]) add('i pref1', word[0]) add('i-1 tag', prev) add('i-2 tag', prev2) add('i tag+i-2 tag', prev, prev2) add('i word', context[i]) add('i-1 tag+i word', prev, context[i]) add('i-1 word', context[i-1]) add('i-1 suffix', context[i-1][-3:]) add('i-2 word', context[i-2]) add('i+1 word', context[i+1]) add('i+1 suffix', context[i+1][-3:]) add('i+2 word', context[i+2]) return featuresThe POS tag parser.The inital model.nTraining data; formatted with one sentence per line, and standard POS tags after each space-delimeted token.bThe number of times to iterate over the training data, randomly shuffling after each iteration. (5 is a reasonable choice.)The _ to train.The training data. (A list of  [(Text, Tag)]'s)7A trained perceptron. IO is needed for randomization. j jNone The default table of tagger IDs to readTagger functions. Each tagger packaged with Chatter should have an entry here. By convention, the IDs use are the fully qualified module name of the tagger package.Store a POSTager to a file.!Load a tagger, using the interal X. If you need to specify your own mappings for new composite taggers, you should use .This function checks the filename to determine if the content should be decompressed. If the file ends with ".gz", then we assume it is a gziped model.yTag a chunk of input text with part-of-speech tags, using the sentence splitter, tokenizer, and tagger contained in the POSTager.Tag the tokens in a string.gReturns a space-separated string of tokens, each token suffixed with the part of speech. For example:tag tagger "the dog jumped .""the/at dog/nn jumped/vbd ./."Text version of tagStrLTrain a tagger on string input in the standard form for POS tagged corpora: .trainStr tagger "the/at dog/nn jumped/vbd ./."The  version of Train a E on a corpus of sentences.This will recurse through the E stack, training all the backoff taggers as well. In order to do that, this function has to be generic to the kind of taggers used, so it is not possible to train up a new POSTagger from nothing: & wouldn't know what tagger to create.8To get around that restriction, you can use the various mkTagger implementations, such as  9 or NLP.POS.AvgPerceptronTagger.mkTagger'. For example: import NLP.POS.AvgPerceptronTagger as APT let newTagger = APT.mkTagger APT.emptyPerceptron Nothing posTgr <- train newTagger trainingExamples Evaluate a POSTager.3Measures accuracy over all tags in the test corpus.Accuracy is calculated as: (|tokens tagged correctly| / |all tokens| !"#$%&'()*+,-./001123456789:;<=>2?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&''(("")*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUUVWXXYZ[\]^_`abcdef g  h i j  g   k l l m n o p q r r s t u v w x k y z { | } ~              gkijikN             chatter-0.3.0.0Data.DefaultMapNLP.Corpora.EmailNLP.Types.GeneralNLP.Types.TagsNLP.Corpora.BrownNLP.Types.TreeNLP.Tokenize.Chatter NLP.TypesNLP.POS.LiteralTaggerNLP.POS.UnambiguousTaggerNLP.POS.AvgPerceptronNLP.Similarity.VectorSimNLP.Extraction.Parsec&NLP.Extraction.Examples.ParsecExamplesNLP.Corpora.ParsingNLP.POS.AvgPerceptronTaggerNLP.POSData.Mapfoldl Paths_chatter serialize taggerTable readTagger Data.TextwordslinesmkTagger DefaultMapDefMap defDefaultdefMapemptylookupfromListkeys$fNFDataDefaultMap$fSerializeDefaultMap plugDataPathplugArchiveTextplugArchiveTokensfullPlugArchivereadF CaseSensitive Insensitive SensitiveError$fSerializeCaseSensitiveRawTagRawChunkTagfromTagparseTagtagUNKtagTermChunkTag fromChunk$fSerializeText$fArbitraryRawTag $fTagRawTag$fSerializeRawTag$fChunkTagRawChunk$fSerializeRawChunkUnk WRB_pl_MD WRB_pl_IN WRB_pl_DOZWRB_pl_DODstar WRB_pl_DOD WRB_pl_DO WRB_pl_BEZ WRB_pl_BERWRBWQL WPS_pl_MD WPS_pl_HVZ WPS_pl_HVD WPS_pl_BEZWPSWPOWPdollar WDT_pl_HVZ WDT_pl_DODWDT_pl_DO_pl_PPS WDT_pl_BEZWDT_pl_BER_pl_PP WDT_pl_BERWDTVBZ VBN_pl_TOVBN VBG_pl_TOVBGVBDVB_pl_VBVB_pl_TOVB_pl_RP VB_pl_PPOVB_pl_JJVB_pl_INVB_pl_ATVBUHTO_pl_VBTORP_pl_INRPRNRBT RBR_pl_CSRBRRB_pl_CS RB_pl_BEZRBdollarRBQLPQL PPSS_pl_VB PPSS_pl_MD PPSS_pl_HVD PPSS_pl_HVPPSS_pl_BEZstar PPSS_pl_BEZ PPSS_pl_BER PPSS_pl_BEMPPSS PPS_pl_MD PPS_pl_HVZ PPS_pl_HVD PPS_pl_BEZPPSPPOPPLSPPLPPdollardollarPPdollarPN_pl_MD PN_pl_HVZ PN_pl_HVD PN_pl_BEZPNdollarPNODNRSNR_pl_MDNRdollarNR NPSdollarNPSNP_pl_MD NP_pl_HVZ NP_pl_BEZNPdollarNP NNS_pl_MD NNSdollarNNSNN_pl_NNNN_pl_MDNN_pl_IN NN_pl_HVZ NN_pl_HVD NN_pl_BEZNNdollarNNMD_pl_TO MD_pl_PPSSMD_pl_HVMDstarMDJJTJJS JJR_pl_CSJJRJJ_pl_JJJJdollarJJ IN_pl_PPOIN_pl_ININHVZstarHVZHVNHVGHVDstarHVDHV_pl_TOHVstarHVFW_WPSFW_WPOFW_WDTFW_VBZFW_VBNFW_VBGFW_VBDFW_VBFW_UH FW_TO_pl_VB FW_RB_pl_CCFW_RBFW_QL FW_PPSS_pl_HVFW_PPSSFW_PPS FW_PPO_pl_INFW_PPO FW_PPL_pl_VBZFW_PPL FW_PPdollarFW_PNFW_ODFW_NRFW_NPSFW_NPFW_NNS FW_NNdollarFW_NNFW_JJTFW_JJRFW_JJ FW_IN_pl_NP FW_IN_pl_NN FW_IN_pl_ATFW_INFW_HVFW_DTS FW_DT_pl_BEZFW_DTFW_CSFW_CDFW_CCFW_BEZFW_BERFW_BE FW_AT_pl_NP FW_AT_pl_NNFW_ATFW_starEX_pl_MD EX_pl_HVZ EX_pl_HVD EX_pl_BEZEXDTX DTS_pl_BEZDTSDTIDT_pl_MD DT_pl_BEZDTdollarDTDOZstarDOZDODstarDOD DO_pl_PPSSDOstarDOCSCDdollarCDCCBEZstarBEZBERstarBERBENBEMstarBEMBEGBEDZstarBEDZBEDstarBEDBEATAP_pl_APAPdollarAPABXABNABLColonTermDashCommaNegatorCl_ParenOp_ParenChunkC_CLC_PPC_VPC_NPTokenPOSChunkOrPOS_CNChunk_CNTaggedSentence TaggedSentChunkedSentence ChunkedSentSentenceSenttokens applyTagsprintTS stripTags unzipTagscombinecombineSentencespickTagmkChunkmkChinkshowPOSprintPOSshowToksuffixunTStsLengthtsConcatcontains containsTag posTagMatches posTokMatches tokenMatchest1t2t3$fIsStringToken$fArbitraryToken$fArbitraryPOS$fArbitraryChunk$fArbitraryChunkOr$fArbitraryTaggedSentence$fArbitraryChunkedSentence$fArbitrarySentencetokenize runTokenizerCorpus corpLengthcorpTermCounts POSTagger posTagger posTrainer posBackoff posTokenizer posSplitter posSerializeposID termCounts addDocumentmkCorpusaddTermsaddTerm$fSerializeCorpus$fNFDataCorpustaggerID protectTermstag tagSentencetrain Perceptronweightstotalststamps instancesWeightClassFeatureFeatemptyPerceptronpredictupdateaverageWeights TermVectormkVectorsim similaritytvSimtfidftf_idfcosVec magnitudedotProd ExtractorposTok posPrefixmatchestxtTokanyTokenoneOf followedBy$fStreamTaggedSentencemPOS findClauseclause prepPhrase nounPhrase verbPhrasereadPOS readPOSWithsafeInittrainNew trainOnFilestrainInt defaultTagger saveTagger loadTagger deserialize tagTokenstagStrtagTexttrainStr trainTextevalcatchIOversionbindirlibdirdatadir libexecdir sysconfdir getBinDir getLibDir getDataDir getLibexecDir getSysconfDirgetDataFileNametagTxtPatterns parseBrownTagreversePatterns showBrownTag replaceAll$fChunkTagChunk$fArbitraryTag$fTagTag$fSerializeTag$fSerializeChunkbytestring-0.10.4.0Data.ByteString.Internal ByteStringescapeRegexCharscontainers-0.5.5.1 Data.Map.BaseMapupd_featroundToinfinityincrementInstances getTimestampgetTotalgetFeatureWeighttrainEx$fNFDataPerceptron$fSerializePerceptron$fSerializeClass$fSerializeFeature text-1.1.1.3 startToksendToks trainSentence predictPos getFeatures itterations toClassLsttrainCls tokenToClass mkFeatureData.Text.InternalText