d.=3      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                             ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W XYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~None5T _Defaulting Map; a Map that returns a default value when queried for a key that does not exist.Create an empty IQuery the map for a value. Returns the default if the key is not found. Create a ! from a default value and a list.Access the keys as a list.(Access the non-default values as a list. (Map a function over the values in a map.  Fold over the values in the map.kNote that this *does* not fold over the default value -- this fold behaves in the same way as a standard  zCompute the union of two maps using the specified per-value combination function and the specified new map default value. !Combine values with this functionThe new map's default valueThe first map to combineThe second map to combine      Safe7Path to the directory containing all the PLUG archives.None5CBoolean type to indicate case sensitivity for textual comparisons.Just a handy alias for TextNone5"A fallback POS tag instance.$ A fall-back . instance, analogous to "&The class of POS Tags.!We use a typeclass here because POS tags just need a few things in excess of equality (they also need to be serializable and human readable). Passing around all the constraints everywhere becomes a hassle, and it's handy to have a uniform interface to the diferent kinds of tag types."This typeclass also allows for corpus-specific tags to be distinguished; They have different semantics, so they should not be merged. That said, if you wish to create a unifying POS Tag set, and mappings into that set, you can use the type system to ensure that that is done correctly.This may* get renamed to POSTag at some later date.-#Check if a tag is a determiner tag..,The class of things that can be regarded as chunks; Chunk tags are much like POS tags, but should not be confused. Generally, chunks distinguish between different phrasal categories (e.g.; Noun Phrases, Verb Phrases, Prepositional Phrases, etc..)2tThe class of named entity sets. This typeclass can be defined entirely in terms of the required class constraints.6!Tag instance for unknown tagsets."#$%&'()*+,-./0123456789"#$%&+)('*,-./01234234./01&'()*+,-$%98"#765 "#$%&'()*+,-./0123456789NoneDRaw tokenized text.D has a  instance to simplify use.FA POS-tagged token.J}A tagged sentence has POS Tags. Generated by a part-of-speech tagger. (tagger :: Tag tag => Sentence -> TaggedSentence tag)L2A Chunk that strictly contains chunks or POS tags.NA data type to represent the portions of a parse tree for Chunks. Note that this part of the parse tree could be a POS tag with no chunk.QHA chunked sentence has POS tags and chunk tags. Generated by a chunker.V(chunker :: (Chunk chunk, Tag tag) => TaggedSentence tag -> ChunkedSentence chunk tag)S`A sentence of tokens without tags. Generated by the tokenizer. (tokenizer :: Text -> Sentence)UExtract the token list from a SVApply a parallel list of &s to a S.XTGenerate a Text representation of a TaggedSentence in the common tagged format, eg: "the/at dog/nn jumped/vbd ./."Y&Remove the tags from a tagged sentenceZoExtract the tags from a tagged sentence, returning a parallel list of tags along with the underlying Sentence.\GCombine the results of POS taggers, using the second param to fill in ) entries, where possible.]Merge J* values, preffering the tags in the first J. Delegates to ^.^-Returns the first param, unless it is tagged ).. Throws an error if the text does not match._Helper to create N types.`Helper to create N& types that just hold POS tagged data.a$Show the underlying text token only.cShow the text and tag.dExtract the text of a De'Extract the last three characters of a DF, if the token is long enough, otherwise returns the full token text.fExtract the list of F tags from a JgCalculate the length of a J% (in terms of the number of tokens).hBrutally concatenate two JsiTrue if the input sentence contains the given text token. Does not do partial or approximate matching, and compares details in a fully case-sensitive manner.jnTrue if the input sentence contains the given POS tag. Does not do partial matching (such as prefix matching)k5Compare the POS-tag token with a supplied tag string.l0Compare the POS-tagged token with a text string.m#Compare a token with a text string.2DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu*DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklm2STuUVQRNOPLMWtJKsXYZ[\]^r_`qFGHIpabcDEondefghijklm(DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuNone+Data type to indicate IOB tags for chunkingBeging marker. In chunk tagNot in a chunk.(Parse an IOB-chunk encoded line of text.Assumes that the line has three space-delimeted entries, in the format: > token POSTag IOBChunk For example: > > parseIOBLine "We PRP B-NP" :: IOBChunk B.Chunk B.Tag > BChunk (POS B.PRP (Token We )) B.C_NPTurn an IOB result into a tree.Parse an IOB-encoded corpus.Just split a body of text into lines, and then into "paragraphs". Each resulting sub list is separated by empty lines in the original text.9e.g.; > > getSentences "Henjumpedn.nnShenjumpedn." > [[He, "jumped", "."], [She,"jumped", "."]] NoneSafe  None58{These tags may actually be the Penn Treebank tags. But I have not (yet?) seen the punctuation tags added to the Penn set.4This particular list was complied from the union of:SAll tags used on the Conll2000 training corpus. (contributing the punctuation tags)$The PennTreebank tags, listed here:  Khttps://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html5 (which contributed LS over the items in the corpus).9The tags: START, END, and Unk, which are used by Chatter.START tag, used in training.END tag, used in training.#$''``(),. Sentence Terminator:Coordinating conjunctionCardinal number DeterminerExistential there Foreign word(Preposition or subordinating conjunction AdjectiveAdjective, comparativeAdjective, superlativeList item markerModalNoun, singular or mass Noun, pluralProper noun, singularProper noun, plural PredeterminerPossessive endingPersonal pronounPossessive pronounAdverbAdverb, comparativeAdverb, superlativeParticleSymbolto InterjectionVerb, base formVerb, past tense"Verb, gerund or present participleVerb, past participle%Verb, non-3rd person singular present!Verb, 3rd person singular present Wh-determiner Wh-pronounPossessive wh-pronoun Wh-adverb-Phrase chunk tags defined for the Conll task. Noun Phrase.Prepositional Phrase. Verb Phrase."out"; not a chunk.8Named entity categories defined for the Conll 2003 task.:Parse an IOB-formatted Conll corpus into TagagedSentences.Order matters here: The patterns are replaced in reverse order when generating tags, and in top-to-bottom when generating tags.RIR0  None5Document corpus.CThis is a simple hashed corpus, the document content is not stored.&The number of documents in the corpus.9A count of the number of documents each term occurred in.,Part of Speech tagger, with back-off tagger.A sequence of pos taggers can be assembled by using backoff taggers. When tagging text, the first tagger is run on the input, possibly tagging some tokens as unknown ('Tag Unk'). The first backoff tagger is then recursively invoked on the text to fill in the unknown tags, but that may still leave some tokens marked with 'Tag Unk'. This process repeats until no more taggers are found. (The current implementation is not very efficient in this respect.).DBack off taggers are particularly useful when there is a set of domain specific vernacular that a general purpose statistical tagger does not know of. A LitteralTagger can be created to map terms to fixed POS tags, and then delegate the bulk of the text to a statistical back off tagger, such as an AvgPerceptronTagger.5 values can be serialized and deserialized by using / and NLP.POS.deserialize`. This is a bit tricky because the POSTagger abstracts away the implementation details of the particular tagging algorithm, and the model for that tagger (if any). To support serialization, each POSTagger value must provide a serialize value that can be used to generate a ? representation of the model, as well as a unique id (also a ). Furthermore, that ID must be added to a `Map ByteString (ByteString -> Maybe POSTagger -> Either String POSTagger)` that is provided to  deserialize0. The function in the map takes the output of , and possibly a backoff tagger, and reconstitutes the POSTagger that was serialized (assigning the proper functions, setting up closures as needed, etc.) Look at the source for  and   for examples. "The initial part-of-speech tagger. 4Training function to train the immediate POS tagger. %A tagger to invoke on unknown tokens. A tokenizer; ( will work.)UA sentence splitter. If your input is formatted as one sentence per line, then use -, otherwise try Erik Kow's fullstop library.3Store this POS tagger to a bytestring. This does not serialize the backoff taggers.iA unique id that will identify the algorithm used for this POS Tagger. This is used in deserialization4Get the number of documents that a term occurred in.Add a document to the corpus.This can be dangerous if the documents are pre-processed differently. All corpus-related functions assume that the documents have all been tokenized and the tokens normalized, in the same way.LCreate a corpus from a list of documents, represented by normalized tokens.     T"#$%&+)('*,-./01234DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklm                 NoneaCreate a Literal Tagger using the specified back-off tagger as a fall-back, if one is specified.'This uses a tokenizer adapted from the tokenize^ package for a tokenizer, and Erik Kow's fullstop sentence segmenter as a sentence splitter. SCreate a tokenizer that protects the provided terms (to tokenize multi-word terms)#deserialization for Literal Taggers. The serialization logic is in the posSerialize record of the POSTagger created in mkTagger. !"#  !"# !"#  !"# None&1Create an unambiguous tagger, using the supplied  as a source of tags.''Trainer method for unambiguous taggers.$%&'$%&'$%&'$%&' None5I (The perceptron model.*GEach feature gets its own weight vector, so weights is a dict-of-dicts+VThe accumulated values, for the averaging. These will be keyed by feature/clas tuples,The last time the feature was changed, for the averaging. Also keyed by feature/clas tuples (tstamps is short for timestamps)-Number of instances seen.eTypedef for doubles to make the code easier to read, and to make this simple to change if necessary./VThe classes that the perceptron assigns are represnted with a newtype-wrapped String.Eventually, I think this should become a typeclass, so the classes can be defined by the users of the Perceptron (such as custom POS tag ADTs, or more complex classes).3,An empty perceptron, used to start training.4'Predict a class given a feature vector.Ported from python: def predict(self, features): '''Dot-product the features and current weights and return the best label.''' scores = defaultdict(float) for feat, value in features.items(): if feat not in self.weights or value == 0: continue weights = self.weights[feat] for label, weight in weights.items(): scores[label] += value * weight # Do a secondary alphabetic sort, for stability return max(self.classes, key=lambda label: (scores[label], label))5)Update the perceptron with a new example. update(self, truth, guess, features) ... self.i += 1 if truth == guess: return None for f in features: weights = self.weights.setdefault(f, {}) -- setdefault is Map.findWithDefault, and destructive. upd_feat(truth, f, weights.get(truth, 0.0), 1.0) upd_feat(guess, f, weights.get(guess, 0.0), -1.0) return Noneported from python: def update(self, truth, guess, features): '''Update the feature weights.''' def upd_feat(c, f, w, v): param = (f, c) self._totals[param] += (self.i - self._tstamps[param]) * w self._tstamps[param] = self.i self.weights[f][c] = w + v6Average the weightsPorted from Python: def average_weights(self): for feat, weights in self.weights.items(): new_feat_weights = {} for clas, weight in weights.items(): param = (feat, clas) total = self._totals[param] total += (self.i - self._tstamps[param]) * weight averaged = round(total / float(self.i), 3) if averaged: new_feat_weights[clas] = averaged self.weights[feat] = new_feat_weights return None7round a fractional number to a specified decimal place.roundTo 2 3.14593.15()*+,-./0123456789:;()*+,-./01234567()*+,-/0.1234756()*+,-./0123456789:; None LiThe type of Chunkers, incorporates chunking, training, serilazitaion and unique IDs for deserialization.R+The unique ID for this implementation of a LSdeserialize an AvgPerceptronChunker from a .TCreate a chunker from a (.U=Chunk a list of POS-tagged sentence, generating a parse tree.V#Chunk a single POS-tagged sentence.Turn an IOB result into a tree.DCopied directly from the AvgPerceptronTagger; should be generalized?]start markers to ensure all features in context are valid, even for the first "real" tokens.Oend markers to ensure all features are valid, even for the last "real" tokens.Train on one sentence.LMNOPQRSTUVWbThe number of times to iterate over the training data, randomly shuffling after each iteration. (5 is a reasonable choice.)The ( to train.The training data. (A list of  [(Text, Tag)]'s)7A trained perceptron. IO is needed for randomization./The full sentence that this word is located in.The index of the current word.The current word/tag pair.)The predicted class of the previous word. LMNOPQRSTUVW TWUVLMNOPQRSLMNOPQRSTUVWNone5I\MAn efficient (ish) representation for documents in the "bag of words" sense.^&Make a document from a list of tokens._CAccess the underlying DefaultMap used to store term vector details.` Generate a \ from a tokenized document.a*Invokes similarity on full strings, using R for tokenization, and no stemming. The return value will be in the range [0, 1]4There *must* be at least one document in the corpus.b(Determine how similar two documents are.fThis function assumes that each document has been tokenized and (if desired) stemmed/case-normalized.This is a wrapper around c, which is a *much* more efficient implementation. If you need to run similarity against any single document more than once, then you should create \&s for each of your documents and use c instead of b.-The return value will be in the range [0, 1].4There *must* be at least one document in the corpus.c(Determine how similar two documents are.ACalculates the similarity between two documents, represented as  TermVectorsL, returning a double in the range [0, 1] where 1 represents "most similar".d5Return the raw frequency of a term in a body of text.The firt argument is the term to find, the second is a tokenized document. This function does not do any stemming or additional text modification.e)Calculate the inverse document frequency.AThe IDF is, roughly speaking, a measure of how popular a term is.fGCalculate the tf*idf measure for a term given a document and a corpus.hAdd two term vectors. When a term is added, its value in each vector is used (or that vector's default value is used if the term is absent from the vector). The new term vector resulting from the addition always uses a default value of zero.i"A "zero vector" term vector (i.e. addVector v zeroVector = v).jNegate a term vector.kAdd a list of term vectors.l$Calculate the magnitude of a vector.m$find the dot product of two vectors.XYZ[\]^_`abcdefghijklmnopqrXYZ[\]^_`abcdefghijklmno\]rXYZ[qp^_`abcdefghijklmnoXYZ[\]^_`abcdefghijklmnopqrNone9:;<=OT|A Parsec parser.Example usage: n> set -XOverloadedStrings > import Text.Parsec.Prim > parse myExtractor "interactive repl" someTaggedSentence }&Consume a token with the given POS Tag~.Consume a token with the specified POS prefix. !> parse (posPrefix "n") "ghci" [(Bob, Tag "np")] Right [(Bob , Tag "np")] 6Text equality matching with optional case sensitivity.6Consume a token with the given lexical representation. Consume any one non-empty token.cSkips any number of fill tokens, ending with the end parser, and returning the last parsed result.eThis is useful when you know what you're looking for and (for instance) don't care what comes first. |}~|}~ |}~ |}~NoneTRead a POS-tagged corpus out of a Text string of the form: "token/tag token/tag..."%readPOS "Dear/jj Sirs/nns :/: Let/vb"5[("Dear",JJ),("Sirs",NNS),(":",Other ":"),("Let",VB)]pReturns all but the last element of a string, unless the string is empty, in which case it returns that string.None nCreate an Averaged Perceptron Tagger using the specified back-off tagger as a fall-back, if one is specified.'This uses a tokenizer adapted from the H package for a tokenizer, and Erik Kow's fullstop sentence segmenter ( +http://hackage.haskell.org/package/fullstop) as a sentence splitter. Train a new (.The training corpus should be a collection of sentences, one sentence on each line, and with each token tagged with a part of speech.For example, the input: 9"The/DT dog/NN jumped/VB ./.\nThe/DT cat/NN slept/VB ./."defines two training sentences.Btagger <- trainNew "Dear/jj Sirs/nns :/: Let/vb\nUs/nn begin/vb\n"-tag tagger $ map T.words $ T.lines "Dear sir""Dear/jj Sirs/nns :/: Let/vb" Train a new ( on a corpus of files.&Add training examples to a perceptron.Otagger <- train emptyPerceptron "Dear/jj Sirs/nns :/: Let/vb\nUs/nn begin/vb\n"-tag tagger $ map T.words $ T.lines "Dear sir""Dear/jj Sirs/nns :/: Let/vb"If you're using multiple input files, this can be useful to improve performance (by folding over the files). For example, see ]start markers to ensure all features in context are valid, even for the first "real" tokens.Oend markers to ensure all features are valid, even for the last "real" tokens.)Tag a document (represented as a list of Ss) with a trained (Ported from Python: wdef tag(self, corpus, tokenize=True): '''Tags a string `corpus`.''' # Assume untokenized corpus has \n between sentences and ' ' between words s_split = nltk.sent_tokenize if tokenize else lambda t: t.split('\n') w_split = nltk.word_tokenize if tokenize else lambda s: s.split() def split_sents(corpus): for s in s_split(corpus): yield w_split(s) prev, prev2 = self.START tokens = [] for words in split_sents(corpus): context = self.START + [self._normalize(w) for w in words] + self.END for i, word in enumerate(words): tag = self.tagdict.get(word) if not tag: features = self._get_features(i, word, context, prev, prev2) tag = self.model.predict(features) tokens.append((word, tag)) prev2 = prev prev = tag return tokensTag a single sentence.Train a model from sentences.Ported from Python: 3def train(self, sentences, save_loc=None, nr_iter=5): self._make_tagdict(sentences) self.model.classes = self.classes prev, prev2 = START for iter_ in range(nr_iter): c = 0 n = 0 for words, tags in sentences: context = START + [self._normalize(w) for w in words] + END for i, word in enumerate(words): guess = self.tagdict.get(word) if not guess: feats = self._get_features(i, word, context, prev, prev2) guess = self.model.predict(feats) self.model.update(tags[i], guess, feats) prev2 = prev; prev = guess c += guess == tags[i] n += 1 random.shuffle(sentences) logging.info("Iter {0}: {1}/{2}={3}".format(iter_, c, n, _pc(c, n))) self.model.average_weights() # Pickle as a binary file if save_loc is not None: pickle.dump((self.model.weights, self.tagdict, self.classes), open(save_loc, 'wb'), -1) return NoneTrain on one sentence.5Adapted from this portion of the Python train method:  context = START + [self._normalize(w) for w in words] + END for i, word in enumerate(words): guess = self.tagdict.get(word) if not guess: feats = self._get_features(i, word, context, prev, prev2) guess = self.model.predict(feats) self.model.update(tags[i], guess, feats) prev2 = prev; prev = guess c += guess == tags[i] n += 1,Predict a Part of Speech, defaulting to the Unk% tag, if no classification is found.Default feature set. def _get_features(self, i, word, context, prev, prev2): '''Map tokens into a feature representation, implemented as a {hashable: float} dict. If the features change, a new model must be trained. ''' def add(name, *args): features[' '.join((name,) + tuple(args))] += 1 i += len(self.START) features = defaultdict(int) # It's useful to have a constant feature, which acts sort of like a prior add('bias') add('i suffix', word[-3:]) add('i pref1', word[0]) add('i-1 tag', prev) add('i-2 tag', prev2) add('i tag+i-2 tag', prev, prev2) add('i word', context[i]) add('i-1 tag+i word', prev, context[i]) add('i-1 word', context[i-1]) add('i-1 suffix', context[i-1][-3:]) add('i-2 word', context[i-2]) add('i+1 word', context[i+1]) add('i+1 suffix', context[i+1][-3:]) add('i+2 word', context[i+2]) return featuresThe POS tag parser.The inital model.nTraining data; formatted with one sentence per line, and standard POS tags after each space-delimeted token.bThe number of times to iterate over the training data, randomly shuffling after each iteration. (5 is a reasonable choice.)The ( to train.The training data. (A list of  [(Text, Tag)]'s)7A trained perceptron. IO is needed for randomization. 3 3None5START tag, used in training.END tag, used in training.() , not n't,. Sentence Terminator::determiner/pronoun, pre-qualifier e.g.; quite such rather<determiner/pronoun, pre-quantifier e.g.; all half many nary>determiner/pronoun, double conjunction or pre-quantifier bothdeterminer/pronoun, post-determiner many other next more last former little several enough most least only very few fewer past same Last latter less single plenty 'nough lesser certain various manye next-to-last particular final previous present nuf<determiner/pronoun, post-determiner, genitive e.g.; other'sEdeterminer/pronoun, post-determiner, hyphenated pair e.g.; many-much,article e.g.; the an no a every th' ever' ye/verb "to be", infinitive or imperative e.g.; beOverb "to be", past tense, 2nd person singular or all persons plural e.g.; were[verb "to be", past tense, 2nd person singular or all persons plural, negated e.g.; weren't@verb "to be", past tense, 1st and 3rd person singular e.g.; wasLverb "to be", past tense, 1st and 3rd person singular, negated e.g.; wasn't7verb "to be", present participle or gerund e.g.; being:verb "to be", present tense, 1st person singular e.g.; amFverb "to be", present tense, 1st person singular, negated e.g.; ain't(verb "to be", past participle e.g.; beenUverb "to be", present tense, 2nd person singular or all persons plural e.g.; are artdverb "to be", present tense, 2nd person singular or all persons plural, negated e.g.; aren't ain't:verb "to be", present tense, 3rd person singular e.g.; isLverb "to be", present tense, 3rd person singular, negated e.g.; isn't ain'tJconjunction, coordinating e.g.; and or but plus & either neither nor yet n and/or minus an'numeral, cardinal e.g.; two one 1 four 2 1913 71 74 637 1937 8 five three million 87-31 29-5 seven 1,119 fifty-three 7.5 billion hundred 125,000 1,700 60 100 six ...7numeral, cardinal, genitive e.g.; 1960's 1961's .404'sconjunction, subordinating e.g.; that as after whether before while like because if since for than altho until so unless though providing once lest sposin> till whereas whereupon supposing tho' albeit then so's 'forePverb "to do", uninflected present tense, infinitive or imperative e.g.; do dostKverb "to do", uninflected present tense or imperative, negated e.g.; don'tjverb "to do", past or present tense + pronoun, personal, nominative, not 3rd person singular e.g.; d'you'verb "to do", past tense e.g.; did done.verb "to do", past tense, negated e.g.; didn't<verb "to do", present tense, 3rd person singular e.g.; doesNverb "to do", present tense, 3rd person singular, negated e.g.; doesn't don'tBdeterminer/pronoun, singular e.g.; this each another that 'nother7determiner/pronoun, singular, genitive e.g.; another'sSdeterminer/pronoun + verb "to be", present tense, 3rd person singular e.g.; that's;determiner/pronoun + modal auxillary e.g.; that'll this'll6determiner/pronoun, singular or plural e.g.; any some1determiner/pronoun, plural e.g.; these those themPpronoun, plural + verb "to be", present tense, 3rd person singular e.g.; them'sCdeterminer, pronoun or double conjunction e.g.; neither either oneexistential there e.g.; thereSexistential there + verb "to be", present tense, 3rd person singular e.g.; there's=existential there + verb "to have", past tense e.g.; there'dUexistential there + verb "to have", present tense, 3rd person singular e.g.; there's;existential there + modal auxillary e.g.; there'll there'd&foreign word: negator e.g.; pas non neNforeign word: article e.g.; la le el un die der ein keine eine das las les Ilforeign word: article + noun, singular, common e.g.; l'orchestre l'identite l'arcade l'ange l'assistance l'activite L'Universite l'independance L'Union L'Unita l'osservatoreJforeign word: article + noun, singular, proper e.g.; L'Astree L'Imperiale?foreign word: verb "to be", infinitive or imperative e.g.; sitkforeign word: verb "to be", present tense, 2nd person singular or all persons plural e.g.; sind sunt etesMforeign word: verb "to be", present tense, 3rd person singular e.g.; ist estLforeign word: conjunction, coordinating e.g.; et ma mais und aber och nec yEforeign word: numeral, cardinal e.g.; une cinq deux sieben unam zwei=foreign word: conjunction, subordinating e.g.; bevor quam ma5foreign word: determiner/pronoun, singular e.g.; hocYforeign word: determiner + verb "to be", present tense, 3rd person singular e.g.; c'est4foreign word: determiner/pronoun, plural e.g.; haecPforeign word: verb "to have", present tense, not 3rd person singular e.g.; habeforeign word: preposition e.g.; ad de en a par con dans ex von auf super post sine sur sub avec per inter sans pour pendant in diNforeign word: preposition + article e.g.; della des du aux zur d'un del dell'cforeign word: preposition + noun, singular, common e.g.; d'etat d'hotel d'argent d'identite d'artJforeign word: preposition + noun, singular, proper e.g.; d'Yquem d'Eiffelforeign word: adjective e.g.; avant Espagnol sinfonica Siciliana Philharmonique grand publique haute noire bouffe Douce meme humaine bel serieuses royaux anticus presto Sovietskaya Bayerische comique schwarzen ...4foreign word: adjective, comparative e.g.; fortiori2foreign word: adjective, superlative e.g.; optimoforeign word: noun, singular, common e.g.; ballet esprit ersatz mano chatte goutte sang Fledermaus oud def kolkhoz roi troika canto boite blutwurst carne muzyka bonheur monde piece force ...foreign word: noun, singular, common, genitive e.g.; corporis intellectus arte's dei aeternitatis senioritatis curiae patronne's chambre's foreign word: noun, plural, common e.g.; al culpas vopos boites haflis kolkhozes augen tyrannis alpha-beta-gammas metis banditos rata phis negociants crus Einsatzkommandos kamikaze wohaws sabinas zorrillas palazzi engages coureurs corroborees yori Ubermenschen ...vforeign word: noun, singular, proper e.g.; Karshilama Dieu Rundfunk Afrique Espanol Afrika Spagna Gott Carthago deusCforeign word: noun, plural, proper e.g.; Svenskarna Atlantes DieuxKforeign word: noun, singular, adverbial e.g.; heute morgen aujourd'hui hoy5foreign word: numeral, ordinal e.g.; 18e 17e quintus(foreign word: pronoun, nominal e.g.; hoc=foreign word: determiner, possessive e.g.; mea mon deras vos4foreign word: pronoun, singular, reflexive e.g.; seoforeign word: pronoun, singular, reflexive + verb, present tense, 3rd person singular e.g.; s'excuse s'accuse2pronoun, personal, accusative e.g.; lui me moi miLforeign word: pronoun, personal, accusative + preposition e.g.; mecum tecumJforeign word: pronoun, personal, nominative, 3rd person singular e.g.; il[foreign word: pronoun, personal, nominative, not 3rd person singular e.g.; ich vous sie jeforeign word: pronoun, personal, nominative, not 3rd person singular + verb "to have", present tense, not 3rd person singular e.g.; j'ai#foreign word: qualifier e.g.; minusforeign word: adverb e.g.; bas assai deja um wiederum cito velociter vielleicht simpliciter non zu domi nuper sic forsan olim oui semper tout despues hors@foreign word: adverb + conjunction, coordinating e.g.; forisqueCforeign word: infinitival to + verb, infinitive e.g.; d'entretenir_foreign word: interjection e.g.; sayonara bien adieu arigato bonjour adios bueno tchalo ciao oforeign word: verb, present tense, not 3rd person singular, imperative or infinitive e.g.; nolo contendere vive fermate faciunt esse vade noli tangere dites duces meminisse iuvabit gosaimasu voulez habla ksuu'peli-afo lacheln miuchi say allons strafe portant;foreign word: verb, past tense e.g.; stabat peccavi audiviyforeign word: verb, present participle or gerund e.g.; nolens volens appellant seq. obliterans servanda dicendi delendaPforeign word: verb, past participle e.g.; vue verstrichen rasa verboten engages[foreign word: verb, present tense, 3rd person singular e.g.; gouverne sinkt sigue diapiace8foreign word: WH-determiner e.g.; quo qua quod que quok5foreign word: WH-pronoun, accusative e.g.; quibusdam.foreign word: WH-pronoun, nominative e.g.; quiTverb "to have", uninflected present tense, infinitive or imperative e.g.; have hastUverb "to have", uninflected present tense or imperative, negated e.g.; haven't ain'tGverb "to have", uninflected present tense + infinitival to e.g.; hafta$verb "to have", past tense e.g.; had1verb "to have", past tense, negated e.g.; hadn't:verb "to have", present participle or gerund e.g.; having)verb "to have", past participle e.g.; hadBverb "to have", present tense, 3rd person singular e.g.; has hathOverb "to have", present tense, 3rd person singular, negated e.g.; hasn't ain'tpreposition e.g.; of in for by considering to on among at through with under into regarding than since despite according per before toward against as after during including between without except upon out over ...)preposition, hyphenated pair e.g.; f'ovuh:preposition + pronoun, personal, accusative e.g.; t'hi-imadjective e.g.; recent over-all possible hard-fought favorable hard meager fit such widespread outmoded inadequate ambiguous grand clerical effective orderly federal foster general proportionate ...!adjective, genitive e.g.; Great's4adjective, hyphenated pair e.g.; big-large long-far adjective, comparative e.g.; greater older further earlier later freer franker wider better deeper firmer tougher faster higher bigger worse younger lighter nicer slower happier frothier Greater newer Elder ... 6adjective + conjunction, coordinating e.g.; lighter'n adjective, semantically superlative e.g.; top chief principal northernmost master key head main tops utmost innermost foremost uppermost paramount topmost adjective, superlative e.g.; best largest coolest calmest latest greatest earliest simplest strongest newest fiercest unhappiest worst youngest worthiest fastest hottest fittest lowest finest smallest staunchest ... Wmodal auxillary e.g.; should may might will would must can could shall ought need wiltemodal auxillary, negated e.g.; cannot couldn't wouldn't can't won't shouldn't shan't mustn't musn'thmodal auxillary + verb "to have", uninflected form e.g.; shouldda musta coulda must've woulda could'veWmodal auxillary + pronoun, personal, nominative, not 3rd person singular e.g.; willya-modal auxillary + infinitival to e.g.; oughtanoun, singular, common e.g.; failure burden court fire appointment awarding compensation Mayor interim committee fact effect airport management surveillance jail doctor intern extern night weekend duty legislation Tax Office ...noun, singular, common, genitive e.g.; season's world's player's night's chapter's golf's football's baseball's club's U.'s coach's bride's bridegroom's board's county's firm's company's superintendent's mob's Navy's ...noun, singular, common + verb "to be", present tense, 3rd person singular e.g.; water's camera's sky's kid's Pa's heat's throat's father's money's undersecretary's granite's level's wife's fat's Knife's fire's name's hell's leg's sun's roulette's cane's guy's kind's baseball's ...?noun, singular, common + verb "to have", past tense e.g.; Pa'dnoun, singular, common + verb "to have", present tense, 3rd person singular e.g.; guy's Knife's boat's summer's rain's company's2noun, singular, common + preposition e.g.; buncha@noun, singular, common + modal auxillary e.g.; cowhand'd sun'll<noun, singular, common, hyphenated pair e.g.; stomach-bellynoun, plural, common e.g.; irregularities presentments thanks reports voters laws legislators years areas adjustments chambers $100 bonds courts sales details raises sessions members congressmen votes polls calls ...noun, plural, common, genitive e.g.; taxpayers' children's members' States' women's cutters' motorists' steelmakers' hours' Nations' lawyers' prisoners' architects' tourists' Employers' secretaries' Rogues' ...Anoun, plural, common + modal auxillary e.g.; duds'd oystchers'llnoun, singular, proper e.g.; Fulton Atlanta September-October Durwood Pye Ivan Allen Jr. Jan. Alpharetta Grady William B. Hartsfield Pearl Williams Aug. Berry J. M. Cheshire Griffin Opelika Ala. E. Pelham Snodgrass ...noun, singular, proper, genitive e.g.; Green's Landis' Smith's Carreon's Allison's Boston's Spahn's Willie's Mickey's Milwaukee's Mays' Howsam's Mantle's Shaw's Wagner's Rickey's Shea's Palmer's Arnold's Broglio's ...noun, singular, proper + verb "to be", present tense, 3rd person singular e.g.; W.'s Ike's Mack's Jack's Kate's Katharine's Black's Arthur's Seaton's Buckhorn's Breed's Penny's Rob's Kitty's Blackwell's Myra's Wally's Lucille's Springfield's Arlene's noun, singular, proper + verb "to have", present tense, 3rd person singular e.g.; Bill's Guardino's Celie's Skolman's Crosson's Tim's Wally's!>noun, singular, proper + modal auxillary e.g.; Gyp'll John'll"noun, plural, proper e.g.; Chases Aderholds Chapelles Armisteads Lockies Carbones French Marskmen Toppers Franciscans Romans Cadillacs Masons Blacks Catholics British Dixiecrats Mississippians Congresses ...#noun, plural, proper, genitive e.g.; Republicans' Orioles' Birds' Yanks' Redbirds' Bucs' Yankees' Stevenses' Geraghtys' Burkes' Wackers' Achaeans' Dresbachs' Russians' Democrats' Gershwins' Adventists' Negroes' Catholics' ...$noun, singular, adverbial e.g.; Friday home Wednesday Tuesday Monday Sunday Thursday yesterday tomorrow tonight West East Saturday west left east downtown north northeast southeast northwest North South right ...%noun, singular, adverbial, genitive e.g.; Saturday's Monday's yesterday's tonight's tomorrow's Sunday's Wednesday's Friday's today's Tuesday's West's Today's South's&;noun, singular, adverbial + modal auxillary e.g.; today'll'Rnoun, plural, adverbial e.g.; Sundays Mondays Saturdays Wednesdays Souths Fridays(numeral, ordinal e.g.; first 13th third nineteenth 2d 61st second sixth eighth ninth twenty-first eleventh 50th eighteenth- Thirty-ninth 72nd 1/20th twentieth mid-19th thousandth 350th sixteenth 701st ...)pronoun, nominal e.g.; none something everything one anyone nothing nobody everybody everyone anybody anything someone no-one nothin*epronoun, nominal, genitive e.g.; one's someone's anybody's nobody's everybody's anyone's everyone's+pronoun, nominal + verb "to be", present tense, 3rd person singular e.g.; nothing's everything's somebody's nobody's someone's,=pronoun, nominal + verb "to have", past tense e.g.; nobody'd-gpronoun, nominal + verb "to have", present tense, 3rd person singular e.g.; nobody's somebody's one's.Jpronoun, nominal + modal auxillary e.g.; someone'll somebody'll anybody'd/Ndeterminer, possessive e.g.; our its his their my your her out thy mine thine0:pronoun, possessive e.g.; ours mine his hers theirs yours1Zpronoun, singular, reflexive e.g.; itself himself myself yourself herself oneself ownself2Apronoun, plural, reflexive e.g.; themselves ourselves yourselves3Npronoun, personal, accusative e.g.; them it him me us you 'em her thee we'uns4Hpronoun, personal, nominative, 3rd person singular e.g.; it he she thee5}pronoun, personal, nominative, 3rd person singular + verb "to be", present tense, 3rd person singular e.g.; it's he's she's6gpronoun, personal, nominative, 3rd person singular + verb "to have", past tense e.g.; she'd he'd it'd7pronoun, personal, nominative, 3rd person singular + verb "to have", present tense, 3rd person singular e.g.; it's he's she's8opronoun, personal, nominative, 3rd person singular + modal auxillary e.g.; he'll she'll it'll he'd it'd she'd9[pronoun, personal, nominative, not 3rd person singular e.g.; they we I you ye thou you'uns:ypronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 1st person singular e.g.; I'm Ahm;pronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 2nd person singular or all persons plural e.g.; we're you're they're<wpronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 3rd person singular e.g.; you's=|pronoun, personal, nominative, not 3rd person singular + verb "to be", present tense, 3rd person singular, negated e.g.; taint>pronoun, personal, nominative, not 3rd person singular + verb "to have", uninflected present tense e.g.; I've we've they've you've?qpronoun, personal, nominative, not 3rd person singular + verb "to have", past tense e.g.; I'd you'd we'd they'd@pronoun, personal, nominative, not 3rd person singular + modal auxillary e.g.; you'll we'll I'll we'd I'd they'll they'd you'dAqpronoun, personal, nominative, not 3rd person singular + verb "to verb", uninflected present tense e.g.; y'knowBqualifier, pre e.g.; well less very most so real as highly fundamentally even how much remarkably somewhat more completely too thus ill deeply little overly halfway almost impossibly far severly such ...C/qualifier, post e.g.; indeed enough still 'nuffDadverb e.g.; only often generally also nevertheless upon together back newly no likely meanwhile near then heavily there apparently yet outright fully aside consistently specifically formally ever just ...Eadverb, genitive e.g.; else'sFOadverb + verb "to be", present tense, 3rd person singular e.g.; here's there'sG7adverb + conjunction, coordinating e.g.; well's soon'sHadverb, comparative e.g.; further earlier better later higher tougher more harder longer sooner less faster easier louder farther oftener nearer cheaper slower tighter lower worse heavier quicker ...I=adverb, comparative + conjunction, coordinating e.g.; more'nJvadverb, superlative e.g.; most best highest uppermost nearest brightest hardest fastest deepest farthest loudest ...K$adverb, nominal e.g.; here afar thenLMadverb, particle e.g.; up out off down over on in about through across afterM1adverb, particle + preposition e.g.; out'n outtaNinfinitival to e.g.; to t'O5infinitival to + verb, infinitive e.g.; t'jawn t'lahPinterjection e.g.; Hurrah bang whee hmpf ah goodbye oops oh-the-pain-of-it ha crunch say oh why see well hello lo alas tarantara rum-tum-tum gosh hell keerist Jesus Keeeerist boy c'mon 'mon goddamn bah hoo-pig damn ...Qverb, base: uninflected present, imperative or infinitive e.g.; investigate find act follow inure achieve reduce take remedy re-set distribute realize disable feel receive continue place protect eliminate elaborate work permit run enter force ...RDverb, base: uninflected present or infinitive + article e.g.; wannaSUverb, base: uninflected present, imperative or infinitive + preposition e.g.; lookitTUverb, base: uninflected present, imperative or infinitive + adjective e.g.; die-deadUXverb, uninflected present tense + pronoun, personal, accusative e.g.; let's lemme gimmeV8verb, imperative + adverbial particle e.g.; g'ahn c'monW^verb, base: uninflected present, imperative or infinitive + infinitival to e.g.; wanta wannaXZverb, base: uninflected present, imperative or infinitive; hypenated pair e.g.; say-speakYverb, past tense e.g.; said produced took recommended commented urged found added praised charged listed became announced brought attended wanted voted defeated received got stood shot scheduled feared promised made ...Zverb, present participle or gerund e.g.; modernizing improving purchasing Purchasing lacking enabling pricing keeping getting picking entering voting warning making strengthening setting neighboring attending participating moving ...[6verb, present participle + infinitival to e.g.; gonna\verb, past participle e.g.; conducted charged won received studied revised operated accepted combined experienced recommended effected granted seen protected adopted retarded notarized selected composed gotten printed ...]3verb, past participle + infinitival to e.g.; gotta^verb, present tense, 3rd person singular e.g.; deserves believes receives takes goes expires says opposes starts permits expects thinks faces votes teaches holds calls fears spends collects backs eliminates sets flies gives seeks reads ..._EWH-determiner e.g.; which what whatever whichever whichever-the-hell`fWH-determiner + verb "to be", present tense, 2nd person singular or all persons plural e.g.; what'reaWH-determiner + verb "to be", present, 2nd person singular or all persons plural + pronoun, personal, nominative, not 3rd person singular e.g.; whaddyabNWH-determiner + verb "to be", present tense, 3rd person singular e.g.; what'scWH-determiner + verb "to do", uninflected present tense + pronoun, personal, nominative, not 3rd person singular e.g.; whaddyad6WH-determiner + verb "to do", past tense e.g.; what'dePWH-determiner + verb "to have", present tense, 3rd person singular e.g.; what'sf)WH-pronoun, genitive e.g.; whose whoseverg*WH-pronoun, accusative e.g.; whom that whohHWH-pronoun, nominative e.g.; that who whoever whosoever what whatsoeveriXWH-pronoun, nominative + verb "to be", present, 3rd person singular e.g.; that's who'sj@WH-pronoun, nominative + verb "to have", past tense e.g.; who'dk`WH-pronoun, nominative + verb "to have", present tense, 3rd person singular e.g.; who's that'slKWH-pronoun, nominative + modal auxillary e.g.; who'll that'd who'd that'llmWH-qualifier e.g.; however hownWH-adverb e.g.; however when where why whereby wherever how whenever whereon wherein wherewith wheare wherefore whereof howsabouto]WH-adverb + verb "to be", present, 2nd person singular or all persons plural e.g.; where'repKWH-adverb + verb "to be", present, 3rd person singular e.g.; how's where'sqGWH-adverb + verb "to do", present, not 3rd person singular e.g.; howdar9WH-adverb + verb "to do", past tense e.g.; where'd how'ds;WH-adverb + verb "to do", past tense, negated e.g.; whyn'ttIWH-adverb + verb "to do", present tense, 3rd person singular e.g.; how'su#WH-adverb + preposition e.g.; why'nv)WH-adverb + modal auxillary e.g.; where'dwUnknown.y Noun Phrase.z Verb Phrase.{Prepositional Phrase.|Clause.}Out not a chunk.~+Parse a Brown corpus into TagagedSentences.Order matters here: The patterns are replaced in reverse order when generating tags, and in top-to-bottom when generating tags.      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~w   DHLNPQYZ\^_fn   !"#$%&'()*+,-./0123456789:;<=>?@ABCEFGIJKMORSTUVWX[]`abcdeghijklmopqrstuvxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~None-Find a clause in a larger collection of text.A clause is defined by the I extractor, and is a Noun Phrase followed (immediately) by a Verb PhraseEfindClause skips over leading tokens, if needed, to locate a clause.,Find a Noun Phrase followed by a Verb Phrase"Part-of-Speech tagging facilities.Rogan Creswick, 2014creswick@gmail.com experimentalNone A basic POS tagger.>A POS tagger that has been trained on the Conll 2000 POS tags.5A POS tagger trained on a subset of the Brown corpus.The default table of tagger IDs to readTagger functions. Each tagger packaged with Chatter should have an entry here. By convention, the IDs use are the fully qualified module name of the tagger package.Store a POSTager to a file.!Load a tagger, using the interal X. If you need to specify your own mappings for new composite taggers, you should use .This function checks the filename to determine if the content should be decompressed. If the file ends with ".gz", then we assume it is a gziped model.yTag a chunk of input text with part-of-speech tags, using the sentence splitter, tokenizer, and tagger contained in the POSTager.Tag the tokens in a string.gReturns a space-separated string of tokens, each token suffixed with the part of speech. For example:tag tagger "the dog jumped .""the/at dog/nn jumped/vbd ./."Text version of tagStrLTrain a tagger on string input in the standard form for POS tagged corpora: .trainStr tagger "the/at dog/nn jumped/vbd ./."The  version of Train a  on a corpus of sentences.This will recurse through the  stack, training all the backoff taggers as well. In order to do that, this function has to be generic to the kind of taggers used, so it is not possible to train up a new POSTagger from nothing: & wouldn't know what tagger to create.8To get around that restriction, you can use the various mkTagger implementations, such as  9 or NLP.POS.AvgPerceptronTagger.mkTagger'. For example: import NLP.POS.AvgPerceptronTagger as APT let newTagger = APT.mkTagger APT.emptyPerceptron Nothing posTgr <- train newTagger trainingExamples Evaluate a POSTager.3Measures accuracy over all tags in the test corpus.Accuracy is calculated as: (|tokens tagged correctly| / |all tokens|Phrase Chunking facilities.Rogan Creswick, 2014creswick@gmail.com experimentalNone A basic Phrasal chunker.2Convenient function to load the Conll2000 Chunker.0Train a chunker on a set of additional examples.Chunk a Js that has been produced by a Chatter tagger, producing a rich representation of the Chunks and the Tags detected.NIf you just want to see chunked output from standard text, you probably want  or .zConvenience funciton to Tokenize, POS-tag, then Chunk the provided text, and format the result in an easy-to-read format. > tgr <- defaultTagger > chk <- defaultChunker > chunkText tgr chk "The brown dog jumped over the lazy cat." "[NP The/DT brown/NN dog/NN] [VP jumped/VBD] [NP over/IN the/DT lazy/JJ cat/NN] ./."A wrapper around  that packs strings.The default table of tagger IDs to readTagger functions. Each tagger packaged with Chatter should have an entry here. By convention, the IDs use are the fully qualified module name of the tagger package.Store a L to disk.Load a LG from disk, optionally gunzipping if needed. (based on file extension)    None5?Different classes of Named Entity used in the WikiNER data set."out" not a chunk.vConvert wikiNer format to basic IOB (one token perline, space separated tags, and a blank line between each sentence)ITranlsate a WikiNER sentence into a list of IOB-lines, for parsing with %Train a chunker on a provided corpus.   !"#$%&'()*+,-./0123456789:;<=>?@ABBCCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abbccdefghhijklmnopqrstuvwxyz{|}~DchP                  ! " # $ % & ' ( ) * + , - . / 0 1 2  /  3 4 4 5 6 7 8 9 : : ; < = > ? @ 3 A B C D E F G H I J K L M N O P Q R S T U U V W X Y Z [ \ ] ^ _``abccdefghijklmnopqrs(&tuvwxyz{|}~/ 312_D      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRhSTUVW  XYZ[\]^_`ab1cdefg3hij3]klmnobhWpqr  stuvwxyz{|}~                      %chatter-0.9.0.0-GxHYkjdmawfoWNDKFHoysData.DefaultMapNLP.Corpora.EmailNLP.Types.GeneralNLP.Types.TagsNLP.Types.Tree NLP.Types.IOBNLP.Tokenize.ChatterNLP.Corpora.Conll NLP.TypesNLP.POS.LiteralTaggerNLP.POS.UnambiguousTaggerNLP.ML.AvgPerceptronNLP.Chunk.AvgPerceptronChunkerNLP.Similarity.VectorSimNLP.Extraction.ParsecNLP.Corpora.ParsingNLP.POS.AvgPerceptronTaggerNLP.Corpora.Brown&NLP.Extraction.Examples.ParsecExamplesNLP.POS NLP.ChunkNLP.Corpora.WikiNerData.Mapfoldl Paths_chatter serialize taggerTable readTagger Data.TextwordslinesmkTagger DefaultMapDefMap defDefaultdefMapemptylookupfromListkeyselemsmap unionWith$fArbitraryDefaultMap$fNFDataDefaultMap$fSerializeDefaultMap$fReadDefaultMap$fShowDefaultMap$fEqDefaultMap$fGenericDefaultMap plugDataPathplugArchiveTextplugArchiveTokensfullPlugArchivereadF CaseSensitive Sensitive InsensitiveError toEitherErr$fArbitraryCaseSensitive$fSerializeCaseSensitive$fReadCaseSensitive$fShowCaseSensitive$fGenericCaseSensitiveRawTagRawChunkTagfromTagparseTagtagUNKtagTermstartTagendTagisDtChunkTag fromChunk parseChunknotChunkNERTag fromNERTag parseNERTag$fArbitraryRawTag $fTagRawTag$fSerializeRawTag$fChunkTagRawChunk$fSerializeRawChunk $fOrdRawChunk $fEqRawChunk$fReadRawChunk$fShowRawChunk$fGenericRawChunk $fOrdRawTag $fEqRawTag $fReadRawTag $fShowRawTag$fGenericRawTagTokenPOSposTagposTokenTaggedSentence TaggedSentChunkChunkOrChunk_CNPOS_CNChunkedSentence ChunkedSentSentenceSenttokens applyTagsshowChunkedSentprintTS stripTags unzipTags unzipChunkscombinecombineSentencespickTagmkChunkmkChink showPOStok showPOStagprintPOSshowToksuffixunTStsLengthtsConcatcontains containsTag posTagMatches posTokMatches tokenMatches$fIsStringToken$fArbitraryToken$fArbitraryPOS$fArbitraryChunk$fArbitraryChunkOr$fArbitraryTaggedSentence$fArbitraryChunkedSentence$fArbitrarySentence $fReadToken $fShowToken $fEqToken $fReadPOS $fShowPOS$fEqPOS$fReadTaggedSentence$fShowTaggedSentence$fEqTaggedSentence $fReadChunkOr $fShowChunkOr $fEqChunkOr $fReadChunk $fShowChunk $fEqChunk$fReadChunkedSentence$fShowChunkedSentence$fEqChunkedSentence$fReadSentence$fShowSentence $fEqSentenceIOBChunkBChunkIChunkOChunkgetPOStoTaggedSentence parseIOBLine iobBuilder toChunkTreeparseIOB parseSentence getSentences$fArbitraryIOBChunk$fReadIOBChunk$fShowIOBChunk $fEqIOBChunktokenize runTokenizerSTARTENDHashDollar CloseDQuote OpenDQuoteOp_ParenCl_ParenCommaTermColonCCCDDTEXFWINJJJJRJJSLSMDNNNNSNNPNNPSPDTPRP PRPdollarRBRBRRBSRPSYMTOUHVBVBDVBGVBNVBPVBZWDTWPWPdollarWRBUnkADJPADVPCONJPINTJLSTNPPPPRTSBARUCPVPOPERORGLOCMISCparseTaggedSentencesreadTagtagTxtPatternsreversePatternsshowTag replaceAll$fChunkTagChunk$fSerializeTag$fArbitraryTag$fTagTag$fSerializeChunk$fNERTagNERTag$fSerializeNERTag$fArbitraryNERTag $fReadNERTag $fShowNERTag $fOrdNERTag $fEqNERTag$fGenericNERTag $fEnumNERTag$fBoundedNERTag $fOrdChunk$fGenericChunk $fEnumChunk$fBoundedChunk $fReadTag $fShowTag$fOrdTag$fEqTag $fGenericTag $fEnumTag $fBoundedTagCorpus corpLengthcorpTermCounts POSTagger posTagger posTrainer posBackoff posTokenizer posSplitter posSerializeposID termCounts addDocumentmkCorpusaddTermsaddTerm$fArbitraryCorpus$fSerializeCorpus$fNFDataCorpus $fReadCorpus $fShowCorpus $fEqCorpus $fOrdCorpus$fGenericCorpustaggerID protectTermstag tagSentencetrain Perceptronweightstotalststamps instancesWeightClassFeatureFeatemptyPerceptronpredictupdateaverageWeights$fNFDataPerceptron$fSerializePerceptron$fSerializeClass$fSerializeFeature $fReadFeature $fShowFeature $fEqFeature $fOrdFeature$fGenericFeature$fNFDataFeature $fReadClass $fShowClass $fEqClass $fOrdClass$fGenericClass $fNFDataClass$fReadPerceptron$fShowPerceptron$fEqPerceptron$fGenericPerceptronChunker chChunker chTrainer chSerializechId chunkerID readChunker mkChunkerchunk chunkSentencetrainIntDocumentdocTermFrequencies docTokens TermVector mkDocumentfromTVmkVectorsim similaritytvSimtfidftf_idfcosVec addVectors zeroVectornegatesum magnitudedotProd$fArbitraryDocument$fNFDataDocument$fArbitraryTermVector$fReadTermVector$fShowTermVector$fEqTermVector$fGenericTermVector$fNFDataTermVector$fReadDocument$fShowDocument $fEqDocument$fGenericDocument ExtractorposTok posPrefixmatchestxtTokanyTokenoneOf followedBy$fStreamChunkedSentencemChunkOr$fStreamTaggedSentencemPOSreadPOS readPOSWithsafeInittrainNew trainOnFilesNegatorDashABLABNABXAPAPdollarAP_pl_APATBEBEDBEDstarBEDZBEDZstarBEGBEMBEMstarBENBERBERstarBEZBEZstarCDdollarCSDODOstar DO_pl_PPSSDODDODstarDOZDOZstarDTdollar DT_pl_BEZDT_pl_MDDTIDTS DTS_pl_BEZDTX EX_pl_BEZ EX_pl_HVD EX_pl_HVZEX_pl_MDFW_starFW_AT FW_AT_pl_NN FW_AT_pl_NPFW_BEFW_BERFW_BEZFW_CCFW_CDFW_CSFW_DT FW_DT_pl_BEZFW_DTSFW_HVFW_IN FW_IN_pl_AT FW_IN_pl_NN FW_IN_pl_NPFW_JJFW_JJRFW_JJTFW_NN FW_NNdollarFW_NNSFW_NPFW_NPSFW_NRFW_ODFW_PN FW_PPdollarFW_PPL FW_PPL_pl_VBZFW_PPO FW_PPO_pl_INFW_PPSFW_PPSS FW_PPSS_pl_HVFW_QLFW_RB FW_RB_pl_CC FW_TO_pl_VBFW_UHFW_VBFW_VBDFW_VBGFW_VBNFW_VBZFW_WDTFW_WPOFW_WPSHVHVstarHV_pl_TOHVDHVDstarHVGHVNHVZHVZstarIN_pl_IN IN_pl_PPOJJdollarJJ_pl_JJ JJR_pl_CSJJTMDstarMD_pl_HV MD_pl_PPSSMD_pl_TONNdollar NN_pl_BEZ NN_pl_HVD NN_pl_HVZNN_pl_INNN_pl_MDNN_pl_NN NNSdollar NNS_pl_MDNPdollar NP_pl_BEZ NP_pl_HVZNP_pl_MDNPS NPSdollarNRNRdollarNR_pl_MDNRSODPNPNdollar PN_pl_BEZ PN_pl_HVD PN_pl_HVZPN_pl_MDPPdollarPPdollardollarPPLPPLSPPOPPS PPS_pl_BEZ PPS_pl_HVD PPS_pl_HVZ PPS_pl_MDPPSS PPSS_pl_BEM PPSS_pl_BER PPSS_pl_BEZPPSS_pl_BEZstar PPSS_pl_HV PPSS_pl_HVD PPSS_pl_MD PPSS_pl_VBQLQLPRBdollar RB_pl_BEZRB_pl_CS RBR_pl_CSRBTRNRP_pl_INTO_pl_VBVB_pl_ATVB_pl_INVB_pl_JJ VB_pl_PPOVB_pl_RPVB_pl_TOVB_pl_VB VBG_pl_TO VBN_pl_TO WDT_pl_BERWDT_pl_BER_pl_PP WDT_pl_BEZWDT_pl_DO_pl_PPS WDT_pl_DOD WDT_pl_HVZWPOWPS WPS_pl_BEZ WPS_pl_HVD WPS_pl_HVZ WPS_pl_MDWQL WRB_pl_BER WRB_pl_BEZ WRB_pl_DO WRB_pl_DODWRB_pl_DODstar WRB_pl_DOZ WRB_pl_IN WRB_pl_MDC_NPC_VPC_PPC_CLC_O findClauseclause prepPhrase nounPhrase verbPhrase defaultTagger conllTagger brownTagger saveTagger loadTagger deserialize tagTokenstagStrtagTexttrainStr trainTextevaldefaultChunker conllChunker chunkTextchunkStr chunkerTable saveChunker loadChunker parseWikiNerwikiNerChunker trainChunkerbase Data.StringIsStringcatchIOversionbindirlibdirdatadir libexecdir sysconfdir getBinDir getLibDir getDataDir getLibexecDir getSysconfDirgetDataFileNamebytestring-0.10.8.1Data.ByteString.Internal ByteStringescapeRegexCharscontainers-0.5.7.1 Data.Map.BaseMapupd_featroundToinfinityincrementInstances getTimestampgetTotalgetFeatureWeighttrainExtoTreetrainCls startToksendToks trainSentence itterations predictChunk toChunkOr toClassLst getFeatures tagsSinceDttagsSinceHelper mkFeature#text-1.2.2.1-9Yh8rJoh8fO2JMLWffT3Qs predictPos tokenToClass parseBrownTag showBrownTagData.Text.InternalText toIOBLines