h&Q      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnop(c) Lev Dvorkin, 2022MITlev_135@mail.ru ExperimentalNone#$%'(-/2567>?B tokenizerSelect some "white set" of available elements or "black set" of forbidden ones tokenizerMake a  containing only one symbol tokenizerIntersect two s. tokenizer Check if  is empty7NB! number of all elements assumed to be too large, so # is never supposed to be empty tokenizer!Check if symbol is a member of a -member 'a' (WhiteSet (S.fromList ['a', 'b']))True-member 'a' (BlackSet (S.fromList ['a', 'b']))False(c) Lev Dvorkin, 2022MITlev_135@mail.ru ExperimentalNone#$%'(-/2567>?   tokenizer0Type for internal needs. Contains autogenerated  , and restrictions behind token are reversed  tokenizer+unique token's id (generated automatically)  tokenizer5constraints on symbols behind/ahead of matchable part tokenizer5constraints on symbols behind/ahead of matchable part tokenizermatchable part of string tokenizerToken id type synonym. tokenizerToken with name of type k (used for uniqueness error messages and tokenizing output) over char type c. tokenizerthe name of token tokenizer3restrictions on symbols before/after matchable partNB! they are assumed to be satisfied if there are no symbols before/after matched part respectively tokenizer3restrictions on symbols before/after matchable partNB! they are assumed to be satisfied if there are no symbols before/after matched part respectively tokenizer:matchable sequences of char sets with possible repetitions tokenizer that can be repeated. tokenizer Number of symbols acceptable by  tokenizerType synonym for list monad used as a collection of alternatives  tokenizer Construct an   from  and its id! tokenizerCompares by token's id" tokenizerCompares by token's id     (c) Lev Dvorkin, 2022MITlev_135@mail.ru ExperimentalNone#$%'(-/2567>? 4 tokenizerError during tokenizing Everywhere  [(k, [c])] type is used, the list of pairs with name of token and part of string, matched by it is stored7 tokenizerAuxillary structure for tokenizing. Should be used as opaque type, initializing by ? and concatenating by q instance.= tokenizerMake a 7 with one element> tokenizerInsert  into 7? tokenizerCreate auxillary Map for tokenizing. Should be called once for initializing@ tokenizer Split list of symbols on tokens.5 tokenizer9Position of the first character that can not be tokenized tokenizerPart of string successfully tokenized (the longest of all attempts)6 tokenizer#Length of uniquely tokenized prefix tokenizerFirst tokenize way tokenizerSecond tokenize way 456789:;<=>?@ 789:;<=>?456@(c) Lev Dvorkin, 2022MITlev_135@mail.ru ExperimentalNone#$%'(-/2567>?YI tokenizer=Two ways of tokenizing a string, demonstrating non-uniquenessM tokenizerResult of division. It looks like  rtoks | lastTok --------|---------|-----------------------|~~~~~ rprefToks | -----|-----|---------| suff (remained part): behind | current | ahead -------|====================|~~~~~O tokenizer(Tokens in main sequence, except last oneP tokenizerLast token in main sequenceQ tokenizerTokens in alter sequenceR tokenizerProcessed symbolsS tokenizerRemained suffixT tokenizerDangling suffixV tokenizer&Symbols behind suffix. Note that only  maxBehind symbols are preservedW tokenizerSymbols from suffix' bodyX tokenizerSymbols ahead suffix^ tokenizer4First list reminder. May be empty if there is no rem_ tokenizer(Second list reminder. Always is nonemptyg tokenizerCheck that there is no list of symbols, that can be decomposed to ways on the tokens from given listIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg]^_YZ[\da`bcTUVWXMNOPQRSefIJKLg(c) Lev Dvorkin, 2022MITlev_135@mail.ru ExperimentalNone#$%'(-/2567>?4567?@IJKLgIJKLg7?456@      !!"#$%&'()*+,-./01234567899:;<=>?@ABCDEFGHIJJKLMMNOPQRSSTUVWWXYZ[\]^_`abcdefghijklmnoptokenizer-0.1.0.0-inplaceText.Tokenizer.BlackWhiteSetText.Tokenizer.TypesText.Tokenizer.SplitText.Tokenizer.UniquenessText.Tokenizer BlackWhiteSetBlackSetWhiteSet singleton intersectionisEmptymember$fEqBlackWhiteSet$fOrdBlackWhiteSet$fShowBlackWhiteSetRToken$sel:tokId:RToken$sel:rbehind:RToken$sel:ahead:RToken$sel:body:RTokenTokIdToken$sel:name:Token$sel:behind:Token$sel:ahead:Token$sel:body:Token Repeatable$sel:getCnt:Repeatable$sel:getBWS:RepeatableCountOneSomeAlt makeRToken $fOrdRToken $fEqRToken $fShowRToken $fShowToken$fEqRepeatable$fOrdRepeatable$fShowRepeatable $fEqCount $fOrdCount $fShowCount$fEqAlt$fOrdAlt $fShowAlt $fFunctorAlt$fApplicativeAlt $fMonadAlt$fAlternativeAlt $fFoldableAlt$fTraversableAlt TokenizeError NoWayTokenizeTwoWaysTokenize TokenizeMap$sel:tokCount:TokenizeMap$sel:charTokMap:TokenizeMap$sel:blackToks:TokenizeMap$sel:tokNames:TokenizeMap singleTokMapinsertmakeTokenizeMaptokenize$fMonoidTokenizeMap$fSemigroupTokenizeMap$fEqRem$fOrdRem $fShowRem$fShowTokenizeError$fEqTokenizeError$fShowTokenizeMapConflictTokens$sel:tokList1:ConflictTokens$sel:tokList2:ConflictTokensDiv$sel:rtoks:Div$sel:lastTok:Div$sel:rprefToks:Div$sel:processed:Div $sel:suff:DivSuff$sel:srbeh:Suff$sel:scur:Suff$sel:sahead:SuffMergeRes$sel:merged:MergeRes$sel:mergeRem:MergeResRemRem1Rem2remList mergedListrem1rem2 mergeRepsinitDivstepDivcheckUniqueTokenizing$fShowConflictTokens$fEqConflictTokens$fOrdConflictTokens$fEqDiv$fOrdDiv $fShowDiv$fEqSuff $fOrdSuff $fShowSuffbaseGHC.Base Semigroup