| Copyright | (c) Lev Dvorkin 2022 |
|---|---|
| License | MIT |
| Maintainer | lev_135@mail.ru |
| Stability | Experimental |
| Safe Haskell | None |
| Language | Haskell2010 |
Text.Tokenizer.Split
Description
This provides simple tokenizing algorithm
Synopsis
- data TokenizeMap k c = TokenizeMap {}
- singleTokMap :: Ord c => Token k c -> TokenizeMap k c
- insert :: Ord c => Token k c -> TokenizeMap k c -> TokenizeMap k c
- makeTokenizeMap :: Ord c => [Token k c] -> TokenizeMap k c
- data TokenizeError k c
- = NoWayTokenize Int [(k, [c])]
- | TwoWaysTokenize Int [(k, [c])] [(k, [c])]
- tokenize :: forall k c. Ord c => TokenizeMap k c -> [c] -> Either (TokenizeError k c) [(k, [c])]
Documentation
data TokenizeMap k c Source #
Auxillary structure for tokenizing. Should be used as opaque type,
initializing by makeTokenizeMap and concatenating by Semigroup instance.
Constructors
| TokenizeMap | |
Instances
| (Show c, Show k) => Show (TokenizeMap k c) Source # | |
Defined in Text.Tokenizer.Split Methods showsPrec :: Int -> TokenizeMap k c -> ShowS # show :: TokenizeMap k c -> String # showList :: [TokenizeMap k c] -> ShowS # | |
| Ord c => Semigroup (TokenizeMap k c) Source # | |
Defined in Text.Tokenizer.Split Methods (<>) :: TokenizeMap k c -> TokenizeMap k c -> TokenizeMap k c # sconcat :: NonEmpty (TokenizeMap k c) -> TokenizeMap k c # stimes :: Integral b => b -> TokenizeMap k c -> TokenizeMap k c # | |
| Ord c => Monoid (TokenizeMap k c) Source # | |
Defined in Text.Tokenizer.Split Methods mempty :: TokenizeMap k c # mappend :: TokenizeMap k c -> TokenizeMap k c -> TokenizeMap k c # mconcat :: [TokenizeMap k c] -> TokenizeMap k c # | |
singleTokMap :: Ord c => Token k c -> TokenizeMap k c Source #
Make a TokenizeMap with one element
insert :: Ord c => Token k c -> TokenizeMap k c -> TokenizeMap k c Source #
Insert Token into TokenizeMap
makeTokenizeMap :: Ord c => [Token k c] -> TokenizeMap k c Source #
Create auxillary Map for tokenizing. Should be called once for initializing
data TokenizeError k c Source #
Error during tokenizing
Everywhere [(k, [c])] type is used, the list of pairs with name of token
and part of string, matched by it is stored
Constructors
| NoWayTokenize | |
Fields
| |
| TwoWaysTokenize | |
Fields
| |
Instances
| (Eq k, Eq c) => Eq (TokenizeError k c) Source # | |
Defined in Text.Tokenizer.Split Methods (==) :: TokenizeError k c -> TokenizeError k c -> Bool # (/=) :: TokenizeError k c -> TokenizeError k c -> Bool # | |
| (Show k, Show c) => Show (TokenizeError k c) Source # | |
Defined in Text.Tokenizer.Split Methods showsPrec :: Int -> TokenizeError k c -> ShowS # show :: TokenizeError k c -> String # showList :: [TokenizeError k c] -> ShowS # | |
tokenize :: forall k c. Ord c => TokenizeMap k c -> [c] -> Either (TokenizeError k c) [(k, [c])] Source #
Split list of symbols on tokens.