| Copyright | ©2019 James Brock | 
|---|---|
| License | BSD2 | 
| Maintainer | James Brock <jamesbrock@gmail.com> | 
| Safe Haskell | Safe-Inferred | 
| Language | Haskell2010 | 
Replace.Attoparsec.Text
Contents
Description
Replace.Attoparsec is for finding text patterns, and also replacing or splitting on the found patterns. This activity is traditionally done with regular expressions, but Replace.Attoparsec uses Data.Attoparsec parsers instead for the pattern matching.
Replace.Attoparsec can be used in the same sort of “pattern capture” or “find all” situations in which one would use Python re.findall, or Perl m//, or Unix grep.
Replace.Attoparsec can be used in the same sort of “stream editing” or “search-and-replace” situations in which one would use Python re.sub, or Perl s///, or Unix sed, or awk.
Replace.Attoparsec can be used in the same sort of “string splitting” situations in which one would use Python re.split or Perl split.
See the replace-attoparsec package README for usage examples.
Synopsis
- breakCap :: Parser a -> Text -> Maybe (Text, a, Text)
 - splitCap :: Parser a -> Text -> [Either Text a]
 - streamEdit :: Parser a -> (a -> Text) -> Text -> Text
 - streamEditT :: Applicative m => Parser a -> (a -> m Text) -> Text -> m Text
 - anyTill :: Parser a -> Parser (Text, a)
 - sepCap :: Parser a -> Parser [Either Text a]
 - findAll :: Parser a -> Parser [Either Text Text]
 - findAllCap :: Parser a -> Parser [Either Text (Text, a)]
 
Running parser
Functions in this section are ways to run parsers
 (like parse). They take
 as arguments a sep parser and some input, run the parser on the input,
 and return a result.
Arguments
| :: Parser a | The pattern matching parser   | 
| -> Text | The input stream of text  | 
| -> Maybe (Text, a, Text) | Maybe (prefix, parse_result, suffix)  | 
Break on and capture one pattern
Find the first occurence of a pattern in a text stream, capture the found pattern, and break the input text stream on the found pattern.
The breakCap function is like takeWhile, but can be predicated
 beyond more than just the next one token. It's also like breakOn,
 but the needle can be a pattern instead of a constant string.
Be careful not to look too far
 ahead; if the sep parser looks to the end of the input then breakCap
 could be O(n²).
The pattern parser sep may match a zero-width pattern (a pattern which
 consumes no parser input on success).
Output
Nothingwhen no pattern match was found.Just (prefix, parse_result, suffix)for the result of parsing the pattern match, and theprefixstring before and thesuffixstring after the pattern match.prefixandsuffixmay be zero-length strings.
Access the matched section of text
If you want to capture the matched string, then combine the pattern
 parser sep with match.
With the matched string, we can reconstruct the input string.
 For all input, sep, if
let (Just(prefix, (infix, _), suffix)) = breakCap (matchsep) input
then
input == prefix<>infix<>suffix
Arguments
| :: Parser a | The pattern matching parser   | 
| -> Text | The input stream of text  | 
| -> [Either Text a] | List of matching and non-matching input sections  | 
Split on and capture all patterns
Find all occurences of the pattern sep, split the input string, capture
 all the patterns and the splits.
The input string will be split on every leftmost non-overlapping occurence
 of the pattern sep. The output list will contain
 the parsed result of input string sections which match the sep pattern
 in Right, and non-matching sections in Left.
splitCap depends on sepCap, see sepCap for more details.
Access the matched section of text
If you want to capture the matched strings, then combine the pattern
 parser sep with match.
With the matched strings, we can reconstruct the input string.
 For all input, sep, if
let output = splitCap (match sep) input
then
input ==mconcat(secondfst<$>output)
Arguments
| :: Parser a | The pattern matching parser   | 
| -> (a -> Text) | The   | 
| -> Text | The input stream of text to be edited  | 
| -> Text | The edited input stream  | 
Stream editor
Also known as “find-and-replace”, or “match-and-substitute”. Finds all
 of the sections of the stream which match the pattern sep, and replaces
 them with the result of the editor function.
Access the matched section of text in the editor
If you want access to the matched string in the editor function,
 then combine the pattern parser sep
 with match. This will effectively change
 the type of the editor function to (Text,a) -> Text.
This allows us to write an editor function which can choose to not
 edit the match and just leave it as it is. If the editor function
 returns the first item in the tuple, then streamEdit will not change
 the matched string.
So, for all sep:
streamEdit (matchsep)fst≡id
Arguments
| :: Applicative m | |
| => Parser a | The pattern matching parser   | 
| -> (a -> m Text) | The   | 
| -> Text | The input stream of text to be edited  | 
| -> m Text | The edited input stream  | 
Stream editor
Monad transformer version of streamEdit.
The editor function will run in the underlying monad context.
If you want to do IO operations in the editor function then
 run this in IO.
If you want the editor function to remember some state,
 then run this in a stateful monad.
Parser combinator
Functions in this section are parser combinators. They take
 a sep parser for an argument, combine sep with another parser,
 and return a new parser.
Specialized manyTill_
Parser combinator to consume and capture input until the sep pattern
 matches, equivalent to
 .
 On success, returns the prefix before the pattern match and the parsed match.manyTill_ anyChar sep
sep may be a zero-width parser, it may succeed without consuming any
 input.
This combinator will produce a parser which acts
 like takeTill but is predicated beyond more than
 just the next one token. It is also like
 takeTill in that it is a “high performance” parser.
Separate and capture
Parser combinator to find all of the non-overlapping ocurrences
 of the pattern sep in a text stream.
 The sepCap parser will always consume its entire input and can never fail.
sepCap is similar to the sep* family of functions found in
 parser-combinators
 and
 parsers,
 but it returns the parsed result of the sep parser instead
 of throwing it away.
Output
The input stream is separated and output into a list of sections:
- Sections which can parsed by the pattern 
sepwill be parsed and captured asRight - Non-matching sections of the stream will be captured in 
Left. 
The output list also has these properties:
- If the input is 
""then the output list will be[]. - If there are no pattern matches, then
   the entire input stream will be returned as one non-matching 
Leftsection. - The output list will not contain two consecutive 
Leftsections. 
Zero-width matches forbidden
If the pattern matching parser sep would succeed without consuming any
 input then sepCap will force it to fail.
 If we allow sep to match a zero-width pattern,
 then it can match the same zero-width pattern again at the same position
 on the next iteration, which would result in an infinite number of
 overlapping pattern matches.
Deprecated: replace with `findAll sep = (fmap.fmap) (second fst) $ sepCap (match sep)`
Find all occurences
Parser combinator for finding all occurences of a pattern in a stream.
Will call sepCap with the match combinator and
 return the text which matched the pattern parser sep in
 the Right sections.
Definition:
findAll sep = (fmap.fmap) (secondfst) $sepCap(matchsep)
Deprecated: replace with `findAllCap sep = sepCap (match sep)`
Find all occurences, parse and capture pattern matches
Parser combinator for finding all occurences of a pattern in a stream.
Will call sepCap with the match combinator so that
 the text which matched the pattern parser sep will be returned in
 the Right sections, along with the result of the parse of sep.
Definition:
findAllCap sep =sepCap(matchsep)