| Copyright | ©2019 James Brock | 
|---|---|
| License | BSD2 | 
| Maintainer | James Brock <jamesbrock@gmail.com> | 
| Safe Haskell | Safe-Inferred | 
| Language | Haskell2010 | 
Replace.Megaparsec
Contents
Description
Replace.Megaparsec is for finding text patterns, and also replacing or splitting on the found patterns. This activity is traditionally done with regular expressions, but Replace.Megaparsec uses Text.Megaparsec parsers instead for the pattern matching.
Replace.Megaparsec can be used in the same sort of “pattern capture” or “find all” situations in which one would use Python re.findall, or Perl m//, or Unix grep.
Replace.Megaparsec can be used in the same sort of “stream editing” or “search-and-replace” situations in which one would use Python re.sub, or Perl s///, or Unix sed, or awk.
Replace.Megaparsec can be used in the same sort of “string splitting” situations in which one would use Python re.split or Perl split.
See the replace-megaparsec package README for usage examples.
Type constraints
output stream type Tokens s = input stream type s
All functions in the Running Parser section require the type of the
 stream of text that is input to be
 
 such that
 Stream s,
 because we want to output the same type of stream that was input.
 That requirement is satisfied for all the Tokens s ~ sStream instances
 included with Text.Megaparsec:
Custom error type e should be Void
Megaparsec parsers have a custom error data component e. When writing parsers
 to be used by this module, the custom error type e should usually
 be Void, because every function in this module expects a parser
 failure to occur on every token in a non-matching section of the input
 stream, so parser failure error descriptions are not returned, and you'll
 never see the custom error information.
Special fast input types
Functions in this module will be “fast” when the input stream
 type s is:
We mean “fast” in the same sense as MonadParsec:
 when returning subsections of the input stream,
 we return slices of the input stream data, rather than constructing a list
 of tokens and then building a new stream subsection from that list.
 This relies on implementation details of the stream representation,
 so there are specialization re-write rules in this module to make
 that possible without adding new typeclasses.
Synopsis
- breakCap :: forall e s a. (Ord e, Stream s, Tokens s ~ s) => Parsec e s a -> s -> Maybe (s, a, s)
 - breakCapT :: forall m e s a. (Ord e, Stream s, Tokens s ~ s, Monad m) => ParsecT e s m a -> s -> m (Maybe (s, a, s))
 - splitCap :: forall e s a. (Ord e, Stream s, Tokens s ~ s) => Parsec e s a -> s -> [Either s a]
 - splitCapT :: forall e s m a. (Ord e, Stream s, Tokens s ~ s, Monad m) => ParsecT e s m a -> s -> m [Either s a]
 - streamEdit :: forall e s a. (Ord e, Stream s, Monoid s, Tokens s ~ s) => Parsec e s a -> (a -> s) -> s -> s
 - streamEditT :: forall e s m a. (Ord e, Stream s, Monad m, Monoid s, Tokens s ~ s) => ParsecT e s m a -> (a -> m s) -> s -> m s
 - anyTill :: forall e s m a. MonadParsec e s m => m a -> m (Tokens s, a)
 - sepCap :: forall e s m a. MonadParsec e s m => m a -> m [Either (Tokens s) a]
 - findAll :: MonadParsec e s m => m a -> m [Either (Tokens s) (Tokens s)]
 - findAllCap :: MonadParsec e s m => m a -> m [Either (Tokens s) (Tokens s, a)]
 
Running parser
Functions in this section are ways to run parsers
 (like runParser). They take
 as arguments a sep parser and some input, run the parser on the input,
 and return a result.
Arguments
| :: forall e s a. (Ord e, Stream s, Tokens s ~ s) | |
| => Parsec e s a | The pattern matching parser   | 
| -> s | The input stream of text  | 
| -> Maybe (s, a, s) | Maybe (prefix, parse_result, suffix)  | 
Break on and capture one pattern
Find the first occurence of a pattern in a text stream, capture the found pattern, and break the input text stream on the found pattern.
The breakCap function is like takeWhile, but can be predicated
 beyond more than just the next one token. It's also like breakOn,
 but the needle can be a pattern instead of a constant string.
Be careful not to look too far
 ahead; if the sep parser looks to the end of the input then breakCap
 could be O(n²).
The pattern parser sep may match a zero-width pattern (a pattern which
 consumes no parser input on success).
Output
Nothingwhen no pattern match was found.Just (prefix, parse_result, suffix)for the result of parsing the pattern match, and theprefixstring before and thesuffixstring after the pattern match.prefixandsuffixmay be zero-length strings.
Access the matched section of text
If you want to capture the matched string, then combine the pattern
 parser sep with match.
With the matched string, we can reconstruct the input string.
 For all input, sep, if
let (Just(prefix, (infix, _), suffix)) = breakCap (matchsep) input
then
input == prefix<>infix<>suffix
Arguments
| :: forall m e s a. (Ord e, Stream s, Tokens s ~ s, Monad m) | |
| => ParsecT e s m a | The pattern matching parser   | 
| -> s | The input stream of text  | 
| -> m (Maybe (s, a, s)) | Maybe (prefix, parse_result, suffix)  | 
Break on and capture one pattern
Monad transformer version of breakCap.
The parser sep will run in the underlying monad context.
Arguments
| :: forall e s a. (Ord e, Stream s, Tokens s ~ s) | |
| => Parsec e s a | The pattern matching parser   | 
| -> s | The input stream of text  | 
| -> [Either s a] | List of matching and non-matching input sections.  | 
Split on and capture all patterns
Find all occurences of the pattern sep, split the input string, capture
 all the patterns and the splits.
The input string will be split on every leftmost non-overlapping occurence
 of the pattern sep. The output list will contain
 the parsed result of input string sections which match the sep pattern
 in Right, and non-matching sections in Left.
splitCap depends on sepCap, see sepCap for more details.
Access the matched section of text
If you want to capture the matched strings, then combine the pattern
 parser sep with match.
With the matched strings, we can reconstruct the input string.
 For all input, sep, if
let output = splitCap (match sep) input
then
input ==mconcat(secondfst<$>output)
Arguments
| :: forall e s m a. (Ord e, Stream s, Tokens s ~ s, Monad m) | |
| => ParsecT e s m a | The pattern matching parser   | 
| -> s | The input stream of text  | 
| -> m [Either s a] | List of matching and non-matching input sections.  | 
Split on and capture all patterns
Monad transformer version of splitCap.
The parser sep will run in the underlying monad context.
Arguments
| :: forall e s a. (Ord e, Stream s, Monoid s, Tokens s ~ s) | |
| => Parsec e s a | The pattern matching parser   | 
| -> (a -> s) | The   | 
| -> s | The input stream of text to be edited  | 
| -> s | The edited input stream  | 
Stream editor
Also known as “find-and-replace”, or “match-and-substitute”. Finds all
 non-overlapping sections of the stream which match the pattern sep,
 and replaces them with the result of the editor function.
Access the matched section of text in the editor
If you want access to the matched string in the editor function,
 then combine the pattern parser sep with match.
 This will effectively change the type of the editor function
 to (s,a) -> s.
This allows us to write an editor function which can choose to not
 edit the match and just leave it as it is. If the editor function
 returns the first item in the tuple, then streamEdit will not change
 the matched string.
So, for all sep:
streamEdit (matchsep)fst≡id
Arguments
| :: forall e s m a. (Ord e, Stream s, Monad m, Monoid s, Tokens s ~ s) | |
| => ParsecT e s m a | The pattern matching parser   | 
| -> (a -> m s) | The   | 
| -> s | The input stream of text to be edited  | 
| -> m s | The edited input stream  | 
Stream editor
Monad transformer version of streamEdit.
Both the parser sep and the editor function will run in the underlying
 monad context.
If you want to do IO operations in the editor function or the
 parser sep, then run this in IO.
If you want the editor function or the parser sep to remember some state,
 then run this in a stateful monad.
Parser combinator
Functions in this section are parser combinators. They take
 a sep parser for an argument, combine sep with another parser,
 and return a new parser.
Arguments
| :: forall e s m a. MonadParsec e s m | |
| => m a | The pattern matching parser   | 
| -> m (Tokens s, a) | parser  | 
Specialized manyTill_
Parser combinator to consume input until the sep pattern matches,
 equivalent to
 .
 On success, returns the prefix before the pattern match and the parsed match.manyTill_ anySingle sep
sep may be a zero-width parser, it may succeed without consuming any
 input.
This combinator will produce a parser which
 acts like takeWhileP but is predicated beyond more than
 just the next one token. anyTill is also like takeWhileP
 in that it will be “fast” when applied to an input stream type s
 for which there are specialization re-write rules.
Arguments
| :: forall e s m a. MonadParsec e s m | |
| => m a | The pattern matching parser   | 
| -> m [Either (Tokens s) a] | parser  | 
Separate and capture
Parser combinator to find all of the leftmost non-overlapping occurrences
 of the pattern parser sep in a text stream.
 The sepCap parser will always consume its entire input and can never fail.
sepCap is similar to the sep* family of parser combinators
 found in
 parser-combinators
 and
 parsers,
 but it returns the parsed result of the sep parser instead
 of throwing it away.
Output
The input stream is separated and output into a list of sections:
- Sections which can parsed by the pattern 
sepwill be parsed and captured asRight. - Non-matching sections of the stream will be captured in 
Left. 
The output list also has these properties:
- If the input is 
""then the output list will be[]. - If there are no pattern matches, then
   the entire input stream will be returned as one non-matching 
Leftsection. - The output list will not contain two consecutive 
Leftsections. 
Zero-width matches forbidden
If the pattern matching parser sep would succeed without consuming any
 input then sepCap will force it to fail.
 If we allow sep to match a zero-width pattern,
 then it can match the same zero-width pattern again at the same position
 on the next iteration, which would result in an infinite number of
 overlapping pattern matches.
Arguments
| :: MonadParsec e s m | |
| => m a | The pattern matching parser   | 
| -> m [Either (Tokens s) (Tokens s)] | parser  | 
Deprecated: replace with `findAll sep = (fmap.fmap) (second fst) $ sepCap (match sep)`
Find all occurences
Parser combinator for finding all occurences of a pattern in a stream.
Will call sepCap with the match combinator and
 return the text which matched the pattern parser sep in
 the Right sections.
Definition:
findAll sep = (fmap.fmap) (secondfst) $sepCap(matchsep)
Arguments
| :: MonadParsec e s m | |
| => m a | The pattern matching parser   | 
| -> m [Either (Tokens s) (Tokens s, a)] | parser  | 
Deprecated: replace with `findAllCap sep = sepCap (match sep)`
Find all occurences, parse and capture pattern matches
Parser combinator for finding all occurences of a pattern in a stream.
Will call sepCap with the match combinator so that
 the text which matched the pattern parser sep will be returned in
 the Right sections, along with the result of the parse of sep.
Definition:
findAllCap sep =sepCap(matchsep)