replace-megaparsec-1.3.0.0: Find, replace, and edit text patterns with Megaparsec parsers

Copyright©2019 James Brock
LicenseBSD2
MaintainerJames Brock <jamesbrock@gmail.com>
Safe HaskellNone
LanguageHaskell2010

Replace.Megaparsec

Contents

Description

Replace.Megaparsec is for finding text patterns, and also editing and replacing the found patterns. This activity is traditionally done with regular expressions, but Replace.Megaparsec uses Text.Megaparsec parsers instead for the pattern matching.

Replace.Megaparsec can be used in the same sort of “pattern capture” or “find all” situations in which one would use Python re.findall, or Perl m//, or Unix grep.

Replace.Megaparsec can be used in the same sort of “stream editing” or “search-and-replace” situations in which one would use Python re.sub, or Perl s///, or Unix sed, or awk.

See the replace-megaparsec package README for usage examples.

Synopsis

Parser combinator

sepCap Source #

Arguments

:: MonadParsec e s m 
=> m a

The pattern matching parser sep

-> m [Either (Tokens s) a] 

Separate and capture

Parser combinator to find all of the non-overlapping ocurrences of the pattern parser sep in a text stream. The sepCap parser will always consume its entire input and can never fail.

Output

The input stream is separated into a list of sections:

  • sections which can parsed by the pattern sep will be captured as matching sections in Right
  • non-matching sections of the stream will be captured in Left.

There are two constraints on the output:

  • The output list will non-empty. If there are no pattern matches, then the entire input stream will be returned as one non-matching Left section. If the input is "" then the output list will be [Left ""].
  • The output list will not contain two consecutive Lefts.

Zero-width matches forbidden

If the pattern matching parser sep would succeed without consuming any input then sepCap will force it to fail. If we allow sep to match a zero-width pattern, then it can match the same zero-width pattern again at the same position on the next iteration, which would result in an infinite number of overlapping pattern matches.

Special accelerated inputs

There are specialization re-write rules to speed up this function when the input type is Data.Text or Data.Bytestring.

Error parameter

The error type parameter e for sep should usually be Void, because sep fails on every token in a non-matching Left section, so parser failures will not be reported.

Notes

This sepCap parser combinator is the basis for all of the other features of this module.

It is similar to the sep* family of functions found in parser-combinators and parsers but, importantly, it returns the parsed result of the sep parser instead of throwing it away, like manyTill_.

findAll Source #

Arguments

:: MonadParsec e s m 
=> m a

The pattern matching parser sep

-> m [Either (Tokens s) (Tokens s)] 

Find all occurences

Parser combinator for finding all occurences of a pattern in a stream.

Will call sepCap with the match combinator and return the text which matched the pattern parser sep in the Right sections.

Definition:

findAll sep = (fmap.fmap) (second fst) $ sepCap (match sep)

findAllCap Source #

Arguments

:: MonadParsec e s m 
=> m a

The pattern matching parser sep

-> m [Either (Tokens s) (Tokens s, a)] 

Find all occurences, parse and capture pattern matches

Parser combinator for finding all occurences of a pattern in a stream.

Will call sepCap with the match combinator so that the text which matched the pattern parser sep will be returned in the Right sections, along with the result of the parse of sep.

Definition:

findAllCap sep = sepCap (match sep)

Running parser

streamEdit Source #

Arguments

:: (Ord e, Stream s, Monoid s, Tokens s ~ s, Show s, Show (Token s), Typeable s) 
=> Parsec e s a

The parser sep for the pattern of interest.

-> (a -> s)

The editor function. Takes a parsed result of sep and returns a new stream section for the replacement.

-> s

The input stream of text to be edited.

-> s 

Stream editor

Also known as “find-and-replace”, or “match-and-substitute”. Finds all of the sections of the stream which match the pattern sep, and replaces them with the result of the editor function.

This function is not a “parser combinator,” it is a “way to run a parser”, like parse or runParserT.

Access the matched section of text in the editor

If you want access to the matched string in the editor function, then combine the pattern parser sep with match. This will effectively change the type of the editor function to (s,a) -> s.

This allows us to write an editor function which can choose to not edit the match and just leave it as it is. If the editor function returns the first item in the tuple, then streamEdit will not change the matched string.

So, for all sep:

streamEdit (match sep) fstid

Type constraints

The type of the stream of text that is input must be Stream s such that Tokens s ~ s, because we want to output the same type of stream that was input. That requirement is satisfied for all the Stream instances included with Text.Megaparsec: Data.Text, Data.Text.Lazy, Data.Bytestring, Data.Bytestring.Lazy, and Data.String.

We need the Monoid s instance so that we can mconcat the output stream.

The error type parameter e should usually be Void.

streamEditT Source #

Arguments

:: (Ord e, Stream s, Monad m, Monoid s, Tokens s ~ s, Show s, Show (Token s), Typeable s) 
=> ParsecT e s m a

The parser sep for the pattern of interest.

-> (a -> m s)

The editor function. Takes a parsed result of sep and returns a new stream section for the replacement.

-> s

The input stream of text to be edited.

-> m s 

Stream editor transformer

Monad transformer version of streamEdit.

Both the parser sep and the editor function run in the underlying monad context.

If you want to do IO operations in the editor function or the parser sep, then run this in IO.

If you want the editor function or the parser sep to remember some state, then run this in a stateful monad.