Copyright | ©2019 James Brock |
---|---|
License | BSD2 |
Maintainer | James Brock <jamesbrock@gmail.com> |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
Replace.Attoparsec is for finding text patterns, and also replacing or splitting on the found patterns. This activity is traditionally done with regular expressions, but Replace.Attoparsec uses Data.Attoparsec parsers instead for the pattern matching.
Replace.Attoparsec can be used in the same sort of “pattern capture” or “find all” situations in which one would use Python re.findall, or Perl m//, or Unix grep.
Replace.Attoparsec can be used in the same sort of “stream editing” or “search-and-replace” situations in which one would use Python re.sub, or Perl s///, or Unix sed, or awk.
Replace.Attoparsec can be used in the same sort of “string splitting” situations in which one would use Python re.split or Perl split.
See the replace-attoparsec package README for usage examples.
Synopsis
- streamEdit :: forall a. Parser a -> (a -> Text) -> Text -> Text
- streamEditT :: Applicative m => Parser a -> (a -> m Text) -> Text -> m Text
- anyTill :: Parser a -> Parser (Text, a)
Running parser
Functions in this section are ways to run parsers
(like parse
). They take
as arguments a sep
parser and some input, run the parser on the input,
and return a result.
:: forall a. Parser a | The pattern matching parser |
-> (a -> Text) | The |
-> Text | The input stream of text to be edited |
-> Text | The edited input stream |
Stream editor
Also known as “find-and-replace”, or “match-and-substitute”. Finds all
of the sections of the stream which match the pattern sep
, and replaces
them with the result of the editor
function.
Access the matched section of text in the editor
If you want access to the matched string in the editor
function,
then combine the pattern parser sep
with match
. This will effectively change
the type of the editor
function to (Text,a) -> Text
.
This allows us to write an editor
function which can choose to not
edit the match and just leave it as it is. If the editor
function
returns the first item in the tuple, then streamEdit
will not change
the matched string.
So, for all sep
:
streamEdit (match
sep)fst
≡id
Laziness
This is lazy in the input text chunks and should release processed chunks to the garbage collector promptly.
The output is constructed by a Builder
and is subject to the chunk size
used there.
:: Applicative m | |
=> Parser a | The pattern matching parser |
-> (a -> m Text) | The |
-> Text | The input stream of text to be edited |
-> m Text | The edited input stream |
Stream editor
Monad transformer version of streamEdit
.
The editor
function will run in the underlying monad context.
If you want to do IO
operations in the editor
function then
run this in IO
.
If you want the editor
function to remember some state,
then run this in a stateful monad.
Laziness
This is lazy in the input text chunks and should release processed chunks to
the garbage collector promptly, i.e. as soon as the presence of a sep
has
been ruled out.
Note that this is as only as lazy in the chunks as the selected monad allows it to be, i.e. if your monad requires running the entire computation before getting the result then this is effectively strict in the input stream.
The output is constructed by a Builder
and is subject to the chunk size
used there.
Parser combinator
Functions in this section are parser combinators. They take
a sep
parser for an argument, combine sep
with another parser,
and return a new parser.
Specialized manyTill_
Parser combinator to consume and capture input until the sep
pattern
matches, equivalent to
.
On success, returns the prefix before the pattern match and the parsed match.manyTill_
anyChar
sep
sep
may be a zero-width parser, it may succeed without consuming any
input.
This combinator will produce a parser which acts
like takeTill
but is predicated beyond more than
just the next one token. It is also like
takeTill
in that it is a “high performance” parser.
Laziness
When the anyTill
parser reaches the end of the current input chunk
before finding the beginning of sep
then the parser will fail.
When the anyTill
parser reaches the end of the current input chunk
while it is successfully parsing sep
then it will lazily fetch more
input and continue parsing.