lens-regex-pcre-1.1.0.0: A lensy interface to regular expressions

Copyright(c) Chris Penner 2019
LicenseBSD3
Safe HaskellNone
LanguageHaskell2010

Control.Lens.Regex.Text

Contents

Description

 
Synopsis

Basics

regex :: QuasiQuoter Source #

Builds a traversal over text using a Regex pattern

It's a QuasiQuoter which creates a Traversal out of the given regex string. It's equivalent to calling regexing on a Regex created using the re QuasiQuoter.

The "real" type is:

regex :: Regex -> IndexedTraversal' Int T.Text Match

It's a traversal which selects Matches; compose it with match or groups to get the relevant parts of your match.

>>> txt = "raindrops on roses and whiskers on kittens"

Search

>>> has ([regex|whisk|]) txt
True

Get matches

>>> txt ^.. [regex|\br\w+|] . match
["raindrops","roses"]

Edit matches

>>> txt & [regex|\br\w+|] . match %~ T.intersperse '-' . T.toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"

Get Groups

>>> txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

Edit Groups

>>> txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"

Get the third match

>>> txt ^? [regex|\w+|] . index 2 . match
Just "roses"

Edit matches

>>> txt & [regex|\br\w+|] . match %~ T.intersperse '-' . T.toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"

Get Groups

>>> txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

Edit Groups

>>> txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"

Get the third match

>>> txt ^? [regex|\w+|] . index 2 . match
Just "roses"

Match integers, Read them into ints, then sort them in-place dumping them back into the source text afterwards.

>>> "Monday: 29, Tuesday: 99, Wednesday: 3" & partsOf ([regex|\d+|] . match . unpacked . _Show @Int) %~ sort
"Monday: 3, Tuesday: 29, Wednesday: 99"

To alter behaviour of the regex you may wish to pass PCREOptions when compiling it. The default behaviour may seem strange in certain cases; e.g. it operates in 'single-line' mode. You can compile the Regex separately and add any options you like, then pass the resulting Regex into regex; Alternatively can make your own version of the QuasiQuoter with any options you want embedded by using mkRegexQQ. regex :: Regex -> IndexedTraversal' Int T.Text RBS.Match

match :: IndexedTraversal' [Text] Match Text Source #

Traverse each match

Stashes any matched groups into the index in case you need them.

Get a match if one exists:

>>> "find a needle in a haystack" ^? [regex|n..dle|] . match
Just "needle"

Collect all matches

>>> "one _two_ three _four_" ^.. [regex|_\w+_|] . match
["_two_","_four_"]

You can edit the traversal to perform a regex replace/substitution

>>> "one _two_ three _four_" & [regex|_\w+_|] . match %~ T.toUpper
"one _TWO_ three _FOUR_"

Here we use the group matches stored in the index to form key-value pairs, replacing the entire match.

>>> "abc-def, ghi-jkl" & [regex|(\w+)-(\w+)|] . match %@~ \[k, v] _ -> "{" <> k <> ":" <> v <> "}"
"{abc:def}, {ghi:jkl}"

groups :: IndexedLens' Text Match [Text] Source #

Access all groups of a match as a list. Also keeps full match text as the index in case you need it.

Note that you can edit the groups through this lens, Changing the length of the list has behaviour similar to partsOf.

Get all matched groups:

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

You can access a specific group combining with ix, or just use group instead

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups .  ix 1
["roses","kittens"]

Editing groups:

>>> "raindrops on roses and whiskers on kittens" & [regex|(\w+) on (\w+)|] . groups .  ix 1 %~ T.toUpper
"raindrops on ROSES and whiskers on KITTENS"

Editing the list rearranges groups

>>> "raindrops on roses and whiskers on kittens" & [regex|(\w+) on (\w+)|] . groups %~ Prelude.reverse
"roses on raindrops and kittens on whiskers"

You can traverse the list to flatten out all groups

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups . traversed
["raindrops","roses","whiskers","kittens"]

This replaces each group with the full match text wrapped in parens:

>>> "one-two" & [regex|(\w+)-(\w+)|] . groups <. traversed %@~ \mtch grp -> grp <> ":(" <> mtch <> ")"
"one:(one-two)-two:(one-two)"

group :: Int -> IndexedTraversal' Text Match Text Source #

Access a specific group of a match. Numbering starts at 0.

Stashes the full match text as the index in case you need it.

See groups for more info on grouping

>>> "key:value, a:b" ^.. [regex|(\w+):(\w+)|] . group 0
["key","a"]
>>> "key:value, a:b" ^.. [regex|(\w+):(\w+)|] . group 1
["value","b"]
>>> "key:value, a:b" & [regex|(\w+):(\w+)|] . group 1 %~ T.toUpper
"key:VALUE, a:B"
>>> "key:value, a:b" & [regex|(\w+):(\w+)|] . group 1 %~ T.toUpper
"key:VALUE, a:B"

Replace the first capture group with the full match:

>>> "a, b" & [regex|(\w+), (\w+)|] . group 0 .@~ \i -> "(" <> i <> ")"
"(a, b), b"

namedGroups :: IndexedLens' Text Match (Map Text Text) Source #

Access all the named groups of a match as a Map. Stashes the full match text as the index in case you need it.

Note that you can edit the groups through this lens, but the behaviour is undefined when editing inner elements of nested groups. Behaviour is undefined if groups are removed from the map (so don't do that).

NOTE: There's currently some strange behaviour in pcre-heavy where trailing unmatched optional groups are omitted, I'm looking into getting that patched, but for now, note the difference in behaviour:

>>> "A" ^? [regex|(?<a>A)|(?<b>B)|] . namedGroups
Just (fromList [("a","A")])
>>> "B" ^? [regex|(?<a>A)|(?<b>B)|] . namedGroups
Just (fromList [("a",""),("b","B")])

Get all matched groups:

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups
[fromList [("first","raindrops"),("second","roses")],fromList [("first","whiskers"),("second","kittens")]]

You can access a specific group combining with ix, or just use namedGroup instead

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups .  ix "second"
["roses","kittens"]

Editing groups:

>>> "raindrops on roses and whiskers on kittens" & [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups . ix "second" %~ T.toUpper
"raindrops on ROSES and whiskers on KITTENS"

Use indexed helpers to access the full match when operating on a group.

This replaces the "first" group with the full match text wrapped in parens:

>>> "one-two" & [regex|(?<first>\w+)-(\w+)|] . namedGroups <. ix "first" %@~ \mtch grp -> grp <> ":(" <> mtch <> ")"
"one:(one-two)-two"

namedGroup :: Text -> IndexedTraversal' Text Match Text Source #

Access a specific named group of a match

See namedGroups for caveats and more info.

Stashes the full match text as the index in case you need it.

>>> "key:value, a:b" ^.. [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "first"
["key","a"]
>>> "key:value, a:b" ^.. [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "second"
["value","b"]
>>> "key:value, a:b" & [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "second" %~ T.toUpper
"key:VALUE, a:B"

Replace the first capture group with the full match:

>>> "a, b" & [regex|(?<first>\w+), (?<second>\w+)|] . namedGroup "first" .@~ \i -> "(" <> i <> ")"
"(a, b), b"

matchAndGroups :: Getter Match (Text, [Text]) Source #

Collect both the match text AND all the matching groups

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . matchAndGroups
[("raindrops on roses",["raindrops","roses"]),("whiskers on kittens",["whiskers","kittens"])]

Compiling regexes to Traversals

regexing :: Regex -> IndexedTraversal' Int Text Match Source #

Build a traversal from the provided Regex, this is handy if you're QuasiQuoter averse, or if you already have a Regex object floating around.

Also see mkRegexTraversalQQ

mkRegexTraversalQQ :: [PCREOption] -> QuasiQuoter Source #

Build a QuasiQuoter just like regex but with the provided PCREOption overrides.

Types

data Match Source #

Match represents an opaque regex match. You can drill into it using match, groups, group, namedGroup, namedGroups or matchAndGroups

Instances
(TypeError (Text "You're trying to 'show' a raw 'Match' object." :$$: Text "You likely missed adding a 'match' or 'groups' or 'group' call after your 'regex' call :)") :: Constraint) => Show Match Source # 
Instance details

Defined in Control.Lens.Regex.ByteString

Methods

showsPrec :: Int -> Match -> ShowS #

show :: Match -> String #

showList :: [Match] -> ShowS #

data Regex #

An abstract pointer to a compiled PCRE Regex structure The structure allocated by the PCRE library will be deallocated automatically by the Haskell storage manager.

Instances
Eq Regex 
Instance details

Defined in Text.Regex.PCRE.Light.Base

Methods

(==) :: Regex -> Regex -> Bool #

(/=) :: Regex -> Regex -> Bool #

Ord Regex 
Instance details

Defined in Text.Regex.PCRE.Light.Base

Methods

compare :: Regex -> Regex -> Ordering #

(<) :: Regex -> Regex -> Bool #

(<=) :: Regex -> Regex -> Bool #

(>) :: Regex -> Regex -> Bool #

(>=) :: Regex -> Regex -> Bool #

max :: Regex -> Regex -> Regex #

min :: Regex -> Regex -> Regex #

Show Regex 
Instance details

Defined in Text.Regex.PCRE.Light.Base

Methods

showsPrec :: Int -> Regex -> ShowS #

show :: Regex -> String #

showList :: [Regex] -> ShowS #