Safe Haskell	None
Language	Haskell2010

Text.Regex.Pcre2

Contents

Matching and substitution
Compile-time validation
- Quasi-quoters
- Type-indexed capture groups
Options
- Callout interface
  - Substitution callouts
User errors
PCRE2 build configuration

Synopsis

match :: Alternative f => Text -> Text -> f Text
matchOpt :: Alternative f => Option -> Text -> Text -> f Text
matchAll :: Text -> Text -> [Text]
matchAllOpt :: Option -> Text -> Text -> [Text]
matches :: Text -> Text -> Bool
matchesOpt :: Option -> Text -> Text -> Bool
captures :: Text -> Text -> [Text]
capturesOpt :: Option -> Text -> Text -> [Text]
capturesA :: Alternative f => Text -> Text -> f (NonEmpty Text)
capturesAOpt :: Alternative f => Option -> Text -> Text -> f (NonEmpty Text)
capturesAll :: Text -> Text -> [NonEmpty Text]
capturesAllOpt :: Option -> Text -> Text -> [NonEmpty Text]
sub :: Text -> Text -> Text -> Text
gsub :: Text -> Text -> Text -> Text
subOpt :: Option -> Text -> Text -> Text -> Text
_match :: Text -> Traversal' Text Text
_matchOpt :: Option -> Text -> Traversal' Text Text
_captures :: Text -> Traversal' Text (NonEmpty Text)
_capturesOpt :: Option -> Text -> Traversal' Text (NonEmpty Text)
regex :: QuasiQuoter
_regex :: QuasiQuoter
data Captures (info :: CapturesInfo)
capture :: forall i info num. (CaptNum i info ~ num, KnownNat num) => Captures info -> Text
_capture :: forall i info num. (CaptNum i info ~ num, KnownNat num) => Lens' (Captures info) Text
data Option
- = AllowEmptyClass
- | AltBsux
- | AltBsuxLegacy
- | AltCircumflex
- | AltVerbNames
- | Anchored
- | AutoCallout
- | BadEscapeIsLiteral
- | Bsr Bsr
- | Caseless
- | DepthLimit Word32
- | DollarEndOnly
- | DotAll
- | DupNames
- | EndAnchored
- | EscapedCrIsLf
- | Extended
- | ExtendedMore
- | FirstLine
- | HeapLimit Word32
- | Literal
- | MatchLimit Word32
- | MatchLine
- | MatchUnsetBackRef
- | MatchWord
- | MaxPatternLength Word64
- | Multiline
- | NeverBackslashC
- | NeverUcp
- | Newline Newline
- | NoAutoCapture
- | NoAutoPossess
- | NoDotStarAnchor
- | NoStartOptimize
- | NotBol
- | NotEmpty
- | NotEmptyAtStart
- | NotEol
- | OffsetLimit Word64
- | ParensLimit Word32
- | PartialHard
- | PartialSoft
- | SubGlobal
- | SubLiteral
- | SubReplacementOnly
- | SubUnknownUnset
- | SubUnsetEmpty
- | Ucp
- | Ungreedy
- | UnsafeCallout (CalloutInfo -> IO CalloutResult)
- | UnsafeCompileRecGuard (Int -> IO Bool)
- | UnsafeSubCallout (SubCalloutInfo -> IO SubCalloutResult)
- | Utf
data Bsr
- = BsrUnicode
- | BsrAnyCrlf
data Newline
- = NewlineCr
- | NewlineLf
- | NewlineCrlf
- | NewlineAny
- | NewlineAnyCrlf
- | NewlineNul
data CalloutInfo = CalloutInfo {
- calloutIndex :: CalloutIndex
- calloutCaptures :: NonEmpty (Maybe Text)
- calloutSubject :: Text
- calloutMark :: Maybe Text
- calloutIsFirst :: Bool
- calloutBacktracked :: Bool
}
data CalloutIndex
- = CalloutNumber Int
- | CalloutName Text
- | CalloutAuto Int Int
data CalloutResult
- = CalloutProceed
- | CalloutNoMatchHere
- | CalloutNoMatch
data SubCalloutInfo = SubCalloutInfo {
- subCalloutSubsCount :: Int
- subCalloutCaptures :: NonEmpty (Maybe Text)
- subCalloutSubject :: Text
- subCalloutReplacement :: Text
}
data SubCalloutResult
- = SubCalloutAccept
- | SubCalloutSkip
- | SubCalloutAbort
data SomePcre2Exception
data Pcre2Exception
data Pcre2CompileException
defaultBsr :: Bsr
compiledWidths :: [Int]
defaultDepthLimit :: Int
defaultHeapLimit :: Int
supportsJit :: Bool
jitTarget :: Maybe Text
linkSize :: Int
defaultMatchLimit :: Int
defaultNewline :: Newline
defaultIsNeverBackslashC :: Bool
defaultParensLimit :: Int
defaultTablesLength :: Int
unicodeVersion :: Maybe Text
supportsUnicode :: Bool
pcreVersion :: Text

Matching and substitution

Introduction

Atop the low-level binding to the C API, we present a high-level interface to add regular expressions to Haskell programs.

All input and output strings are strict Text, which maps directly to how PCRE2 operates on strings of 16-bit-wide code units.

The C API requires pattern strings to be compiled and the compiled patterns to be executed on subject strings in discrete steps. We hide this procedure, accepting pattern and subject as arguments in a single function, essentially:

pattern -> subject -> result

The implementation guarantees that, when partially applied to pattern but not subject, the resulting function will close on the underlying compiled object and reuse it for every subject it is subsequently applied to.

Likewise, we do not require the user to know whether a PCRE2 option is to be applied at pattern compile time or match time. Instead we fold all possible options into a single datatype, Option. Most functions have vanilla and configurable variants; the latter have "Opt" in the name and accept a value of this type.

Similar to how head :: [a] -> a sacrifices totality for type simplicity, we represent user errors as imprecise exceptions. Unlike with head, these exceptions are typed (as SomePcre2Exceptions); moreover, we offer Template Haskell facilities that can intercept some of these errors before the program is run. (Failure to match is not considered a user error and is represented in the types.)

There's more than one way to do it with this library. The choices between functions and traversals, poly-kinded Captures and plain lists, string literals and quasi-quotations, quasi-quoted expressions and quasi-quoted patterns...these are left to the user. She will observe that advanced features' extra safety, power, and convenience entail additional language extensions, cognitive overhead, and (for lenses) library dependencies, so it's really a matter of finding the best trade-offs for her case.

Definitions

Pattern: The string defining a regular expression. Refer to syntax here.
Subject: The string the compiled regular expression is executed on.
Regex: A function of the form Text -> result, where the argument is the subject. It is "compiled" via partial application as discussed above. (Lens users: A regex has the more abstract form Traversal' Text result, but the concept is the same.)
Capture (or capture group): Any substrings of the subject matched by the pattern, meaning the whole pattern and any parenthesized groupings. The PCRE2 docs do not refer to the former as a "capture"; however it is accessed the same way in the C API, just with index 0, so we will consider it the 0th capture for consistency. Parenthesized captures are implicitly numbered from 1.
Unset capture: A capture considered unset as distinct from empty. This can arise from matching the pattern (a)? to an empty subject—the 0th capture will be set as empty, but the 1st will be unset altogether. We represent both as empty Text for simplicity. See below for discussions about how unset captures may be detected or substituted using this library.
Named capture: A parenthesized capture can be named like this: (?<foo>...). Whether they have names or not, captures are always numbered as described above.

Performance

Each API function is designed such that, when a regex is obtained, the underlying C data generated from the pattern and any options is reused for that regex's lifetime. Care should be taken that the same regex is not recreated ex nihilo and discarded for each new subject:

isEmptyOrHas2Digits :: Text -> Bool
isEmptyOrHas2Digits s = Text.null s || matches "\\d{2}" s -- bad, fully applied

Instead, store it in a partially applied state:

isEmptyOrHas2Digits = (||) <$> Text.null <*> matches "\\d{2}" -- OK but abstruse

When in doubt, always create regexes as top-level values:

has2Digits :: Text -> Bool
has2Digits = matches "\\d{2}"

isEmptyOrHas2Digits s = Text.null s || has2Digits s -- good

Note: Template Haskell regexes are immune from this problem and may be freely inlined; see below.

Handling errors

In a few places we use the Alternative typeclass to optionally return match results, expressing success via pure and failure via empty. Typically the user will choose the instance Maybe, but other useful ones exist, notably STM, that of optparse-applicative, and those of parser combinator libraries such as megaparsec.

By contrast, user errors are thrown purely. If a user error is to be caught, it must be at the site where the match or substitution results are evaluated—in other words, wherever the regex is applied to a subject. Even pattern compile errors are deferred to match sites, due to the way this library employs unsafePerformIO to implement laziness.

>>> broken = match "*"
>>> broken "foo"
*** Exception: pcre2_compile: quantifier does not follow a repeatable item
                    *
                    ^

evaluate comes in handy to force results into the IO monad in order to catch errors reliably:

>>> evaluate (broken "foo") `catch` \(_ :: SomePcre2Exception) -> return Nothing
Nothing

Or simply select IO as the Alternative instance:

>>> :t broken
broken :: Alternative f => Text -> f Text
>>> broken "foo" `catch` \(_ :: SomePcre2Exception) -> return "broken"
"broken"

Basic matching functions

match :: Alternative f => Text -> Text -> f Text Source #

Match a pattern to a subject once and return the portion that matched in an Alternative, or empty if no match.

matchOpt :: Alternative f => Option -> Text -> Text -> f Text Source #

matchOpt mempty = match

matchAll :: Text -> Text -> [Text] Source #

Match a pattern to a subject and lazily return a list of all non-overlapping portions that matched.

Since: 1.1.0

matchAllOpt :: Option -> Text -> Text -> [Text] Source #

matchAllOpt mempty = matchAll

Since: 1.1.0

matches :: Text -> Text -> Bool Source #

Does the pattern match the subject at least once?

matchesOpt :: Option -> Text -> Text -> Bool Source #

matchesOpt mempty = matches

captures :: Text -> Text -> [Text] Source #

Match a pattern to a subject once and return a list of captures, or [] if no match.

capturesOpt :: Option -> Text -> Text -> [Text] Source #

capturesOpt mempty = captures

capturesA :: Alternative f => Text -> Text -> f (NonEmpty Text) Source #

Match a pattern to a subject once and return a non-empty list of captures in an Alternative, or empty if no match. The non-empty list constructor :| serves as a cue to differentiate the 0th capture from the others:

let parseDate = capturesA "(\\d{4})-(\\d{2})-(\\d{2})"
in case parseDate "submitted 2020-10-20" of
    Just (date :| [y, m, d]) -> ...
    Nothing                  -> putStrLn "didn't match"

capturesAOpt :: Alternative f => Option -> Text -> Text -> f (NonEmpty Text) Source #

capturesAOpt mempty = capturesA

Since: 1.1.0

capturesAll :: Text -> Text -> [NonEmpty Text] Source #

Match a pattern to a subject and lazily produce a list of all non-overlapping portions, with all capture groups, that matched.

Since: 1.1.0

capturesAllOpt :: Option -> Text -> Text -> [NonEmpty Text] Source #

capturesAllOpt mempty = capturesAll

Since: 1.1.0

PCRE2-native substitution

sub Source #

Arguments

:: Text	pattern
-> Text	replacement
-> Text	subject
-> Text	result

Perform at most one substitution. See the docs for the special syntax of replacement.

>>> sub "(\\w+) calling the (\\w+)" "$2 calling the $1" "the pot calling the kettle black"
"the kettle calling the pot black"

gsub :: Text -> Text -> Text -> Text Source #

Perform substitutions globally.

>>> gsub "a" "o" "apples and bananas"
"opples ond bononos"

subOpt :: Option -> Text -> Text -> Text -> Text Source #

subOpt mempty = sub
subOpt SubGlobal = gsub

Lens-powered matching and substitution

To use this portion of the library, there are two prerequisites:

A basic working understanding of optics. Many tutorials exist online, such as this, and videos such as this.
A library providing combinators. For lens newcomers, it is recommended to grab microlens-platform—most of the examples in this library work with it, packed and unpacked are included for working with Text, and it is upwards-compatible with the full lens library.

We expose a set of traversals that focus on matched substrings within a subject. Like the basic functional regexes, they should be "compiled" and memoized, rather than created inline.

_nee :: Traversal' Text Text
_nee = _match "(?i)\\bnee\\b"

In addition to getting results, they support global substitution through setting; more generally, they can accrete effects while performing replacements.

>>> promptNee = traverseOf (_nee . unpacked) $ \s -> print s >> getLine
>>> promptNee "We are the knights who say...NEE!"
"NEE"
NOO
"We are the knights who say...NOO!"
>>>

In general these traversals are not law-abiding.

_match :: Text -> Traversal' Text Text Source #

Given a pattern, produce a traversal (0 or more targets) that focuses from a subject to the portions of it that match.

_match = _captures patt . ix 0

_matchOpt :: Option -> Text -> Traversal' Text Text Source #

_matchOpt mempty = _match

_captures :: Text -> Traversal' Text (NonEmpty Text) Source #

Given a pattern, produce a traversal (0 or more targets) that focuses from a subject to each non-empty list of captures that pattern matches globally.

Substitution works in the following way: If a capture is set such that the new Text is not equal to the old one, a substitution occurs, otherwise it doesn't. This matters in cases where a capture encloses another capture—notably, all parenthesized captures are enclosed by the 0th.

>>> threeAndMiddle = _captures ". (.) ."
>>> "A A A" & threeAndMiddle .~ "A A A" :| ["B"]
"A B A"
>>> "A A A" & threeAndMiddle .~ "A B A" :| ["A"]
"A B A"

Changing multiple overlapping captures won't do what you want and is unsupported.

Changing an unset capture is unsupported because the PCRE2 match API does not give location info about it. Currently we ignore all such attempts. (Native substitution functions like sub do not have this limitation. See also SubUnknownUnset and SubUnsetEmpty.)

If the list becomes longer for some reason, the extra elements are ignored. If it's shortened, the absent elements are considered to be unchanged.

It's recommended that the list be modified capture-wise, using ix.

let madlibs = _captures "(\\w+) my (\\w+)"

print $ "Well bust my buttons!" &~ do
    zoom madlibs $ do
        ix 1 . _head .= 'd'
        ix 2 %= Text.reverse
    _last .= '?'

-- "Well dust my snottub?"

_capturesOpt :: Option -> Text -> Traversal' Text (NonEmpty Text) Source #

_capturesOpt mempty = _captures

Compile-time validation

Despite whatever virtues, the API thus far has some fragility arising from various scenarios:

pattern malformation such as mismatched parentheses (runtime error)
out-of-bounds indexing of a capture group list (runtime error)
out-of-bounds ixing of a Traversal' target (spurious failure to match)
case expression containing a Haskell list pattern of the wrong length (spurious failure to match)
regex created and discarded inline (suboptimal performance)
precariously many backslashes in a pattern. Matching a literal backslash requires the sequence "\\\\"!

Using a combination of language extensions and pattern introspection features, we provide a Template Haskell API to mitigate these scenarios. To make use of it these must be enabled:

Extension	Required for	When
`DataKinds`	`Nat`s (numbers), `Symbol`s (strings), and other type-level data powering compile-time capture lookups	Using `regex`/`_regex` with a pattern containing parenthesized captures
`QuasiQuotes`	`[`f`\|`...`\|]` syntax	Always
`TypeApplications`	`@i` syntax for supplying type index arguments to applicable functions	Using `capture`/`_capture`
`ViewPatterns`	Running code and binding variables in pattern context proper (pattern guards are off-limits for this)	Using `regex` as a Haskell pattern

The inspiration for this portion of the library is Ruby, which supports regular expressions with superior ergonomics.

Quasi-quoters

regex :: QuasiQuoter Source #

As an expression

regex :: (Alternative f) => String -> Text -> f (Captures info)

in the presence of parenthesized captures, or

regex :: (Alternative f) => String -> Text -> f Text

if there are none. In other words, if there is more than the 0th capture, this behaves like capturesA (except returning an opaque Captures instead of a list), otherwise it behaves like match.

To retrieve an individual capture from a Captures, use capture.

case [regex|(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})|] "submitted 2020-10-20" of
    Just cs ->
        let date = capture @0 cs
            year = read @Int $ Text.unpack $ capture @"y" cs
            ...

forM_ ([regex|\s+$|] line :: Maybe Text) $ \spaces -> error $
    "line has trailing spaces (" ++ show (Text.length spaces) ++ " characters)"

As a pattern

This matches when the regex first matches, whereupon any named captures are bound to variables of the same names.

case "submitted 2020-10-20" of
    [regex|(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})|] ->
        let year = read @Int $ Text.unpack y
            ...

Note that it is not possible to access the 0th capture this way. As a workaround, explicitly capture the whole pattern and name it.

If there are no named captures, this simply acts as a guard.

_regex :: QuasiQuoter Source #

A global, optical variant of regex. Can only be used as an expression.

_regex :: String -> Traversal' Text (Captures info)
_regex :: String -> Traversal' Text Text

import Control.Lens
import Data.Text.Lens

embeddedNumber :: Traversal' String Int
embeddedNumber = packed . [_regex|\d+|] . unpacked . _Show

main :: IO ()
main = putStrLn $ "There are 14 competing standards" & embeddedNumber %~ (+ 1)

-- There are 15 competing standards

Type-indexed capture groups

data Captures (info :: CapturesInfo) Source #

A wrapper around a list of captures that carries additional type-level information about the number and names of those captures.

This type is only intended to be created by regex/_regex and consumed by capture/_capture, relying on type inference. Specifying the info explicitly in a type signature is not supported—the definition of CapturesInfo is not part of the public API and may change without warning.

After obtaining Captures it's recommended to immediately consume them and transform them into application-level data, to avoid leaking the types to top level and having to write signatures. In times of need, "Captures _" may be written with the help of {-# LANGUAGE PartialTypeSignatures #-}.

capture :: forall i info num. (CaptNum i info ~ num, KnownNat num) => Captures info -> Text Source #

Safely lookup a capture in a Captures result obtained from a Template Haskell-generated matching function.

The ugly type signature may be interpreted like this: Given some capture group index i and some info about a regex, ensure that index exists and is resolved to the number num at compile time. Then, at runtime, get a capture group from a list of captures.

In practice the variable i is specified by type application and the other variables are inferred.

capture @3
capture @"bar"

Specifying a nonexistant number or name will result in a type error.

_capture :: forall i info num. (CaptNum i info ~ num, KnownNat num) => Lens' (Captures info) Text Source #

Like capture but focus from a Captures to a capture.

Options

data Option Source #

A Monoid representing nearly every facility PCRE2 presents for tweaking the behavior of regex compilation and execution.

All library functions that take options have the suffix Opt in their names; for each of them, there's also a non-Opt convenience function that simply has the (unexported) mempty option. For many uses, options won't be needed.

Some options can be enabled by special character sequences in the pattern as an alternative to specifying them as an Option. See Caseless for example.

Documentation is scant here. For more complete, accurate information, including discussions of corner cases arising from specific combinations of options and pattern items, please see the C API documentation.

Constructors

AllowEmptyClass	Make `[]` not match anything, rather than counting the `]` as the first character of the class.
AltBsux	Like `AltBsuxLegacy`, except with ECMAScript 6 hex literal feature for `\u`.
AltBsuxLegacy	Behave like ECMAScript 5 for `\U`, `\u`, and `\x`. See `AltBsux`.
AltCircumflex	Match a `^` after a newline at the end of the subject. Only relevant in multiline mode.
AltVerbNames	Enable backslash escapes in verb names. E.g., `(*MARK:L$O$L)`.
Anchored	Equivalent to beginning pattern with `^`.
AutoCallout	Run callout for every pattern item. Only relevant if a callout is set.
BadEscapeIsLiteral	Do not throw an error for unrecognized or malformed escapes. "This is a dangerous option."
Bsr Bsr	Override what `\R` matches (default given by `defaultBsr`).
Caseless	Case-insensitive match. Equivalent to `(?i)`.
DepthLimit Word32	Override maximum depth of nested backtracking (default given by `defaultDepthLimit`). Equivalent to `(LIMIT_DEPTH=`number*`)`.
DollarEndOnly	Don't match `$` with a newline at the end of the subject.
DotAll	A dot also matches a (single-character) newline. Equivalent to `(?s)`.
DupNames	Allow non-unique capture names.
EndAnchored	More or less like ending pattern with `$`.
EscapedCrIsLf	Interpret `\r` as `\n`.
Extended	In the pattern, ignore whitespace, and enable comments starting with `#`. Equivalent to `(?x)`.
ExtendedMore	Like `Extended` but also ignore spaces and tabs within `[]`.
FirstLine	The match must begin in the first line of the subject.
HeapLimit Word32	Override maximum heap memory (in kibibytes) used to hold backtracking information (default given by `defaultHeapLimit`). Equivalent to `(LIMIT_HEAP=`number*`)`.
Literal	Treat the pattern as a literal string.
MatchLimit Word32	Override maximum value of the main matching loop's internal counter (default given by `defaultMatchLimit`), as a simple CPU throttle. Equivalent to `(LIMIT_MATCH=`number*`)`.
MatchLine	Only match complete lines. Equivalent to bracketing the pattern with `^(?:`pattern`)$`.
MatchUnsetBackRef	A backreference to an unset capture group matches an empty string.
MatchWord	Only match subjects that have word boundaries at the beginning and end. Equivalent to bracketing the pattern with `\b(?:`pattern`)\b`.
MaxPatternLength Word64	Default is `maxBound`.
Multiline	`^` and `$` mean "beginning/end of a line" rather than "beginning/end of the subject". Equivalent to `(?m)`.
NeverBackslashC	Do not allow the unsafe `\C` sequence.
NeverUcp	Don't count Unicode characters in some character classes such as `\d`. Overrides `(*UCP)`.
Newline Newline	Override what a newline is (default given by `defaultNewline`). Equivalent to `(*CRLF)` or similar.
NoAutoCapture	Disable numbered capturing parentheses.
NoAutoPossess	Turn off some optimizations, possibly resulting in some callouts not being called.
NoDotStarAnchor	Turn off an optimization involving `.*`, possibly resulting in some callouts not being called.
NoStartOptimize	Turn off some optimizations normally performed at the beginning of a pattern.
NotBol	First character of subject is not the beginning of line. Only affects `^`.
NotEmpty	The 0th capture doesn't match if it would be empty.
NotEmptyAtStart	The 0th capture doesn't match if it would be empty and at the beginning of the subject.
NotEol	End of subject is not the end of line. Only affects `$`.
OffsetLimit Word64	Limit how far an unanchored search can advance in the subject.
ParensLimit Word32	Override max depth of nested parentheses (default given by `defaultParensLimit`).
PartialHard	If the subject ends without finding a complete match, stop trying alternatives and signal a partial match immediately. Currently we do this by throwing a `Pcre2Exception` but we should do better.
PartialSoft	If the subject ends and all alternatives have been tried, but no complete match is found, signal a partial match. Currently we do this by throwing a `Pcre2Exception` but we should do better.
SubGlobal	Affects `subOpt`. Replace all, rather than just the first.
SubLiteral	Affects `subOpt`. Treat the replacement as a literal string.
SubReplacementOnly	Affects `subOpt`. Return just the rendered replacement instead of it within the subject. With `SubGlobal`, all results are concatenated.
SubUnknownUnset	Affects `subOpt`. References in the replacement to non-existent captures don't error but are treated as unset.
SubUnsetEmpty	Affects `subOpt`. References in the replacement to unset captures don't error but are treated as empty.
Ucp	Count Unicode characters in some character classes such as `\d`. Incompatible with `NeverUcp`.
Ungreedy	Invert the effect of `?`. Without it, quantifiers are non-greedy; with it, they are greedy. Equivalent to `(?U)`.
UnsafeCallout (CalloutInfo -> IO CalloutResult)	Run the given callout at every callout point (see the docs for more info). Multiples of this option before the rightmost are ignored. NOTE: The callout is run via `unsafePerformIO` within pure code!
UnsafeCompileRecGuard (Int -> IO Bool)	Run the given guard on every new descent into a level of parentheses, passing the current depth as argument. Returning `False` aborts pattern compilation with an exception. Multiples of this option before the rightmost are ignored. NOTE: Currently (PCRE2 version 10.35) patterns seem to be parsed in two passes, both times triggering the recursion guard. Also, it is triggered at the beginning of the pattern, passing 0. None of this is documented; expect the unexpected in the presence of side effects! NOTE: The guard is run via `unsafePerformIO` within pure code!
UnsafeSubCallout (SubCalloutInfo -> IO SubCalloutResult)	Run the given callout on every substitution. This is at most once unless `SubGlobal` is set. Multiples of this option before the rightmost are ignored. NOTE: The callout is run via `unsafePerformIO` within pure code!
Utf	Treat both the pattern and subject as UTF rather than fixed-width 16-bit code units.

Instances

Instances details

Semigroup Option Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (<>) :: Option -> Option -> Option # sconcat :: NonEmpty Option -> Option # stimes :: Integral b => b -> Option -> Option #
Monoid Option Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods mempty :: Option # mappend :: Option -> Option -> Option # mconcat :: [Option] -> Option #

data Bsr Source #

What \R, backslash R, can mean.

Constructors

BsrUnicode	any Unicode line ending sequence
BsrAnyCrlf	`\r`, `\n`, or `\r\n`

Instances

Instances details

Eq Bsr Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (==) :: Bsr -> Bsr -> Bool # (/=) :: Bsr -> Bsr -> Bool #
Show Bsr Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> Bsr -> ShowS # show :: Bsr -> String # showList :: [Bsr] -> ShowS #

data Newline Source #

What's considered a newline.

Constructors

NewlineCr	`\r` only
NewlineLf	`\n` only
NewlineCrlf	`\r\n` only
NewlineAny	any Unicode line ending sequence
NewlineAnyCrlf	any of the above
NewlineNul	binary zero

Instances

Instances details

Eq Newline Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (==) :: Newline -> Newline -> Bool # (/=) :: Newline -> Newline -> Bool #
Show Newline Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> Newline -> ShowS # show :: Newline -> String # showList :: [Newline] -> ShowS #

Callout interface

data CalloutInfo Source #

Input for user-defined callouts.

Constructors

CalloutInfo

Fields

calloutIndex :: CalloutIndex
The index of which callout point we're on.
calloutCaptures :: NonEmpty (Maybe Text)
The captures that have been set so far.
calloutSubject :: Text
The original subject.
calloutMark :: Maybe Text
The name of the most recently passed (*MARK), (*PRUNE), or (*THEN), if any.
calloutIsFirst :: Bool
Is this the first callout after the start of matching?
calloutBacktracked :: Bool
Has a backtrack occurred since the previous callout, or the beginning of matching if no previous callouts?

Instances

Instances details

Eq CalloutInfo Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (==) :: CalloutInfo -> CalloutInfo -> Bool # (/=) :: CalloutInfo -> CalloutInfo -> Bool #
Show CalloutInfo Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> CalloutInfo -> ShowS # show :: CalloutInfo -> String # showList :: [CalloutInfo] -> ShowS #

data CalloutIndex Source #

What caused the callout.

Constructors

CalloutNumber Int	Numerical callout.
CalloutName Text	String callout.
CalloutAuto Int Int	The item located at this half-open range of offsets within the pattern. See `AutoCallout`.

Instances

Instances details

Eq CalloutIndex Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (==) :: CalloutIndex -> CalloutIndex -> Bool # (/=) :: CalloutIndex -> CalloutIndex -> Bool #
Show CalloutIndex Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> CalloutIndex -> ShowS # show :: CalloutIndex -> String # showList :: [CalloutIndex] -> ShowS #

data CalloutResult Source #

Callout functions return one of these values, which dictates what happens next in the match.

Constructors

CalloutProceed	Keep going.
CalloutNoMatchHere	Fail the current capture, but not the whole match. For example, backtracking may occur.
CalloutNoMatch	Fail the whole match.

Instances

Instances details

Eq CalloutResult Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (==) :: CalloutResult -> CalloutResult -> Bool # (/=) :: CalloutResult -> CalloutResult -> Bool #
Show CalloutResult Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> CalloutResult -> ShowS # show :: CalloutResult -> String # showList :: [CalloutResult] -> ShowS #

Substitution callouts

data SubCalloutInfo Source #

Input for user-defined substitution callouts.

Constructors

SubCalloutInfo
Fields subCalloutSubsCount :: Int The 1-based index of which substitution we're on. Only goes past 1 during global substitutions. subCalloutCaptures :: NonEmpty (Maybe Text) The captures that have been set so far. subCalloutSubject :: Text The original subject. subCalloutReplacement :: Text The replacement.

Instances

Instances details

Eq SubCalloutInfo Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (==) :: SubCalloutInfo -> SubCalloutInfo -> Bool # (/=) :: SubCalloutInfo -> SubCalloutInfo -> Bool #
Show SubCalloutInfo Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> SubCalloutInfo -> ShowS # show :: SubCalloutInfo -> String # showList :: [SubCalloutInfo] -> ShowS #

data SubCalloutResult Source #

Substitution callout functions return one of these values, which dictates what happens next in the substitution.

Constructors

SubCalloutAccept	Succeed, and keep going if in global mode.
SubCalloutSkip	Do not perform this substitution, but keep going if in global mode.
SubCalloutAbort	Do not perform this or any subsequent substitutions.

Instances

Instances details

Eq SubCalloutResult Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods (==) :: SubCalloutResult -> SubCalloutResult -> Bool # (/=) :: SubCalloutResult -> SubCalloutResult -> Bool #
Show SubCalloutResult Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> SubCalloutResult -> ShowS # show :: SubCalloutResult -> String # showList :: [SubCalloutResult] -> ShowS #

User errors

data SomePcre2Exception Source #

The root of the PCRE2 exception hierarchy.

Instances

Instances details

Show SomePcre2Exception Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> SomePcre2Exception -> ShowS # show :: SomePcre2Exception -> String # showList :: [SomePcre2Exception] -> ShowS #
Exception SomePcre2Exception Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods toException :: SomePcre2Exception -> SomeException # fromException :: SomeException -> Maybe SomePcre2Exception # displayException :: SomePcre2Exception -> String #

data Pcre2Exception Source #

Vanilla PCRE2 exceptions with messages generated by the underlying C library.

Instances

Instances details

Show Pcre2Exception Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> Pcre2Exception -> ShowS # show :: Pcre2Exception -> String # showList :: [Pcre2Exception] -> ShowS #
Exception Pcre2Exception Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods toException :: Pcre2Exception -> SomeException # fromException :: SomeException -> Maybe Pcre2Exception # displayException :: Pcre2Exception -> String #

data Pcre2CompileException Source #

PCRE2 compile exceptions. Along with a message stating the cause, we show the pattern with a cursor pointing at where the error is (if not after the last character).

Instances

Instances details

Show Pcre2CompileException Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods showsPrec :: Int -> Pcre2CompileException -> ShowS # show :: Pcre2CompileException -> String # showList :: [Pcre2CompileException] -> ShowS #
Exception Pcre2CompileException Source #
Instance details Defined in Text.Regex.Pcre2.Internal Methods toException :: Pcre2CompileException -> SomeException # fromException :: SomeException -> Maybe Pcre2CompileException # displayException :: Pcre2CompileException -> String #