pcre-heavy-0.2.4: A regexp library on top of pcre-light you can actually use.

Safe HaskellNone
LanguageHaskell2010

Text.Regex.PCRE.Heavy

Contents

Description

A usable regular expressions library on top of pcre-light.

Synopsis

Matching

(=~) :: Stringable a => a -> Regex -> Bool Source

Checks whether a string matches a regex.

>>> :set -XQuasiQuotes
>>> "https://unrelenting.technology" =~ [re|^http.*|]
True

(≈) :: Stringable a => a -> Regex -> Bool Source

Same as =~.

scan :: Stringable a => Regex -> a -> [(a, [a])] Source

Searches the string for all matches of a given regex.

>>> scan [re|\s*entry (\d+) (\w+)\s*&?|] " entry 1 hello  &entry 2 hi"
[(" entry 1 hello  &",["1","hello"]),("entry 2 hi",["2","hi"])]

It is lazy! If you only need the first match, just apply head (or headMay from the "safe" library) -- no extra work will be performed!

>>> head $ scan [re|\s*entry (\d+) (\w+)\s*&?|] " entry 1 hello  &entry 2 hi"
(" entry 1 hello  &",["1","hello"])

scanO :: Stringable a => Regex -> [PCREExecOption] -> a -> [(a, [a])] Source

Exactly like scan, but passes runtime options to PCRE.

scanRanges :: Stringable a => Regex -> a -> [((Int, Int), [(Int, Int)])] Source

Searches the string for all matches of a given regex, like scan, but returns positions inside of the string.

>>> scanRanges [re|\s*entry (\d+) (\w+)\s*&?|] " entry 1 hello  &entry 2 hi"
[((0,17),[(7,8),(9,14)]),((17,27),[(23,24),(25,27)])]

And just like scan, it's lazy.

scanRangesO :: Stringable a => Regex -> [PCREExecOption] -> a -> [((Int, Int), [(Int, Int)])] Source

Exactly like scanRanges, but passes runtime options to PCRE.

Replacement

sub :: (Stringable a, RegexReplacement r) => Regex -> r -> a -> a Source

Replaces the first occurence of a given regex.

>>> sub [re|thing|] "world" "Hello, thing thing" :: String
"Hello, world thing"
>>> sub [re|a|] "b" "c" :: String
"c"
>>> sub [re|bad|] "xxxbad" "this is bad, right?"
"this is xxxbad, right?"

You can use functions! A function of Stringable gets the full match. A function of [Stringable] gets the groups. A function of Stringable → [Stringable] gets both.

>>> sub [re|%(\d+)(\w+)|] (\(d:w:_) -> "{" ++ d ++ " of " ++ w ++ "}" :: String) "Hello, %20thing" :: String
"Hello, {20 of thing}"

subO :: (Stringable a, RegexReplacement r) => Regex -> [PCREExecOption] -> r -> a -> a Source

Exactly like sub, but passes runtime options to PCRE.

gsub :: (Stringable a, RegexReplacement r) => Regex -> r -> a -> a Source

Replaces all occurences of a given regex.

See sub for more documentation.

>>> gsub [re|thing|] "world" "Hello, thing thing" :: String
"Hello, world world"
>>> gsub [re||] "" "Hello, world" :: String
"Hello, world"

https://github.com/myfreeweb/pcre-heavy/issues/2 >>> gsub [re|good|] "bad" "goodgoodgood" "badbadbad"

>>> gsub [re|bad|] "xxxbad" "this is bad, right? bad"
"this is xxxbad, right? xxxbad"

gsubO :: (Stringable a, RegexReplacement r) => Regex -> [PCREExecOption] -> r -> a -> a Source

Exactly like gsub, but passes runtime options to PCRE.

Splitting

split :: Stringable a => Regex -> a -> [a] Source

Splits the string using the given regex.

Is lazy.

>>> split [re|%(begin|next|end)%|] "%begin%hello%next%world%end%"
["","hello","world",""]
>>> split [re|%(begin|next|end)%|] ""
[""]

splitO :: Stringable a => Regex -> [PCREExecOption] -> a -> [a] Source

Exactly like split, but passes runtime options to PCRE.

QuasiQuoter

re :: QuasiQuoter Source

A QuasiQuoter for regular expressions that does a compile time check.

mkRegexQQ :: [PCREOption] -> QuasiQuoter Source

Returns a QuasiQuoter like re, but with given PCRE options.

Types and stuff from pcre-light

data Regex :: *

An abstract pointer to a compiled PCRE Regex structure The structure allocated by the PCRE library will be deallocated automatically by the Haskell storage manager.

Instances

data PCREOption :: *

A type for PCRE compile-time options. These are newtyped CInts, which can be bitwise-or'd together, using '(Data.Bits..|.)'

compileM :: ByteString -> [PCREOption] -> Either String Regex

compileM A safe version of compile with failure wrapped in an Either.

Examples,

> compileM ".*" [] :: Either String Regex
Right (Regex 0x000000004bb5b980 ".*")
> compileM "*" [] :: Either String Regex
Left "nothing to repeat"

Advanced raw stuff

rawMatch :: Regex -> ByteString -> Int -> [PCREExecOption] -> Maybe [(Int, Int)] Source

Does raw PCRE matching (you probably shouldn't use this directly).

>>> :set -XOverloadedStrings
>>> rawMatch [re|\w{2}|] "a a ab abc ba" 0 []
Just [(4,6)]
>>> rawMatch [re|\w{2}|] "a a ab abc ba" 6 []
Just [(7,9)]
>>> rawMatch [re|(\w)(\w)|] "a a ab abc ba" 0 []
Just [(4,6),(4,5),(5,6)]

rawSub :: RegexReplacement r => Regex -> r -> ByteString -> Int -> [PCREExecOption] -> Maybe (ByteString, Int) Source