regex-0.6.0.0: Toolkit for regex-base

Copyright(C) 2016-17 Chris Dornan
LicenseBSD3 (see the LICENSE file)
MaintainerChris Dornan <chris.dornan@irisconnect.com>
StabilityRFC
Portabilityportable
Safe HaskellNone
LanguageHaskell2010

Text.RE

Contents

Description

 

Synopsis

Tutorial

We have a regex tutorial at http://tutorial.regex.uk. These API docs are mainly for reference.

How to use this library

This module won't provide any operators to match a regular expression against text as it merely provides the toolkit for working with the output of the match operators. You probably won't import it directly but import one of the modules that provides the match operators, which will in tuen re-export this module.

The module that you choose to import will depend upon two factors:

  • Which flavour of regular expression do you want to use? If you want Posix flavour REs then you want the TDFA modules, otherwise its PCRE for Perl-style REs.
  • What type of text do you want to match: (slow) Strings, ByteString, ByteString.Lazy, Text, Text.Lazy or the anachronistic Seq Char or indeed a good old-fashioned polymorphic operators?

While we aim to provide all combinations of these choices, some of them are currently not available. We have:

The Match Operators

The traditional =~ and =~~ operators are exported by the regex, but we recommend that you use the two new operators, especially if you are not familiar with the old operators. We have:

  • txt ?=~ re searches for a single match yielding a value of type Match a where a is the type of the text you are searching.
  • txt *=~ re searches for all non-overlapping matches in txt, returning a value of type Matches a.

See the sections below for more information on these Matches and Match result types.

Matches, Match & Capture

data Matches a Source

the result type to use when every match is needed, not just the first match of the RE against the source

Constructors

Matches 

Fields

matchesSource :: !a

the source text being matched

allMatches :: ![Match a]

all captures found, left to right

Instances

Functor Matches 
(RegexContext regex source [MatchText source], RegexLike regex source) => RegexContext regex source (Matches source)

for matching all REs against the source text

Eq a => Eq (Matches a) 
Show a => Show (Matches a) 
Typeable (* -> *) Matches 

data Match a Source

the result of matching a RE to a text once, listing the text that was matched and the named captures in the RE and all of the substrings matched, with the text captured by the whole RE; a complete failure to match will be represented with an empty array (with bounds (0,-1))

Constructors

Match 

Fields

matchSource :: !a

the whole source text

captureNames :: !CaptureNames

the RE's capture names

matchArray :: !(Array CaptureOrdinal (Capture a))
  1. .n-1 captures, starting with the text matched by the whole RE

Instances

Functor Match 
(RegexContext regex source (AllTextSubmatches (Array Int) (source, (Int, Int))), RegexLike regex source) => RegexContext regex source (Match source)

for matching just the first RE against the source text

Eq a => Eq (Match a) 
Show a => Show (Match a) 
Typeable (* -> *) Match 

data Capture a Source

the matching of a single sub-expression against part of the source text

Constructors

Capture 

Fields

captureSource :: !a

the whole text that was searched

capturedText :: !a

the text that was matched

captureOffset :: !Int

the number of characters preceding the match with -1 used if no text was captured by the RE (not even the empty string)

captureLength :: !Int

the number of chacter in the captured sub-string

Instances

Functor Capture 
Eq a => Eq (Capture a) 
Show a => Show (Capture a) 

noMatch :: a -> Match a Source

Construct a Match that does not match anything.

Matches functions

anyMatches :: Matches a -> Bool Source

tests whether the RE matched the source text at all

countMatches :: Matches a -> Int Source

count the matches

matches :: Matches a -> [a] Source

mainCaptures :: Matches a -> [Capture a] Source

extract the main capture from each match

Match functions

matched :: Match a -> Bool Source

tests whether the RE matched the source text at all

matchedText :: Match a -> Maybe a Source

tests whether the RE matched the source text at all

matchCapture :: Match a -> Maybe (Capture a) Source

the top-level capture if the source text matched the RE, Nothing otherwise

matchCaptures :: Match a -> Maybe (Capture a, [Capture a]) Source

the top-level capture and the sub captures if the text matched the RE, Nothing otherwise

(!$$) :: Match a -> CaptureID -> a infixl 9 Source

an alternative for captureText

captureText :: CaptureID -> Match a -> a Source

look up the text of the nth capture, 0 being the match of the whole RE against the source text, 1, the first bracketed sub-expression to be matched and so on

(!$$?) :: Match a -> CaptureID -> Maybe a Source

an alternative for captureTextMaybe

captureTextMaybe :: CaptureID -> Match a -> Maybe a Source

look up the text of the nth capture (0 being the match of the whole), returning Nothing if the Match doesn't contain the capture

(!$) :: Match a -> CaptureID -> Capture a infixl 9 Source

an alternative for capture

capture :: CaptureID -> Match a -> Capture a Source

look up the nth capture, 0 being the match of the whole RE against the source text, 1, the first bracketed sub-expression to be matched and so on

(!$?) :: Match a -> CaptureID -> Maybe (Capture a) Source

an alternative for capture captureMaybe

captureMaybe :: CaptureID -> Match a -> Maybe (Capture a) Source

look up the nth capture, 0 being the match of the whole RE against the source text, 1, the first bracketed sub-expression to be matched and so on, returning Nothing if there is no such capture, or if the capture failed to capture anything (being in a failed alternate)

Capture functions

hasCaptured :: Capture a -> Bool Source

test if the capture has matched any text

capturePrefix :: Extract a => Capture a -> a Source

returns the text preceding the match

captureSuffix :: Extract a => Capture a -> a Source

returns the text after the match

IsRegex

class Replace s => IsRegex re s where Source

Methods

matchOnce :: re -> s -> Match s Source

matchMany :: re -> s -> Matches s Source

regexSource :: re -> String Source

Instances

IsRegex RE String 
IsRegex RE ByteString 
IsRegex RE ByteString 
IsRegex RE Text 
IsRegex RE Text 
IsRegex RE (Seq Char) 

Options

data Options_ r c e Source

Constructors

Options 

Fields

optionsMacs :: !(Macros r)
 
optionsComp :: !c
 
optionsExec :: !e
 

Instances

IsOption Options RE CompOption ExecOption 
(Show r, Show c, Show e) => Show (Options_ r c e) 

class IsOption o r c e | e -> r, c -> e, e -> c, r -> c, c -> r, r -> e where Source

Methods

makeOptions :: o -> Options_ r c e Source

Instances

IsOption () RE CompOption ExecOption 
IsOption SimpleRegexOptions RE CompOption ExecOption 
IsOption ExecOption RE CompOption ExecOption 
IsOption CompOption RE CompOption ExecOption 
IsOption Options RE CompOption ExecOption 
IsOption (Macros RE) RE CompOption ExecOption 

newtype MacroID Source

Constructors

MacroID 

Fields

getMacroID :: String
 

Instances

Eq MacroID 
Ord MacroID 
Show MacroID 
IsString MacroID 
Hashable MacroID 
IsOption (Macros RE) RE CompOption ExecOption 

type Macros r = HashMap MacroID r Source

CaptureID

data CaptureID Source

CaptureID identifies captures, either by number (e.g., [cp|1|]) or name (e.g., [cp|foo|]).

type CaptureNames = HashMap CaptureName CaptureOrdinal Source

the dictionary for named captures stored in compiled regular expressions associates

newtype CaptureName Source

a CaptureName is just the text of the name

Constructors

CaptureName 

Fields

getCaptureName :: Text
 

newtype CaptureOrdinal Source

a CaptureOrdinal is just the number of the capture, starting with 0 for the whole of the text matched, then in leftmost, outermost

Constructors

CaptureOrdinal 

Edit

data Edits m re s Source

Constructors

Select [(re, Edit m s)] 
Pipe [(re, Edit m s)] 

data Edit m s Source

Constructors

Template s 
Function Context (LineNo -> Match s -> Location -> Capture s -> m (Maybe s)) 
LineEdit (LineNo -> Matches s -> m (LineEdit s)) 

data LineEdit s Source

Constructors

NoEdit 
ReplaceWith s 
Delete 

Instances

Show s => Show (LineEdit s) 

applyEdits :: (IsRegex re s, Monad m, Functor m) => LineNo -> Edits m re s -> s -> m s Source

applyEdit :: (IsRegex re s, Monad m, Functor m) => (s -> s) -> LineNo -> re -> Edit m s -> s -> m (Maybe s) Source

applyLineEdit :: Monoid s => (s -> s) -> LineEdit s -> Maybe s Source

LineNo

newtype LineNo Source

Constructors

ZeroBasedLineNo 

Instances

Parsers

parseString :: Replace a => a -> Maybe Text Source

parseDate :: Replace a => a -> Maybe Day Source

parseTimeOfDay :: Replace a => a -> Maybe TimeOfDay Source

parseTimeZone :: Replace a => a -> Maybe TimeZone Source

parseDateTime :: Replace a => a -> Maybe UTCTime Source

parseDateTime8601 :: Replace a => a -> Maybe UTCTime Source

parseDateTimeCLF :: Replace a => a -> Maybe UTCTime Source

severityKeywords :: Severity -> (Text, [Text]) Source

Replace

class (Extract a, Monoid a) => Replace a where Source

Replace provides the missing methods needed to replace the matched text; lengthE is the minimum implementation

Minimal complete definition

lengthE, packE, unpackE, parseTemplateE

Methods

lengthE :: a -> Int Source

length function for a

packE :: String -> a Source

inject String into a

unpackE :: a -> String Source

project a onto a String

textifyE :: a -> Text Source

inject into Text

detextifyE :: Text -> a Source

project Text onto a

appendNewlineE :: a -> a Source

append a newline

substE :: (a -> a) -> Capture a -> a Source

apply a substitution function to a Capture

parseTemplateE :: a -> Match a -> Location -> Capture a -> Maybe a Source

convert a template containing $0, $1, etc., in the first argument, into a phi replacement function for use with replaceAllCaptures and replaceCaptures

Instances

Replace ByteString 
Replace ByteString 
Replace Text 
Replace Text 
Replace [Char] 
Replace (Seq Char) 

data ReplaceMethods a Source

a selction of the Replace methods can be encapsulated with ReplaceMethods for the higher-order replacement functions

Constructors

ReplaceMethods 

Fields

methodLength :: a -> Int
 
methodSubst :: (a -> a) -> Capture a -> a
 

replaceMethods :: Replace a => ReplaceMethods a Source

replaceMethods encapsulates ReplaceMethods a from a Replace a context

data Context Source

Context specifies which contexts the substitutions should be applied

Constructors

TOP

substitutions should be applied to the top-level only, the text that matched the whole RE

SUB

substitutions should only be applied to the text captured by bracketed sub-REs

ALL

the substitution function should be applied to all captures, the top level and the sub-expression captures

Instances

data Location Source

the Location information passed into the substitution function specifies which sub-expression is being substituted

Constructors

Location 

Fields

locationMatch :: Int

the zero-based, i-th string to be matched, when matching all strings, zero when only the first string is being matched

locationCapture :: CaptureOrdinal

0, when matching the top-level string matched by the whole RE, 1 for the top-most, left-most redex captured by bracketed sub-REs, etc.

Instances

isTopLocation :: Location -> Bool Source

True iff the location references a complete match (i.e., not a bracketed capture)

replace :: Replace a => Match a -> a -> a Source

replace with a template containing $0 for whole text, $1 for first capture, etc.

replaceAll :: Replace a => a -> Matches a -> a Source

replace all with a template, $0 for whole text, $1 for first capture, etc.

replaceAllCaptures :: Replace a => Context -> (Match a -> Location -> Capture a -> Maybe a) -> Matches a -> a Source

substitutes using a function that takes the full Match context and returns the same replacement text as the _phi_phi context.

replaceAllCaptures_ :: Extract a => ReplaceMethods a -> Context -> (Match a -> Location -> Capture a -> Maybe a) -> Matches a -> a Source

replaceAllCaptures_ is like like replaceAllCaptures but takes the Replace methods through the ReplaceMethods argument

replaceAllCapturesM :: (Extract a, Monad m) => ReplaceMethods a -> Context -> (Match a -> Location -> Capture a -> m (Maybe a)) -> Matches a -> m a Source

replaceAllCapturesM is just a monadically generalised version of replaceAllCaptures_

replaceCaptures :: Replace a => Context -> (Match a -> Location -> Capture a -> Maybe a) -> Match a -> a Source

substitutes using a function that takes the full Match context and returns the same replacement text as the _phi_phi context.

replaceCaptures_ :: Extract a => ReplaceMethods a -> Context -> (Match a -> Location -> Capture a -> Maybe a) -> Match a -> a Source

replaceCaptures_ is like replaceCaptures but takes the Replace methods through the ReplaceMethods argument

replaceCapturesM :: (Monad m, Extract a) => ReplaceMethods a -> Context -> (Match a -> Location -> Capture a -> m (Maybe a)) -> Match a -> m a Source

replaceCapturesM is just a monadically generalised version of replaceCaptures_

expandMacros :: (r -> String) -> Macros r -> String -> String Source

expand all of the @{..} macros in the RE in the argument String according to the Macros argument, preprocessing the RE String according to the Mode argument (used internally)

expandMacros' :: (MacroID -> Maybe String) -> String -> String Source

expand the @{..} macos in the argument string using the given function

Tools

Grep

data Line Source

Constructors

Line 

Fields

getLineNumber :: LineNo
 
getLineMatches :: Matches ByteString
 

Instances

grep :: IsRegex re ByteString => re -> FilePath -> IO () Source

grepLines :: IsRegex re ByteString => re -> FilePath -> IO [Line] Source

type GrepScript re s t = [(re, LineNo -> Matches s -> Maybe t)] Source

grepScript :: IsRegex re s => GrepScript re s t -> [s] -> [t] Source

Lex

alex :: IsRegex re s => [(re, Match s -> Maybe t)] -> t -> s -> [t] Source

alex' :: Replace s => (re -> s -> Match s) -> [(re, Match s -> Maybe t)] -> t -> s -> [t] Source

Sed

type SedScript re = Edits IO re ByteString Source

sed :: IsRegex re ByteString => SedScript re -> FilePath -> FilePath -> IO () Source

sed' :: (IsRegex re ByteString, Monad m, Functor m) => Edits m re ByteString -> ByteString -> m ByteString Source