regex-0.1.0.0: Toolkit for regex-base

Copyright(C) 2016-17 Chris Dornan
LicenseBSD3 (see the LICENSE file)
MaintainerChris Dornan <chris.dornan@irisconnect.com>
StabilityRFC
Portabilityportable
Safe HaskellNone
LanguageHaskell2010

Text.RE

Contents

Description

 

Synopsis

Tutorial

We have a regex tutorial at http://tutorial.regex.uk. These API docs are mainly for reference.

How to use this library

This module won't provide any operators to match a regular expression against text as it merely provides the toolkit for working with the output of the match operators. You probably won't import it directly but import one of the modules that provides the match operators, which will in tuen re-export this module.

The module that you choose to import will depend upon two factors:

  • Which flavour of regular expression do you want to use? If you want Posix flavour REs then you want the TDFA modules, otherwise its PCRE for Perl-style REs.
  • What type of text do you want to match: (slow) Strings, ByteString, ByteString.Lazy, Text, Text.Lazy or the anachronistic Seq Char or indeed a good old-fashioned polymorphic operators?

While we aim to provide all combinations of these choices, some of them are currently not available. We have:

The Match Operators

The traditional =~ and =~~ operators are exported by the regex, but we recommend that you use the two new operators, especially if you are not familiar with the old operators. We have:

  • txt ?=~ re searches for a single match yielding a value of type Match a where a is the type of the text you are searching.
  • txt *=~ re searches for all non-overlapping matches in txt, returning a value of type Matches a.

See the sections below for more information on these Matches and Match result types.

Matches, Match, Capture Types and Functions

data Matches a Source

the result type to use when every match is needed, not just the first match of the RE against the source

Constructors

Matches 

Fields

matchesSource :: !a

the source text being matched

allMatches :: ![Match a]

all captures found, left to right

Instances

Functor Matches 
(RegexContext regex source [MatchText source], RegexLike regex source) => RegexContext regex source (Matches source)

for matching all REs against the source text

Eq a => Eq (Matches a) 
Show a => Show (Matches a) 

data Match a Source

the result of matching a RE to a text once, listing the text that was matched and the named captures in the RE and all of the substrings matched, with the text captured by the whole RE; a complete failure to match will be represented with an empty array (with bounds (0,-1))

Constructors

Match 

Fields

matchSource :: !a

the whole source text

captureNames :: !CaptureNames

the RE's capture names

matchArray :: !(Array CaptureOrdinal (Capture a))
  1. .n-1 captures, starting with the text matched by the whole RE

Instances

Functor Match 
(RegexContext regex source (AllTextSubmatches (Array Int) (source, (Int, Int))), RegexLike regex source) => RegexContext regex source (Match source)

for matching just the first RE against the source text

Eq a => Eq (Match a) 
Show a => Show (Match a) 

data Capture a Source

the matching of a single sub-expression against part of the source text

Constructors

Capture 

Fields

captureSource :: !a

the whole text that was searched

capturedText :: !a

the text that was matched

captureOffset :: !Int

the number of characters preceding the match with -1 used if no text was captured by the RE (not even the empty string)

captureLength :: !Int

the number of chacter in the captured sub-string

Instances

Functor Capture 
Eq a => Eq (Capture a) 
Show a => Show (Capture a) 

noMatch :: a -> Match a Source

Construct a Match that does not match anything.

Matches functions

anyMatches :: Matches a -> Bool Source

tests whether the RE matched the source text at all

countMatches :: Matches a -> Int Source

count the matches

matches :: Matches a -> [a] Source

mainCaptures :: Matches a -> [Capture a] Source

extract the main capture from each match

Match functions

matched :: Match a -> Bool Source

tests whether the RE matched the source text at all

matchedText :: Match a -> Maybe a Source

tests whether the RE matched the source text at all

matchCapture :: Match a -> Maybe (Capture a) Source

the top-level capture if the source text matched the RE, Nothing otherwise

matchCaptures :: Match a -> Maybe (Capture a, [Capture a]) Source

the top-level capture and the sub captures if the text matched the RE, Nothing otherwise

(!$$) :: Match a -> CaptureID -> a infixl 9 Source

an alternative for captureText

captureText :: CaptureID -> Match a -> a Source

look up the text of the nth capture, 0 being the match of the whole RE against the source text, 1, the first bracketed sub-expression to be matched and so on

(!$$?) :: Match a -> CaptureID -> Maybe a Source

an alternative for captureTextMaybe

captureTextMaybe :: CaptureID -> Match a -> Maybe a Source

look up the text of the nth capture (0 being the match of the whole), returning Nothing if the Match doesn't contain the capture

(!$) :: Match a -> CaptureID -> Capture a infixl 9 Source

an alternative for capture

capture :: CaptureID -> Match a -> Capture a Source

look up the nth capture, 0 being the match of the whole RE against the source text, 1, the first bracketed sub-expression to be matched and so on

(!$?) :: Match a -> CaptureID -> Maybe (Capture a) Source

an alternative for capture captureMaybe

captureMaybe :: CaptureID -> Match a -> Maybe (Capture a) Source

look up the nth capture, 0 being the match of the whole RE against the source text, 1, the first bracketed sub-expression to be matched and so on, returning Nothing if there is no such capture, or if the capture failed to capture anything (being in a failed alternate)

Capture functions

hasCaptured :: Capture a -> Bool Source

test if the capture has matched any text

capturePrefix :: Extract a => Capture a -> a Source

returns the text preceding the match

captureSuffix :: Extract a => Capture a -> a Source

returns the text after the match

IsRegex

class Replace s => IsRegex re s where Source

Methods

matchOnce :: re -> s -> Match s Source

matchMany :: re -> s -> Matches s Source

regexSource :: re -> String Source

Instances

IsRegex RE String 
IsRegex RE ByteString 
IsRegex RE ByteString 
IsRegex RE String 
IsRegex RE ByteString 
IsRegex RE ByteString 
IsRegex RE Text 
IsRegex RE Text 
IsRegex RE (Seq Char) 
IsRegex RE (Seq Char) 

Options

data Options_ r c e Source

Constructors

Options 

Instances

IsOption Options RE CompOption ExecOption 
IsOption Options RE CompOption ExecOption 
(Show r, Show c, Show e) => Show (Options_ r c e) 

class IsOption o r c e | e -> r, c -> e, e -> c, r -> c, c -> r, r -> e where Source

Methods

makeOptions :: o -> Options_ r c e Source

Instances

IsOption () RE CompOption ExecOption 
IsOption () RE CompOption ExecOption 
IsOption SimpleRegexOptions RE CompOption ExecOption 
IsOption SimpleRegexOptions RE CompOption ExecOption 
IsOption Mode RE CompOption ExecOption 
IsOption Mode RE CompOption ExecOption 
IsOption ExecOption RE CompOption ExecOption 
IsOption CompOption RE CompOption ExecOption 
IsOption ExecOption RE CompOption ExecOption 
IsOption CompOption RE CompOption ExecOption 
IsOption Options RE CompOption ExecOption 
IsOption Options RE CompOption ExecOption 
IsOption (Macros RE) RE CompOption ExecOption 
IsOption (Macros RE) RE CompOption ExecOption 

data Mode Source

Constructors

Simple 
Block 

Instances

Bounded Mode 
Enum Mode 
Eq Mode 
Ord Mode 
Show Mode 
IsOption Mode RE CompOption ExecOption 
IsOption Mode RE CompOption ExecOption 

newtype MacroID Source

Constructors

MacroID 

Fields

_MacroID :: String
 

Instances

Eq MacroID 
Ord MacroID 
Show MacroID 
IsString MacroID 
Hashable MacroID 
IsOption (Macros RE) RE CompOption ExecOption 
IsOption (Macros RE) RE CompOption ExecOption 

type Macros r = HashMap MacroID r Source

CaptureID

newtype CaptureName Source

Constructors

CaptureName 

Fields

getCaptureName :: Text
 

Edit

data Edits m re s Source

Constructors

Select [(re, Edit m s)] 
Pipe [(re, Edit m s)] 

data Edit m s Source

Constructors

EDIT_tpl s 
EDIT_phi (Phi s) 
EDIT_fun Context (LineNo -> Match s -> Location -> Capture s -> m (Maybe s)) 
EDIT_gen (LineNo -> Matches s -> m (LineEdit s)) 

data LineEdit s Source

Constructors

NoEdit 
ReplaceWith s 
Delete 

Instances

Show s => Show (LineEdit s) 

applyEdits :: (IsRegex re s, Monad m, Functor m) => LineNo -> Edits m re s -> s -> m s Source

applyEdit :: (IsRegex re s, Monad m, Functor m) => (s -> s) -> LineNo -> re -> Edit m s -> s -> m (Maybe s) Source

applyLineEdit :: Monoid s => (s -> s) -> LineEdit s -> Maybe s Source

LineNo

newtype LineNo Source

Constructors

ZeroBasedLineNo 

Instances

Parsers

parseString :: Replace a => a -> Maybe Text Source

parseDate :: Replace a => a -> Maybe Day Source

parseTimeOfDay :: Replace a => a -> Maybe TimeOfDay Source

parseTimeZone :: Replace a => a -> Maybe TimeZone Source

parseDateTime :: Replace a => a -> Maybe UTCTime Source

parseDateTime8601 :: Replace a => a -> Maybe UTCTime Source

parseDateTimeCLF :: Replace a => a -> Maybe UTCTime Source

severityKeywords :: Severity -> (Text, [Text]) Source

Replace

class (Extract a, Monoid a) => Replace a where Source

Replace provides the missing methods needed to replace the matched text; length_ is the minimum implementation

Minimal complete definition

length_, pack_, unpack_, parse_tpl

Methods

length_ :: a -> Int Source

length function for a

pack_ :: String -> a Source

inject String into a

unpack_ :: a -> String Source

project a onto a String

textify :: a -> Text Source

inject into Text

detextify :: Text -> a Source

project Text onto a

appendNewline :: a -> a Source

append a newline

subst :: (a -> a) -> Capture a -> a Source

apply a substitution function to a Capture

parse_tpl :: a -> Match a -> Location -> Capture a -> Maybe a Source

convert a template containing $0, $1, etc., in the first argument, into a phi replacement function for use with replaceAllCaptures' and replaceCaptures'

Instances

Replace ByteString 
Replace ByteString 
Replace Text 
Replace Text 
Replace [Char] 
Replace (Seq Char) 

data Replace_ a Source

a selction of the Replace methods can be encapsulated with Replace_ for the higher-order replacement functions

Constructors

Replace_ 

Fields

_r_length :: a -> Int
 
_r_subst :: (a -> a) -> Capture a -> a
 

replace_ :: Replace a => Replace_ a Source

replace_ encapsulates Replace_ a from a Replace a context

data Phi a Source

Phi specifies the substitution function for procesing the substrings captured by the regular expression.

Constructors

Phi 

Fields

_phi_context :: Context

the context for applying the substitution

_phi_phi :: Location -> a -> a

the substitution function takes the location and the text to be replaced and returns the replacement text to be substituted

data Context Source

Context specifies which contexts the substitutions should be applied

Constructors

TOP

substitutions should be applied to the top-level only, the text that matched the whole RE

SUB

substitutions should only be applied to the text captured by bracketed sub-REs

ALL

the substitution function should be applied to all captures, the top level and the sub-expression captures

Instances

data Location Source

the Location information passed into the substitution function specifies which sub-expression is being substituted

Constructors

Location 

Fields

_loc_match :: Int

the zero-based, i-th string to be matched, when matching all strings, zero when only the first string is being matched

_loc_capture :: CaptureOrdinal

0, when matching the top-level string matched by the whole RE, 1 for the top-most, left-most redex captured by bracketed sub-REs, etc.

Instances

isTopLocation :: Location -> Bool Source

True iff the location references a complete match (i.e., not a bracketed capture)

replace :: Replace a => Match a -> a -> a Source

replace with a template containing $0 for whole text, $1 for first capture, etc.

replaceAll :: Replace a => a -> Matches a -> a Source

replace all with a template, $0 for whole text, $1 for first capture, etc.

replaceAllCaptures :: Replace a => Phi a -> Matches a -> a Source

substitutes the PHI substitutions through the Matches

replaceAllCaptures' :: Replace a => Context -> (Match a -> Location -> Capture a -> Maybe a) -> Matches a -> a Source

substitutes using a function that takes the full Match context and returns the same replacement text as the _phi_phi context.

replaceAllCaptures_ :: Extract a => Replace_ a -> Context -> (Match a -> Location -> Capture a -> Maybe a) -> Matches a -> a Source

replaceAllCaptures_ is like like replaceAllCaptures' but takes the Replace methods through the Replace_ argument

replaceAllCapturesM :: (Extract a, Monad m) => Replace_ a -> Context -> (Match a -> Location -> Capture a -> m (Maybe a)) -> Matches a -> m a Source

replaceAllCapturesM is just a monadically generalised version of replaceAllCaptures_

replaceCaptures :: Replace a => Phi a -> Match a -> a Source

substitutes the PHI substitutions through the Match

replaceCaptures' :: Replace a => Context -> (Match a -> Location -> Capture a -> Maybe a) -> Match a -> a Source

substitutes using a function that takes the full Match context and returns the same replacement text as the _phi_phi context.

replaceCaptures_ :: Extract a => Replace_ a -> Context -> (Match a -> Location -> Capture a -> Maybe a) -> Match a -> a Source

replaceCaptures_ is like replaceCaptures' but takes the Replace methods through the Replace_ argument

replaceCapturesM :: (Monad m, Extract a) => Replace_ a -> Context -> (Match a -> Location -> Capture a -> m (Maybe a)) -> Match a -> m a Source

replaceCapturesM is just a monadically generalised version of replaceCaptures_

expandMacros :: (r -> String) -> Mode -> Macros r -> String -> String Source

expand all of the @{..} macros in the RE in the argument String according to the Macros argument, preprocessing the RE String according to the Mode argument (used internally)

expandMacros' :: (MacroID -> Maybe String) -> String -> String Source

expand the @{..} macos in the argument string using the given function

Tools

Grep

grep :: IsRegex re ByteString => re -> FilePath -> IO () Source

grepLines :: IsRegex re ByteString => re -> FilePath -> IO [Line] Source

type GrepScript re s t = [(re, LineNo -> Matches s -> Maybe t)] Source

grepScript :: IsRegex re s => GrepScript re s t -> [s] -> [t] Source

linesMatched :: [Line] -> [Line] Source

Lex

alex :: IsRegex re s => [(re, Match s -> Maybe t)] -> t -> s -> [t] Source

alex' :: Replace s => (re -> s -> Match s) -> [(re, Match s -> Maybe t)] -> t -> s -> [t] Source

Sed

type SedScript re = Edits IO re ByteString Source

sed :: IsRegex re ByteString => SedScript re -> FilePath -> FilePath -> IO () Source

sed' :: (IsRegex re ByteString, Monad m, Functor m) => Edits m re ByteString -> ByteString -> m ByteString Source