regex-pderiv-0.2.0: Replaces/Enhances Text.Regex. Implementing regular expression matching using Antimirov's partial derivatives.

Safe HaskellSafe



This module defines the data type of internal regular expression pattern, | as well as the partial derivative operations for regular expression patterns.



data Pat Source #

regular expression patterns


PVar Int [Range] Pat

variable pattern


pattern without binder

PPair Pat Pat

pair pattern

PChoice Pat Pat GFlag

choice pattern

PStar Pat GFlag

star pattern

PPlus Pat Pat

plus pattern, it is used internally to indicate that it is unrolled from a PStar

PEmpty Pat

empty pattern, it is used intermally to indicate that mkEmpty function has been applied.


Eq Pat Source #

The Eq instance for Pat data type NOTE: We ignore the 'consumed word' when comparing patterns (ie we only compare the pattern structure). Essential for later comparisons among patterns.


(==) :: Pat -> Pat -> Bool #

(/=) :: Pat -> Pat -> Bool #

Show Pat Source # 


showsPrec :: Int -> Pat -> ShowS #

show :: Pat -> String #

showList :: [Pat] -> ShowS #

Pretty Pat Source # 


pretty :: Pat -> String Source #

Key Pat Source # 


hash :: Pat -> [Int] Source #

IsGreedy Pat Source #

Function isGreedy checks whether a pattern is greedy


isGreedy :: Pat -> Bool Source #

Simplifiable Pat Source #

mainly interested in simplifying epsilon, p --> p could be made more optimal, e.g. (epsilon, epsilon) --> epsilon


simplify :: Pat -> Pat Source #

IsPhi Pat Source # 


isPhi :: Pat -> Bool Source #

IsEpsilon Pat Source # 


isEpsilon :: Pat -> Bool Source #

strip :: Pat -> RE Source #

function strip strips away the bindings from a pattern

pdPat :: Pat -> Letter -> [Pat] Source #

function pdPat computes the partial derivatives of a pattern w.r.t. a letter. Integrating non-greedy operator with PStar For p*, we need to unroll it into a special construct say PPlus p' p* where p' in p/l. When we push another label, say l' to PPlus p' p*, and p' is emptiable, naively, we would do [ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus (mkE p') (PPlus p''' p*) | (PPlus p''' p*) <- p*l ] Now the problem here is the shape of the pdpat are infinite, which breaks the requirement of getting a compilation scheme. The fix here is to simplify the second component, by combining the binding, of (mkE p') and p''' since they share the same set of variables. [ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus p4 p* | (PPlus p''' p*) <- p*l ] where p4 = combineBinding (mkE p') p''' For pdPat0 approach, we do not need to do this explicitly, we simply drop (mkE p') even in the PPair case. see the definitely of pdPat0 below

type Binder = IntMap [Range] Source #

The Binder type denotes a set of (pattern var * range) pairs type Binder = [(Int, [Range])]

toBinder :: Pat -> Binder Source #

Function toBinder turns a pattern into a binder

pdPat0 Source #


:: Pat

the source pattern

-> Letter

the letter to be "consumed"

-> [(Pat, Int -> Binder -> Binder)] 

Function pdPat0 is the abstracted form of the pdPat function It computes a set of pairs. Each pair consists a shape of the partial derivative, and an update function which defines the change of the pattern bindings from the source pattern to the resulting partial derivative. This is used in the compilation of the regular expression pattern

pdPat0Sim Source #


:: Pat

the source pattern

-> Letter

the letter to be "consumed"

-> [(Pat, Int -> Binder -> Binder)] 

Function pdPat0Sim applies simplification to the results of pdPat0

nub2 :: Eq a => [(a, b)] -> [(a, b)] Source #