regex-pderiv-0.1.0: Replaces/Enhances Text.Regex



This module defines the data type of internal regular expression pattern, | as well as the partial derivative operations for regular expression patterns.



data Pat Source

regular expression patterns


PVar Int [Range] Pat

variable pattern


pattern without binder

PPair Pat Pat

pair pattern

PChoice Pat Pat GFlag

choice pattern

PStar Pat GFlag

star pattern

PPlus Pat Pat

plus pattern, it is used internally to indicate that it is unrolled from a PStar

PEmpty Pat

empty pattern, it is used intermally to indicate that mkEmpty function has been applied.


Eq Pat

The Eq instance for Pat data type NOTE: We ignore the 'consumed word' when comparing patterns (ie we only compare the pattern structure). Essential for later comparisons among patterns.

Show Pat 
Key Pat 
IsGreedy Pat

Function isGreedy checks whether a pattern is greedy

Simplifiable Pat

mainly interested in simplifying epsilon, p --> p could be made more optimal, e.g. (epsilon, epsilon) --> epsilon

IsPhi Pat 
IsEpsilon Pat 
Pretty Pat 

strip :: Pat -> RESource

function strip strips away the bindings from a pattern

pdPat :: Pat -> Letter -> [Pat]Source

function pdPat computes the partial derivatives of a pattern w.r.t. a letter. Integrating non-greedy operator with PStar For p*, we need to unroll it into a special construct say PPlus p' p* where p' in p/l. When we push another label, say l' to PPlus p' p*, and p' is emptiable, naively, we would do [ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus (mkE p') (PPlus p''' p*) | (PPlus p''' p*) <- p*l ] Now the problem here is the shape of the pdpat are infinite, which breaks the requirement of getting a compilation scheme. The fix here is to simplify the second component, by combining the binding, of (mkE p') and p''' since they share the same set of variables. [ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus p4 p* | (PPlus p''' p*) <- p*l ] where p4 = combineBinding (mkE p') p''' For pdPat0 approach, we do not need to do this explicitly, we simply drop (mkE p') even in the PPair case. see the definitely of pdPat0 below

type Binder = IntMap [Range]Source

The Binder type denotes a set of (pattern var * range) pairs type Binder = [(Int, [Range])]

toBinder :: Pat -> BinderSource

Function toBinder turns a pattern into a binder



:: Pat

the source pattern

-> Letter

the letter to be consumed

-> [(Pat, Int -> Binder -> Binder)] 

Function pdPat0 is the abstracted form of the pdPat function It computes a set of pairs. Each pair consists a shape of the partial derivative, and an update function which defines the change of the pattern bindings from the source pattern to the resulting partial derivative. This is used in the compilation of the regular expression pattern



:: Pat

the source pattern

-> Letter

the letter to be consumed

-> [(Pat, Int -> Binder -> Binder)] 

Function pdPat0Sim applies simplification to the results of pdPat0

nub2 :: Eq a => [(a, b)] -> [(a, b)]Source