Safe Haskell | Safe-Infered |
---|
This module defines the data type of internal regular expression pattern, | as well as the partial derivative operations for regular expression patterns.
- data Pat
- strip :: Pat -> RE
- pdPat :: Pat -> Letter -> [Pat]
- type Binder = IntMap [Range]
- toBinder :: Pat -> Binder
- listifyBinder :: Binder -> [(Int, [Range])]
- pdPat0 :: Pat -> Letter -> [(Pat, Int -> Binder -> Binder)]
- pdPat0Sim :: Pat -> Letter -> [(Pat, Int -> Binder -> Binder)]
- nub2 :: Eq a => [(a, b)] -> [(a, b)]
Documentation
regular expression patterns
PVar Int [Range] Pat | variable pattern |
PE RE | pattern without binder |
PPair Pat Pat | pair pattern |
PChoice Pat Pat GFlag | choice pattern |
PStar Pat GFlag | star pattern |
PPlus Pat Pat | plus pattern, it is used internally to indicate that it is unrolled from a PStar |
PEmpty Pat | empty pattern, it is used intermally to indicate that mkEmpty function has been applied. |
Eq Pat | The Eq instance for Pat data type NOTE: We ignore the 'consumed word' when comparing patterns (ie we only compare the pattern structure). Essential for later comparisons among patterns. |
Show Pat | |
Key Pat | |
IsGreedy Pat | Function |
Simplifiable Pat | mainly interested in simplifying epsilon, p --> p could be made more optimal, e.g. (epsilon, epsilon) --> epsilon |
IsPhi Pat | |
IsEpsilon Pat | |
Pretty Pat |
pdPat :: Pat -> Letter -> [Pat]Source
function pdPat
computes the partial derivatives of a pattern w.r.t. a letter.
Integrating non-greedy operator with PStar
For p*, we need to unroll it into a special construct
say PPlus p' p* where p' in p/l.
When we push another label, say l' to PPlus p' p*, and
p' is emptiable, naively, we would do
[ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus (mkE p') (PPlus p''' p*) | (PPlus p''' p*) <- p*l ]
Now the problem here is the shape of the pdpat are infinite, which
breaks the requirement of getting a compilation scheme.
The fix here is to simplify the second component, by combining the binding, of (mkE p') and p'''
since they share the same set of variables.
[ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus p4 p* | (PPlus p''' p*) <- p*l ]
where p4 = combineBinding (mkE p') p'''
For pdPat0 approach, we do not need to do this explicitly, we simply drop
(mkE p') even in the PPair case. see the definitely of pdPat0 below
type Binder = IntMap [Range]Source
The Binder
type denotes a set of (pattern var * range) pairs
type Binder = [(Int, [Range])]
listifyBinder :: Binder -> [(Int, [Range])]Source
Function pdPat0
is the abstracted
form of the pdPat
function
It computes a set of pairs. Each pair consists a shape
of the partial derivative, and
an update function which defines the change of the pattern bindings from the source
pattern to
the resulting partial derivative. This is used in the compilation of the regular expression pattern