Safe Haskell | Safe-Infered |
---|

This module defines the data type of internal regular expression pattern, | as well as the partial derivative operations for regular expression patterns.

- data Pat
- strip :: Pat -> RE
- pdPat :: Pat -> Letter -> [Pat]
- type Binder = IntMap [Range]
- toBinder :: Pat -> Binder
- listifyBinder :: Binder -> [(Int, [Range])]
- pdPat0 :: Pat -> Letter -> [(Pat, Int -> Binder -> Binder)]
- pdPat0Sim :: Pat -> Letter -> [(Pat, Int -> Binder -> Binder)]
- nub2 :: Eq a => [(a, b)] -> [(a, b)]

# Documentation

regular expression patterns

PVar Int [Range] Pat | variable pattern |

PE RE | pattern without binder |

PPair Pat Pat | pair pattern |

PChoice Pat Pat GFlag | choice pattern |

PStar Pat GFlag | star pattern |

PPlus Pat Pat | plus pattern, it is used internally to indicate that it is unrolled from a PStar |

PEmpty Pat | empty pattern, it is used intermally to indicate that mkEmpty function has been applied. |

Eq Pat | The Eq instance for Pat data type NOTE: We ignore the 'consumed word' when comparing patterns (ie we only compare the pattern structure). Essential for later comparisons among patterns. |

Show Pat | |

Key Pat | |

IsGreedy Pat | Function |

Simplifiable Pat | mainly interested in simplifying epsilon, p --> p could be made more optimal, e.g. (epsilon, epsilon) --> epsilon |

IsPhi Pat | |

IsEpsilon Pat | |

Pretty Pat |

pdPat :: Pat -> Letter -> [Pat]Source

function `pdPat`

computes the partial derivatives of a pattern w.r.t. a letter.
Integrating non-greedy operator with PStar
For p*, we need to unroll it into a special construct
say PPlus p' p* where p' in p/l.
When we push another label, say l' to PPlus p' p*, and
p' is emptiable, naively, we would do
[ PPlus p'' p* | p'' <- p' * l ] ++ [ PPlus (mkE p') (PPlus p''' p*) | (PPlus p''' p*) <- p**l ]
Now the problem here is the shape of the pdpat are infinite, which
breaks the requirement of getting a compilation scheme.
The fix here is to simplify the second component, by combining the binding, of (mkE p') and p'''
since they share the same set of variables.
[ PPlus p'' p* | p'' <- p' * l ] ++ [ PPlus p4 p* | (PPlus p''' p*) <- p**l ]
where p4 = combineBinding (mkE p') p'''
For pdPat0 approach, we do not need to do this explicitly, we simply drop
(mkE p') even in the PPair case. see the definitely of pdPat0 below

type Binder = IntMap [Range]Source

The `Binder`

type denotes a set of (pattern var * range) pairs
type Binder = [(Int, [Range])]

listifyBinder :: Binder -> [(Int, [Range])]Source

Function `pdPat0`

is the `abstracted`

form of the `pdPat`

function
It computes a set of pairs. Each pair consists a `shape`

of the partial derivative, and
an update function which defines the change of the pattern bindings from the `source`

pattern to
the resulting partial derivative. This is used in the compilation of the regular expression pattern