regex-pderiv-0.2.0: Replaces/Enhances Text.Regex. Implementing regular expression matching using Antimirov's partial derivatives.

Text.Regex.PDeriv.IntPattern

Description

This module defines the data type of internal regular expression pattern, | as well as the partial derivative operations for regular expression patterns.

Synopsis

# Documentation

data Pat Source #

regular expression patterns

Constructors

 PVar Int [Range] Pat variable pattern PE RE pattern without binder PPair Pat Pat pair pattern PChoice Pat Pat GFlag choice pattern PStar Pat GFlag star pattern PPlus Pat Pat plus pattern, it is used internally to indicate that it is unrolled from a PStar PEmpty Pat empty pattern, it is used intermally to indicate that mkEmpty function has been applied.

Instances

 Source # The Eq instance for Pat data type NOTE: We ignore the 'consumed word' when comparing patterns (ie we only compare the pattern structure). Essential for later comparisons among patterns. Methods(==) :: Pat -> Pat -> Bool #(/=) :: Pat -> Pat -> Bool # Source # MethodsshowsPrec :: Int -> Pat -> ShowS #show :: Pat -> String #showList :: [Pat] -> ShowS # Source # Methods Source # Methodshash :: Pat -> [Int] Source # Source # Function isGreedy checks whether a pattern is greedy Methods Source # mainly interested in simplifying epsilon, p --> p could be made more optimal, e.g. (epsilon, epsilon) --> epsilon Methods Source # Methods Source # Methods

function strip strips away the bindings from a pattern

pdPat :: Pat -> Letter -> [Pat] Source #

function pdPat computes the partial derivatives of a pattern w.r.t. a letter. Integrating non-greedy operator with PStar For p*, we need to unroll it into a special construct say PPlus p' p* where p' in p/l. When we push another label, say l' to PPlus p' p*, and p' is emptiable, naively, we would do [ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus (mkE p') (PPlus p''' p*) | (PPlus p''' p*) <- p*l ] Now the problem here is the shape of the pdpat are infinite, which breaks the requirement of getting a compilation scheme. The fix here is to simplify the second component, by combining the binding, of (mkE p') and p''' since they share the same set of variables. [ PPlus p'' p* | p'' <- p' l ] ++ [ PPlus p4 p* | (PPlus p''' p*) <- p*l ] where p4 = combineBinding (mkE p') p''' For pdPat0 approach, we do not need to do this explicitly, we simply drop (mkE p') even in the PPair case. see the definitely of pdPat0 below

type Binder = IntMap [Range] Source #

The Binder type denotes a set of (pattern var * range) pairs type Binder = [(Int, [Range])]

Function toBinder turns a pattern into a binder

Arguments

 :: Pat the source pattern -> Letter the letter to be "consumed" -> [(Pat, Int -> Binder -> Binder)]

Function pdPat0 is the abstracted form of the pdPat function It computes a set of pairs. Each pair consists a shape of the partial derivative, and an update function which defines the change of the pattern bindings from the source pattern to the resulting partial derivative. This is used in the compilation of the regular expression pattern

Arguments

 :: Pat the source pattern -> Letter the letter to be "consumed" -> [(Pat, Int -> Binder -> Binder)]

Function pdPat0Sim applies simplification to the results of pdPat0

nub2 :: Eq a => [(a, b)] -> [(a, b)] Source #