Safe Haskell	None
Language	Haskell2010

RegExDot.DataSpanTree

Contents

Types
- Type-synonyms
Functions

Description

AUTHOR: Dr. Alistair Ward
DESCRIPTION: Permits transformation of MatchList, to facilitate standardisation.

Synopsis

Types

Type-synonyms

Functions

extractCaptureGroups Source

Arguments

:: Bool	Whether to strictly comply with POSIX.
-> DataSpanTreeList a	The tree-structure from which to extract the capture-groups.
-> [DataSpan a]

POSIX describes the contents of capture-groups, as summarised in http://www2.research.att.com/~gsf/testregex/.
Result, is a complete description of the match between InputData & RegEx.ExtendedRegEx'; this function extracts a POSIX-conformant list from it.
The major differences are, that:

Only data from parenthesized sub-expressions (Alternatives) is captured.

Only the last repetition of a repeated sub-expression is returned. http://www.opengroup.org/onlinepubs/009695399/functions/regcomp.html.

The data captured within each parenthesized sub-expression, is summarised as a single DataSpan.

POSIX specifies a Span-offset of -1, for sub-expressions which match zero times; cf sub-expressions which consume nothing, once. http://www.opengroup.org/onlinepubs/009695399/functions/regcomp.html. @ ("ace" Text.Regex.Posix.=~ "a(b)*c(d)?e") :: Text.Regex.Base.RegexLike.MatchArray array (0,2) [(0,(0,3)),(1,(-1,0)),(2,(-1,0))]

("ace" Text.Regex.Posix.=~ "a(b*)c(d?)e") :: Text.Regex.Base.RegexLike.MatchArray array (0,2) [(0,(0,3)),(1,(1,0)),(2,(2,0))] @ I consider this a poor convention, resulting from the focus of POSIX on C, which makes subsequent calculation from the list of DataSpans difficult & error-prone.

flattenTreeList Source

Arguments

:: DataLength	The offset into the input-data at which a match occurred.
-> DataSpanTreeList a	The tree to flatten.
-> [DataSpan a]

Condenses a DataSpanTreeLists into a list of DataSpans, using join.

toTreeList :: MatchList a -> DataSpanTreeList a Source

Converts a MatchList into a DataSpanTreeList, by transforming the Leafs.