Safe Haskell | None |
---|---|
Language | Haskell2010 |
AUTHOR
- Dr. Alistair Ward
DESCRIPTION
- Permits transformation of
MatchList
, to facilitate standardisation.
- extractCaptureGroups :: Bool -> DataSpanTreeList a -> [DataSpan a]
- flattenTreeList :: DataLength -> DataSpanTreeList a -> [DataSpan a]
- toTreeList :: MatchList a -> DataSpanTreeList a
Types
Type-synonyms
Functions
:: Bool | Whether to strictly comply with POSIX. |
-> DataSpanTreeList a | The tree-structure from which to extract the capture-groups. |
-> [DataSpan a] |
- POSIX describes the contents of capture-groups, as summarised in http://www2.research.att.com/~gsf/testregex/.
Result
, is a complete description of the match betweenInputData
& RegEx.ExtendedRegEx'; this function extracts a POSIX-conformant list from it.- The major differences are, that:
Only data from parenthesized sub-expressions (Alternatives
) is captured.
Only the last repetition of a repeated sub-expression is returned. http://www.opengroup.org/onlinepubs/009695399/functions/regcomp.html.
The data captured within each parenthesized sub-expression, is summarised as a single DataSpan
.
POSIX specifies a Span
-offset of -1
, for sub-expressions which match zero times; cf sub-expressions which consume nothing, once.
http://www.opengroup.org/onlinepubs/009695399/functions/regcomp.html.
@
("ace" Text.Regex.Posix.=~ "a(b)*c(d)?e") :: Text.Regex.Base.RegexLike.MatchArray
array (0,2) [(0,(0,3)),(1,(-1,0)),(2,(-1,0))]
("ace" Text.Regex.Posix.=~ "a(b*)c(d?)e") :: Text.Regex.Base.RegexLike.MatchArray
array (0,2) [(0,(0,3)),(1,(1,0)),(2,(2,0))]
@
I consider this a poor convention, resulting from the focus of POSIX on C, which makes subsequent calculation from the list of DataSpan
s difficult & error-prone.
:: DataLength | The offset into the input-data at which a match occurred. |
-> DataSpanTreeList a | The tree to flatten. |
-> [DataSpan a] |
toTreeList :: MatchList a -> DataSpanTreeList a Source