regex-tdfa-rc- Replaces/Enhances Text.Regex

Safe HaskellNone



Text.Regex.TDFA.TNFA converts the CorePattern Q/P data (and its Pattern leafs) to a QNFA tagged non-deterministic finite automata.

This holds every possible way to follow one state by another, while in the DFA these will be reduced by picking a single best transition for each (soure,destination) pair. The transitions are heavily and often redundantly annotated with tasks to perform, and this redundancy is reduced when picking the best transition. So far, keeping all this information has helped fix bugs in both the design and implementation.

The QNFA for a Pattern with a starTraned Q/P form with N one character accepting leaves has at most N+1 nodes. These nodes repesent the future choices after accepting a leaf. The processing of Or nodes often reduces this number by sharing at the end of the different paths. Turning off capturing while compiling the pattern may (future extension) reduce this further for some patterns by processing Star with optimizations. This compact design also means that tags are assigned not just to be updated before taking a transition (PreUpdate) but also after the transition (PostUpdate).

Uses recursive do notation.



data QNFA Source

Internal NFA node type




q_id :: Index
q_qt :: QT

data QT Source

Internal to QNFA type.




qt_win :: WinTags

empty transitions to the virtual winning state

qt_trans :: CharMap QTrans

all ways to leave this QNFA to other or the same QNFA

qt_other :: QTrans

default ways to leave this QNFA to other or the same QNFA



qt_test :: WhichTest

The test to perform

qt_dopas :: EnumSet DoPa

location(s) of the anchor(s) in the original regexp

qt_a :: QT

use qt_a if test is True, else use qt_b

qt_b :: QT

use qt_a if test is True, else use qt_b


Eq QT 

type QTrans = IntMap [TagCommand]Source

Internal type to represent the tagged transition from one QNFA to another (or itself). The key is the Index of the destination QNFA.

data TagUpdate Source

When attached to a QTrans the TagTask can be done before or after accepting the character.