kleene-0: Kleene algebra

Safe HaskellSafe
LanguageHaskell2010

Kleene.DFA

Contents

Synopsis

Documentation

data DFA c Source #

Deterministic finite automaton.

A deterministic finite automaton (DFA) over an alphabet \(\Sigma\) (type variable c) is 4-tuple \(Q\), \(q_0\) , \(F\), \(\delta\), where

  • \(Q\) is a finite set of states (subset of Int),
  • \(q_0 \in Q\) is the distinguised start state (0),
  • \(F \subset Q\) is a set of final (or accepting) states (dfaAcceptable), and
  • \(\delta : Q \times \Sigma \to Q\) is a function called the state transition function (dfaTransition).

Constructors

DFA 

Fields

Instances

Complement c (DFA c) Source #

Complement DFA.

Complement of DFA is way easier than of RE: complement accept states.

>>> let dfa = complement $ fromRE $ RE.star "abc"
>>> putPretty dfa
0 -> \x -> if
    | x <= '`'  -> 3
    | x <= 'a'  -> 2
    | otherwise -> 3
1+ -> \x -> if
    | x <= 'b'  -> 3
    | x <= 'c'  -> 0
    | otherwise -> 3
2+ -> \x -> if
    | x <= 'a'  -> 3
    | x <= 'b'  -> 1
    | otherwise -> 3
3+ -> \_ -> 3 -- black hole
>>> map (match dfa) ["", "abc", "abcabc", "aa","abca", 'a' : 'a' : undefined]
[False,False,False,True,True,True]

Methods

complement :: DFA c -> DFA c Source #

Ord c => Match c (DFA c) Source #

Run DFA on the input.

Because we have analysed a language, in some cases we can determine an input without traversing all of the input. That's not the cases with RE match.

>>> let dfa = fromRE $ RE.star "abc"
>>> map (match dfa) ["", "abc", "abcabc", "aa", 'a' : 'a' : undefined]
[True,True,True,False,False]

Holds:

match (fromRE re) xs == match re xs
all (match (fromRE r)) $ take 10 $ RE.generate (curry QC.choose) 42 (r :: RE.RE Char)

Methods

match :: DFA c -> [c] -> Bool Source #

Show c => Show (DFA c) Source # 

Methods

showsPrec :: Int -> DFA c -> ShowS #

show :: DFA c -> String #

showList :: [DFA c] -> ShowS #

Show c => Pretty (DFA c) Source # 

Methods

pretty :: DFA c -> String Source #

prettyS :: DFA c -> ShowS Source #

Conversions

fromRE :: forall c. (Ord c, Enum c, Bounded c) => RE c -> DFA c Source #

Convert RE to DFA.

>>> putPretty $ fromRE $ RE.star "abc"
0+ -> \x -> if
    | x <= '`'  -> 3
    | x <= 'a'  -> 2
    | otherwise -> 3
1 -> \x -> if
    | x <= 'b'  -> 3
    | x <= 'c'  -> 0
    | otherwise -> 3
2 -> \x -> if
    | x <= 'a'  -> 3
    | x <= 'b'  -> 1
    | otherwise -> 3
3 -> \_ -> 3 -- black hole

Everything and nothing result in blackholes:

>>> traverse_ (putPretty . fromRE) [RE.empty, RE.star RE.anyChar]
0 -> \_ -> 0 -- black hole
0+ -> \_ -> 0 -- black hole

Character ranges are effecient:

>>> putPretty $ fromRE $ RE.charRange 'a' 'z'
0 -> \x -> if
    | x <= '`'  -> 2
    | x <= 'z'  -> 1
    | otherwise -> 2
1+ -> \_ -> 2
2 -> \_ -> 2 -- black hole

An example with two blackholes:

>>> putPretty $ fromRE $ "c" <> RE.star RE.anyChar
0 -> \x -> if
    | x <= 'b'  -> 2
    | x <= 'c'  -> 1
    | otherwise -> 2
1+ -> \_ -> 1 -- black hole
2 -> \_ -> 2 -- black hole

toRE :: forall c. (Ord c, Enum c, Bounded c) => DFA c -> RE c Source #

Convert DFA to RE.

>>> putPretty $ toRE $ fromRE "foobar"
^foobar$

For string regular expressions, toRE . fromRE = id:

let s = take 5 s' in RE.string (s :: String) === toRE (fromRE (RE.string s))

But in general it isn't:

>>> let aToZ = RE.star $ RE.charRange 'a' 'z'
>>> traverse_ putPretty [aToZ, toRE $ fromRE aToZ]
^[a-z]*$
^([a-z]|[a-z]?[a-z]*[a-z]?)?$
not-prop> (re :: RE.RE Char) === toRE (fromRE re)

However, they are equivalent:

>>> RE.equivalent aToZ (toRE (fromRE aToZ))
True

And so are others

>>> all (\re -> RE.equivalent re (toRE (fromRE re))) [RE.star "a", RE.star "ab"]
True
expensive-prop> RE.equivalent re (toRE (fromRE (re :: RE.RE Char)))

Note, that toRE . fromRE can, and usually makes regexp unrecognisable:

>>> putPretty $ toRE $ fromRE $ RE.star "ab"
^(a(ba)*b)?$

We can complement DFA, therefore we can complement RE. For example. regular expression matching string containing an a:

>>> let withA = RE.star RE.anyChar <> "a" <> RE.star RE.anyChar
>>> let withoutA = toRE $ complement $ fromRE withA
>>> putPretty withoutA
^([^a]|[^a]?[^a]*[^a]?)?$
>>> let withoutA' = RE.star $ RE.REChars $ RSet.complement $ RSet.singleton 'a'
>>> putPretty withoutA'
^[^a]*$
>>> RE.equivalent withoutA withoutA'
True

Quite small, for example 2 state DFAs can result in big regular expressions:

>>> putPretty $ toRE $ complement $ fromRE $ star "ab"
^([^]|a(ba)*(ba)?|a(ba)*([^b]|b[^a])|([^a]|a(ba)*([^b]|b[^a]))[^]*[^]?)$

We can use toRE . fromERE to convert ERE to RE:

>>> putPretty $ toRE $ fromERE $ complement $ star "ab"
^([^]|a(ba)*(ba)?|a(ba)*([^b]|b[^a])|([^a]|a(ba)*([^b]|b[^a]))[^]*[^]?)$
>>> putPretty $ toRE $ fromERE $ "a" /\ "b"
^[]$

See https://mathoverflow.net/questions/45149/can-regular-expressions-be-made-unambiguous for the description of the algorithm used.

fromERE :: forall c. (Ord c, Enum c, Bounded c) => ERE c -> DFA c Source #

Convert ERE to DFA.

We don't always generate minimal automata:

>>> putPretty $ fromERE $ "a" /\ "b"
0 -> \_ -> 1
1 -> \_ -> 1 -- black hole

Compare this to an complement example

Using fromTMEquiv, we can get minimal automaton, for the cost of higher complexity (slow!).

>>> putPretty $ fromTMEquiv $ ("a" /\ "b" :: ERE.ERE Char)
0 -> \_ -> 0 -- black hole
>>> putPretty $ fromERE $ complement $ star "abc"
0 -> \x -> if
    | x <= '`'  -> 3
    | x <= 'a'  -> 2
    | otherwise -> 3
1+ -> \x -> if
    | x <= 'b'  -> 3
    | x <= 'c'  -> 0
    | otherwise -> 3
2+ -> \x -> if
    | x <= 'a'  -> 3
    | x <= 'b'  -> 1
    | otherwise -> 3
3+ -> \_ -> 3 -- black hole

toERE :: forall c. (Ord c, Enum c, Bounded c) => DFA c -> ERE c Source #

Convert DFA to ERE.

fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c Source #

Create from TransitionMap.

See fromRE for a specific example.

fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c Source #

Create from TransitonMap minimising states with Equivalent.

See fromERE for an example.

toKleene :: forall k c. (Ord c, Enum c, Bounded c, FiniteKleene c k) => DFA c -> k Source #

Convert to any Kleene.

See toRE for a specific example.