Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
Kleene algebra.
This package provides means to work with kleene algebra,
at the moment specifically concentrating on regular expressions over Char
.
Implements ideas from Regular-expression derivatives re-examined by Scott Owens, John Reppy and Aaron Turon https://doi.org/10.1017/S0956796808007090.
>>>
:set -XOverloadedStrings
>>>
import Algebra.Lattice
>>>
import Algebra.PartialOrd
>>>
import Data.Semigroup
>>>
import Kleene.Internal.Pretty (putPretty)
Kleene.RE module provides RE
type. Kleene.Classes module provides various
classes to work with the type. All of that is re-exported from Kleene module.
First let's construct a regular expression value:
>>>
let re = star "abc" <> "def" <> ("x" \/ "yz") :: RE Char
>>>
putPretty re
^(abc)*def(x|yz)$
We can convert it to DFA
(there are 8 states)
>>>
let dfa = fromTM re
>>>
putPretty dfa
0 -> \x -> if | x <= '`' -> 8 | x <= 'a' -> 5 | x <= 'c' -> 8 | x <= 'd' -> 3 | otherwise -> 8 1 -> \x -> if | x <= 'w' -> 8 | x <= 'x' -> 6 | x <= 'y' -> 7 | otherwise -> 8 2 -> ... ...
It's also possible to graphically visualise DFAs
λ> writeFile "example.dot' (toDot
dfa)
% dot -Tpng -oexample.png example.dot
And we can convert back from DFA
to RE
:
>>>
let re' = toKleene dfa :: RE Char
>>>
putPretty re'
^(a(bca)*bcdefx|defx|(a(bca)*bcdefy|defy)z)$
As you see, we don't get what we started with. Yet, these
regular expressions are equivalent
;
>>>
equivalent re re'
True
or using Equiv
wrapper
>>>
Equiv re == Equiv re'
True
(The paper doesn't outline decision procedure for the equivalence, though it's right there - seems to be fast enough at least for toy examples like here).
We can use regular expressions to generate word examples in the language:
>>>
import Data.Foldable
>>>
import qualified Test.QuickCheck as QC
>>>
import Kleene.RE (generate)
>>>
traverse_ print $ take 5 $ generate (curry QC.choose) 42 re
"abcabcabcabcabcabcdefyz" "abcabcabcabcdefyz" "abcabcabcabcabcabcabcabcabcdefx" "abcabcdefx" "abcabcabcabcabcabcdefyz"
In addition to the "normal" regular expressions, there are extended regular expressions.
Regular expressions which we can complement
, and therefore intersect:
>>>
let ere = star "aa" /\ star "aaa" :: ERE Char
>>>
putPretty ere
^~(~((aa)*)|~((aaa)*))$
We can convert ERE
to RE
via DFA
:
>>>
let re'' = toKleene (fromTM ere) :: RE Char
>>>
putPretty re''
^(a(aaaaaa)*aaaaa)?$
Machine works own ways, we don't (always) get as pretty results as we'd like:
>>>
equivalent re'' (star "aaaaaa")
True
Another feature of the library is an Applciative
Functor
,
>>>
import Control.Applicative
>>>
import qualified Kleene.Functor as F
>>>
let f = (,) <$> many (F.char 'x') <* F.few F.anyChar <*> many (F.char 'z')
>>>
putPretty f
^x*[^]*z*$
By relying on http://hackage.haskell.org/package/regex-applicative library, we can match and capture with regular expression.
>>>
F.match f "xyyzzz"
Just ("x","zzz")
Where with RE
we can only get True
or False
:
>>>
match (F.toRE f) "xyyzzz"
True
Which in this case is not even interesting because:
>>>
equivalent (F.toRE f) everything
True
Converting from RE
to K
is also possible, which may be handy:
>>>
let g = (,) <$> F.few F.anyChar <*> F.fromRE re''
>>>
putPretty g
^[^]*(a(aaaaaa)*aaaaa)?$
>>>
F.match g (replicate 20 'a')
Just ("aa","aaaaaaaaaaaaaaaaaa")
We got longest divisible by 6 prefix of as. That's because fromRE
uses many
for star
.
Synopsis
- data RE c
- data ERE c
- newtype Equiv r c = Equiv (r c)
- data DFA c = DFA {
- dfaTransition :: !(IntMap (SF c Int))
- dfaInitial :: !Int
- dfaAcceptable :: !IntSet
- dfaBlackholes :: !IntSet
- fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c
- fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c
- toKleene :: forall k c. (Ord c, Enum c, Bounded c, FiniteKleene c k) => DFA c -> k
- toDot :: DFA Char -> String
- class Kleene k where
- class Kleene k => CharKleene c k | k -> c where
- class CharKleene c k => FiniteKleene c k | k -> c where
- class Derivate c k | k -> c where
- class Match c k | k -> c where
- class Derivate c k => TransitionMap c k | k -> c where
- transitionMap :: k -> Map k (SF c k)
- class Complement c k | k -> c where
- complement :: k -> k
- class ToLatin1 k where
- data K c a
Regular expressions
Regular expression
Constructors are exposed, but you should use
smart constructors in this module to construct RE
.
The Eq
and Ord
instances are structural.
The Kleene
etc constructors do "weak normalisation", so for values
constructed using those operations Eq
witnesses "weak equivalence".
See equivalent
for regular-expression equivalence.
Structure is exposed in Kleene.RE module but consider constructors as
half-internal. There are soft-invariants, but violating them shouldn't
break anything in the package. (e.g. transitionMap
will eventually
terminate, but may create more redundant states if starting regexp is not
"weakly normalised").
Instances
ToLatin1 RE Source # | |
(Ord c, Enum c, Bounded c) => Complement c (RE c) Source # | |
Defined in Kleene.DFA complement :: RE c -> RE c Source # | |
(Ord c, Enum c, Bounded c) => TransitionMap c (RE c) Source # | |
Defined in Kleene.Internal.RE | |
(Ord c, Enum c, Bounded c) => Equivalent c (RE c) Source # | |
Defined in Kleene.Internal.RE | |
(Ord c, Enum c, Bounded c) => Match c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => Derivate c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => FiniteKleene c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => CharKleene c (RE c) Source # | |
Eq c => Eq (RE c) Source # | |
Ord c => Ord (RE c) Source # | |
Show c => Show (RE c) Source # | |
c ~ Char => IsString (RE c) Source # | |
Defined in Kleene.Internal.RE fromString :: String -> RE c # | |
Eq c => Semigroup (RE c) Source # | |
Eq c => Monoid (RE c) Source # | |
(Ord c, Enum c, Bounded c, Arbitrary c) => Arbitrary (RE c) Source # | |
CoArbitrary c => CoArbitrary (RE c) Source # | |
Defined in Kleene.Internal.RE coarbitrary :: RE c -> Gen b -> Gen b # | |
(Ord c, Enum c, Bounded c) => Lattice (RE c) Source # | WARNING: The
|
(Ord c, Enum c, Bounded c) => BoundedJoinSemiLattice (RE c) Source # | |
Defined in Kleene.DFA | |
(Ord c, Enum c, Bounded c) => BoundedMeetSemiLattice (RE c) Source # | |
Defined in Kleene.DFA | |
c ~ Char => Pretty (RE c) Source # | |
(Ord c, Enum c, Bounded c) => Kleene (RE c) Source # | |
Extended regular expression
It's both, Kleene and Boolean algebra. (If we add only intersections, it wouldn't be Boolean).
Note: we don't have special constructor for intersections. We use de Morgan formula \(a \land b = \neg (\neg a \lor \neg b)\).
>>>
putPretty $ asEREChar $ "a" /\ "b"
^~(~a|~b)$
There is no generator, as intersections
makes it hard.
Instances
Equivalence (and partial order)
Regular-expressions for which ==
is equivalent
.
>>>
let re1 = star "a" <> "a" :: RE Char
>>>
let re2 = "a" <> star "a" :: RE Char
>>>
re1 == re2
False
>>>
Equiv re1 == Equiv re2
True
Equiv
is also a PartialOrd
(but not Ord
!)
>>>
Equiv "a" `leq` Equiv (star "a" :: RE Char)
True
Not all regular expessions are comparable
:
>>>
let reA = Equiv "a" :: Equiv RE Char
>>>
let reB = Equiv "b" :: Equiv RE Char
>>>
(leq reA reB, leq reB reA)
(False,False)
Equiv (r c) |
Instances
Complement c (r c) => Complement c (Equiv r c) Source # | |
Defined in Kleene.Equiv complement :: Equiv r c -> Equiv r c Source # | |
Equivalent c (r c) => Equivalent c (Equiv r c) Source # | |
Defined in Kleene.Equiv | |
Match c (r c) => Match c (Equiv r c) Source # | |
Derivate c (r c) => Derivate c (Equiv r c) Source # | |
CharKleene c (r c) => CharKleene c (Equiv r c) Source # | |
Equivalent c (r c) => Eq (Equiv r c) Source # | |
Show (r c) => Show (Equiv r c) Source # | |
Semigroup (r c) => Semigroup (Equiv r c) Source # | |
Monoid (r c) => Monoid (Equiv r c) Source # | |
Lattice (r c) => Lattice (Equiv r c) Source # | |
BoundedJoinSemiLattice (r c) => BoundedJoinSemiLattice (Equiv r c) Source # | |
Defined in Kleene.Equiv | |
BoundedMeetSemiLattice (r c) => BoundedMeetSemiLattice (Equiv r c) Source # | |
Defined in Kleene.Equiv | |
(Lattice (r c), Equivalent c (r c)) => PartialOrd (Equiv r c) Source # | \(a \preceq b := a \lor b = b \) |
Pretty (r c) => Pretty (Equiv r c) Source # | |
Kleene (r c) => Kleene (Equiv r c) Source # | |
Deterministic finite automaton
Deterministic finite automaton.
A deterministic finite automaton (DFA) over an alphabet \(\Sigma\) (type
variable c
) is 4-tuple \(Q\), \(q_0\) , \(F\), \(\delta\), where
- \(Q\) is a finite set of states (subset of
s
), - \(q_0 \in Q\) is the distinguised start state (
dfaInitial
), - \(F \subset Q\) is a set of final (or accepting) states (
dfaAcceptable
), and - \(\delta : Q \times \Sigma \to Q\) is a function called the state
transition function (
dfaTransition
).
DFA | |
|
Instances
Complement c (DFA c) Source # | Complement DFA. Complement of
|
Defined in Kleene.DFA complement :: DFA c -> DFA c Source # | |
Ord c => Match c (DFA c) Source # | Run Because we have analysed a language, in some cases we can determine a result
without traversing all of the input.
That's not the cases with
Holds:
all (match (fromRE r)) $ take 10 $ RE.generate (curry QC.choose) 42 (r :: RE.RE Char) |
Ord c => Derivate c (DFA c) Source # | |
Show c => Show (DFA c) Source # | |
Show c => Pretty (DFA c) Source # | |
fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c Source #
Create from TransitionMap
.
See fromRE
for a specific example.
fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c Source #
Create from TransitonMap
minimising states with Equivalent
.
See fromERE
for an example.
toDot :: DFA Char -> String Source #
Get Graphviz dot-code of DFA.
>>>
let dfa = fromRE $ RE.star "abc"
>>>
putStr $ toDot dfa
digraph dfa { rankdir=LR; // states "0" [shape=doublecircle]; "1" [shape=circle]; "2" [shape=circle]; // initial state "" [shape=none]; "" -> "0"; // transitions "0" -> "2"[label="a"] "1" -> "0"[label="c"] "2" -> "1"[label="b"] }
Classes
Most operations are defined in following type-classes.
See Kleene.RE module for a specific version with examples.
Kleene algebra.
If k
is Monoid
it's expected that
;
if appends
= mappend
k
is Lattice
it's expected that
.unions
= joins
Empty regex. Doesn't accept anything.
Empty string. Note: different than empty
.
Concatenation.
Union.
Kleene star.
class Kleene k => CharKleene c k | k -> c where Source #
Instances
(Ord c, Enum c, Bounded c) => CharKleene c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => CharKleene c (ERE c) Source # | |
CharKleene c (M c) Source # | |
CharKleene c (r c) => CharKleene c (Equiv r c) Source # | |
class CharKleene c k => FiniteKleene c k | k -> c where Source #
everything :: k Source #
Everything. \(\Sigma^\star\).
charRange :: c -> c -> k Source #
.charRange
a
z
= ^[a-z]$
fromRSet :: RSet c -> k Source #
Generalisation of charRange
.
.
Every character except new line \n
.
Any character. Note: different than dot
!
class Derivate c k | k -> c where Source #
nullable :: k -> Bool Source #
Does language contain an empty string?
derivate :: c -> k -> k Source #
Derivative of a language.
class Match c k | k -> c where Source #
An f
can be used to match on the input.
Instances
(Ord c, Enum c, Bounded c) => Match c (RE c) Source # | |
(Ord c, Enum c) => Match c (ERE c) Source # | |
Ord c => Match c (DFA c) Source # | Run Because we have analysed a language, in some cases we can determine a result
without traversing all of the input.
That's not the cases with
Holds:
all (match (fromRE r)) $ take 10 $ RE.generate (curry QC.choose) 42 (r :: RE.RE Char) |
(Eq c, Enum c, Bounded c) => Match c (M c) Source # | |
Match c (r c) => Match c (Equiv r c) Source # | |
class Derivate c k => TransitionMap c k | k -> c where Source #
Transition map.
transitionMap :: k -> Map k (SF c k) Source #
Instances
(Ord c, Enum c, Bounded c) => TransitionMap c (RE c) Source # | |
Defined in Kleene.Internal.RE | |
(Ord c, Enum c, Bounded c) => TransitionMap c (ERE c) Source # | |
Defined in Kleene.ERE |
class Complement c k | k -> c where Source #
complement :: k -> k Source #
Instances
Complement c (ERE c) Source # | |
Defined in Kleene.ERE complement :: ERE c -> ERE c Source # | |
(Ord c, Enum c, Bounded c) => Complement c (RE c) Source # | |
Defined in Kleene.DFA complement :: RE c -> RE c Source # | |
Complement c (DFA c) Source # | Complement DFA. Complement of
|
Defined in Kleene.DFA complement :: DFA c -> DFA c Source # | |
Complement c (r c) => Complement c (Equiv r c) Source # | |
Defined in Kleene.Equiv complement :: Equiv r c -> Equiv r c Source # |
Functor
Only the type is exported so it can be referred to.
See Kleene.Functor for operations.
Applicative
Functor
regular expression.
Instances
Functor (K c) Source # | |
Applicative (K c) Source # | |
Alternative (K c) Source # | |
Alt (K c) Source # | |
Apply (K c) Source # | |
(c ~ Char, IsString a) => IsString (K c a) Source # | |
Defined in Kleene.Internal.Functor fromString :: String -> K c a # | |
c ~ Char => Pretty (K c a) Source # | Convert to non-matching JavaScript string which can be used
as an argument to
|