Data.Generic.Diff
Contents
Description
Use this library to get an efficient, optimal, type-safe diff and
patch function for your datatypes of choice by defining a simple GADT and
some class instances. The process is explained in the documentation of the
Family type class.
The result of diff is an optimal EditScript that contains the operations
that need to be performed to get from the source value to the destination
value. The edit script can be used by patch, inspected with show and used
for all kinds of interesting stuff you come up with.
The library has been extracted from Eelco Lempsink's Master's Thesis
Generic type-safe diff and patch for families of datatypes (available online:
http://eelco.lempsink.nl/thesis.pdf). See Appendix C for a large example.
Note that some types and functions have been renamed for the benefit of the API
(e.g., diff to diffL, diff1 to diff, Diff to EditScriptL).
- diff :: (Type f x, Type f y) => x -> y -> EditScript f x y
- patch :: EditScript f x y -> x -> y
- diffL :: (Family f, List f txs, List f tys) => txs -> tys -> EditScriptL f txs tys
- patchL :: forall f txs tys. EditScriptL f txs tys -> txs -> tys
- compress :: Family f => EditScriptL f txs tys -> EditScriptL f txs tys
- data EditScriptL where
- Ins :: (Type f t, List f ts, List f tys) => f t ts -> EditScriptL f txs (Append ts tys) -> EditScriptL f txs (Cons t tys)
- Del :: (Type f t, List f ts, List f txs) => f t ts -> EditScriptL f (Append ts txs) tys -> EditScriptL f (Cons t txs) tys
- Cpy :: (Type f t, List f ts, List f txs, List f tys) => f t ts -> EditScriptL f (Append ts txs) (Append ts tys) -> EditScriptL f (Cons t txs) (Cons t tys)
- CpyTree :: (Type f t, List f txs, List f tys) => EditScriptL f txs tys -> EditScriptL f (Cons t txs) (Cons t tys)
- End :: EditScriptL f Nil Nil
- type EditScript f x y = EditScriptL f (Cons x Nil) (Cons y Nil)
- class Family f where
- class Family f => Type f t where
- constructors :: [Con f t]
- data a :=: b where
- data Con where
- data Nil = CNil
- data Cons x xs = CCons x xs
Diffing and patchLing
diff :: (Type f x, Type f y) => x -> y -> EditScript f x ySource
Calculate the difference between two values in the form of an edit script.
See the documentation for Family for how to make your own data types work
with this function.
patch :: EditScript f x y -> x -> ySource
Apply the edit script to value.
For multiple datatypes
diffL :: (Family f, List f txs, List f tys) => txs -> tys -> EditScriptL f txs tysSource
Underlying implementation of diff, works with (heterogeneous) lists of
values.
patchL :: forall f txs tys. EditScriptL f txs tys -> txs -> tysSource
Underlying implementation of patch, works with (heterogeneous) lists of
values.
Patch compression
compress :: Family f => EditScriptL f txs tys -> EditScriptL f txs tysSource
Patch datatype
data EditScriptL whereSource
The EditScriptL datatype captures the result of diffL and can be used as the input
to patchL (and compress).
The diffL function use a depth-first preorder traversal to traverse the
expression trees. The edit script it outputs contains the operation that must
be applied to the constructor at that point: either keeping it (Cpy),
removing it (Del, which does not remove the complete subtree, but contracts
the edge of the removed node) or inserting a new constructor (Ins, which can
only be done if the available subtrees at that point correspond to the types
the constructor expects). (The CpyTree is only used when running compress
over an existing edit script.)
For more information about this datatype, you're advised to look at Eelco Lempsink's thesis at http://eelco.lempsink.nl/thesis.pdf.
Constructors
| Ins :: (Type f t, List f ts, List f tys) => f t ts -> EditScriptL f txs (Append ts tys) -> EditScriptL f txs (Cons t tys) | |
| Del :: (Type f t, List f ts, List f txs) => f t ts -> EditScriptL f (Append ts txs) tys -> EditScriptL f (Cons t txs) tys | |
| Cpy :: (Type f t, List f ts, List f txs, List f tys) => f t ts -> EditScriptL f (Append ts txs) (Append ts tys) -> EditScriptL f (Cons t txs) (Cons t tys) | |
| CpyTree :: (Type f t, List f txs, List f tys) => EditScriptL f txs tys -> EditScriptL f (Cons t txs) (Cons t tys) | |
| End :: EditScriptL f Nil Nil |
Instances
| Show (EditScriptL f txs tys) |
type EditScript f x y = EditScriptL f (Cons x Nil) (Cons y Nil)Source
Edit script type for two single values.
Type classes
To use diff and patch on your datatypes, you must create an instance of
Family.
There are four steps to set up everything for diff and patch.
- Define your datatypes. (Presumable, you already have done this.)
- Create a family datatype, grouping your datatypes together.
- Make the family datatype an instance of
Family - Create
Typeinstances for each member of the family.
Steps 1-3 are explained below, step 4 is explained in the documentation for
Type.
As a running example, we create a simple family of datatypes (step 1):
data Expr = Min Expr Term
data Term = Parens Expr
| Number Int
The second step is creating the family datatype. Each constructor in the datatypes above gets a constructor in a family GADT:
data ExprTermFamily :: * -> * -> * where
Min' :: ExprTermFamily Expr (Cons Expr (Cons Term Nil))
Parens' :: ExprTermFamily Term (Cons Expr Nil)
Number' :: ExprTermFamily Term (Cons Int Nil)
Int' :: Int -> ExprTermFamily Int Nil
The first type argument of the datatype must be the resulting type of the
constructor. The second argument captures the types of the arguments the
constructor expects. Note how to use Cons and Nil to create type level
lists.
The Int' constructor is special, in the sense that it captures the Int type
abstractly (listing all Int's constructors would be quite impossible).
Caveat emptor: polymorphic datatypes (like lists) are ill-supported and require family constructors for each instance. It might require another master thesis project to solve this.
Step 3 is to create the instance for the Family class. For each function you
will have to implement four functions. It's straightforward albeit a bit
boring.
instance Family ExprTermFamily where
decEq Min' Min' = Just (Refl, Refl)
decEq Parens' Parens' = Just (Refl, Refl)
decEq Number' Number' = Just (Refl, Refl)
decEq (Int' x) (Int' y) | x == y = Just (Refl, Refl)
| otherwise = Nothing
decEq _ _ = Nothing
fields Min' (Min e t) = Just (CCons e (CCons t CNil))
fields Parens' (Parens e) = Just (CCons e CNil)
fields Number' (Number i) = Just (CCons i CNil)
fields (Int' _) _ = Just CNil
fields _ _ = Nothing
apply Min' (CCons e (CCons t CNil)) = Min e t
apply Parens' (CCons e CNil) = Parens e
apply Number' (CCons i CNil) = Number i
apply (Int' i) CNil = i
string Min' = "Min"
string Parens' = "Parens"
string Number' = "Number"
string (Int' i) = show i
Methods
decEq :: f tx txs -> f ty tys -> Maybe (tx :=: ty, txs :=: tys)Source
Return an instance of the equality GADT (Refl) of the type and
constructor arguments are equal, else return Nothing.
fields :: f t ts -> t -> Maybe tsSource
When the constructor from the family matches the real constructor,
return the arguments in a heterogeneous list, else return Nothing.
apply :: f t ts -> ts -> tSource
When the constructor from the family and the heterogeneous list of
arguments match, apply the real constructor. For abstract
constructors, the list of arguments should be CNil, but you project
out the value saved with the family constructor.
string :: f t ts -> StringSource
For showing the constructors from the family.
class Family f => Type f t whereSource
For each type in the family, you need to create an instance of Type, listing
all the members of the family GADT which belong to one type.
This is step 4 of the four steps needed to be able to use diff and patch.
Steps 1-3 are explained in the documentation for Family.
Continuing the example from the Family documentation, the instances for
Type are:
instance Type ExprTermFamily Term where
constructors = [Concr Number', Concr Parens']
instance Type ExprTermFamily Expr where
constructors = [Concr Min']
instance Type ExprTermFamily Int where
constructors = [Abstr Int']
Methods
constructors :: [Con f t]Source
List of constructors from the family GADT for one type in your family
Supporting datatypes
Equality GADT, see the documentation for Family for an example of its use.
Con wraps both concrete and abstract constructors to a simple type so
constructors for a single type can be put together in a list, see Type for
more information and an example.
Use Concr for concrete constructors (e.g., for simple abstract datatypes).
Use Abstr for abstract constructors (e.g., for build-in types or types with many
(or infinite) constructors)