gdiff-1.1: Generic diff and patch

Safe HaskellNone
LanguageHaskell98

Data.Generic.Diff

Contents

Description

Use this library to get an efficient, optimal, type-safe diff and patch function for your datatypes of choice by defining a simple GADT and some class instances. The process is explained in the documentation of the Family type class.

The result of diff is an optimal EditScript that contains the operations that need to be performed to get from the source value to the destination value. The edit script can be used by patch, inspected with show and used for all kinds of interesting stuff you come up with.

The library has been extracted from Eelco Lempsink's Master's Thesis Generic type-safe diff and patch for families of datatypes (available online: http://eelco.lempsink.nl/thesis.pdf). See Appendix C for a large example. Note that some types and functions have been renamed for the benefit of the API (e.g., diff to diffL, diff1 to diff, Diff to EditScriptL).

Synopsis

Diffing and patching

diff :: (Type f x, Type f y) => x -> y -> EditScript f x y Source

Calculate the difference between two values in the form of an edit script.

See the documentation for Family for how to make your own data types work with this function.

patch :: EditScript f x y -> x -> y Source

Apply the edit script to a value.

For multiple datatypes

diffL :: (Family f, List f txs, List f tys) => txs -> tys -> EditScriptL f txs tys Source

Underlying implementation of diff, works with (heterogeneous) lists of values.

patchL :: forall f txs tys. EditScriptL f txs tys -> txs -> tys Source

Underlying implementation of patch, works with (heterogeneous) lists of values.

Patch compression

compress :: Family f => EditScriptL f txs tys -> EditScriptL f txs tys Source

Patch datatype

data EditScriptL :: (* -> * -> *) -> * -> * -> * where Source

The EditScriptL datatype captures the result of diffL and can be used as the input to patchL (and compress).

The diffL function use a depth-first preorder traversal to traverse the expression trees. The edit script it outputs contains the operation that must be applied to the constructor at that point: either keeping it (Cpy), removing it (Del, which does not remove the complete subtree, but contracts the edge of the removed node) or inserting a new constructor (Ins, which can only be done if the available subtrees at that point correspond to the types the constructor expects). (The CpyTree is only used when running compress over an existing edit script.)

For more information about this datatype, you're advised to look at Eelco Lempsink's thesis at http://eelco.lempsink.nl/thesis.pdf.

Constructors

Ins :: (Type f t, List f ts, List f tys) => f t ts -> EditScriptL f txs (Append ts tys) -> EditScriptL f txs (Cons t tys) 
Del :: (Type f t, List f ts, List f txs) => f t ts -> EditScriptL f (Append ts txs) tys -> EditScriptL f (Cons t txs) tys 
Cpy :: (Type f t, List f ts, List f txs, List f tys) => f t ts -> EditScriptL f (Append ts txs) (Append ts tys) -> EditScriptL f (Cons t txs) (Cons t tys) 
CpyTree :: (Type f t, List f txs, List f tys) => EditScriptL f txs tys -> EditScriptL f (Cons t txs) (Cons t tys) 
End :: EditScriptL f Nil Nil 

Instances

Show (EditScriptL f txs tys) 

type EditScript f x y = EditScriptL f (Cons x Nil) (Cons y Nil) Source

Edit script type for two single values.

Type classes

class Family f where Source

To use diff and patch on your datatypes, you must create an instance of Family.

There are four steps to set up everything for diff and patch.

  1. Define your datatypes. (Presumably, you already have done this.)
  2. Create a family datatype, grouping your datatypes together.
  3. Make the family datatype an instance of Family
  4. Create Type instances for each member of the family.

Steps 1-3 are explained below, step 4 is explained in the documentation for Type.

As a running example, we create a simple family of datatypes (step 1):

data Expr  =  Min Expr Term
data Term  =  Parens Expr
           |  Number Int

The second step is creating the family datatype. Each constructor in the datatypes above gets a constructor in a family GADT:

data ExprTermFamily :: * -> * -> * where
    Min'     ::          ExprTermFamily Expr (Cons Expr (Cons Term Nil))
    Parens'  ::          ExprTermFamily Term (Cons Expr Nil)
    Number'  ::          ExprTermFamily Term (Cons Int Nil)
    Int'     ::  Int ->  ExprTermFamily Int  Nil

The first type argument of the datatype must be the resulting type of the constructor. The second argument captures the types of the arguments the constructor expects. Note how to use Cons and Nil to create type level lists.

The Int' constructor is special, in the sense that it captures the Int type abstractly (listing all Int's constructors would be quite impossible).

Caveat emptor: polymorphic datatypes (like lists) are ill-supported and require family constructors for each instance. It might require another master thesis project to solve this.

Step 3 is to create the instance for the Family class. For each function you will have to implement four functions. It's straightforward albeit a bit boring.

instance Family ExprTermFamily where
    decEq Min'      Min'      =              Just (Refl, Refl)
    decEq Parens'   Parens'   =              Just (Refl, Refl)
    decEq Number'   Number'   =              Just (Refl, Refl)
    decEq (Int' x)  (Int' y)  | x == y     = Just (Refl, Refl)
                              | otherwise  = Nothing
    decEq _        _          = Nothing

    fields Min'      (Min e t)   = Just (CCons e (CCons t CNil))
    fields Parens'   (Parens e)  = Just (CCons e CNil)
    fields Number'   (Number i)  = Just (CCons i CNil)
    fields (Int' _)  _           = Just CNil
    fields _         _           = Nothing

    apply Min'      (CCons e (CCons t CNil))  = Min e t
    apply Parens'   (CCons e CNil)            = Parens e
    apply Number'   (CCons i CNil)            = Number i
    apply (Int' i)  CNil                      = i

    string Min'      = "Min"
    string Parens'   = "Parens"
    string Number'   = "Number"
    string (Int' i)  = show i

Methods

decEq :: f tx txs -> f ty tys -> Maybe (tx :~: ty, txs :~: tys) Source

Return an instance of the equality GADT (Refl) of the type and constructor arguments are equal, else return Nothing.

fields :: f t ts -> t -> Maybe ts Source

When the constructor from the family matches the real constructor, return the arguments in a heterogeneous list, else return Nothing.

apply :: f t ts -> ts -> t Source

When the constructor from the family and the heterogeneous list of arguments match, apply the real constructor. For abstract constructors, the list of arguments should be CNil, but you project out the value saved with the family constructor.

string :: f t ts -> String Source

For showing the constructors from the family.

class Family f => Type f t where Source

For each type in the family, you need to create an instance of Type, listing all the members of the family GADT which belong to one type.

This is step 4 of the four steps needed to be able to use diff and patch. Steps 1-3 are explained in the documentation for Family.

Continuing the example from the Family documentation, the instances for Type are:

instance Type ExprTermFamily Term where
    constructors = [Concr Number', Concr Parens']

instance Type ExprTermFamily Expr where
    constructors = [Concr Min']

instance Type ExprTermFamily Int where
    constructors = [Abstr Int']

Methods

constructors :: [Con f t] Source

List of constructors from the family GADT for one type in your family

Supporting datatypes

data a :~: b :: k -> k -> * where infix 4

Propositional equality. If a :~: b is inhabited by some terminating value, then the type a is the same as the type b. To use this equality in practice, pattern-match on the a :~: b to get out the Refl constructor; in the body of the pattern-match, the compiler knows that a ~ b.

Since: 4.7.0.0

Constructors

Refl :: (:~:) k a1 a1 

Instances

TestEquality k ((:~:) k a) 
(~) k a b => Bounded ((:~:) k a b) 
(~) k a b => Enum ((:~:) k a b) 
Eq ((:~:) k a b) 
Ord ((:~:) k a b) 
(~) k a b => Read ((:~:) k a b) 
Show ((:~:) k a b) 

data Con :: (* -> * -> *) -> * -> * where Source

Con wraps both concrete and abstract constructors to a simple type so constructors for a single type can be put together in a list, see Type for more information and an example.

Use Concr for concrete constructors (e.g., for simple algebraic datatypes).

Use Abstr for abstract constructors (e.g., for built-in types or types with many (or infinite) constructors)

Constructors

Concr :: List f ts => f t ts -> Con f t 
Abstr :: (Eq t, List f ts) => (t -> f t ts) -> Con f t 

data Nil Source

Datatype for type level lists, corresponding to '[]'. Use when creating your Family instance.

Constructors

CNil 

data Cons x xs Source

Datatype for type level lists, corresponding to '(:)'. Use when creating your Family instance.

Constructors

CCons x xs