Data.Derive: A User Manual

by Neil Mitchell

Data.Derive is a library and a tool for deriving instances for Haskell programs. It is designed to work with custom derivations, SYB and Template Haskell mechanisms. The tool requires GHC, but the generated code is portable to all compilers. We see this tool as a competitor to DrIFT.

This document proceeds as follows:

Obtaining and Installing Data.Derive
Supported Derivations
Using the Derive Program
Using Template Haskell Derivations
Writing a New Derivation

Acknowledgements

Thanks to everyone who has submitted patches and given assistance, including: Twan van Laarhoven, Spencer Janssen, Andrea Vezzosi, Samuel Bronson, Joel Raymont, Benedikt Huber, Stefan O'Rear, Robin Green.

Obtaining and Installing Data.Derive

Data.Derive is available using darcs:

darcs get --partial http://community.haskell.org/~ndm/darcs/derive

Install the program using the standard sequence of Cabal magic:

cabal update
cabal install derive

Supported Derivations

Data.Derive is not limited to any prebuild set of derivations, see later for how to add your own. Out of the box, we provide instances for the following libraries.

Arbitrary - from the library QuickCheck

ArbitraryOld - from the library QuickCheck-1.2.0.0

Arities - from the library derive

Binary - from the library binary

BinaryDefer - from the library binarydefer

Bounded - from the library base

Data - from the library base

DataAbstract - from the library base

Default - from the library derive

Enum - from the library base

EnumCyclic - from the library base

Eq - from the library base

Fold

Foldable - from the library base

From

Functor - from the library base

Has

LazySet

Monoid - from the library base

NFData - from the library parallel

Ord - from the library base

Read - from the library base

Ref

Serial - from the library smallcheck

Serialize - from the library cereal

Set

Show - from the library base

Traversable - from the library base

Typeable - from the library base

UniplateDirect - from the library uniplate

UniplateTypeable - from the library uniplate

Update

Using the Derive program

Let's imagine we've defined a data type:

data Color = RGB Int Int Int
           | CMYK Int Int Int Int
           deriving (Eq, Show)

Now we wish to extend this to derive Binary and change to defining Eq using our library. To do this we simply add to the deriving clause.

data Color = RGB Int Int Int
           | CMYK Int Int Int Int
           deriving (Show {-! Eq, Binary !-})

Or alternatively write:

{-!
deriving instance Eq Color
deriving instance Binary Color
!-}

Now running derive on the program containing this code will generate appropriate instances. How do you combine these instances back into the code? There are various mechanisms supported.

Appending to the module

One way is to append the text to the bottom of the module, this can be done by passing the --append flag. If this is done, Derive will generate the required instances and place them at the bottom of the file, along with a checksum. Do not modify these instances.

Using CPP

One way is to use CPP. Ensure your compiler is set up for compiling with the C Pre Processor. For example:

{-# LANGUAGE CPP #-}
{-# OPTIONS_DERIVE --output=file.h #-}

module ModuleName where

#include "file.h"

Side-by-side Modules

If you had Colour.Type, and wished to place the Binary instance in Colour.Binary, this can be done with:

{-# OPTIONS_DERIVE --output=Binary.hs --module=Colour.Binary --import #-}

Here you ask for the output to go to a particular file, give a specific module name and import this module. This will only work if the data structure is exported non-abstractly.

Using Template Haskell Derivations

One of Derive's advantages over DrIFT is support for Template Haskell (abbreviated TH). Derive can be invoked automatically during the compilation process, and transparently supports deriving across module boundaries. The main disadvantage of TH-based deriving is that it is only portable to compilers that support TH; currently that is GHC only.

To use the TH deriving system, with the same example as before:

{-# LANGUAGE TemplateHaskell #-}
import Data.DeriveTH
import Data.Binary

data Color = RGB Int Int Int
           | CMYK Int Int Int Int
           deriving (Show)

$( derive makeEq ''Color )
$( derive makeBinary ''Color )

We need to tell the compiler to insert the instance using the TH splice construct, $( ... ) (the spaces are optional). The splice causes the compiler to run the function derive (exported from Data.DeriveTH), passing arguments makeFooBar and ''Color. The second argument deserves more explanation; it is a quoted symbol, somewhat like a quoted symbol in Lisp and with deliberately similar syntax. (Two apostrophes are used to specify that this name is to be resolved as a type constructor; just 'Color would look for a data constructor named Color.)

Writing a New Derivation

There are two methods for writing a new derivation, guessing or coding. The guessing method is substantially easier if it will work for you, but is limited to derivations with the following properties:

Inductive - each derivation must be similar to the previous one. Binary does not have this property as a 1 item derivation does not have a tag, but a 2 item derivation does.
Not inductive on the type - it must be an instance for the constructors, not for the type. Typeable violates this property by inducting on the free variables in the data type.
Not type based - the derivation must not change based on the types of the fields. Play and Functor both behave differently given differently typed fields.
Not record based - the derivation must not change on record fields. Show outputs the fields, so this is not allowed.

If however your instance does meet these properties, you can use derivation by guess. Many instances do meet these conditions, for examples see: Eq, Ord, Data, Serial etc. If however you need to code the derivation manually see examples such as Update and Functor.