Safe Haskell | None |
---|

Explicit Closures, inspired by [1] and refined by [2].

References:

- Epstein, Black, Peyton-Jones.
*Towards Haskell in the Cloud*. Haskell Symposium 2011. - Maier, Trinder.
*Implementing a High-level Distributed-Memory**Parallel Haskell in Haskell*. IFL 2011.

- data Closure a
- unClosure :: Closure a -> a
- forceClosure :: (NFData a, ToClosure a) => Closure a -> Closure a
- idC :: Closure (a -> a)
- termC :: Closure (a -> ())
- compC :: Closure (b -> c) -> Closure (a -> b) -> Closure (a -> c)
- apC :: Closure (a -> b) -> Closure a -> Closure b
- toClosure :: ToClosure a => a -> Closure a
- class Serialize a => ToClosure a where
- locToClosure :: LocT a

- type StaticToClosure a = Static (Env -> a)
- staticToClosure :: ToClosure a => StaticToClosure a
- mkClosure :: ExpQ -> ExpQ
- mkClosureLoc :: ExpQ -> ExpQ
- data LocT a
- here :: ExpQ
- unsafeMkClosure :: a -> Static (Env -> a) -> Env -> Closure a
- data Static a
- data StaticDecl
- declare :: Static a -> StaticDecl
- register :: StaticDecl -> IO ()
- showStaticTable :: [String]
- static :: Name -> ExpQ
- static_ :: Name -> ExpQ
- staticLoc :: Name -> ExpQ
- staticLoc_ :: Name -> ExpQ
- data Env
- encodeEnv :: Serialize a => a -> Env
- decodeEnv :: Serialize a => Env -> a
- declareStatic :: StaticDecl

# Explicit Closures

## Key facts

An *explicit Closure* is a term of type `Closure t`

, wrapping a thunk
of type `t`

. Henceforth, we will write *Closure* (capitalised) to mean
an explicit Closure, and *closure* (in lower case) to mean an ordinary
Haskell closure (ie. a thunk).

The

type constructor is abstract (see reference [2] or
module `Closure`

`Internal`

for its internal *dual*
representation).

The function

returns the thunk wrapped in a Closure.
`unClosure`

The function

constructs Closures but is considered
`unsafeMkClosure`

*unsafe* because it exposes the internal dual representation, and
relies on the programmer to guarantee consistency.

Article [2] proposes a *safe* Closure construction `$(mkClosure [|...|])`

where the `...`

is an arbitrary thunk. Due to current limitations of the
GHC (namely missing support for the

type constructor),
this module provides only limited variants of `Static`

`$(mkClosure [|...|])`

.
The restrictions on the the thunk `...`

are

- either
`...`

is a toplevel variable (also called a*static closure*), - or
`...`

is a toplevel variable (also called a*closure abstraction*) applied to a tuple of local variables.

As a matter of nomenclature, functions operating on Closures will either
contain the string `Closure`

(like

and `unClosure`

) or will
end in the suffix `mkClosure`

`C`

(like the categorical operations

and
`idC`

).
`compC`

Some identities involving Closures:

unClosure $ unsafeMkClosure x fun env = x

unClosure $ toClosure x = x

unClosure $(mkClosure [| stat_clo |]) = stat_clo

unClosure $(mkClosure [| clo_abs free_vars |]) = clo_abs free_vars

## The Closure type constructor

An explicit Closure, ie. a term of type `Closure a`

, maintains a dual
representation of an actual closure (ie. a thunk) of type `a`

.

- One half of that representation is the actual closure, the
*thunk*of type`a`

. - The other half is a serialisable representation of the thunk,
consisting of a

environment deserialiser, of type`Static`

, plus a serialised environment of type`Static`

(`Env`

-> a)

.`Env`

Representation (1) is used for computing with Closures while representation (2) is used for serialising and communicating Closures across the network.

## Caveat: Deserialisation is not type safe

Deserialising a serialised value via the methods of class

(or `Binary`

) is not type safe.
For instance, the compiler will happily assign type `Serialize`

`Int -> Bool`

to

decodeBool . encodeInt where decodeBool = decode :: ByteString -> Bool encodeInt = encode :: Int -> ByteString

This type coercion will only fail at runtime, when (and if) the decoder for
`Bool`

stumbles over unexepected input values. Hence deserialising can be
viewed as an assertion that the serialised byte string represents a value
of the type `decode`

expects to see. A well-written decoder will check
this assertion on deserialisation, and will abort with an error message
if the assertion fails.

HdpH treats Closure deserialisation similarly as an assertion that the
serialised byte string represents a value of the expected Closure type.
However, HdpH does not check whether the assertion holds. Instead it
subverts the type system via `unsafeCoerce`

, with potentially disastrous
consequences (like seg faults) in cases where the assertion fails.

The up side of this design choice is a simpler Closure representation
without runtime type reflection.
The down side is that Closure deserialisation has to be treated with
*extreme care*.
HdpH, for instance, avoids the pitfalls of Closure deserialisation by
making sure only Closures of the fixed type `Closure (Par ())`

are
serialised and deserialised.

## Closure elimination

unClosure :: Closure a -> aSource

Eliminates a Closure by returning its thunk. This operation is cheap.

## Forcing Closures

forceClosure :: (NFData a, ToClosure a) => Closure a -> Closure aSource

`forceClosure`

fully forces its argument, ie. fully normalises the thunk.
Importantly, `forceClosure`

also updates the serialisable closure
represention in order to keep it in sync with the normalised thunk.
Note that only the thunk of the resulting Closure is in normal form;
for the serialisable representation to also be in normal form, the resulting
Closure must be forced by

.
`deepseq`

Note that `forceClosure clo`

does not have the same effect as

## Categorical operations on function Closures

A *function Closure* is a Closure of type

.
`Closure`

(a -> b)

compC :: Closure (b -> c) -> Closure (a -> b) -> Closure (a -> c)Source

Composition of function Closures.

apC :: Closure (a -> b) -> Closure a -> Closure bSource

Application of a function Closure to another Closure.
Behaves like the `ap`

operation of an applicative functor.

## Construction of value Closures

A *value Closure* is a Closure, which when eliminated (from its
serialisable representation, at least) yields an evaluated value
(rather than an unevaluated thunk).

toClosure :: ToClosure a => a -> Closure aSource

`toClosure x`

constructs a value Closure wrapping `x`

, provided the
type of `x`

is an instance of class

.
Note that the serialised representation of the resulting Closure stores a
serialised representation (as per class `ToClosure`

) of
`Serialize`

`x`

, so serialising the resulting Closure will force `x`

(hence could be
costly). However, Closure construction itself is cheap.

class Serialize a => ToClosure a whereSource

Indexing class, recording types which support the

operation;
see the tutorial below for a more thorough explanation.
Note that `toClosure`

`ToClosure`

is a subclass of

.
`Serialize`

locToClosure :: LocT aSource

Only method of class `ToClosure`

, recording the source location
where an instance of `ToClosure`

is declared.

type StaticToClosure a = Static (Env -> a)Source

staticToClosure :: ToClosure a => StaticToClosure aSource

## Safe Closure construction

mkClosure :: ExpQ -> ExpQSource

Template Haskell transformation constructing a Closure from a given thunk.
The thunk must either be a single toplevel closure (in which case the result
is a *static* Closure), or an application of a toplevel closure
abstraction to a tuple of local variables.
See the tutorial below for how to use `mkClosure`

.

mkClosureLoc :: ExpQ -> ExpQSource

Template Haskell transformation constructing a family of Closures from a
given thunk. The family is indexed by location (that's what the suffix `Loc`

stands for).
The thunk must either be a single toplevel closure (in which case the result
is a family of *static* Closures), or an application of a toplevel closure
abstraction to a tuple of local variables.
See the tutorial below for how to use `mkClosureLoc`

.

A value of type `'LocT a'`

is a representation of a Haskell source location
(or more precisely, the location of a Template Haskell slice, as produced by

). Additionally, this location is annotated with a phantom type `here`

`a`

,
which is used for mapping location indexing to type indexing.

## Unsafe Closure construction

unsafeMkClosure :: a -> Static (Env -> a) -> Env -> Closure aSource

`unsafeMkClosure thk fun env`

constructs a Closure that

- wraps the thunk
`thk`

and - whose serialised representation consists of the

deserialiser`Static`

`fun`

and the serialised environment`env`

.

This operation is cheap and does not require Template Haskell support,
but it is *unsafe* because it relies on the programmer to ensure that
both closure representations evaluate to the same term.

# Static terms

A term `t`

is called *static* if it could be declared at the toplevel.
That is, `t`

is static if all free variables occuring in `t`

are
toplevel.

## The Static type constructor

A term of type `Static a`

is a serialisable reference to a *static* term
of type `a`

.

## Static declaration and registration

data StaticDecl Source

declare :: Static a -> StaticDeclSource

register :: StaticDecl -> IO ()Source

showStaticTable :: [String]Source

Emits the contents of the global

table as a list of Strings,
one per entry. Useful for debugging; not to be called prior to registration.
`Static`

## Static deserialisers

A `Static`

*environment deserialiser* is a term of type

, for some type `Static`

(`Env`

-> a)`a`

, and is part
of the serialisable representation of an explicit Closure.

Template Haskell transformation converting a toplevel closure abstraction
(given by its name) into a

deserialiser.
See the tutorial below for how to use `Static`

`static`

.

Template Haskell transformation converting a static toplevel closure
(given by its name) into a

deserialiser.
Note that a static closure ignores its empty environment (which is
what the suffix `Static`

`_`

is meant to signify).
See the tutorial below for how to use `static_`

.

staticLoc :: Name -> ExpQSource

Template Haskell transformation converting a toplevel closure abstraction
(given by its name) into a family of

deserialisers indexed by
location (that's what the suffix `Static`

`Loc`

stands for).
See the tutorial below for how to use `staticLoc`

.

staticLoc_ :: Name -> ExpQSource

Template Haskell transformation converting a static toplevel closure
(given by its name) into a family of

deserialisers indexed by
location (that's what the suffix `Static`

`Loc`

stands for).
Note that a static closure ignores its empty environment (which is
what the suffix `_`

is meant to signify).
See the tutorial below for how to use `staticLoc_`

.

# Serialised environment

A *serialised enviroment* is part of the serialisable representation
of an explicit Closure.

Abstract type of serialised environments.

encodeEnv :: Serialize a => a -> EnvSource

Creates a serialised environment from a given value of type `a`

.

decodeEnv :: Serialize a => Env -> aSource

Deserialises a serialised environment producing a value of type `a`

.
Note that the programmer asserts that the environment can be deserialised
at type `a`

, a type mismatch may abort or crash the program.

# This module's Static declaration

# Tutorial on safe Closure construction

This guide will demonstrate the safe construction of explicit Closures in HdpH through a series of examples.

- Constructing plain Closures

The first example is the definition of the Closure transformation
`apC`

. It is defined in [2] (where it is called `mapClosure`

) by eliminating
the Closures of its arguments, and constructing a new Closure of the the
resulting application, as follows:

apC :: Closure (a -> b) -> Closure a -> Closure b apC clo_f clo_x = $(mkClosure [|unClosure clo_f $ unClosure clo_x|])

If

were fully supported by GHC then `Static`

would abstract
the free local variables (here `mkClosure`

`clo_f`

and `clo_x`

) in its quoted argument,
serialise a tuple of these variables, and construct a suitable

deserialiser. In short, expanding the above Template Haskell splice
would yield the following definition:
`Static`

apC clo_f clo_x = let thk = unClosure clo_f $ unClosure clo_x env = encodeEnv (clo_f, clo_x) fun = static $ \ env -> let (clo_f, clo_x) = decodeEnv env in unClosure clo_f $ unClosure clo_x in unsafeMkClosure thk fun env

However, the current implementation of

cannot do this because
there is no term former `mkClosure`

`static`

. Instead, the current implementation
of

expects its quoted argument to be in a one of two special
forms: either it is a single `mkClosure`

*toplevel* variable, or an application of a
*toplevel* variable to a tuple of free variables. To distinguish the two
forms, Closures constructed from single toplevel variables will be called
*static* (because like static terms they do not capture any variables).

Here is the definition of `apC`

as an example of how to construct a
general, non-static Closure.

apC clo_f clo_x = $(mkClosure [|apC_abs (clo_f, clo_x)|]) apC_abs :: (Closure (a -> b), Closure a) -> b apC_abs (clo_f, clo_x) = unClosure clo_f $ unClosure clo_x

First, the programmer must manually define a toplevel *closure abstraction*
`apC_abs`

which abstracts a tuple `(clo_f, clo_x)`

of free local variables
occuring in the expression to be converted into a Closure. (Usually, the
programmer would want the closure abstraction `apC_abs`

to be inlined.)

Second, the programmer constructs the explicit Closure via a Template
Haskell splice `$(mkClosure [|apC_abs (clo_f, clo_x)|])`

, where the quoted
expression `apC_abs (clo_f, clo_x)`

is an application of the toplevel closure
abstraction to a tuple of free local variables. It is important
that this actual tuple of variables matches *exactly* the formal tuple
in the definition of the closure abstraction. In fact, best practice is
for the quoted expression to textually match the left-hand side of the
definition of the closure abstraction.

Finally, a

deserialiser corresponding to the toplevel closure
abstraction must be declared. To this end, this module exports a Template
Haskell function `Static`

which converts the name of a toplevel closure
abstraction into a `static`

deserialiser, and a function `Static`

which
turns the `declare`

deserialiser into a `Static`

declaration. These
should be used as follows; see also the definition of `Static`

`declareStatic`

below.

declare $(static 'apC_abs)

The second example is the definition of the static Closure `idC`

, which
lifts the identity function to a function Closure.
The definition is a follows; see also code towards the end of this file.

idC :: Closure (a -> a) idC = $(mkClosure [|id|])

The expression to be converted into a Closure, the variable

, does not
contain any free local variables, so there is no need to abstract a tuple
of free variables. In fact, the expression is already a toplevel variable,
so there is also no need to define a new toplevel closure abstraction either.
Thus, the programmer constructs the static Closure via the the Template
Haskell splice `id`

`$(mkClosure [|id|])`

, where the quoted expression

is a toplevel variable.
`id`

The programmer must also declare a

deserialiser corresponding to
the toplevel variable of which the static Closure was created. This is done
as follows; see also the definition of `Static`

`declareStatic`

below.

declare $(static_ 'id)

Note the use of

instead of `static_`

to create a `static`

deserialiser for a `Static`

*static* Closure, ie. a

deserialiser that does
not actually deserialise any free variables. Accidentally mixing up
`Static`

and `static`

may lead to runtime errors!
`static_`

- Constructing families of Closures

A problem arises with Closures whose type is not only polymorphic (as eg.
the type of `idC`

above) but also constrained by type classes. The problem
here is that the class constraint has to be resolved statically, as there
is no way of attaching implicit dictionaries to a Closure (because these
dictionaries would have to be serialised as well). However, there is a
way of safely constructing such constrained Closures. The idea is to
actually construct a family of Closures, indexed by source locations.
The indices are produced by certain type class instances, effectively
turning the location-based into a type-based indexing. We illustrate
this method on two examples.

The first example is the function `toClosure`

which converts any suitable
value into a Closure, forcing the value upon serialisation. The *suitable*
values are those whose type is an instance of class

, so one would expect `Serialize`

`toClosure`

to have the
following implementation and

declaration:
`Static`

toClosure :: (Serialize a) => a -> Closure a toClosure val = $(mkClosure [| id val |]) declare $(static 'id)

However, this does not compile - the last line complains from an ambiguous
type variable `a`

in the constrait `Serialize a`

.

The solution is to define a new type class `ToClosure`

as a subclass of
the given

constraint, and use instances of
`Serialize`

`ToClosure`

to index a family of Closures. The indexing is done by
`locToClosure`

, the only member of class `ToClosure`

, as follows.

class (Serialize a) => ToClosure a where locToClosure :: LocT a toClosure :: (ToClosure a) => a -> Closure a toClosure val = $(mkClosureLoc [| id val |]) locToClosure

Note that the above splice `$(mkClosureLoc [|id val|])`

generates a family
of type

. Applying that family to the index
`LocT`

a -> `Closure`

a`locToClosure`

yields the actual Closure. On the face of it, indexing
appears to be location based, but actually the family generated by

never evaluates the index; thanks to the phantom type
argument of `mkClosureLoc`

the indexing is really done statically on the types.
(Though the location information is necessary to tag the associated
`LocT`

deserialisers, and will be evaluated when constructing the
`Static`

table.)
`Static`

What this achieves is reducing the constraint on `toClosure`

from

to `Serialize`

`ToClosure`

. Hence `toClosure`

is only
available for types for which the programmer explicitly instantiates
`ToClosure`

. These instances are very simple, see the following two samples.

instance ToClosure Int where locToClosure = $(here) instance ToClosure (Closure a) where locToClosure = $(here)

Note that the two instances are entirely identical, in fact all instances
of `ToClosure`

must be identical. In particular, instances must not
be recursive, so that the number of types instantiating `ToClosure`

matches exactly the number of instance declarations in the source code.
All an instance does is record its location in the source code via
`locToClosure`

, making `locToClosure`

a key for `ToClosure`

instances.

The programmer must declare the

deserialisers associated with
the above family of Closures. More precisely, she must declare one
deserialiser per `Static`

`ToClosure`

instance. This is done by defining a
family of

deserialisers, similar to the family of Closures.
`Static`

type StaticToClosure a = Static (Env -> a) staticToClosure :: (ToClosure a) => StaticToClosure a staticToClosure = $(staticLoc 'id) locToClosure

Note that the splice `$(staticLoc 'id)`

above generates a family of
type

. Applying that family to the index
`LocT`

a -> `Static`

(`Env`

-> a)`locToClosure`

yields an actual

deserialiser. The type synomyn
`Static`

`StaticToClosure`

is a mere convenience to improve readability, as will
become evident when actually declaring the

deserialisers
(particularly in the second example).
`Static`

It remains to actually declare the

deserialisers, one per instance
of `Static`

`ToClosure`

. Since

declarations form a monoid, the programmer
can simply concatenate all these declarations. So, given the above sample
`Static`

`ToClosure`

instances, the programmer would declare the associated
deserialisers as follows.

Data.Monoid.mconcat [declare (staticToClosure :: StaticToClosure Int), declare (staticToClosure :: forall a . StaticToClosure (Closure a))]

The second example of families of Closures is `forceCC`

, which wraps
`forceC`

into a Closure, where `forceC`

is defined as follows:

forceC :: (NFData a, ToClosure a) => Strategy (Closure a) forceC clo = return $! forceClosure clo

Note that `forceC`

lifts

to a `forceClosure`

*strategy* in the `Par`

monad; please see module `Control.Parallel.HdpH.Strategies`

for the
definition of the `Strategy`

type constructor.

Ideally, `forceCC`

would be implemented simply like this:

forceCC :: (NFData a, ToClosure a) => Closure (Strategy (Closure a)) forceCC = $(mkClosure [|forceC|])

However, the type class constraint demands that `forceCC`

generate a family
of Closures; in fact, it must generate a family of *static* Closures because
`forceC`

, the expression quoted in the Template Haskell splice, is a single
toplevel variable. The actual implementation of `forceCC`

is as follows:

class (NFData a, ToClosure a) => ForceCC a where locForceCC :: LocT (Strategy (Closure a)) forceCC :: (ForceCC a) => Closure (Strategy (Closure a)) forceCC = $(mkClosureLoc [| forceC |]) locForceCC

Above is the first part of the code, the definition of a new type class
`ForceCC`

and the defintion of `forceCC`

itself. Note the complex phantom
type argument of `locForceCC`

; this is dictated by the splice
`$(mkClosureLoc [|forceC|])`

which is of type

.
Instances of class `LocT`

(Strategy (`Closure`

a)) -> `Closure`

(Strategy (`Closure`

a))`ForceCC`

are very similar to instances of `ToClosure`

,
cf. the following two samples.

instance ForceCC Int where locForceCC = $(here) instance ForceCC (Closure a) where locForceCC = $(here)

Along with the family of Closures there is a family of

deserialisers, defined as follows. Note the use of `Static`

because
`staticLoc_`

generates a family of `forceCC`

*static* Closures.

type StaticForceCC a = Static (Env -> Strategy (Closure a)) staticForceCC :: (ForceCC a) => StaticForceCC a staticForceCC = $(staticLoc_ 'forceC) locForceCC

It remains to actually declare the

deserialisers, one per
`Static`

`ForceCC`

instance. Again, this is very similar to declaring the
deserialisers linked to `ToClosure`

instances. But note how the type
synonym `StaticForceCC`

improves readability of the declaration - just try
expanding the last line of the following declaration.

Data.Monoid.mconcat [declare (staticForceCC :: StaticForceCC Int), declare (staticForceCC :: forall a . StaticForceCC (Closure a))]

- Registering

deserialisers`Static`

All

deserialisers need to be declared, as shown above.
By convention, each module must concatenate all such `Static`

declarations
(including declarations in imported modules) into a single `Static`

declaration. The `Static`

`Main`

module, finally, must *register* its

declaration (which by convention includes all declarations in imported
modules) at the beginning of the `Static`

`main`

function, to create the

table. This section shows how to concatenate and register `Static`

declarations.
`Static`

First, concatenating

declarations. As shown above, a `Static`

deserialiser can be turned into a `Static`

declaration (of type
`Static`

) by applying `StaticDecl`

. The type `declare`

is an
instance of class `StaticDecl`

, so `Monoid`

declarations can be
concatenated via the `Static`

operation. In fact, `mconcat`

is not
just a monoid but a commutative and idempotent monoid, thanks to the
way `StaticDecl`

deserialisers are constructed. That means that `Static`

declarations can be concatenated repeatedly and in any order without
changing the result.
`Static`

As a matter of convention, every module which constructs explicit Closures
must export a term `declareStatic :: StaticDecl`

, a comprehensive declaration
of all the module's

deserialisers, including all `Static`

deserialisers of imported modules. The latter is required because the module
may call imported functions which in turn depend on their modules'
`Static`

deserialisers.
`Static`

declareStatic :: StaticDecl declareStatic = Data.Monoid.mconcat [declare $(static_ 'id), declare $(static_ 'constUnit), declare $(static 'compC_abs), declare $(static 'apC_abs), declare (staticToClosure :: forall a . StaticToClosure (Closure a))]

Above is a simple example of such a comprehensive Static declaration,
the declaration of this module. Note that the order of the list argument
to

does not matter because because `mconcat`

is a
commutative monoid. Below is a more complex example, taken from a HdpH
program.
`StaticDecl`

declareStatic :: StaticDecl declareStatic = Data.Monoid.mconcat [Control.Parallel.HdpH.declareStatic, Control.Parallel.HdpH.Strategies.declareStatic, declare (staticToClosure :: StaticToClosure Int), declare (staticToClosure :: StaticToClosure [Int]), declare (staticToClosure :: StaticToClosure Integer), declare (staticForceCC :: StaticForceCC Integer), declare $(static 'spark_sum_euler_abs), declare $(static_ 'sum_totient), declare $(static_ 'totient)]

This declaration does concatenate the Static declarations of two imported
modules, `Control.Parllel.HdpH`

and `Control.Parallel.HdpH.Strategies`

.
Actually, the latter also imports the former, so
`Control.Parallel.HdpH.Strategies.declareStatic`

already concatenates
`Control.Parallel.HdpH.declareStatic`

, and so the above call to
`Control.Parallel.HdpH.declareStatic`

could be omitted. However, that
omission is only justified in the knowledge of implementation details of
`Control.Parallel.HdpH.Strategies`

(namely its import graph). The convention
that

declarations of `Static`

*all* imported modules must be concatenated,
regardless of whether they are transitively concatenated by other imported
modules, is more robust; the resulting

declaration is the same
anyway, thanks to `Static`

being an idempotent monoid.
`StaticDecl`

A general rule how to deal with imports and

declarations:
If module `Static`

`M1`

imports another module `M2`

, and
if `M2`

exports a variable `declareStatic :: StaticDecl`

then `M1`

must export its own `declareStatic :: StaticDecl`

,
which needs to concatenate `M2.declareStatic`

(amongst other declarations).
Note that the rule applies even if `M1`

ostensibly imports nothing
from `M2`

, eg. `M1`

imports `M2`

with the clause `import M2 ()`

. For
this clause may still import instance declarations from `M2`

, and those
instances may depend on M2's

deserialisers.
`Static`

Finally, registering a

declaration. This must be done exactly once
in the `Static`

`Main`

module, right at the start of `main`

, by calling the function

on the `register`

`Main`

module's comprehensive

declaration.
This is shown in sample code below, taken from a HdpH program.
Note that any actions executed prior to `Static`

(ie. the `register`

`hSetBuffering`

calls) must not involve explicit Closures.

main :: IO () main = do hSetBuffering stdout LineBuffering hSetBuffering stderr LineBuffering register declareStatic ...