Safe Haskell | None |
---|
Explicit Closures, inspired by [1] and refined by [2].
References:
- Epstein, Black, Peyton-Jones. Towards Haskell in the Cloud. Haskell Symposium 2011.
- Maier, Trinder. Implementing a High-level Distributed-Memory Parallel Haskell in Haskell. IFL 2011.
- data Closure a
- unClosure :: Closure a -> a
- forceClosure :: (NFData a, ToClosure a) => Closure a -> Closure a
- idC :: Closure (a -> a)
- termC :: Closure (a -> ())
- compC :: Closure (b -> c) -> Closure (a -> b) -> Closure (a -> c)
- apC :: Closure (a -> b) -> Closure a -> Closure b
- toClosure :: ToClosure a => a -> Closure a
- class Serialize a => ToClosure a where
- locToClosure :: LocT a
- type StaticToClosure a = Static (Env -> a)
- staticToClosure :: ToClosure a => StaticToClosure a
- mkClosure :: ExpQ -> ExpQ
- mkClosureLoc :: ExpQ -> ExpQ
- data LocT a
- here :: ExpQ
- unsafeMkClosure :: a -> Static (Env -> a) -> Env -> Closure a
- data Static a
- data StaticDecl
- declare :: Static a -> StaticDecl
- register :: StaticDecl -> IO ()
- showStaticTable :: [String]
- static :: Name -> ExpQ
- static_ :: Name -> ExpQ
- staticLoc :: Name -> ExpQ
- staticLoc_ :: Name -> ExpQ
- data Env
- encodeEnv :: Serialize a => a -> Env
- decodeEnv :: Serialize a => Env -> a
- declareStatic :: StaticDecl
Explicit Closures
Key facts
An explicit Closure is a term of type Closure t
, wrapping a thunk
of type t
. Henceforth, we will write Closure (capitalised) to mean
an explicit Closure, and closure (in lower case) to mean an ordinary
Haskell closure (ie. a thunk).
The
type constructor is abstract (see reference [2] or
module Closure
Internal
for its internal dual
representation).
The function
returns the thunk wrapped in a Closure.
unClosure
The function
constructs Closures but is considered
unsafe because it exposes the internal dual representation, and
relies on the programmer to guarantee consistency.
unsafeMkClosure
Article [2] proposes a safe Closure construction $(mkClosure [|...|])
where the ...
is an arbitrary thunk. Due to current limitations of the
GHC (namely missing support for the
type constructor),
this module provides only limited variants of Static
$(mkClosure [|...|])
.
The restrictions on the the thunk ...
are
- either
...
is a toplevel variable (also called a static closure), - or
...
is a toplevel variable (also called a closure abstraction) applied to a tuple of local variables.
As a matter of nomenclature, functions operating on Closures will either
contain the string Closure
(like
and unClosure
) or will
end in the suffix mkClosure
C
(like the categorical operations
and
idC
).
compC
Some identities involving Closures:
unClosure $ unsafeMkClosure x fun env = x
unClosure $ toClosure x = x
unClosure $(mkClosure [| stat_clo |]) = stat_clo
unClosure $(mkClosure [| clo_abs free_vars |]) = clo_abs free_vars
The Closure type constructor
An explicit Closure, ie. a term of type Closure a
, maintains a dual
representation of an actual closure (ie. a thunk) of type a
.
- One half of that representation is the actual closure, the thunk
of type
a
. - The other half is a serialisable representation of the thunk,
consisting of a
environment deserialiser, of typeStatic
, plus a serialised environment of typeStatic
(Env
-> a)
.Env
Representation (1) is used for computing with Closures while representation (2) is used for serialising and communicating Closures across the network.
Caveat: Deserialisation is not type safe
Deserialising a serialised value via the methods of class
(or Binary
) is not type safe.
For instance, the compiler will happily assign type Serialize
Int -> Bool
to
decodeBool . encodeInt where decodeBool = decode :: ByteString -> Bool encodeInt = encode :: Int -> ByteString
This type coercion will only fail at runtime, when (and if) the decoder for
Bool
stumbles over unexepected input values. Hence deserialising can be
viewed as an assertion that the serialised byte string represents a value
of the type decode
expects to see. A well-written decoder will check
this assertion on deserialisation, and will abort with an error message
if the assertion fails.
HdpH treats Closure deserialisation similarly as an assertion that the
serialised byte string represents a value of the expected Closure type.
However, HdpH does not check whether the assertion holds. Instead it
subverts the type system via unsafeCoerce
, with potentially disastrous
consequences (like seg faults) in cases where the assertion fails.
The up side of this design choice is a simpler Closure representation
without runtime type reflection.
The down side is that Closure deserialisation has to be treated with
extreme care.
HdpH, for instance, avoids the pitfalls of Closure deserialisation by
making sure only Closures of the fixed type Closure (Par ())
are
serialised and deserialised.
Closure elimination
unClosure :: Closure a -> aSource
Eliminates a Closure by returning its thunk. This operation is cheap.
Forcing Closures
forceClosure :: (NFData a, ToClosure a) => Closure a -> Closure aSource
forceClosure
fully forces its argument, ie. fully normalises the thunk.
Importantly, forceClosure
also updates the serialisable closure
represention in order to keep it in sync with the normalised thunk.
Note that only the thunk of the resulting Closure is in normal form;
for the serialisable representation to also be in normal form, the resulting
Closure must be forced by
.
deepseq
Note that forceClosure clo
does not have the same effect as
Categorical operations on function Closures
A function Closure is a Closure of type
.
Closure
(a -> b)
compC :: Closure (b -> c) -> Closure (a -> b) -> Closure (a -> c)Source
Composition of function Closures.
apC :: Closure (a -> b) -> Closure a -> Closure bSource
Application of a function Closure to another Closure.
Behaves like the ap
operation of an applicative functor.
Construction of value Closures
A value Closure is a Closure, which when eliminated (from its serialisable representation, at least) yields an evaluated value (rather than an unevaluated thunk).
toClosure :: ToClosure a => a -> Closure aSource
toClosure x
constructs a value Closure wrapping x
, provided the
type of x
is an instance of class
.
Note that the serialised representation of the resulting Closure stores a
serialised representation (as per class ToClosure
) of
Serialize
x
, so serialising the resulting Closure will force x
(hence could be
costly). However, Closure construction itself is cheap.
class Serialize a => ToClosure a whereSource
Indexing class, recording types which support the
operation;
see the tutorial below for a more thorough explanation.
Note that toClosure
ToClosure
is a subclass of
.
Serialize
locToClosure :: LocT aSource
Only method of class ToClosure
, recording the source location
where an instance of ToClosure
is declared.
type StaticToClosure a = Static (Env -> a)Source
staticToClosure :: ToClosure a => StaticToClosure aSource
Safe Closure construction
mkClosure :: ExpQ -> ExpQSource
Template Haskell transformation constructing a Closure from a given thunk.
The thunk must either be a single toplevel closure (in which case the result
is a static Closure), or an application of a toplevel closure
abstraction to a tuple of local variables.
See the tutorial below for how to use mkClosure
.
mkClosureLoc :: ExpQ -> ExpQSource
Template Haskell transformation constructing a family of Closures from a
given thunk. The family is indexed by location (that's what the suffix Loc
stands for).
The thunk must either be a single toplevel closure (in which case the result
is a family of static Closures), or an application of a toplevel closure
abstraction to a tuple of local variables.
See the tutorial below for how to use mkClosureLoc
.
A value of type 'LocT a'
is a representation of a Haskell source location
(or more precisely, the location of a Template Haskell slice, as produced by
). Additionally, this location is annotated with a phantom type here
a
,
which is used for mapping location indexing to type indexing.
Unsafe Closure construction
unsafeMkClosure :: a -> Static (Env -> a) -> Env -> Closure aSource
unsafeMkClosure thk fun env
constructs a Closure that
- wraps the thunk
thk
and - whose serialised representation consists of the
deserialiserStatic
fun
and the serialised environmentenv
.
This operation is cheap and does not require Template Haskell support, but it is unsafe because it relies on the programmer to ensure that both closure representations evaluate to the same term.
Static terms
A term t
is called static if it could be declared at the toplevel.
That is, t
is static if all free variables occuring in t
are
toplevel.
The Static type constructor
A term of type Static a
is a serialisable reference to a static term
of type a
.
Static declaration and registration
data StaticDecl Source
declare :: Static a -> StaticDeclSource
register :: StaticDecl -> IO ()Source
showStaticTable :: [String]Source
Emits the contents of the global
table as a list of Strings,
one per entry. Useful for debugging; not to be called prior to registration.
Static
Static deserialisers
A
environment deserialiser is a term of type
Static
, for some type Static
(Env
-> a)a
, and is part
of the serialisable representation of an explicit Closure.
Template Haskell transformation converting a toplevel closure abstraction
(given by its name) into a
deserialiser.
See the tutorial below for how to use Static
static
.
Template Haskell transformation converting a static toplevel closure
(given by its name) into a
deserialiser.
Note that a static closure ignores its empty environment (which is
what the suffix Static
_
is meant to signify).
See the tutorial below for how to use static_
.
staticLoc :: Name -> ExpQSource
Template Haskell transformation converting a toplevel closure abstraction
(given by its name) into a family of
deserialisers indexed by
location (that's what the suffix Static
Loc
stands for).
See the tutorial below for how to use staticLoc
.
staticLoc_ :: Name -> ExpQSource
Template Haskell transformation converting a static toplevel closure
(given by its name) into a family of
deserialisers indexed by
location (that's what the suffix Static
Loc
stands for).
Note that a static closure ignores its empty environment (which is
what the suffix _
is meant to signify).
See the tutorial below for how to use staticLoc_
.
Serialised environment
A serialised enviroment is part of the serialisable representation of an explicit Closure.
Abstract type of serialised environments.
encodeEnv :: Serialize a => a -> EnvSource
Creates a serialised environment from a given value of type a
.
decodeEnv :: Serialize a => Env -> aSource
Deserialises a serialised environment producing a value of type a
.
Note that the programmer asserts that the environment can be deserialised
at type a
, a type mismatch may abort or crash the program.
This module's Static declaration
Tutorial on safe Closure construction
This guide will demonstrate the safe construction of explicit Closures in HdpH through a series of examples.
- Constructing plain Closures
The first example is the definition of the Closure transformation
apC
. It is defined in [2] (where it is called mapClosure
) by eliminating
the Closures of its arguments, and constructing a new Closure of the the
resulting application, as follows:
apC :: Closure (a -> b) -> Closure a -> Closure b apC clo_f clo_x = $(mkClosure [|unClosure clo_f $ unClosure clo_x|])
If
were fully supported by GHC then Static
would abstract
the free local variables (here mkClosure
clo_f
and clo_x
) in its quoted argument,
serialise a tuple of these variables, and construct a suitable
deserialiser. In short, expanding the above Template Haskell splice
would yield the following definition:
Static
apC clo_f clo_x = let thk = unClosure clo_f $ unClosure clo_x env = encodeEnv (clo_f, clo_x) fun = static $ \ env -> let (clo_f, clo_x) = decodeEnv env in unClosure clo_f $ unClosure clo_x in unsafeMkClosure thk fun env
However, the current implementation of
cannot do this because
there is no term former mkClosure
static
. Instead, the current implementation
of
expects its quoted argument to be in a one of two special
forms: either it is a single toplevel variable, or an application of a
toplevel variable to a tuple of free variables. To distinguish the two
forms, Closures constructed from single toplevel variables will be called
static (because like static terms they do not capture any variables).
mkClosure
Here is the definition of apC
as an example of how to construct a
general, non-static Closure.
apC clo_f clo_x = $(mkClosure [|apC_abs (clo_f, clo_x)|]) apC_abs :: (Closure (a -> b), Closure a) -> b apC_abs (clo_f, clo_x) = unClosure clo_f $ unClosure clo_x
First, the programmer must manually define a toplevel closure abstraction
apC_abs
which abstracts a tuple (clo_f, clo_x)
of free local variables
occuring in the expression to be converted into a Closure. (Usually, the
programmer would want the closure abstraction apC_abs
to be inlined.)
Second, the programmer constructs the explicit Closure via a Template
Haskell splice $(mkClosure [|apC_abs (clo_f, clo_x)|])
, where the quoted
expression apC_abs (clo_f, clo_x)
is an application of the toplevel closure
abstraction to a tuple of free local variables. It is important
that this actual tuple of variables matches exactly the formal tuple
in the definition of the closure abstraction. In fact, best practice is
for the quoted expression to textually match the left-hand side of the
definition of the closure abstraction.
Finally, a
deserialiser corresponding to the toplevel closure
abstraction must be declared. To this end, this module exports a Template
Haskell function Static
which converts the name of a toplevel closure
abstraction into a static
deserialiser, and a function Static
which
turns the declare
deserialiser into a Static
declaration. These
should be used as follows; see also the definition of Static
declareStatic
below.
declare $(static 'apC_abs)
The second example is the definition of the static Closure idC
, which
lifts the identity function to a function Closure.
The definition is a follows; see also code towards the end of this file.
idC :: Closure (a -> a) idC = $(mkClosure [|id|])
The expression to be converted into a Closure, the variable
, does not
contain any free local variables, so there is no need to abstract a tuple
of free variables. In fact, the expression is already a toplevel variable,
so there is also no need to define a new toplevel closure abstraction either.
Thus, the programmer constructs the static Closure via the the Template
Haskell splice id
$(mkClosure [|id|])
, where the quoted expression
is a toplevel variable.
id
The programmer must also declare a
deserialiser corresponding to
the toplevel variable of which the static Closure was created. This is done
as follows; see also the definition of Static
declareStatic
below.
declare $(static_ 'id)
Note the use of
instead of static_
to create a static
deserialiser for a static Closure, ie. a Static
deserialiser that does
not actually deserialise any free variables. Accidentally mixing up
Static
and static
may lead to runtime errors!
static_
- Constructing families of Closures
A problem arises with Closures whose type is not only polymorphic (as eg.
the type of idC
above) but also constrained by type classes. The problem
here is that the class constraint has to be resolved statically, as there
is no way of attaching implicit dictionaries to a Closure (because these
dictionaries would have to be serialised as well). However, there is a
way of safely constructing such constrained Closures. The idea is to
actually construct a family of Closures, indexed by source locations.
The indices are produced by certain type class instances, effectively
turning the location-based into a type-based indexing. We illustrate
this method on two examples.
The first example is the function toClosure
which converts any suitable
value into a Closure, forcing the value upon serialisation. The suitable
values are those whose type is an instance of class
, so one would expect Serialize
toClosure
to have the
following implementation and
declaration:
Static
toClosure :: (Serialize a) => a -> Closure a toClosure val = $(mkClosure [| id val |]) declare $(static 'id)
However, this does not compile - the last line complains from an ambiguous
type variable a
in the constrait Serialize a
.
The solution is to define a new type class ToClosure
as a subclass of
the given
constraint, and use instances of
Serialize
ToClosure
to index a family of Closures. The indexing is done by
locToClosure
, the only member of class ToClosure
, as follows.
class (Serialize a) => ToClosure a where locToClosure :: LocT a toClosure :: (ToClosure a) => a -> Closure a toClosure val = $(mkClosureLoc [| id val |]) locToClosure
Note that the above splice $(mkClosureLoc [|id val|])
generates a family
of type
. Applying that family to the index
LocT
a -> Closure
alocToClosure
yields the actual Closure. On the face of it, indexing
appears to be location based, but actually the family generated by
never evaluates the index; thanks to the phantom type
argument of mkClosureLoc
the indexing is really done statically on the types.
(Though the location information is necessary to tag the associated
LocT
deserialisers, and will be evaluated when constructing the
Static
table.)
Static
What this achieves is reducing the constraint on toClosure
from
to Serialize
ToClosure
. Hence toClosure
is only
available for types for which the programmer explicitly instantiates
ToClosure
. These instances are very simple, see the following two samples.
instance ToClosure Int where locToClosure = $(here) instance ToClosure (Closure a) where locToClosure = $(here)
Note that the two instances are entirely identical, in fact all instances
of ToClosure
must be identical. In particular, instances must not
be recursive, so that the number of types instantiating ToClosure
matches exactly the number of instance declarations in the source code.
All an instance does is record its location in the source code via
locToClosure
, making locToClosure
a key for ToClosure
instances.
The programmer must declare the
deserialisers associated with
the above family of Closures. More precisely, she must declare one
deserialiser per Static
ToClosure
instance. This is done by defining a
family of
deserialisers, similar to the family of Closures.
Static
type StaticToClosure a = Static (Env -> a) staticToClosure :: (ToClosure a) => StaticToClosure a staticToClosure = $(staticLoc 'id) locToClosure
Note that the splice $(staticLoc 'id)
above generates a family of
type
. Applying that family to the index
LocT
a -> Static
(Env
-> a)locToClosure
yields an actual
deserialiser. The type synomyn
Static
StaticToClosure
is a mere convenience to improve readability, as will
become evident when actually declaring the
deserialisers
(particularly in the second example).
Static
It remains to actually declare the
deserialisers, one per instance
of Static
ToClosure
. Since
declarations form a monoid, the programmer
can simply concatenate all these declarations. So, given the above sample
Static
ToClosure
instances, the programmer would declare the associated
deserialisers as follows.
Data.Monoid.mconcat [declare (staticToClosure :: StaticToClosure Int), declare (staticToClosure :: forall a . StaticToClosure (Closure a))]
The second example of families of Closures is forceCC
, which wraps
forceC
into a Closure, where forceC
is defined as follows:
forceC :: (NFData a, ToClosure a) => Strategy (Closure a) forceC clo = return $! forceClosure clo
Note that forceC
lifts
to a strategy in the forceClosure
Par
monad; please see module Control.Parallel.HdpH.Strategies
for the
definition of the Strategy
type constructor.
Ideally, forceCC
would be implemented simply like this:
forceCC :: (NFData a, ToClosure a) => Closure (Strategy (Closure a)) forceCC = $(mkClosure [|forceC|])
However, the type class constraint demands that forceCC
generate a family
of Closures; in fact, it must generate a family of static Closures because
forceC
, the expression quoted in the Template Haskell splice, is a single
toplevel variable. The actual implementation of forceCC
is as follows:
class (NFData a, ToClosure a) => ForceCC a where locForceCC :: LocT (Strategy (Closure a)) forceCC :: (ForceCC a) => Closure (Strategy (Closure a)) forceCC = $(mkClosureLoc [| forceC |]) locForceCC
Above is the first part of the code, the definition of a new type class
ForceCC
and the defintion of forceCC
itself. Note the complex phantom
type argument of locForceCC
; this is dictated by the splice
$(mkClosureLoc [|forceC|])
which is of type
.
Instances of class LocT
(Strategy (Closure
a)) -> Closure
(Strategy (Closure
a))ForceCC
are very similar to instances of ToClosure
,
cf. the following two samples.
instance ForceCC Int where locForceCC = $(here) instance ForceCC (Closure a) where locForceCC = $(here)
Along with the family of Closures there is a family of
deserialisers, defined as follows. Note the use of Static
because
staticLoc_
generates a family of static Closures.
forceCC
type StaticForceCC a = Static (Env -> Strategy (Closure a)) staticForceCC :: (ForceCC a) => StaticForceCC a staticForceCC = $(staticLoc_ 'forceC) locForceCC
It remains to actually declare the
deserialisers, one per
Static
ForceCC
instance. Again, this is very similar to declaring the
deserialisers linked to ToClosure
instances. But note how the type
synonym StaticForceCC
improves readability of the declaration - just try
expanding the last line of the following declaration.
Data.Monoid.mconcat [declare (staticForceCC :: StaticForceCC Int), declare (staticForceCC :: forall a . StaticForceCC (Closure a))]
- Registering
deserialisersStatic
All
deserialisers need to be declared, as shown above.
By convention, each module must concatenate all such Static
declarations
(including declarations in imported modules) into a single Static
declaration. The Static
Main
module, finally, must register its
declaration (which by convention includes all declarations in imported
modules) at the beginning of the Static
main
function, to create the
table. This section shows how to concatenate and register Static
declarations.
Static
First, concatenating
declarations. As shown above, a Static
deserialiser can be turned into a Static
declaration (of type
Static
) by applying StaticDecl
. The type declare
is an
instance of class StaticDecl
, so Monoid
declarations can be
concatenated via the Static
operation. In fact, mconcat
is not
just a monoid but a commutative and idempotent monoid, thanks to the
way StaticDecl
deserialisers are constructed. That means that Static
declarations can be concatenated repeatedly and in any order without
changing the result.
Static
As a matter of convention, every module which constructs explicit Closures
must export a term declareStatic :: StaticDecl
, a comprehensive declaration
of all the module's
deserialisers, including all Static
deserialisers of imported modules. The latter is required because the module
may call imported functions which in turn depend on their modules'
Static
deserialisers.
Static
declareStatic :: StaticDecl declareStatic = Data.Monoid.mconcat [declare $(static_ 'id), declare $(static_ 'constUnit), declare $(static 'compC_abs), declare $(static 'apC_abs), declare (staticToClosure :: forall a . StaticToClosure (Closure a))]
Above is a simple example of such a comprehensive Static declaration,
the declaration of this module. Note that the order of the list argument
to
does not matter because because mconcat
is a
commutative monoid. Below is a more complex example, taken from a HdpH
program.
StaticDecl
declareStatic :: StaticDecl declareStatic = Data.Monoid.mconcat [Control.Parallel.HdpH.declareStatic, Control.Parallel.HdpH.Strategies.declareStatic, declare (staticToClosure :: StaticToClosure Int), declare (staticToClosure :: StaticToClosure [Int]), declare (staticToClosure :: StaticToClosure Integer), declare (staticForceCC :: StaticForceCC Integer), declare $(static 'spark_sum_euler_abs), declare $(static_ 'sum_totient), declare $(static_ 'totient)]
This declaration does concatenate the Static declarations of two imported
modules, Control.Parllel.HdpH
and Control.Parallel.HdpH.Strategies
.
Actually, the latter also imports the former, so
Control.Parallel.HdpH.Strategies.declareStatic
already concatenates
Control.Parallel.HdpH.declareStatic
, and so the above call to
Control.Parallel.HdpH.declareStatic
could be omitted. However, that
omission is only justified in the knowledge of implementation details of
Control.Parallel.HdpH.Strategies
(namely its import graph). The convention
that
declarations of all imported modules must be concatenated,
regardless of whether they are transitively concatenated by other imported
modules, is more robust; the resulting Static
declaration is the same
anyway, thanks to Static
being an idempotent monoid.
StaticDecl
A general rule how to deal with imports and
declarations:
If module Static
M1
imports another module M2
, and
if M2
exports a variable declareStatic :: StaticDecl
then M1
must export its own declareStatic :: StaticDecl
,
which needs to concatenate M2.declareStatic
(amongst other declarations).
Note that the rule applies even if M1
ostensibly imports nothing
from M2
, eg. M1
imports M2
with the clause import M2 ()
. For
this clause may still import instance declarations from M2
, and those
instances may depend on M2's
deserialisers.
Static
Finally, registering a
declaration. This must be done exactly once
in the Static
Main
module, right at the start of main
, by calling the function
on the register
Main
module's comprehensive
declaration.
This is shown in sample code below, taken from a HdpH program.
Note that any actions executed prior to Static
(ie. the register
hSetBuffering
calls) must not involve explicit Closures.
main :: IO () main = do hSetBuffering stdout LineBuffering hSetBuffering stderr LineBuffering register declareStatic ...