A graph representation of core programs. A graph is a flat structure that can be viewed as a program with a global scope. For example, the Haskell program
main x = f 1 where f y = g 2 where g z = x + z
might be represented by the following flat graph:
graph = Graph { graphNodes = [ Node { nodeId = 0 , function = Input , input = Tup [] , inputType = Tup [] , outputType = intType } , Node { nodeId = 1 , function = Input , input = Tup [] , inputType = Tup [] , outputType = intType } , Node { nodeId = 2 , function = Input , input = Tup [] , inputType = Tup [] , outputType = intType } , Node { nodeId = 3 , function = Function "(+)" , input = Tup [One (Variable (0,[])), One (Variable (2,[]))] , inputType = intPairType , outputType = intType } , Node { nodeId = 4 , function = NoInline "f" (Interface 1 (One (Variable (5,[]))) intType intType) , input = One (Constant (IntData 1)) , inputType = intType , outputType = intType } , Node { nodeId = 5 , function = NoInline "g" (Interface 2 (One (Variable (3,[]))) intType intType) , input = One (Constant (IntData 2)) , inputType = intType , outputType = intType } ] , graphInterface = Interface { interfaceInput = 0 , interfaceOutput = One (Variable (4,[])) , interfaceInputType = intType , interfaceOutputType = intType } } where intType = result (typeOf :: Res [[[Int]]] (Tuple StorableType)) intPairType = result (typeOf :: Res (Int,Int) (Tuple StorableType))
which corresponds to the following flat program
main v0 = v4 f v1 = v5 g v2 = v3 v3 = v0 + v2 v4 = f 1 v5 = g 2
There are a few assumptions on graphs:
 All nodes have unique identifiers.
 There are no cycles.
 The
input
andinputType
tuples of each node should have the same shape.  Each
interfaceInput
(including the toplevel one) refers to anInput
node not referred to by any other interface.  All
Variable
references are valid (i.e. refer only to those variables implicitly defined by each node).  There should not be any cycles in the constraints introduced by
findLocalities
. (XXX Is this even possible?)  Subfunction interfaces should be "consistent" with the input/output type of the node. For example, the body of a while loop should have the same type as the whole loop.
In the original program, g
was defined locally to f
, and the addition was
done locally in g
. But in the flat program, this hierarchy (called
definition hierarchy) is not represented. The flat program is of course not
valid Haskell (v0
and v2
are used outside of their scopes). The function
makeHierarchical
turns a flat graph into a hierarchical one that
corresponds to syntactically valid Haskell.
makeHierarchical
requires some explanation. First a few definitions:
 Nodes that have associated interfaces (
NoInline
,IfThenElse
,While
andParallel
) are said to contain subfunctions. These nodes are called super nodes. In the above program, the super nodev4
contains the subfunctionf
, andv5
contains the subfunctiong
.  A definition
d
is local to a definitione
iff.d
is placed somewhere within the definition ofe
(i.e. inside an arbitrarily deeply nestedwhere
clause).  A definition
d
is owned by a definitione
iff.d
is placed immediately under the topmostwhere
clause ofe
. A definition may have at most one owner.
The definition hierarchy thus specifies ownership between the definitions in the program. There are two types of ownership:
 A super node is always the owner of its subfunctions.
 A subfunction may be the owner of some node definitions.
Assigning nodes to subfunctions in a useful way takes some work. It is done by first finding out for each node which subfunctions it must be local to. Each locality constraint gives an upper bound on where in the definition hierarchy the node may be placed. There is one principle for introducing a locality constraint:
 If node
v
depends on the input of subfunctionf
, thenv
must be local tof
.
The locality constraints for a graph can thus be found be tracing each
subfunction input in order to find the nodes that depend on it (see function
findLocalities
). In the above program, we have the subfunctions f
and
g
with the inputs v1
and v2
respectively. We can see immediately that
no node depends on v1
, so we get no locality constraints for f
. The only
node that depends on v2
is v3
, so the program has a single locality
constraint: v3
is local to g
. Nodes without constraints are simply taken
to be local to main
. With this information, we can now rewrite the flat
program as
main v0 = v4 where v4 = f 1 where f v1 = v5 v5 = g 2 where g v2 = v3 where v3 = v0 + v2
which is syntactically valid Haskell. Note that this program is slightly
different from the original which defined g
locally to f
. However, in
general, we want definitions to be as "global" as possible in order to
maximize sharing. For example, we don't want to put definitions in the body
of a while loop unless they really depend on the loop state, because then
they will (probably, depending on implementation) be recomputed in every
iteration. Also note that in this program, it is not strictly necessary to
have the subfunctions owned by their super nodes  f
and g
could have
been owned by main
instead. However, this would cause clashes if two
subfunctions have the same name. Having subfunctions owned by their super
nodes is also a way of keeping related definitions together in the program.
There is one caveat with the above method. Consider the following flat program:
main v0 = v4 f v1 = v5 g v2 = v3 v3 = v1 + 2 v4 = f 0 v5 = g 1
Here, we get the locality constraint: v3
is local to f
. However, to get a
valid definition hierarchy, we also need v5
to be local to f
. This is
because v5
is the owner of g
, and the output of g
is local to f
. So
when looking for dependencies, we should let each super node depend on its
subfunction output, except for the owner of the very subfunction that is
being traced (a function cannot be owned by itself).
 type NodeId = Int
 type Variable = (NodeId, [Int])
 data Source
 data Node = Node {}
 data Interface = Interface {}
 data Function
 data Graph = Graph {
 graphNodes :: [Node]
 graphInterface :: Interface
 data Hierarchy = Hierarchy [(Node, [Hierarchy])]
 data HierarchicalGraph = HierGraph {}
 type SuperNode = NodeId
 data SubFunction = SubFunction {}
 data Local = Local SubFunction NodeId
 sourceNodes :: Tuple Source > [NodeId]
 fanout :: Graph > Map NodeId [NodeId]
 nodeMap :: Graph > NodeId > Node
 subFunctions :: Graph > [SubFunction]
 findLocalities :: Graph > [Local]
 orderSuperNodes :: Graph > Map NodeId [SubFunction] > Map SuperNode Int
 minimalSubFun :: Map SuperNode Int > [SubFunction] > SubFunction
 sortNodes :: [Node] > [Node]
 makeHierarchical :: Graph > HierarchicalGraph
Documentation
type Variable = (NodeId, [Int])Source
Variable represented by a node id and a tuple path. For example, in a definition (given in Haskell syntax)
((a,b),c) = f x
the variable b
would be represented as (i,[0,1])
(where i
is the id of
the f
node).
The source of a value is either constant data or a variable.
A node in the program graph. The input is given as a Source
tuple. The
output is implicitly defined by the nodeId
and the outputType
. For
example, a node with id i
and output type
Tup [One ..., One ...]
has the implicit output
Tup [One (i,[0]), One (i,[1])]
Node  

The interface of a (sub)graph. The input is conceptually a
Tuple Variable
, but all these variables refer to the same Input
node, so
it is sufficient to track the node id (the tuple shape can be inferred from
the interfaceInputType
).
Node functionality
A graph is a list of unique nodes with an interface.
Graph  

A definition hierarchy. A hierarchy consists of number of toplevel nodes, each one associated with its subfunctions, represented as hierarchies. The nodes owned by a subfunction appear as the toplevel nodes in the corresponding hierarchy.
data HierarchicalGraph Source
A graph with a hierarchical ordering of the nodes. If the hierarchy is
flattened it should result in a valid Graph
.
data SubFunction Source
The branch is used to distinguish between different subfunctions of the
same super node. For example, the continue condition of a whileloop has
branch number 0, and the body has number 1 (see subFunctions
).
sourceNodes :: Tuple Source > [NodeId]Source
Returns the nodes in a source tuple.
fanout :: Graph > Map NodeId [NodeId]Source
The fanout of each node in a graph. Nodes that are not in the map are assumed to have no fanout.
subFunctions :: Graph > [SubFunction]Source
Lists all subfunctions in the graph.
findLocalities :: Graph > [Local]Source
Lists all locality constraints of the graph.
orderSuperNodes :: Graph > Map NodeId [SubFunction] > Map SuperNode IntSource
Returns a total ordering between all super nodes in a graph, such that if
node v
is local to subfunction f
, then v
maps to a lower number than
the owner of f
. The converse is not necessarily true. The second argument
gives the locality constraints for each node in the graph (toplevel nodes
may be left undefined).
minimalSubFun :: Map SuperNode Int > [SubFunction] > SubFunctionSource
Returns the minimal subfunction according to the given owner ordering.
makeHierarchical :: Graph > HierarchicalGraphSource
Makes a hierarchical graph from a flat one. The node lists in the hierarchy are always sorted according to node id.