Copyright  (c) Justin Le 2017 

License  BSD3 
Maintainer  justin@jle.im 
Stability  experimental 
Portability  nonportable 
Safe Haskell  None 
Language  Haskell2010 
Provides the Op
(and OpM
) type and combinators, which represent
differentiable functions/operations on values, and are used by the
library to perform backpropagation.
Note that Op
is a subset or subtype of OpM
, and so, any function
that expects an
(or an OpM
m as a
)
can be given an OpB
s as a
and it'll work just fine.Op
as a
 type Op as a = forall m. Monad m => OpM m as a
 pattern Op :: forall as a. (Tuple as > (a, Maybe a > Tuple as)) > Op as a
 newtype OpM m as a = OpM (Tuple as > m (a, Maybe a > m (Tuple as)))
 data Prod k f a :: forall k. (k > *) > [k] > * where
 type Tuple = Prod * I
 newtype I a :: * > * = I {
 getI :: a
 runOp :: Op as a > Tuple as > a
 gradOp :: Op as a > Tuple as > Tuple as
 gradOp' :: Op as a > Tuple as > (a, Tuple as)
 gradOpWith :: Op as a > Tuple as > a > Tuple as
 gradOpWith' :: Op as a > Tuple as > Maybe a > Tuple as
 runOp' :: Op as a > Tuple as > (a, Maybe a > Tuple as)
 runOpM :: Functor m => OpM m as a > Tuple as > m a
 gradOpM :: Monad m => OpM m as a > Tuple as > m (Tuple as)
 gradOpM' :: Monad m => OpM m as a > Tuple as > m (a, Tuple as)
 gradOpWithM :: Monad m => OpM m as a > Tuple as > a > m (Tuple as)
 gradOpWithM' :: Monad m => OpM m as a > Tuple as > Maybe a > m (Tuple as)
 runOpM' :: OpM m as a > Tuple as > m (a, Maybe a > m (Tuple as))
 composeOp :: (Monad m, Known Length as, Every Num as) => Prod (OpM m as) bs > OpM m bs c > OpM m as c
 composeOp1 :: (Monad m, Known Length as, Every Num as) => OpM m as b > OpM m '[b] c > OpM m as c
 (~.) :: (Monad m, Known Length as, Every Num as) => OpM m '[b] c > OpM m as b > OpM m as c
 composeOp' :: Monad m => Prod Summer as > Prod (OpM m as) bs > OpM m bs c > OpM m as c
 composeOp1' :: Monad m => Prod Summer as > OpM m as b > OpM m '[b] c > OpM m as c
 op0 :: a > Op '[] a
 opConst :: (Every Num as, Known Length as) => a > Op as a
 opConst' :: Prod Summer as > a > Op as a
 op1 :: Num a => (forall s. AD s (Forward a) > AD s (Forward a)) > Op '[a] a
 op2 :: Num a => (forall s. Reifies s Tape => Reverse s a > Reverse s a > Reverse s a) > Op '[a, a] a
 op3 :: Num a => (forall s. Reifies s Tape => Reverse s a > Reverse s a > Reverse s a > Reverse s a) > Op '[a, a, a] a
 opN :: (Num a, Known Nat n) => (forall s. Reifies s Tape => Vec n (Reverse s a) > Reverse s a) > Op (Replicate n a) a
 type family Replicate (n :: N) (a :: k) = (as :: [k])  as > n where ...
 op1' :: (a > (b, Maybe b > a)) > Op '[a] b
 op2' :: (a > b > (c, Maybe c > (a, b))) > Op '[a, b] c
 op3' :: (a > b > c > (d, Maybe d > (a, b, c))) > Op '[a, b, c] d
 opCoerce :: (Coercible a b, Num a) => Op '[a] b
 opTup :: (Every Num as, Known Length as) => Op as (Tuple as)
 opIso :: Num a => Iso' a b > Op '[a] b
 opCoerce' :: Coercible a b => Unity a > Op '[a] b
 opTup' :: Prod Unity as > Op as (Tuple as)
 opIso' :: Unity a > Iso' a b > Op '[a] b
 pattern (:>) :: forall k f a b. f a > f b > Prod k f ((:) k a ((:) k b ([] k)))
 only :: f a > Prod k f ((:) k a ([] k))
 head' :: Prod k f ((:<) k a as) > f a
 pattern (::<) :: forall a as. a > Tuple as > Tuple ((:<) * a as)
 only_ :: a > Tuple ((:) * a ([] *))
Implementation
Op
s contain information on a function as well as its gradient, but
provides that information in a way that allows them to be "chained".
For example, for a function
\[ f : \mathbb{R}^n \rightarrow \mathbb{R} \]
We might want to apply a function \(g\) to the result we get, to get our "final" result:
\[ \eqalign{ y &= f(\mathbf{x})\cr z &= g(y) } \]
Now, we might want the gradient \(\nabla z\) with respect to \(\mathbf{x}\), or \(\nabla_\mathbf{x} z\). Explicitly, this is:
\[ \nabla_\mathbf{x} z = \left< \frac{\partial z}{\partial x_1}, \frac{\partial z}{\partial x_2}, \ldots \right> \]
We can compute that by multiplying the total derivative of \(z\) with respect to \(y\) (that is, \(\frac{dz}{dy}\)) with the gradient of \(f\)) itself:
\[ \eqalign{ \nabla_\mathbf{x} z &= \frac{dz}{dy} \left< \frac{\partial y}{\partial x_1}, \frac{\partial y}{\partial x_2}, \ldots \right>\cr \nabla_\mathbf{x} z &= \frac{dz}{dy} \nabla_\mathbf{x} y } \]
So, to create an
with the Op
as aOp
constructor (or an OpM
with the
OpM
constructor), you give a function that returns a tuple,
containing:
 An
a
: The result of the function  An
Maybe a > Tuple as
: A function that, when given \(\frac{dz}{dy}\) (in aJust
), returns the total gradient \(\nabla_z \mathbf{x}\). If the function is given is givenNothing
, then \(\frac{dz}{dy}\) should be taken to be 1. In other words, you would simply need to return \(\nabla_y \mathbf{x}\), unchanged. That is, an input ofNothing
indicates that the "final result" is just simply \(f(\mathbf{x})\), and not some \(g(f(\mathbf{x}))\).
This is done so that Op
s can easily be "chained" together, one after
the other. If you have an Op
for \(f\) and an Op
for \(g\), you can
compute the gradient of \(f\) knowing that the result target is
\(g \circ f\).
Note that end users should probably never be required to construct an
Op
or OpM
explicitly this way. Instead, libraries should provide
carefuly preconstructed ones, or provide ways to generate them
automatically (like op1
, op2
, and op3
here).
type Op as a = forall m. Monad m => OpM m as a Source #
An
describes a differentiable function from Op
as aas
to a
.
For example, a value of type
Op
'[Int, Bool] Double
is a function from an Int
and a Bool
, returning a Double
. It can
be differentiated to give a gradient of an Int
and a Bool
if given
a total derivative for the Double
. If we call Bool
\(2\), then,
mathematically, it is akin to a:
\[ f : \mathbb{Z} \times 2 \rightarrow \mathbb{R} \]
See runOp
, gradOp
, and gradOpWith
for examples on how to run it,
and Op
for instructions on creating it.
This type is abstracted over using the pattern synonym with constructor
Op
, so you can create one from scratch with it. However, it's
simplest to create it using op2'
, op1'
, op2'
, and op3'
helper
smart constructors And, if your function is a numeric function, they
can even be created automatically using op1
, op2
, op3
, and opN
with a little help from Numeric.AD from the ad library.
Note that this type is a subset or subtype of OpM
(and also of
OpB
). So, if a function ever expects an
(or a OpM
m as
aOpB
), you can always provide an
instead.Op
as a
Many functions in this library will expect an
(or
an OpM
m as a
), and in all of these cases, you can
provide an OpB
s as a
.Op
as a
pattern Op :: forall as a. (Tuple as > (a, Maybe a > Tuple as)) > Op as a Source #
Construct an Op
by giving a function creating the result, and also
a continuation on how to create the gradient, given the total derivative
of a
.
See the module documentation for Numeric.Backprop.Op for more details
on the function that this constructor and OpM
expect.
An
represents a differentiable (monadic) function
from OpM
m as aas
to a
, in the context of a Monad
m
.
For example, an
OpM
IO '[Int, Bool] Double
would be a function that takes an Int
and a Bool
and returns
a Double
(in IO
). It can be differentiated to give a gradient of
an Int
and a Bool
(also in IO
) if given the total derivative for
the Double
.
Note that an OpM
is a superclass of Op
, so any function that
expects an
can also accept an OpM
m as a
.Op
as a
See runOpM
, gradOpM
, and gradOpWithM
for examples on how to run
it.
OpM (Tuple as > m (a, Maybe a > m (Tuple as)))  Construct an See the module documentation for Numeric.Backprop.Op for more
details on the function that this constructor and 
(Monad m, Known [*] (Length *) as, Every * Floating as, Every * Fractional as, Every * Num as, Floating a) => Floating (OpM m as a) Source #  
(Monad m, Known [*] (Length *) as, Every * Fractional as, Every * Num as, Fractional a) => Fractional (OpM m as a) Source #  
(Monad m, Known [*] (Length *) as, Every * Num as, Num a) => Num (OpM m as a) Source #  
Tuple Types
See Numeric.Backprop for a minitutorial on Prod
and
Tuple
data Prod k f a :: forall k. (k > *) > [k] > * where #
Witness ØC ØC (Prod k f (Ø k))  
IxFunctor1 k [k] (Index k) (Prod k)  
IxFoldable1 k [k] (Index k) (Prod k)  
IxTraversable1 k [k] (Index k) (Prod k)  
Functor1 [k] k (Prod k)  
Foldable1 [k] k (Prod k)  
Traversable1 [k] k (Prod k)  
TestEquality k f => TestEquality [k] (Prod k f)  
BoolEquality k f => BoolEquality [k] (Prod k f)  
Eq1 k f => Eq1 [k] (Prod k f)  
Ord1 k f => Ord1 [k] (Prod k f)  
Show1 k f => Show1 [k] (Prod k f)  
Read1 k f => Read1 [k] (Prod k f)  
(Known [k] (Length k) as, Every k (Known k f) as) => Known [k] (Prod k f) as  
(Witness p q (f a1), Witness s t (Prod a f as)) => Witness (p, s) (q, t) (Prod a f ((:<) a a1 as))  
ListC ((<$>) Constraint * Eq ((<$>) * k f as)) => Eq (Prod k f as)  
(ListC ((<$>) Constraint * Eq ((<$>) * k f as)), ListC ((<$>) Constraint * Ord ((<$>) * k f as))) => Ord (Prod k f as)  
ListC ((<$>) Constraint * Show ((<$>) * k f as)) => Show (Prod k f as)  
type WitnessC ØC ØC (Prod k f (Ø k))  
type KnownC [k] (Prod k f) as  
type WitnessC (p, s) (q, t) (Prod a f ((:<) a a1 as))  
Running
Pure
runOp :: Op as a > Tuple as > a Source #
Run the function that an Op
encodes, to get the result.
>>>
runOp (op2 (*)) (3 ::< 5 ::< Ø)
15
gradOp :: Op as a > Tuple as > Tuple as Source #
Run the function that an Op
encodes, and get the gradient of the
output with respect to the inputs.
>>>
gradOp (op2 (*)) (3 ::< 5 ::< Ø)
5 ::< 3 ::< Ø  the gradient of x*y is (y, x)
gradOp' :: Op as a > Tuple as > (a, Tuple as) Source #
Run the function that an Op
encodes, to get the resulting output and
also its gradient with respect to the inputs.
>>>
gradOpM' (op2 (*)) (3 ::< 5 ::< Ø) :: IO (Int, Tuple '[Int, Int])
(15, 5 ::< 3 ::< Ø)
:: Op as a 

> Tuple as  Inputs to run it with 
> a  The total derivative of the result 
> Tuple as  The gradient 
Run the function that an Op
encodes, and get the gradient of
a "final result" with respect to the inputs, given the total derivative
of the output with the final result.
See gradOp
and the module documentaiton for Numeric.Backprop.Op for
more information.
:: Op as a 

> Tuple as  Inputs to run it with 
> Maybe a  If 
> Tuple as  The gradient 
A combination of gradOp
and gradOpWith
. The third argument is
(optionally) the total derivative the result. Give Nothing
and it is
assumed that the result is the final result (and the total derivative is
1), and this behaves the same as gradOp
. Give
and it uses
the Just
dd
as the total derivative of the result, and this behaves like
gradOpWith
.
See gradOp
and the module documentaiton for Numeric.Backprop.Op for
more information.
:: Op as a 

> Tuple as  Inputs 
> (a, Maybe a > Tuple as)  Result, and continuation to get the gradient 
A combination of runOp
and gradOpWith'
. Given an Op
and inputs,
returns the result of the Op
and a continuation that gives its
gradient.
The continuation takes the total derivative of the result as input. See
documenation for gradOpWith'
and module documentation for
Numeric.Backprop.Op for more information.
Monadic
:: Monad m  
=> OpM m as a 

> Tuple as  Inputs to run it with 
> a  The total derivative of the result 
> m (Tuple as)  the gradient 
The monadic version of gradOpWith
, for OpM
s.
:: Monad m  
=> OpM m as a 

> Tuple as  Inputs to run it with 
> Maybe a  If 
> m (Tuple as)  The gradient 
The monadic version of gradOpWith'
, for OpM
s.
:: OpM m as a 

> Tuple as  Inputs 
> m (a, Maybe a > m (Tuple as))  Result, and continuation to get the gradient 
A combination of runOpM
and gradOpWithM'
. Given an OpM
and
inputs, returns the result of the OpM
and a continuation that gives
its gradient.
The continuation takes the total derivative of the result as input. See
documenation for gradOpWithM'
and module documentation for
Numeric.Backprop.Op for more information.
Manipulation
composeOp1 :: (Monad m, Known Length as, Every Num as) => OpM m as b > OpM m '[b] c > OpM m as c Source #
(~.) :: (Monad m, Known Length as, Every Num as) => OpM m '[b] c > OpM m as b > OpM m as c infixr 9 Source #
composeOp1' :: Monad m => Prod Summer as > OpM m as b > OpM m '[b] c > OpM m as c Source #
composeOp1
, but taking explicit Summer
s, for the situation where
the as
are not instance of Num
.
Creation
Create an Op
that takes no inputs and always returns the given
value.
There is no gradient, of course (using gradOp
will give you an empty
tuple), because there is no input to have a gradient of.
>>>
gradOp' (op0 10) Ø
(10, Ø)
For a constant Op
that takes input and ignores it, see opConst
and
opConst'
.
Note that because this returns an Op
, it can be used with any function
that expects an OpM
or OpB
, as well.
opConst :: (Every Num as, Known Length as) => a > Op as a Source #
An Op
that ignores all of its inputs and returns a given constant
value.
>>>
gradOp' (opConst 10) (1 ::< 2 ::< 3 ::< Ø)
(10, 0 ::< 0 ::< 0 ::< Ø)
Automatic creation using the ad library
op2 :: Num a => (forall s. Reifies s Tape => Reverse s a > Reverse s a > Reverse s a) > Op '[a, a] a Source #
op3 :: Num a => (forall s. Reifies s Tape => Reverse s a > Reverse s a > Reverse s a > Reverse s a) > Op '[a, a, a] a Source #
opN :: (Num a, Known Nat n) => (forall s. Reifies s Tape => Vec n (Reverse s a) > Reverse s a) > Op (Replicate n a) a Source #
type family Replicate (n :: N) (a :: k) = (as :: [k])  as > n where ... Source #
is a list of Replicate
n aa
s repeated n
times.
>>>
:kind! Replicate N3 Int
'[Int, Int, Int]>>>
:kind! Replicate N5 Double
'[Double, Double, Double, Double, Double]
Giving gradients directly
op1' :: (a > (b, Maybe b > a)) > Op '[a] b Source #
Create an Op
of a function taking one input, by giving its explicit
derivative. The function should return a tuple containing the result of
the function, and also a function taking the derivative of the result
and return the derivative of the input.
If we have
\[ \eqalign{ f &: \mathbb{R} \rightarrow \mathbb{R}\cr y &= f(x)\cr z &= g(y) } \]
Then the derivative \( \frac{dz}{dx} \), it would be:
\[ \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} \]
If our Op
represents \(f\), then the second item in the resulting
tuple should be a function that takes \(\frac{dz}{dy}\) and returns
\(\frac{dz}{dx}\).
If the input is Nothing
, then \(\frac{dz}{dy}\) should be taken to be
\(1\).
As an example, here is an Op
that squares its input:
square :: Num a =>Op
'[a] a square =op1'
$ \x > (x*x, \case Nothing > 2 * x Just d > 2 * d * x )
Remember that, generally, end users shouldn't directly construct Op
s;
they should be provided by libraries or generated automatically.
For numeric functions, singleinput Op
s can be generated automatically
using op1
.
op2' :: (a > b > (c, Maybe c > (a, b))) > Op '[a, b] c Source #
Create an Op
of a function taking two inputs, by giving its explicit
gradient. The function should return a tuple containing the result of
the function, and also a function taking the derivative of the result
and return the derivative of the input.
If we have
\[ \eqalign{ f &: \mathbb{R}^2 \rightarrow \mathbb{R}\cr z &= f(x, y)\cr k &= g(z) } \]
Then the gradient \( \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> \) would be:
\[ \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> = \left< \frac{dk}{dz} \frac{\partial z}{dx}, \frac{dk}{dz} \frac{\partial z}{dy} \right> \]
If our Op
represents \(f\), then the second item in the resulting
tuple should be a function that takes \(\frac{dk}{dz}\) and returns
\( \left< \frac{\partial k}{dx}, \frac{\partial k}{dx} \right> \).
If the input is Nothing
, then \(\frac{dk}{dz}\) should be taken to be
\(1\).
As an example, here is an Op
that multiplies its inputs:
mul :: Num a =>Op
'[a, a] a mul =op2'
$ \x y > (x*y, \case Nothing > (y , x ) Just d > (d*y, x*d) )
Remember that, generally, end users shouldn't directly construct Op
s;
they should be provided by libraries or generated automatically.
For numeric functions, twoinput Op
s can be generated automatically
using op2
.
From Isomorphisms
opTup :: (Every Num as, Known Length as) => Op as (Tuple as) Source #
An Op
that takes as
and returns exactly the input tuple.
>>>
gradOp' opTup (1 ::< 2 ::< 3 ::< Ø)
(1 ::< 2 ::< 3 ::< Ø, 1 ::< 1 ::< 1 ::< Ø)
opIso :: Num a => Iso' a b > Op '[a] b Source #
An Op
that runs the input value through the isomorphism encoded in
the Iso
. Requires the input to be an instance of Num
.
Warning: This is unsafe! It assumes that the isomorphisms themselves
have derivative 1, so will break for things like
exponentiating
. Basically, don't use this for any
"numeric" isomorphisms.