| Copyright | (c) Justin Le 2017 |
|---|---|
| License | BSD3 |
| Maintainer | justin@jle.im |
| Stability | experimental |
| Portability | non-portable |
| Safe Haskell | None |
| Language | Haskell2010 |
Numeric.Backprop.Op
Contents
Description
Provides the Op (and OpM) type and combinators, which represent
differentiable functions/operations on values, and are used by the
library to perform back-propagation.
Note that Op is a subset or subtype of OpM, and so, any function
that expects an (or an OpM m as a)
can be given an OpB s as a and it'll work just fine.Op as a
- type Op as a = forall m. Monad m => OpM m as a
- pattern Op :: forall as a. (Tuple as -> (a, Maybe a -> Tuple as)) -> Op as a
- newtype OpM m as a = OpM (Tuple as -> m (a, Maybe a -> m (Tuple as)))
- data Prod k f a :: forall k. (k -> *) -> [k] -> * where
- type Tuple = Prod * I
- newtype I a :: * -> * = I {
- getI :: a
- runOp :: Op as a -> Tuple as -> a
- gradOp :: Op as a -> Tuple as -> Tuple as
- gradOp' :: Op as a -> Tuple as -> (a, Tuple as)
- gradOpWith :: Op as a -> Tuple as -> a -> Tuple as
- gradOpWith' :: Op as a -> Tuple as -> Maybe a -> Tuple as
- runOp' :: Op as a -> Tuple as -> (a, Maybe a -> Tuple as)
- runOpM :: Functor m => OpM m as a -> Tuple as -> m a
- gradOpM :: Monad m => OpM m as a -> Tuple as -> m (Tuple as)
- gradOpM' :: Monad m => OpM m as a -> Tuple as -> m (a, Tuple as)
- gradOpWithM :: Monad m => OpM m as a -> Tuple as -> a -> m (Tuple as)
- gradOpWithM' :: Monad m => OpM m as a -> Tuple as -> Maybe a -> m (Tuple as)
- runOpM' :: OpM m as a -> Tuple as -> m (a, Maybe a -> m (Tuple as))
- composeOp :: (Monad m, Known Length as, Every Num as) => Prod (OpM m as) bs -> OpM m bs c -> OpM m as c
- composeOp1 :: (Monad m, Known Length as, Every Num as) => OpM m as b -> OpM m '[b] c -> OpM m as c
- (~.) :: (Monad m, Known Length as, Every Num as) => OpM m '[b] c -> OpM m as b -> OpM m as c
- composeOp' :: Monad m => Prod Summer as -> Prod (OpM m as) bs -> OpM m bs c -> OpM m as c
- composeOp1' :: Monad m => Prod Summer as -> OpM m as b -> OpM m '[b] c -> OpM m as c
- op0 :: a -> Op '[] a
- opConst :: (Every Num as, Known Length as) => a -> Op as a
- opConst' :: Prod Summer as -> a -> Op as a
- op1 :: Num a => (forall s. AD s (Forward a) -> AD s (Forward a)) -> Op '[a] a
- op2 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a] a
- op3 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a, a] a
- opN :: (Num a, Known Nat n) => (forall s. Reifies s Tape => Vec n (Reverse s a) -> Reverse s a) -> Op (Replicate n a) a
- type family Replicate (n :: N) (a :: k) = (as :: [k]) | as -> n where ...
- op1' :: (a -> (b, Maybe b -> a)) -> Op '[a] b
- op2' :: (a -> b -> (c, Maybe c -> (a, b))) -> Op '[a, b] c
- op3' :: (a -> b -> c -> (d, Maybe d -> (a, b, c))) -> Op '[a, b, c] d
- opCoerce :: (Coercible a b, Num a) => Op '[a] b
- opTup :: (Every Num as, Known Length as) => Op as (Tuple as)
- opIso :: Num a => Iso' a b -> Op '[a] b
- opCoerce' :: Coercible a b => Unity a -> Op '[a] b
- opTup' :: Prod Unity as -> Op as (Tuple as)
- opIso' :: Unity a -> Iso' a b -> Op '[a] b
- pattern (:>) :: forall k f a b. f a -> f b -> Prod k f ((:) k a ((:) k b ([] k)))
- only :: f a -> Prod k f ((:) k a ([] k))
- head' :: Prod k f ((:<) k a as) -> f a
- pattern (::<) :: forall a as. a -> Tuple as -> Tuple ((:<) * a as)
- only_ :: a -> Tuple ((:) * a ([] *))
Implementation
Ops contain information on a function as well as its gradient, but
provides that information in a way that allows them to be "chained".
For example, for a function
\[ f : \mathbb{R}^n \rightarrow \mathbb{R} \]
We might want to apply a function \(g\) to the result we get, to get our "final" result:
\[ \eqalign{ y &= f(\mathbf{x})\cr z &= g(y) } \]
Now, we might want the gradient \(\nabla z\) with respect to \(\mathbf{x}\), or \(\nabla_\mathbf{x} z\). Explicitly, this is:
\[ \nabla_\mathbf{x} z = \left< \frac{\partial z}{\partial x_1}, \frac{\partial z}{\partial x_2}, \ldots \right> \]
We can compute that by multiplying the total derivative of \(z\) with respect to \(y\) (that is, \(\frac{dz}{dy}\)) with the gradient of \(f\)) itself:
\[ \eqalign{ \nabla_\mathbf{x} z &= \frac{dz}{dy} \left< \frac{\partial y}{\partial x_1}, \frac{\partial y}{\partial x_2}, \ldots \right>\cr \nabla_\mathbf{x} z &= \frac{dz}{dy} \nabla_\mathbf{x} y } \]
So, to create an with the Op as aOp constructor (or an OpM with the
OpM constructor), you give a function that returns a tuple,
containing:
- An
a: The result of the function - An
Maybe a -> Tuple as: A function that, when given \(\frac{dz}{dy}\) (in aJust), returns the total gradient \(\nabla_z \mathbf{x}\). If the function is given is givenNothing, then \(\frac{dz}{dy}\) should be taken to be 1. In other words, you would simply need to return \(\nabla_y \mathbf{x}\), unchanged. That is, an input ofNothingindicates that the "final result" is just simply \(f(\mathbf{x})\), and not some \(g(f(\mathbf{x}))\).
This is done so that Ops can easily be "chained" together, one after
the other. If you have an Op for \(f\) and an Op for \(g\), you can
compute the gradient of \(f\) knowing that the result target is
\(g \circ f\).
Note that end users should probably never be required to construct an
Op or OpM explicitly this way. Instead, libraries should provide
carefuly pre-constructed ones, or provide ways to generate them
automatically (like op1, op2, and op3 here).
type Op as a = forall m. Monad m => OpM m as a Source #
An describes a differentiable function from Op as aas to a.
For example, a value of type
Op '[Int, Bool] Double
is a function from an Int and a Bool, returning a Double. It can
be differentiated to give a gradient of an Int and a Bool if given
a total derivative for the Double. If we call Bool \(2\), then,
mathematically, it is akin to a:
\[ f : \mathbb{Z} \times 2 \rightarrow \mathbb{R} \]
See runOp, gradOp, and gradOpWith for examples on how to run it,
and Op for instructions on creating it.
This type is abstracted over using the pattern synonym with constructor
Op, so you can create one from scratch with it. However, it's
simplest to create it using op2', op1', op2', and op3' helper
smart constructors And, if your function is a numeric function, they
can even be created automatically using op1, op2, op3, and opN
with a little help from Numeric.AD from the ad library.
Note that this type is a subset or subtype of OpM (and also of
OpB). So, if a function ever expects an (or a OpM m as
aOpB), you can always provide an
instead.Op as a
Many functions in this library will expect an (or
an OpM m as a), and in all of these cases, you can
provide an OpB s as a.Op as a
pattern Op :: forall as a. (Tuple as -> (a, Maybe a -> Tuple as)) -> Op as a Source #
Construct an Op by giving a function creating the result, and also
a continuation on how to create the gradient, given the total derivative
of a.
See the module documentation for Numeric.Backprop.Op for more details
on the function that this constructor and OpM expect.
An represents a differentiable (monadic) function
from OpM m as aas to a, in the context of a Monad m.
For example, an
OpM IO '[Int, Bool] Double
would be a function that takes an Int and a Bool and returns
a Double (in IO). It can be differentiated to give a gradient of
an Int and a Bool (also in IO) if given the total derivative for
the Double.
Note that an OpM is a superclass of Op, so any function that
expects an can also accept an OpM m as a.Op as a
See runOpM, gradOpM, and gradOpWithM for examples on how to run
it.
Constructors
| OpM (Tuple as -> m (a, Maybe a -> m (Tuple as))) | Construct an See the module documentation for Numeric.Backprop.Op for more
details on the function that this constructor and |
Instances
| (Monad m, Known [*] (Length *) as, Every * Floating as, Every * Fractional as, Every * Num as, Floating a) => Floating (OpM m as a) Source # | |
| (Monad m, Known [*] (Length *) as, Every * Fractional as, Every * Num as, Fractional a) => Fractional (OpM m as a) Source # | |
| (Monad m, Known [*] (Length *) as, Every * Num as, Num a) => Num (OpM m as a) Source # | |
Tuple Types
See Numeric.Backprop for a mini-tutorial on Prod and
Tuple
data Prod k f a :: forall k. (k -> *) -> [k] -> * where #
Instances
| Witness ØC ØC (Prod k f (Ø k)) | |
| IxFunctor1 k [k] (Index k) (Prod k) | |
| IxFoldable1 k [k] (Index k) (Prod k) | |
| IxTraversable1 k [k] (Index k) (Prod k) | |
| Functor1 [k] k (Prod k) | |
| Foldable1 [k] k (Prod k) | |
| Traversable1 [k] k (Prod k) | |
| TestEquality k f => TestEquality [k] (Prod k f) | |
| BoolEquality k f => BoolEquality [k] (Prod k f) | |
| Eq1 k f => Eq1 [k] (Prod k f) | |
| Ord1 k f => Ord1 [k] (Prod k f) | |
| Show1 k f => Show1 [k] (Prod k f) | |
| Read1 k f => Read1 [k] (Prod k f) | |
| (Known [k] (Length k) as, Every k (Known k f) as) => Known [k] (Prod k f) as | |
| (Witness p q (f a1), Witness s t (Prod a f as)) => Witness (p, s) (q, t) (Prod a f ((:<) a a1 as)) | |
| ListC ((<$>) Constraint * Eq ((<$>) * k f as)) => Eq (Prod k f as) | |
| (ListC ((<$>) Constraint * Eq ((<$>) * k f as)), ListC ((<$>) Constraint * Ord ((<$>) * k f as))) => Ord (Prod k f as) | |
| ListC ((<$>) Constraint * Show ((<$>) * k f as)) => Show (Prod k f as) | |
| type WitnessC ØC ØC (Prod k f (Ø k)) | |
| type KnownC [k] (Prod k f) as | |
| type WitnessC (p, s) (q, t) (Prod a f ((:<) a a1 as)) | |
Running
Pure
runOp :: Op as a -> Tuple as -> a Source #
Run the function that an Op encodes, to get the result.
>>>runOp (op2 (*)) (3 ::< 5 ::< Ø)15
gradOp :: Op as a -> Tuple as -> Tuple as Source #
Run the function that an Op encodes, and get the gradient of the
output with respect to the inputs.
>>>gradOp (op2 (*)) (3 ::< 5 ::< Ø)5 ::< 3 ::< Ø -- the gradient of x*y is (y, x)
gradOp' :: Op as a -> Tuple as -> (a, Tuple as) Source #
Run the function that an Op encodes, to get the resulting output and
also its gradient with respect to the inputs.
>>>gradOpM' (op2 (*)) (3 ::< 5 ::< Ø) :: IO (Int, Tuple '[Int, Int])(15, 5 ::< 3 ::< Ø)
Arguments
| :: Op as a |
|
| -> Tuple as | Inputs to run it with |
| -> a | The total derivative of the result |
| -> Tuple as | The gradient |
Run the function that an Op encodes, and get the gradient of
a "final result" with respect to the inputs, given the total derivative
of the output with the final result.
See gradOp and the module documentaiton for Numeric.Backprop.Op for
more information.
Arguments
| :: Op as a |
|
| -> Tuple as | Inputs to run it with |
| -> Maybe a | If |
| -> Tuple as | The gradient |
A combination of gradOp and gradOpWith. The third argument is
(optionally) the total derivative the result. Give Nothing and it is
assumed that the result is the final result (and the total derivative is
1), and this behaves the same as gradOp. Give and it uses
the Just dd as the total derivative of the result, and this behaves like
gradOpWith.
See gradOp and the module documentaiton for Numeric.Backprop.Op for
more information.
Arguments
| :: Op as a |
|
| -> Tuple as | Inputs |
| -> (a, Maybe a -> Tuple as) | Result, and continuation to get the gradient |
A combination of runOp and gradOpWith'. Given an Op and inputs,
returns the result of the Op and a continuation that gives its
gradient.
The continuation takes the total derivative of the result as input. See
documenation for gradOpWith' and module documentation for
Numeric.Backprop.Op for more information.
Monadic
Arguments
| :: Monad m | |
| => OpM m as a |
|
| -> Tuple as | Inputs to run it with |
| -> a | The total derivative of the result |
| -> m (Tuple as) | the gradient |
The monadic version of gradOpWith, for OpMs.
Arguments
| :: Monad m | |
| => OpM m as a |
|
| -> Tuple as | Inputs to run it with |
| -> Maybe a | If |
| -> m (Tuple as) | The gradient |
The monadic version of gradOpWith', for OpMs.
Arguments
| :: OpM m as a |
|
| -> Tuple as | Inputs |
| -> m (a, Maybe a -> m (Tuple as)) | Result, and continuation to get the gradient |
A combination of runOpM and gradOpWithM'. Given an OpM and
inputs, returns the result of the OpM and a continuation that gives
its gradient.
The continuation takes the total derivative of the result as input. See
documenation for gradOpWithM' and module documentation for
Numeric.Backprop.Op for more information.
Manipulation
composeOp1 :: (Monad m, Known Length as, Every Num as) => OpM m as b -> OpM m '[b] c -> OpM m as c Source #
(~.) :: (Monad m, Known Length as, Every Num as) => OpM m '[b] c -> OpM m as b -> OpM m as c infixr 9 Source #
composeOp1' :: Monad m => Prod Summer as -> OpM m as b -> OpM m '[b] c -> OpM m as c Source #
composeOp1, but taking explicit Summers, for the situation where
the as are not instance of Num.
Creation
Create an Op that takes no inputs and always returns the given
value.
There is no gradient, of course (using gradOp will give you an empty
tuple), because there is no input to have a gradient of.
>>>gradOp' (op0 10) Ø(10, Ø)
For a constant Op that takes input and ignores it, see opConst and
opConst'.
Note that because this returns an Op, it can be used with any function
that expects an OpM or OpB, as well.
opConst :: (Every Num as, Known Length as) => a -> Op as a Source #
An Op that ignores all of its inputs and returns a given constant
value.
>>>gradOp' (opConst 10) (1 ::< 2 ::< 3 ::< Ø)(10, 0 ::< 0 ::< 0 ::< Ø)
Automatic creation using the ad library
op2 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a] a Source #
op3 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a, a] a Source #
opN :: (Num a, Known Nat n) => (forall s. Reifies s Tape => Vec n (Reverse s a) -> Reverse s a) -> Op (Replicate n a) a Source #
type family Replicate (n :: N) (a :: k) = (as :: [k]) | as -> n where ... Source #
is a list of Replicate n aas repeated n times.
>>>:kind! Replicate N3 Int'[Int, Int, Int]>>>:kind! Replicate N5 Double'[Double, Double, Double, Double, Double]
Giving gradients directly
op1' :: (a -> (b, Maybe b -> a)) -> Op '[a] b Source #
Create an Op of a function taking one input, by giving its explicit
derivative. The function should return a tuple containing the result of
the function, and also a function taking the derivative of the result
and return the derivative of the input.
If we have
\[ \eqalign{ f &: \mathbb{R} \rightarrow \mathbb{R}\cr y &= f(x)\cr z &= g(y) } \]
Then the derivative \( \frac{dz}{dx} \), it would be:
\[ \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} \]
If our Op represents \(f\), then the second item in the resulting
tuple should be a function that takes \(\frac{dz}{dy}\) and returns
\(\frac{dz}{dx}\).
If the input is Nothing, then \(\frac{dz}{dy}\) should be taken to be
\(1\).
As an example, here is an Op that squares its input:
square :: Num a =>Op'[a] a square =op1'$ \x -> (x*x, \case Nothing -> 2 * x Just d -> 2 * d * x )
Remember that, generally, end users shouldn't directly construct Ops;
they should be provided by libraries or generated automatically.
For numeric functions, single-input Ops can be generated automatically
using op1.
op2' :: (a -> b -> (c, Maybe c -> (a, b))) -> Op '[a, b] c Source #
Create an Op of a function taking two inputs, by giving its explicit
gradient. The function should return a tuple containing the result of
the function, and also a function taking the derivative of the result
and return the derivative of the input.
If we have
\[ \eqalign{ f &: \mathbb{R}^2 \rightarrow \mathbb{R}\cr z &= f(x, y)\cr k &= g(z) } \]
Then the gradient \( \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> \) would be:
\[ \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> = \left< \frac{dk}{dz} \frac{\partial z}{dx}, \frac{dk}{dz} \frac{\partial z}{dy} \right> \]
If our Op represents \(f\), then the second item in the resulting
tuple should be a function that takes \(\frac{dk}{dz}\) and returns
\( \left< \frac{\partial k}{dx}, \frac{\partial k}{dx} \right> \).
If the input is Nothing, then \(\frac{dk}{dz}\) should be taken to be
\(1\).
As an example, here is an Op that multiplies its inputs:
mul :: Num a =>Op'[a, a] a mul =op2'$ \x y -> (x*y, \case Nothing -> (y , x ) Just d -> (d*y, x*d) )
Remember that, generally, end users shouldn't directly construct Ops;
they should be provided by libraries or generated automatically.
For numeric functions, two-input Ops can be generated automatically
using op2.
From Isomorphisms
opTup :: (Every Num as, Known Length as) => Op as (Tuple as) Source #
An Op that takes as and returns exactly the input tuple.
>>>gradOp' opTup (1 ::< 2 ::< 3 ::< Ø)(1 ::< 2 ::< 3 ::< Ø, 1 ::< 1 ::< 1 ::< Ø)
opIso :: Num a => Iso' a b -> Op '[a] b Source #
An Op that runs the input value through the isomorphism encoded in
the Iso. Requires the input to be an instance of Num.
Warning: This is unsafe! It assumes that the isomorphisms themselves
have derivative 1, so will break for things like
exponentiating. Basically, don't use this for any
"numeric" isomorphisms.