Copyright | (c) Justin Le 2017 |
---|---|
License | BSD3 |
Maintainer | justin@jle.im |
Stability | experimental |
Portability | non-portable |
Safe Haskell | None |
Language | Haskell2010 |
Provides the Op
(and OpM
) type and combinators, which represent
differentiable functions/operations on values, and are used by the
library to perform back-propagation.
Note that Op
is a subset or subtype of OpM
, and so, any function
that expects an
(or an OpM
m as a
)
can be given an OpB
s as a
and it'll work just fine.Op
as a
- type Op as a = forall m. Monad m => OpM m as a
- pattern Op :: forall as a. (Tuple as -> (a, Maybe a -> Tuple as)) -> Op as a
- newtype OpM m as a = OpM (Tuple as -> m (a, Maybe a -> m (Tuple as)))
- data Prod k f a :: forall k. (k -> *) -> [k] -> * where
- type Tuple = Prod * I
- newtype I a :: * -> * = I {
- getI :: a
- runOp :: Op as a -> Tuple as -> a
- gradOp :: Op as a -> Tuple as -> Tuple as
- gradOp' :: Op as a -> Tuple as -> (a, Tuple as)
- gradOpWith :: Op as a -> Tuple as -> a -> Tuple as
- gradOpWith' :: Op as a -> Tuple as -> Maybe a -> Tuple as
- runOp' :: Op as a -> Tuple as -> (a, Maybe a -> Tuple as)
- runOpM :: Functor m => OpM m as a -> Tuple as -> m a
- gradOpM :: Monad m => OpM m as a -> Tuple as -> m (Tuple as)
- gradOpM' :: Monad m => OpM m as a -> Tuple as -> m (a, Tuple as)
- gradOpWithM :: Monad m => OpM m as a -> Tuple as -> a -> m (Tuple as)
- gradOpWithM' :: Monad m => OpM m as a -> Tuple as -> Maybe a -> m (Tuple as)
- runOpM' :: OpM m as a -> Tuple as -> m (a, Maybe a -> m (Tuple as))
- composeOp :: (Monad m, Every Num as, Known Length as) => Prod (OpM m as) bs -> OpM m bs c -> OpM m as c
- composeOp1 :: (Monad m, Every Num as, Known Length as) => OpM m as b -> OpM m '[b] c -> OpM m as c
- (~.) :: (Monad m, Known Length as, Every Num as) => OpM m '[b] c -> OpM m as b -> OpM m as c
- composeOp' :: forall m as bs c. (Monad m, Every Num as) => Length as -> Prod (OpM m as) bs -> OpM m bs c -> OpM m as c
- composeOp1' :: (Monad m, Every Num as) => Length as -> OpM m as b -> OpM m '[b] c -> OpM m as c
- op0 :: a -> Op '[] a
- opConst :: forall as a. (Every Num as, Known Length as) => a -> Op as a
- opConst' :: forall as a. Every Num as => Length as -> a -> Op as a
- op1 :: Num a => (forall s. AD s (Forward a) -> AD s (Forward a)) -> Op '[a] a
- op2 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a] a
- op3 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a, a] a
- opN :: (Num a, Known Nat n) => (forall s. Reifies s Tape => Vec n (Reverse s a) -> Reverse s a) -> Op (Replicate n a) a
- type family Replicate (n :: N) (a :: k) = (as :: [k]) | as -> n where ...
- op1' :: (a -> (b, Maybe b -> a)) -> Op '[a] b
- op2' :: (a -> b -> (c, Maybe c -> (a, b))) -> Op '[a, b] c
- op3' :: (a -> b -> c -> (d, Maybe d -> (a, b, c))) -> Op '[a, b, c] d
- opCoerce :: Num a => Coercible a b => Op '[a] b
- opTup :: (Every Num as, Known Length as) => Length as -> Op as (Tuple as)
- opIso :: Num a => Iso' a b -> Op '[a] b
- opTup' :: Every Num as => Length as -> Op as (Tuple as)
- pattern (:>) :: forall k f a b. f a -> f b -> Prod k f ((:) k a ((:) k b ([] k)))
- only :: f a -> Prod k f ((:) k a ([] k))
- head' :: Prod k f ((:<) k a as) -> f a
- pattern (::<) :: forall a as. a -> Tuple as -> Tuple ((:<) * a as)
- only_ :: a -> Tuple ((:) * a ([] *))
- (+.) :: Num a => Op '[a, a] a
- (-.) :: Num a => Op '[a, a] a
- (*.) :: Num a => Op '[a, a] a
- negateOp :: Num a => Op '[a] a
- absOp :: Num a => Op '[a] a
- signumOp :: Num a => Op '[a] a
- (/.) :: Fractional a => Op '[a, a] a
- recipOp :: Fractional a => Op '[a] a
- expOp :: Floating a => Op '[a] a
- logOp :: Floating a => Op '[a] a
- sqrtOp :: Floating a => Op '[a] a
- (**.) :: Floating a => Op '[a, a] a
- logBaseOp :: Floating a => Op '[a, a] a
- sinOp :: Floating a => Op '[a] a
- cosOp :: Floating a => Op '[a] a
- tanOp :: Floating a => Op '[a] a
- asinOp :: Floating a => Op '[a] a
- acosOp :: Floating a => Op '[a] a
- atanOp :: Floating a => Op '[a] a
- sinhOp :: Floating a => Op '[a] a
- coshOp :: Floating a => Op '[a] a
- tanhOp :: Floating a => Op '[a] a
- asinhOp :: Floating a => Op '[a] a
- acoshOp :: Floating a => Op '[a] a
- atanhOp :: Floating a => Op '[a] a
Implementation
Op
s contain information on a function as well as its gradient, but
provides that information in a way that allows them to be "chained".
For example, for a function
\[ f : \mathbb{R}^n \rightarrow \mathbb{R} \]
We might want to apply a function \(g\) to the result we get, to get our "final" result:
\[ \eqalign{ y &= f(\mathbf{x})\cr z &= g(y) } \]
Now, we might want the gradient \(\nabla z\) with respect to \(\mathbf{x}\), or \(\nabla_\mathbf{x} z\). Explicitly, this is:
\[ \nabla_\mathbf{x} z = \left< \frac{\partial z}{\partial x_1}, \frac{\partial z}{\partial x_2}, \ldots \right> \]
We can compute that by multiplying the total derivative of \(z\) with respect to \(y\) (that is, \(\frac{dz}{dy}\)) with the gradient of \(f\)) itself:
\[ \eqalign{ \nabla_\mathbf{x} z &= \frac{dz}{dy} \left< \frac{\partial y}{\partial x_1}, \frac{\partial y}{\partial x_2}, \ldots \right>\cr \nabla_\mathbf{x} z &= \frac{dz}{dy} \nabla_\mathbf{x} y } \]
So, to create an
with the Op
as aOp
constructor (or an OpM
with the
OpM
constructor), you give a function that returns a tuple,
containing:
- An
a
: The result of the function - An
Maybe a -> Tuple as
: A function that, when given \(\frac{dz}{dy}\) (in aJust
), returns the total gradient \(\nabla_z \mathbf{x}\). If the function is given is givenNothing
, then \(\frac{dz}{dy}\) should be taken to be 1. In other words, you would simply need to return \(\nabla_y \mathbf{x}\), unchanged. That is, an input ofNothing
indicates that the "final result" is just simply \(f(\mathbf{x})\), and not some \(g(f(\mathbf{x}))\).
This is done so that Op
s can easily be "chained" together, one after
the other. If you have an Op
for \(f\) and an Op
for \(g\), you can
compute the gradient of \(f\) knowing that the result target is
\(g \circ f\).
Note that end users should probably never be required to construct an
Op
or OpM
explicitly this way. Instead, libraries should provide
carefuly pre-constructed ones, or provide ways to generate them
automatically (like op1
, op2
, and op3
here).
For examples of Op
s implemented from scratch, see the implementations
of +.
, -.
, recipOp
, sinOp
, etc.
type Op as a = forall m. Monad m => OpM m as a Source #
An
describes a differentiable function from Op
as aas
to a
.
For example, a value of type
Op
'[Int, Bool] Double
is a function from an Int
and a Bool
, returning a Double
. It can
be differentiated to give a gradient of an Int
and a Bool
if given
a total derivative for the Double
. If we call Bool
\(2\), then,
mathematically, it is akin to a:
\[ f : \mathbb{Z} \times 2 \rightarrow \mathbb{R} \]
See runOp
, gradOp
, and gradOpWith
for examples on how to run it,
and Op
for instructions on creating it.
This type is abstracted over using the pattern synonym with constructor
Op
, so you can create one from scratch with it. However, it's
simplest to create it using op2'
, op1'
, op2'
, and op3'
helper
smart constructors And, if your function is a numeric function, they
can even be created automatically using op1
, op2
, op3
, and opN
with a little help from Numeric.AD from the ad library.
Note that this type is a subset or subtype of OpM
(and also of
OpB
). So, if a function ever expects an
(or a OpM
m as
aOpB
), you can always provide an
instead.Op
as a
Many functions in this library will expect an
(or
an OpM
m as a
), and in all of these cases, you can
provide an OpB
s as a
.Op
as a
pattern Op :: forall as a. (Tuple as -> (a, Maybe a -> Tuple as)) -> Op as a Source #
Construct an Op
by giving a function creating the result, and also
a continuation on how to create the gradient, given the total derivative
of a
.
See the module documentation for Numeric.Backprop.Op for more details
on the function that this constructor and OpM
expect.
An
represents a differentiable (monadic) function
from OpM
m as aas
to a
, in the context of a Monad
m
.
For example, an
OpM
IO '[Int, Bool] Double
would be a function that takes an Int
and a Bool
and returns
a Double
(in IO
). It can be differentiated to give a gradient of
an Int
and a Bool
(also in IO
) if given the total derivative for
the Double
.
Note that an OpM
is a superclass of Op
, so any function that
expects an
can also accept an OpM
m as a
.Op
as a
See runOpM
, gradOpM
, and gradOpWithM
for examples on how to run
it.
OpM (Tuple as -> m (a, Maybe a -> m (Tuple as))) | Construct an See the module documentation for Numeric.Backprop.Op for more
details on the function that this constructor and |
(Monad m, Known [*] (Length *) as, Every * Floating as, Every * Fractional as, Every * Num as, Floating a) => Floating (OpM m as a) Source # | |
(Monad m, Known [*] (Length *) as, Every * Fractional as, Every * Num as, Fractional a) => Fractional (OpM m as a) Source # | |
(Monad m, Known [*] (Length *) as, Every * Num as, Num a) => Num (OpM m as a) Source # | |
Tuple Types
See Numeric.Backprop for a mini-tutorial on Prod
and
Tuple
data Prod k f a :: forall k. (k -> *) -> [k] -> * where #
Witness ØC ØC (Prod k f (Ø k)) | |
IxFunctor1 k [k] (Index k) (Prod k) | |
IxFoldable1 k [k] (Index k) (Prod k) | |
IxTraversable1 k [k] (Index k) (Prod k) | |
Functor1 [k] k (Prod k) | |
Foldable1 [k] k (Prod k) | |
Traversable1 [k] k (Prod k) | |
TestEquality k f => TestEquality [k] (Prod k f) | |
BoolEquality k f => BoolEquality [k] (Prod k f) | |
Eq1 k f => Eq1 [k] (Prod k f) | |
Ord1 k f => Ord1 [k] (Prod k f) | |
Show1 k f => Show1 [k] (Prod k f) | |
Read1 k f => Read1 [k] (Prod k f) | |
(Known [k] (Length k) as, Every k (Known k f) as) => Known [k] (Prod k f) as | |
(Witness p q (f a1), Witness s t (Prod a f as)) => Witness (p, s) (q, t) (Prod a f ((:<) a a1 as)) | |
ListC ((<$>) Constraint * Eq ((<$>) * k f as)) => Eq (Prod k f as) | |
(ListC ((<$>) Constraint * Eq ((<$>) * k f as)), ListC ((<$>) Constraint * Ord ((<$>) * k f as))) => Ord (Prod k f as) | |
ListC ((<$>) Constraint * Show ((<$>) * k f as)) => Show (Prod k f as) | |
type WitnessC ØC ØC (Prod k f (Ø k)) | |
type KnownC [k] (Prod k f) as | |
type WitnessC (p, s) (q, t) (Prod a f ((:<) a a1 as)) | |
Running
Pure
runOp :: Op as a -> Tuple as -> a Source #
Run the function that an Op
encodes, to get the result.
>>>
runOp (op2 (*)) (3 ::< 5 ::< Ø)
15
gradOp :: Op as a -> Tuple as -> Tuple as Source #
Run the function that an Op
encodes, and get the gradient of the
output with respect to the inputs.
>>>
gradOp (op2 (*)) (3 ::< 5 ::< Ø)
5 ::< 3 ::< Ø -- the gradient of x*y is (y, x)
gradOp' :: Op as a -> Tuple as -> (a, Tuple as) Source #
Run the function that an Op
encodes, to get the resulting output and
also its gradient with respect to the inputs.
>>>
gradOpM' (op2 (*)) (3 ::< 5 ::< Ø) :: IO (Int, Tuple '[Int, Int])
(15, 5 ::< 3 ::< Ø)
:: Op as a |
|
-> Tuple as | Inputs to run it with |
-> a | The total derivative of the result |
-> Tuple as | The gradient |
Run the function that an Op
encodes, and get the gradient of
a "final result" with respect to the inputs, given the total derivative
of the output with the final result.
See gradOp
and the module documentaiton for Numeric.Backprop.Op for
more information.
:: Op as a |
|
-> Tuple as | Inputs to run it with |
-> Maybe a | If |
-> Tuple as | The gradient |
A combination of gradOp
and gradOpWith
. The third argument is
(optionally) the total derivative the result. Give Nothing
and it is
assumed that the result is the final result (and the total derivative is
1), and this behaves the same as gradOp
. Give
and it uses
the Just
dd
as the total derivative of the result, and this behaves like
gradOpWith
.
See gradOp
and the module documentaiton for Numeric.Backprop.Op for
more information.
:: Op as a |
|
-> Tuple as | Inputs |
-> (a, Maybe a -> Tuple as) | Result, and continuation to get the gradient |
A combination of runOp
and gradOpWith'
. Given an Op
and inputs,
returns the result of the Op
and a continuation that gives its
gradient.
The continuation takes the total derivative of the result as input. See
documenation for gradOpWith'
and module documentation for
Numeric.Backprop.Op for more information.
Monadic
:: Monad m | |
=> OpM m as a |
|
-> Tuple as | Inputs to run it with |
-> a | The total derivative of the result |
-> m (Tuple as) | the gradient |
The monadic version of gradOpWith
, for OpM
s.
:: Monad m | |
=> OpM m as a |
|
-> Tuple as | Inputs to run it with |
-> Maybe a | If |
-> m (Tuple as) | The gradient |
The monadic version of gradOpWith'
, for OpM
s.
:: OpM m as a |
|
-> Tuple as | Inputs |
-> m (a, Maybe a -> m (Tuple as)) | Result, and continuation to get the gradient |
A combination of runOpM
and gradOpWithM'
. Given an OpM
and
inputs, returns the result of the OpM
and a continuation that gives
its gradient.
The continuation takes the total derivative of the result as input. See
documenation for gradOpWithM'
and module documentation for
Numeric.Backprop.Op for more information.
Manipulation
composeOp1 :: (Monad m, Every Num as, Known Length as) => OpM m as b -> OpM m '[b] c -> OpM m as c Source #
(~.) :: (Monad m, Known Length as, Every Num as) => OpM m '[b] c -> OpM m as b -> OpM m as c infixr 9 Source #
A version of composeOp
taking explicit Length
, indicating the
number of inputs expected and their types.
Requiring an explicit Length
is mostly useful for rare "extremely
polymorphic" situations, where GHC can't infer the type and length of
the the expected input tuple. If you ever actually explicitly write
down as
as a list of types, you should be able to just use
composeOp
.
composeOp1' :: (Monad m, Every Num as) => Length as -> OpM m as b -> OpM m '[b] c -> OpM m as c Source #
A version of composeOp1
taking explicit Length
, indicating the
number of inputs expected and their types.
Requiring an explicit Length
is mostly useful for rare "extremely
polymorphic" situations, where GHC can't infer the type and length of
the the expected input tuple. If you ever actually explicitly write
down as
as a list of types, you should be able to just use
composeOp1
.
Creation
Create an Op
that takes no inputs and always returns the given
value.
There is no gradient, of course (using gradOp
will give you an empty
tuple), because there is no input to have a gradient of.
>>>
gradOp' (op0 10) Ø
(10, Ø)
For a constant Op
that takes input and ignores it, see opConst
and
opConst'
.
Note that because this returns an Op
, it can be used with any function
that expects an OpM
or OpB
, as well.
opConst :: forall as a. (Every Num as, Known Length as) => a -> Op as a Source #
An Op
that ignores all of its inputs and returns a given constant
value.
>>>
gradOp' (opConst 10) (1 ::< 2 ::< 3 ::< Ø)
(10, 0 ::< 0 ::< 0 ::< Ø)
opConst' :: forall as a. Every Num as => Length as -> a -> Op as a Source #
A version of opConst
taking explicit Length
, indicating the
number of inputs and their types.
Requiring an explicit Length
is mostly useful for rare "extremely
polymorphic" situations, where GHC can't infer the type and length of
the the expected input tuple. If you ever actually explicitly write
down as
as a list of types, you should be able to just use
opConst
.
Automatic creation using the ad library
op2 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a] a Source #
op3 :: Num a => (forall s. Reifies s Tape => Reverse s a -> Reverse s a -> Reverse s a -> Reverse s a) -> Op '[a, a, a] a Source #
opN :: (Num a, Known Nat n) => (forall s. Reifies s Tape => Vec n (Reverse s a) -> Reverse s a) -> Op (Replicate n a) a Source #
type family Replicate (n :: N) (a :: k) = (as :: [k]) | as -> n where ... Source #
is a list of Replicate
n aa
s repeated n
times.
>>>
:kind! Replicate N3 Int
'[Int, Int, Int]>>>
:kind! Replicate N5 Double
'[Double, Double, Double, Double, Double]
Giving gradients directly
op1' :: (a -> (b, Maybe b -> a)) -> Op '[a] b Source #
Create an Op
of a function taking one input, by giving its explicit
derivative. The function should return a tuple containing the result of
the function, and also a function taking the derivative of the result
and return the derivative of the input.
If we have
\[ \eqalign{ f &: \mathbb{R} \rightarrow \mathbb{R}\cr y &= f(x)\cr z &= g(y) } \]
Then the derivative \( \frac{dz}{dx} \), it would be:
\[ \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} \]
If our Op
represents \(f\), then the second item in the resulting
tuple should be a function that takes \(\frac{dz}{dy}\) and returns
\(\frac{dz}{dx}\).
If the input is Nothing
, then \(\frac{dz}{dy}\) should be taken to be
\(1\).
As an example, here is an Op
that squares its input:
square :: Num a =>Op
'[a] a square =op1'
$ \x -> (x*x, \case Nothing -> 2 * x Just d -> 2 * d * x )
Remember that, generally, end users shouldn't directly construct Op
s;
they should be provided by libraries or generated automatically.
For numeric functions, single-input Op
s can be generated automatically
using op1
.
op2' :: (a -> b -> (c, Maybe c -> (a, b))) -> Op '[a, b] c Source #
Create an Op
of a function taking two inputs, by giving its explicit
gradient. The function should return a tuple containing the result of
the function, and also a function taking the derivative of the result
and return the derivative of the input.
If we have
\[ \eqalign{ f &: \mathbb{R}^2 \rightarrow \mathbb{R}\cr z &= f(x, y)\cr k &= g(z) } \]
Then the gradient \( \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> \) would be:
\[ \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> = \left< \frac{dk}{dz} \frac{\partial z}{dx}, \frac{dk}{dz} \frac{\partial z}{dy} \right> \]
If our Op
represents \(f\), then the second item in the resulting
tuple should be a function that takes \(\frac{dk}{dz}\) and returns
\( \left< \frac{\partial k}{dx}, \frac{\partial k}{dx} \right> \).
If the input is Nothing
, then \(\frac{dk}{dz}\) should be taken to be
\(1\).
As an example, here is an Op
that multiplies its inputs:
mul :: Num a =>Op
'[a, a] a mul =op2'
$ \x y -> (x*y, \case Nothing -> (y , x ) Just d -> (d*y, x*d) )
Remember that, generally, end users shouldn't directly construct Op
s;
they should be provided by libraries or generated automatically.
For numeric functions, two-input Op
s can be generated automatically
using op2
.
From Isomorphisms
opTup :: (Every Num as, Known Length as) => Length as -> Op as (Tuple as) Source #
An Op
that takes as
and returns exactly the input tuple.
>>>
gradOp' opTup (1 ::< 2 ::< 3 ::< Ø)
(1 ::< 2 ::< 3 ::< Ø, 1 ::< 1 ::< 1 ::< Ø)
opIso :: Num a => Iso' a b -> Op '[a] b Source #
An Op
that runs the input value through the isomorphism encoded in
the Iso
. Requires the input to be an instance of Num
.
Warning: This is unsafe! It assumes that the isomorphisms themselves
have derivative 1, so will break for things like
exponentiating
. Basically, don't use this for any
"numeric" isomorphisms.
opTup' :: Every Num as => Length as -> Op as (Tuple as) Source #
A version of opTup
taking explicit Length
, indicating the
number of inputs expected and their types.
Requiring an explicit Length
is mostly useful for rare "extremely
polymorphic" situations, where GHC can't infer the type and length of
the the expected input tuple. If you ever actually explicitly write
down as
as a list of types, you should be able to just use
opTup
.
Utility
pattern (:>) :: forall k f a b. f a -> f b -> Prod k f ((:) k a ((:) k b ([] k))) infix 6 #
Construct a two element Prod. Since the precedence of (:>) is higher than (:<), we can conveniently write lists like:
>>>
a :< b :> c
Which is identical to:
>>>
a :< b :< c :< Ø