Copyright	(c) Justin Le 2018
License	BSD3
Maintainer	justin@jle.im
Stability	experimental
Portability	non-portable
Safe Haskell	None
Language	Haskell2010

Numeric.Backprop.Op

Contents

Implementation
- Tuple Types
Running
- Pure
Creation
- Giving gradients directly
- From Isomorphisms
Manipulation
Utility
- Numeric Ops

Description

Provides the Op type and combinators, which represent differentiable functions/operations on values, and are used internally by the library to perform back-propagation.

Users of the library can ignore this module for the most part. Library authors defining backpropagatable primitives for their functions are recommend to simply use op0, op1, op2, op3, which are re-exported in Numeric.Backprop. However, authors who want more options in defining their primtive functions might find some of these functions useful.

Note that if your entire function is a single non-branching composition of functions, Op and its utility functions alone are sufficient to differentiate/backprop. However, this happens rarely in practice.

Synopsis

Implementation

Ops contain information on a function as well as its gradient, but provides that information in a way that allows them to be "chained".

For example, for a function

\[ f : \mathbb{R}^n \rightarrow \mathbb{R} \]

We might want to apply a function $g$ to the result we get, to get our "final" result:

\[ \eqalign{ y &= f(\mathbf{x})\cr z &= g(y) } \]

Now, we might want the gradient $\nabla z$ with respect to $\mathbf{x}$, or $\nabla_\mathbf{x} z$. Explicitly, this is:

\[ \nabla_\mathbf{x} z = \left< \frac{\partial z}{\partial x_1}, \frac{\partial z}{\partial x_2}, \ldots \right> \]

We can compute that by multiplying the total derivative of $z$ with respect to $y$ (that is, $\frac{dz}{dy}$) with the gradient of $f$) itself:

\[ \eqalign{ \nabla_\mathbf{x} z &= \frac{dz}{dy} \left< \frac{\partial y}{\partial x_1}, \frac{\partial y}{\partial x_2}, \ldots \right>\cr \nabla_\mathbf{x} z &= \frac{dz}{dy} \nabla_\mathbf{x} y } \]

So, to create an Op as a with the Op constructor, you give a function that returns a tuple, containing:

An a: The result of the function
An a -> Tuple as: A function that, when given $\frac{dz}{dy}$, returns the total gradient $\nabla_z \mathbf{x}$.

This is done so that Ops can easily be "chained" together, one after the other. If you have an Op for $f$ and an Op for $g$, you can compute the gradient of $f$ knowing that the result target is $g \circ f$.

Note that end users should probably never be required to construct an Op explicitly this way. Instead, libraries should provide carefuly pre-constructed ones, or provide ways to generate them automatically (like op1, op2, and op3 here).

For examples of Ops implemented from scratch, see the implementations of +., -., recipOp, sinOp, etc.

See Numeric.Backprop.Op for a mini-tutorial on using Prod and Tuple.

newtype Op as a Source #

An Op as a describes a differentiable function from as to a.

For example, a value of type

Op '[Int, Bool] Double

is a function from an Int and a Bool, returning a Double. It can be differentiated to give a gradient of an Int and a Bool if given a total derivative for the Double. If we call Bool $2$, then, mathematically, it is akin to a:

\[ f : \mathbb{Z} \times 2 \rightarrow \mathbb{R} \]

See runOp, gradOp, and gradOpWith for examples on how to run it, and Op for instructions on creating it.

It is simpler to not use this type constructor directly, and instead use the op2, op1, op2, and op3 helper smart constructors.

See Numeric.Backprop.Op for a mini-tutorial on using Prod and Tuple.

Constructors

Op

Construct an Op by giving a function creating the result, and also a continuation on how to create the gradient, given the total derivative of a.

See the module documentation for Numeric.Backprop.Op for more details on the function that this constructor and Op expect.

Fields

runOpWith :: Tuple as -> (a, a -> Tuple as)
Run the function that the Op encodes, returning a continuation to compute the gradient, given the total derivative of a. See documentation for Numeric.Backprop.Op for more information.

Instances

(Known [] (Length ) as, Every * Floating as, Every * Fractional as, Every * Num as, Floating a) => Floating (Op as a) Source #
Methods pi :: Op as a # exp :: Op as a -> Op as a # log :: Op as a -> Op as a # sqrt :: Op as a -> Op as a # (**) :: Op as a -> Op as a -> Op as a # logBase :: Op as a -> Op as a -> Op as a # sin :: Op as a -> Op as a # cos :: Op as a -> Op as a # tan :: Op as a -> Op as a # asin :: Op as a -> Op as a # acos :: Op as a -> Op as a # atan :: Op as a -> Op as a # sinh :: Op as a -> Op as a # cosh :: Op as a -> Op as a # tanh :: Op as a -> Op as a # asinh :: Op as a -> Op as a # acosh :: Op as a -> Op as a # atanh :: Op as a -> Op as a # log1p :: Op as a -> Op as a # expm1 :: Op as a -> Op as a # log1pexp :: Op as a -> Op as a # log1mexp :: Op as a -> Op as a #
(Known [] (Length ) as, Every * Fractional as, Every * Num as, Fractional a) => Fractional (Op as a) Source #
Methods (/) :: Op as a -> Op as a -> Op as a # recip :: Op as a -> Op as a # fromRational :: Rational -> Op as a #
(Known [] (Length ) as, Every * Num as, Num a) => Num (Op as a) Source #
Methods (+) :: Op as a -> Op as a -> Op as a # (-) :: Op as a -> Op as a -> Op as a # (*) :: Op as a -> Op as a -> Op as a # negate :: Op as a -> Op as a # abs :: Op as a -> Op as a # signum :: Op as a -> Op as a # fromInteger :: Integer -> Op as a #

Tuple Types

Prod, from the <http://hackage.haskell.org/package/type-combinators type-combinators> library (in Data.Type.Product) is a heterogeneous list/tuple type, which allows you to tuple together multiple values of different types and operate on them generically.

A Prod f '[a, b, c] contains an f a, an f b, and an f c, and is constructed by consing them together with :< (using 'Ø' as nil):

I "hello" :< I True :< I 7.8 :< Ø    :: Prod I '[String, Bool, Double]
C "hello" :< C "world" :< C "ok" :< Ø  :: Prod (C String) '[a, b, c]
Proxy :< Proxy :< Proxy :< Ø           :: Prod Proxy '[a, b, c]

(I is the identity functor, and C is the constant functor)

So, in general:

x :: f a
y :: f b
z :: f c
x :< y :< z :< Ø :: Prod f '[a, b, c]

If you're having problems typing 'Ø', you can use only:

only z           :: Prod f '[c]
x :< y :< only z :: Prod f '[a, b, c]

Tuple is provided as a convenient type synonym for Prod I, and has a convenient pattern synonym ::< (and only_), which can also be used for pattern matching:

x :: a
y :: b
z :: c

only_ z             :: Tuple '[c]
x ::< y ::< z ::< Ø :: Tuple '[a, b, c]
x ::< y ::< only_ z :: Tuple '[a, b, c]

data Prod k (f :: k -> *) (a :: [k]) :: forall k. (k -> *) -> [k] -> * where #

Constructors

Ø :: Prod k f ([] k)
(:<) :: Prod k f ((:) k a1 as) infixr 5

Instances

Witness ØC ØC (Prod k f (Ø k))
Associated Types type WitnessC ØC ØC (Prod k f (Ø k)) :: Constraint # Methods (\\) :: ØC => (ØC -> r) -> Prod k f (Ø k) -> r #
Functor1 k [k] (Prod k)
Methods map1 :: (forall (a :: Prod k). f a -> g a) -> t f b -> t g b #
Foldable1 k [k] (Prod k)
Methods foldMap1 :: Monoid m => (forall (a :: Prod k). f a -> m) -> t f b -> m #
Traversable1 k [k] (Prod k)
Methods traverse1 :: Applicative h => (forall (a :: Prod k). f a -> h (g a)) -> t f b -> h (t g b) #
IxFunctor1 k [k] (Index k) (Prod k)
Methods imap1 :: (forall (a :: Index k). i b a -> f a -> g a) -> t f b -> t g b #
IxFoldable1 k [k] (Index k) (Prod k)
Methods ifoldMap1 :: Monoid m => (forall (a :: Index k). i b a -> f a -> m) -> t f b -> m #
IxTraversable1 k [k] (Index k) (Prod k)
Methods itraverse1 :: Applicative h => (forall (a :: Index k). i b a -> f a -> h (g a)) -> t f b -> h (t g b) #
TestEquality k f => TestEquality [k] (Prod k f)
Methods testEquality :: f a -> f b -> Maybe ((Prod k f :~: a) b) #
BoolEquality k f => BoolEquality [k] (Prod k f)
Methods boolEquality :: f a -> f b -> Boolean ((Prod k f == a) b) #
Eq1 k f => Eq1 [k] (Prod k f)
Methods eq1 :: f a -> f a -> Bool # neq1 :: f a -> f a -> Bool #
Ord1 k f => Ord1 [k] (Prod k f)
Methods compare1 :: f a -> f a -> Ordering # (<#) :: f a -> f a -> Bool # (>#) :: f a -> f a -> Bool # (<=#) :: f a -> f a -> Bool # (>=#) :: f a -> f a -> Bool #
Show1 k f => Show1 [k] (Prod k f)
Methods showsPrec1 :: Int -> f a -> ShowS # show1 :: f a -> String #
Read1 k f => Read1 [k] (Prod k f)
Methods readsPrec1 :: Int -> ReadS (Some (Prod k f) f) #
(Known [k] (Length k) as, Every k (Known k f) as) => Known [k] (Prod k f) as
Associated Types type KnownC (Prod k f) (as :: Prod k f -> *) (a :: Prod k f) :: Constraint # Methods known :: as a #
(Witness p q (f a2), Witness s t (Prod a1 f as)) => Witness (p, s) (q, t) (Prod a1 f ((:<) a1 a2 as))
Associated Types type WitnessC (p, s) (q, t) (Prod a1 f ((a1 :< a2) as)) :: Constraint # Methods (\\) :: (p, s) => ((q, t) -> r) -> Prod a1 f ((a1 :< a2) as) -> r #
ListC ((<$>) * Constraint Eq ((<$>) k * f as)) => Eq (Prod k f as)
Methods (==) :: Prod k f as -> Prod k f as -> Bool # (/=) :: Prod k f as -> Prod k f as -> Bool #
(ListC ((<$>) * Constraint Eq ((<$>) k * f as)), ListC ((<$>) * Constraint Ord ((<$>) k * f as))) => Ord (Prod k f as)
Methods compare :: Prod k f as -> Prod k f as -> Ordering # (<) :: Prod k f as -> Prod k f as -> Bool # (<=) :: Prod k f as -> Prod k f as -> Bool # (>) :: Prod k f as -> Prod k f as -> Bool # (>=) :: Prod k f as -> Prod k f as -> Bool # max :: Prod k f as -> Prod k f as -> Prod k f as # min :: Prod k f as -> Prod k f as -> Prod k f as #
ListC ((<$>) * Constraint Show ((<$>) k * f as)) => Show (Prod k f as)
Methods showsPrec :: Int -> Prod k f as -> ShowS # show :: Prod k f as -> String # showList :: [Prod k f as] -> ShowS #
type WitnessC ØC ØC (Prod k f (Ø k))
type WitnessC ØC ØC (Prod k f (Ø k)) = ØC
type KnownC [k] (Prod k f) as
type KnownC [k] (Prod k f) as = (Known [k] (Length k) as, Every k (Known k f) as)
type WitnessC (p, s) (q, t) (Prod a1 f ((:<) a1 a2 as))
type WitnessC (p, s) (q, t) (Prod a1 f ((:<) a1 a2 as)) = (Witness p q (f a2), Witness s t (Prod a1 f as))

type Tuple = Prod * I #

A Prod of simple Haskell types.

newtype I a :: * -> * #

Constructors

I
Fields getI :: a

Instances

Monad I
Methods (>>=) :: I a -> (a -> I b) -> I b # (>>) :: I a -> I b -> I b # return :: a -> I a # fail :: String -> I a #
Functor I
Methods fmap :: (a -> b) -> I a -> I b # (<$) :: a -> I b -> I a #
Applicative I
Methods pure :: a -> I a # (<>) :: I (a -> b) -> I a -> I b # liftA2 :: (a -> b -> c) -> I a -> I b -> I c # (>) :: I a -> I b -> I b # (<*) :: I a -> I b -> I a #
Foldable I
Methods fold :: Monoid m => I m -> m # foldMap :: Monoid m => (a -> m) -> I a -> m # foldr :: (a -> b -> b) -> b -> I a -> b # foldr' :: (a -> b -> b) -> b -> I a -> b # foldl :: (b -> a -> b) -> b -> I a -> b # foldl' :: (b -> a -> b) -> b -> I a -> b # foldr1 :: (a -> a -> a) -> I a -> a # foldl1 :: (a -> a -> a) -> I a -> a # toList :: I a -> [a] # null :: I a -> Bool # length :: I a -> Int # elem :: Eq a => a -> I a -> Bool # maximum :: Ord a => I a -> a # minimum :: Ord a => I a -> a # sum :: Num a => I a -> a # product :: Num a => I a -> a #
Traversable I
Methods traverse :: Applicative f => (a -> f b) -> I a -> f (I b) # sequenceA :: Applicative f => I (f a) -> f (I a) # mapM :: Monad m => (a -> m b) -> I a -> m (I b) # sequence :: Monad m => I (m a) -> m (I a) #
Witness p q a => Witness p q (I a)
Associated Types type WitnessC p q (I a) :: Constraint # Methods (\\) :: p => (q -> r) -> I a -> r #
Eq a => Eq (I a)
Methods (==) :: I a -> I a -> Bool # (/=) :: I a -> I a -> Bool #
Num a => Num (I a)
Methods (+) :: I a -> I a -> I a # (-) :: I a -> I a -> I a # (*) :: I a -> I a -> I a # negate :: I a -> I a # abs :: I a -> I a # signum :: I a -> I a # fromInteger :: Integer -> I a #
Ord a => Ord (I a)
Methods compare :: I a -> I a -> Ordering # (<) :: I a -> I a -> Bool # (<=) :: I a -> I a -> Bool # (>) :: I a -> I a -> Bool # (>=) :: I a -> I a -> Bool # max :: I a -> I a -> I a # min :: I a -> I a -> I a #
Show a => Show (I a)
Methods showsPrec :: Int -> I a -> ShowS # show :: I a -> String # showList :: [I a] -> ShowS #
type WitnessC p q (I a)
type WitnessC p q (I a) = Witness p q a

Running

Pure

runOp :: Num a => Op as a -> Tuple as -> (a, Tuple as) Source #

Run the function that an Op encodes, to get the resulting output and also its gradient with respect to the inputs.

>>> gradOp' (op2 (*)) (3 ::< 5 ::< Ø)
(15, 5 ::< 3 ::< Ø)

evalOp :: Op as a -> Tuple as -> a Source #

Run the function that an Op encodes, to get the result.

>>> runOp (op2 (*)) (3 ::< 5 ::< Ø)
15

gradOp :: Num a => Op as a -> Tuple as -> Tuple as Source #

Run the function that an Op encodes, and get the gradient of the output with respect to the inputs.

>>> gradOp (op2 (*)) (3 ::< 5 ::< Ø)
5 ::< 3 ::< Ø
-- the gradient of x*y is (y, x)

gradOp o xs = gradOpWith o xs 1

gradOpWith Source #

Arguments

:: Op as a	`Op` to run
-> Tuple as	Inputs to run it with
-> a	The total derivative of the result.
-> Tuple as	The gradient

Get the gradient function that an Op encodes, with a third argument expecting the total derivative of the result.

See the module documentaiton for Numeric.Backprop.Op for more information.

Creation

op0 :: a -> Op '[] a Source #

Create an Op that takes no inputs and always returns the given value.

There is no gradient, of course (using gradOp will give you an empty tuple), because there is no input to have a gradient of.

>>> runOp (op0 10) Ø
(10, Ø)

For a constant Op that takes input and ignores it, see opConst and opConst'.

opConst :: (Every Num as, Known Length as) => a -> Op as a Source #

An Op that ignores all of its inputs and returns a given constant value.

>>> gradOp' (opConst 10) (1 ::< 2 ::< 3 ::< Ø)
(10, 0 ::< 0 ::< 0 ::< Ø)

idOp :: Op '[a] a Source #

An Op that just returns whatever it receives. The identity function.

idOp = opIso id id

opConst' :: Every Num as => Length as -> a -> Op as a Source #

A version of opConst taking explicit Length, indicating the number of inputs and their types.

Requiring an explicit Length is mostly useful for rare "extremely polymorphic" situations, where GHC can't infer the type and length of the the expected input tuple. If you ever actually explicitly write down as as a list of types, you should be able to just use opConst.

Giving gradients directly

op1 :: (a -> (b, b -> a)) -> Op '[a] b Source #

Create an Op of a function taking one input, by giving its explicit derivative. The function should return a tuple containing the result of the function, and also a function taking the derivative of the result and return the derivative of the input.

If we have

\[ \eqalign{ f &: \mathbb{R} \rightarrow \mathbb{R}\cr y &= f(x)\cr z &= g(y) } \]

Then the derivative $ \frac{dz}{dx} $, it would be:

\[ \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} \]

If our Op represents $f$, then the second item in the resulting tuple should be a function that takes $\frac{dz}{dy}$ and returns $\frac{dz}{dx}$.

As an example, here is an Op that squares its input:

square :: Num a => Op '[a] a
square = op1 $ \x -> (x*x, \d -> 2 * d * x
                     )

Remember that, generally, end users shouldn't directly construct Ops; they should be provided by libraries or generated automatically.

op2 :: (a -> b -> (c, c -> (a, b))) -> Op '[a, b] c Source #

Create an Op of a function taking two inputs, by giving its explicit gradient. The function should return a tuple containing the result of the function, and also a function taking the derivative of the result and return the derivative of the input.

If we have

\[ \eqalign{ f &: \mathbb{R}^2 \rightarrow \mathbb{R}\cr z &= f(x, y)\cr k &= g(z) } \]

Then the gradient $ \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> $ would be:

\[ \left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> = \left< \frac{dk}{dz} \frac{\partial z}{dx}, \frac{dk}{dz} \frac{\partial z}{dy} \right> \]

If our Op represents $f$, then the second item in the resulting tuple should be a function that takes $\frac{dk}{dz}$ and returns $ \left< \frac{\partial k}{dx}, \frac{\partial k}{dx} \right> $.

As an example, here is an Op that multiplies its inputs:

mul :: Num a => Op '[a, a] a
mul = op2' $ \x y -> (x*y, \d -> (d*y, x*d)
                     )

Remember that, generally, end users shouldn't directly construct Ops; they should be provided by libraries or generated automatically.

op3 :: (a -> b -> c -> (d, d -> (a, b, c))) -> Op '[a, b, c] d Source #

Create an Op of a function taking three inputs, by giving its explicit gradient. See documentation for op2 for more details.

From Isomorphisms

opCoerce :: Coercible a b => Op '[a] b Source #

An Op that coerces an item into another item whose type has the same runtime representation.

>>> gradOp' opCoerce (Identity 5) :: (Int, Identity Int)
(5, Identity 1)

opCoerce = opIso coerced coerce

opTup :: Op as (Tuple as) Source #

An Op that takes as and returns exactly the input tuple.

>>> gradOp' opTup (1 ::< 2 ::< 3 ::< Ø)
(1 ::< 2 ::< 3 ::< Ø, 1 ::< 1 ::< 1 ::< Ø)

opIso :: (a -> b) -> (b -> a) -> Op '[a] b Source #

An Op that runs the input value through an isomorphism.

Warning: This is unsafe! It assumes that the isomorphisms themselves have derivative 1, so will break for things like exponentiating. Basically, don't use this for any "numeric" isomorphisms.

opLens :: Num a => Lens' a b -> Op '[a] b Source #

An Op that extracts a value from an input value using a Lens'.

Warning: This is unsafe! It assumes that it extracts a specific value unchanged, with derivative 1, so will break for things that numerically manipulate things before returning them.

Manipulation

composeOp Source #

Arguments

:: (Every Num as, Known Length as)
=> Prod (Op as) bs	`Prod` of `Op`s taking `as` and returning different `b` in `bs`
-> Op bs c	`Op` taking eac of the `bs` from the input `Prod`.
-> Op as c	Composed `Op`

Compose Ops together, like sequence for functions, or liftAN.

That is, given an Op as b1, an Op as b2, and an Op as b3, it can compose them with an Op '[b1,b2,b3] c to create an Op as c.

composeOp1 :: (Every Num as, Known Length as) => Op as b -> Op '[b] c -> Op as c Source #

Convenient wrapper over composeOp for the case where the second function only takes one input, so the two Ops can be directly piped together, like for ..

(~.) :: (Known Length as, Every Num as) => Op '[b] c -> Op as b -> Op as c infixr 9 Source #

Convenient infix synonym for (flipped) composeOp1. Meant to be used just like .:

f :: Op '[b]   c
g :: Op '[a,a] b

f ~. g :: Op '[a, a] c

composeOp' Source #

Arguments

:: Every Num as
=> Length as
-> Prod (Op as) bs	`Prod` of `Op`s taking `as` and returning different `b` in `bs`
-> Op bs c	`OpM` taking eac of the `bs` from the input `Prod`.
-> Op as c	Composed `Op`

A version of composeOp taking explicit Length, indicating the number of inputs expected and their types.

Requiring an explicit Length is mostly useful for rare "extremely polymorphic" situations, where GHC can't infer the type and length of the the expected input tuple. If you ever actually explicitly write down as as a list of types, you should be able to just use composeOp.

composeOp1' :: Every Num as => Length as -> Op as b -> Op '[b] c -> Op as c Source #

A version of composeOp1 taking explicit Length, indicating the number of inputs expected and their types.

Requiring an explicit Length is mostly useful for rare "extremely polymorphic" situations, where GHC can't infer the type and length of the the expected input tuple. If you ever actually explicitly write down as as a list of types, you should be able to just use composeOp1.