backprop-0.1.2.0: Heterogeneous automatic differentation (backpropagation)

Numeric.Backprop.Op

Description

Provides the Op type and combinators, which represent differentiable functions/operations on values, and are used internally by the library to perform back-propagation.

Users of the library can ignore this module for the most part. Library authors defining backpropagatable primitives for their functions are recommend to simply use op0, op1, op2, op3, which are re-exported in Numeric.Backprop. However, authors who want more options in defining their primtive functions might find some of these functions useful.

Note that if your entire function is a single non-branching composition of functions, Op and its utility functions alone are sufficient to differentiate/backprop. However, this happens rarely in practice.

Synopsis

# Implementation

Ops contain information on a function as well as its gradient, but provides that information in a way that allows them to be "chained".

For example, for a function

$f : \mathbb{R}^n \rightarrow \mathbb{R}$

We might want to apply a function $$g$$ to the result we get, to get our "final" result:

\eqalign{ y &= f(\mathbf{x})\cr z &= g(y) }

Now, we might want the gradient $$\nabla z$$ with respect to $$\mathbf{x}$$, or $$\nabla_\mathbf{x} z$$. Explicitly, this is:

$\nabla_\mathbf{x} z = \left< \frac{\partial z}{\partial x_1}, \frac{\partial z}{\partial x_2}, \ldots \right>$

We can compute that by multiplying the total derivative of $$z$$ with respect to $$y$$ (that is, $$\frac{dz}{dy}$$) with the gradient of $$f$$) itself:

\eqalign{ \nabla_\mathbf{x} z &= \frac{dz}{dy} \left< \frac{\partial y}{\partial x_1}, \frac{\partial y}{\partial x_2}, \ldots \right>\cr \nabla_\mathbf{x} z &= \frac{dz}{dy} \nabla_\mathbf{x} y }

So, to create an Op as a with the Op constructor, you give a function that returns a tuple, containing:

1. An a: The result of the function
2. An a -> Tuple as: A function that, when given $$\frac{dz}{dy}$$, returns the total gradient $$\nabla_z \mathbf{x}$$.

This is done so that Ops can easily be "chained" together, one after the other. If you have an Op for $$f$$ and an Op for $$g$$, you can compute the gradient of $$f$$ knowing that the result target is $$g \circ f$$.

Note that end users should probably never be required to construct an Op explicitly this way. Instead, libraries should provide carefuly pre-constructed ones, or provide ways to generate them automatically (like op1, op2, and op3 here).

For examples of Ops implemented from scratch, see the implementations of +., -., recipOp, sinOp, etc.

See Numeric.Backprop.Op for a mini-tutorial on using Prod and Tuple.

newtype Op as a Source #

An Op as a describes a differentiable function from as to a.

For example, a value of type

Op '[Int, Bool] Double

is a function from an Int and a Bool, returning a Double. It can be differentiated to give a gradient of an Int and a Bool if given a total derivative for the Double. If we call Bool $$2$$, then, mathematically, it is akin to a:

$f : \mathbb{Z} \times 2 \rightarrow \mathbb{R}$

See runOp, gradOp, and gradOpWith for examples on how to run it, and Op for instructions on creating it.

It is simpler to not use this type constructor directly, and instead use the op2, op1, op2, and op3 helper smart constructors.

See Numeric.Backprop.Op for a mini-tutorial on using Prod and Tuple.

Constructors

 Op Construct an Op by giving a function creating the result, and also a continuation on how to create the gradient, given the total derivative of a.See the module documentation for Numeric.Backprop.Op for more details on the function that this constructor and Op expect. FieldsrunOpWith :: Tuple as -> (a, a -> Tuple as)Run the function that the Op encodes, returning a continuation to compute the gradient, given the total derivative of a. See documentation for Numeric.Backprop.Op for more information.

Instances

 (Known [*] (Length *) as, Every * Floating as, Every * Fractional as, Every * Num as, Floating a) => Floating (Op as a) Source # Methodspi :: Op as a #exp :: Op as a -> Op as a #log :: Op as a -> Op as a #sqrt :: Op as a -> Op as a #(**) :: Op as a -> Op as a -> Op as a #logBase :: Op as a -> Op as a -> Op as a #sin :: Op as a -> Op as a #cos :: Op as a -> Op as a #tan :: Op as a -> Op as a #asin :: Op as a -> Op as a #acos :: Op as a -> Op as a #atan :: Op as a -> Op as a #sinh :: Op as a -> Op as a #cosh :: Op as a -> Op as a #tanh :: Op as a -> Op as a #asinh :: Op as a -> Op as a #acosh :: Op as a -> Op as a #atanh :: Op as a -> Op as a #log1p :: Op as a -> Op as a #expm1 :: Op as a -> Op as a #log1pexp :: Op as a -> Op as a #log1mexp :: Op as a -> Op as a # (Known [*] (Length *) as, Every * Fractional as, Every * Num as, Fractional a) => Fractional (Op as a) Source # Methods(/) :: Op as a -> Op as a -> Op as a #recip :: Op as a -> Op as a #fromRational :: Rational -> Op as a # (Known [*] (Length *) as, Every * Num as, Num a) => Num (Op as a) Source # Methods(+) :: Op as a -> Op as a -> Op as a #(-) :: Op as a -> Op as a -> Op as a #(*) :: Op as a -> Op as a -> Op as a #negate :: Op as a -> Op as a #abs :: Op as a -> Op as a #signum :: Op as a -> Op as a #fromInteger :: Integer -> Op as a #

## Tuple Types

Prod, from the <http://hackage.haskell.org/package/type-combinators type-combinators> library (in Data.Type.Product) is a heterogeneous list/tuple type, which allows you to tuple together multiple values of different types and operate on them generically.

A Prod f '[a, b, c] contains an f a, an f b, and an f c, and is constructed by consing them together with :< (using 'Ø' as nil):

I "hello" :< I True :< I 7.8 :< Ø    :: Prod I '[String, Bool, Double]
C "hello" :< C "world" :< C "ok" :< Ø  :: Prod (C String) '[a, b, c]
Proxy :< Proxy :< Proxy :< Ø           :: Prod Proxy '[a, b, c]

(I is the identity functor, and C is the constant functor)

So, in general:

x :: f a
y :: f b
z :: f c
x :< y :< z :< Ø :: Prod f '[a, b, c]

If you're having problems typing 'Ø', you can use only:

only z           :: Prod f '[c]
x :< y :< only z :: Prod f '[a, b, c]

Tuple is provided as a convenient type synonym for Prod I, and has a convenient pattern synonym ::< (and only_), which can also be used for pattern matching:

x :: a
y :: b
z :: c

only_ z             :: Tuple '[c]
x ::< y ::< z ::< Ø :: Tuple '[a, b, c]
x ::< y ::< only_ z :: Tuple '[a, b, c]

data Prod k (f :: k -> *) (a :: [k]) :: forall k. (k -> *) -> [k] -> * where #

Constructors

 Ø :: Prod k f ([] k) (:<) :: Prod k f ((:) k a1 as) infixr 5

Instances

 Witness ØC ØC (Prod k f (Ø k)) Associated Typestype WitnessC ØC ØC (Prod k f (Ø k)) :: Constraint # Methods(\\) :: ØC => (ØC -> r) -> Prod k f (Ø k) -> r # Functor1 k [k] (Prod k) Methodsmap1 :: (forall (a :: Prod k). f a -> g a) -> t f b -> t g b # Foldable1 k [k] (Prod k) MethodsfoldMap1 :: Monoid m => (forall (a :: Prod k). f a -> m) -> t f b -> m # Traversable1 k [k] (Prod k) Methodstraverse1 :: Applicative h => (forall (a :: Prod k). f a -> h (g a)) -> t f b -> h (t g b) # IxFunctor1 k [k] (Index k) (Prod k) Methodsimap1 :: (forall (a :: Index k). i b a -> f a -> g a) -> t f b -> t g b # IxFoldable1 k [k] (Index k) (Prod k) MethodsifoldMap1 :: Monoid m => (forall (a :: Index k). i b a -> f a -> m) -> t f b -> m # IxTraversable1 k [k] (Index k) (Prod k) Methodsitraverse1 :: Applicative h => (forall (a :: Index k). i b a -> f a -> h (g a)) -> t f b -> h (t g b) # TestEquality k f => TestEquality [k] (Prod k f) MethodstestEquality :: f a -> f b -> Maybe ((Prod k f :~: a) b) # BoolEquality k f => BoolEquality [k] (Prod k f) MethodsboolEquality :: f a -> f b -> Boolean ((Prod k f == a) b) # Eq1 k f => Eq1 [k] (Prod k f) Methodseq1 :: f a -> f a -> Bool #neq1 :: f a -> f a -> Bool # Ord1 k f => Ord1 [k] (Prod k f) Methodscompare1 :: f a -> f a -> Ordering #(<#) :: f a -> f a -> Bool #(>#) :: f a -> f a -> Bool #(<=#) :: f a -> f a -> Bool #(>=#) :: f a -> f a -> Bool # Show1 k f => Show1 [k] (Prod k f) MethodsshowsPrec1 :: Int -> f a -> ShowS #show1 :: f a -> String # Read1 k f => Read1 [k] (Prod k f) MethodsreadsPrec1 :: Int -> ReadS (Some (Prod k f) f) # (Known [k] (Length k) as, Every k (Known k f) as) => Known [k] (Prod k f) as Associated Typestype KnownC (Prod k f) (as :: Prod k f -> *) (a :: Prod k f) :: Constraint # Methodsknown :: as a # (Witness p q (f a2), Witness s t (Prod a1 f as)) => Witness (p, s) (q, t) (Prod a1 f ((:<) a1 a2 as)) Associated Typestype WitnessC (p, s) (q, t) (Prod a1 f ((a1 :< a2) as)) :: Constraint # Methods(\\) :: (p, s) => ((q, t) -> r) -> Prod a1 f ((a1 :< a2) as) -> r # ListC ((<$>) * Constraint Eq ((<$>) k * f as)) => Eq (Prod k f as) Methods(==) :: Prod k f as -> Prod k f as -> Bool #(/=) :: Prod k f as -> Prod k f as -> Bool # (ListC ((<$>) * Constraint Eq ((<$>) k * f as)), ListC ((<$>) * Constraint Ord ((<$>) k * f as))) => Ord (Prod k f as) Methodscompare :: Prod k f as -> Prod k f as -> Ordering #(<) :: Prod k f as -> Prod k f as -> Bool #(<=) :: Prod k f as -> Prod k f as -> Bool #(>) :: Prod k f as -> Prod k f as -> Bool #(>=) :: Prod k f as -> Prod k f as -> Bool #max :: Prod k f as -> Prod k f as -> Prod k f as #min :: Prod k f as -> Prod k f as -> Prod k f as # ListC ((<$>) * Constraint Show ((<$>) k * f as)) => Show (Prod k f as) MethodsshowsPrec :: Int -> Prod k f as -> ShowS #show :: Prod k f as -> String #showList :: [Prod k f as] -> ShowS # type WitnessC ØC ØC (Prod k f (Ø k)) type WitnessC ØC ØC (Prod k f (Ø k)) = ØC type KnownC [k] (Prod k f) as type KnownC [k] (Prod k f) as = (Known [k] (Length k) as, Every k (Known k f) as) type WitnessC (p, s) (q, t) (Prod a1 f ((:<) a1 a2 as)) type WitnessC (p, s) (q, t) (Prod a1 f ((:<) a1 a2 as)) = (Witness p q (f a2), Witness s t (Prod a1 f as))

type Tuple = Prod * I #

A Prod of simple Haskell types.

newtype I a :: * -> * #

Constructors

 I FieldsgetI :: a

Instances

 Methods(>>=) :: I a -> (a -> I b) -> I b #(>>) :: I a -> I b -> I b #return :: a -> I a #fail :: String -> I a # Methodsfmap :: (a -> b) -> I a -> I b #(<) :: a -> I b -> I a # Methodspure :: a -> I a #(<*>) :: I (a -> b) -> I a -> I b #liftA2 :: (a -> b -> c) -> I a -> I b -> I c #(*>) :: I a -> I b -> I b #(<*) :: I a -> I b -> I a # Methodsfold :: Monoid m => I m -> m #foldMap :: Monoid m => (a -> m) -> I a -> m #foldr :: (a -> b -> b) -> b -> I a -> b #foldr' :: (a -> b -> b) -> b -> I a -> b #foldl :: (b -> a -> b) -> b -> I a -> b #foldl' :: (b -> a -> b) -> b -> I a -> b #foldr1 :: (a -> a -> a) -> I a -> a #foldl1 :: (a -> a -> a) -> I a -> a #toList :: I a -> [a] #null :: I a -> Bool #length :: I a -> Int #elem :: Eq a => a -> I a -> Bool #maximum :: Ord a => I a -> a #minimum :: Ord a => I a -> a #sum :: Num a => I a -> a #product :: Num a => I a -> a # Methodstraverse :: Applicative f => (a -> f b) -> I a -> f (I b) #sequenceA :: Applicative f => I (f a) -> f (I a) #mapM :: Monad m => (a -> m b) -> I a -> m (I b) #sequence :: Monad m => I (m a) -> m (I a) # Witness p q a => Witness p q (I a) Associated Typestype WitnessC p q (I a) :: Constraint # Methods(\\) :: p => (q -> r) -> I a -> r # Eq a => Eq (I a) Methods(==) :: I a -> I a -> Bool #(/=) :: I a -> I a -> Bool # Num a => Num (I a) Methods(+) :: I a -> I a -> I a #(-) :: I a -> I a -> I a #(*) :: I a -> I a -> I a #negate :: I a -> I a #abs :: I a -> I a #signum :: I a -> I a #fromInteger :: Integer -> I a # Ord a => Ord (I a) Methodscompare :: I a -> I a -> Ordering #(<) :: I a -> I a -> Bool #(<=) :: I a -> I a -> Bool #(>) :: I a -> I a -> Bool #(>=) :: I a -> I a -> Bool #max :: I a -> I a -> I a #min :: I a -> I a -> I a # Show a => Show (I a) MethodsshowsPrec :: Int -> I a -> ShowS #show :: I a -> String #showList :: [I a] -> ShowS # type WitnessC p q (I a) type WitnessC p q (I a) = Witness p q a # Running ## Pure runOp :: Num a => Op as a -> Tuple as -> (a, Tuple as) Source # Run the function that an Op encodes, to get the resulting output and also its gradient with respect to the inputs. >>> gradOp' (op2 (*)) (3 ::< 5 ::< Ø) (15, 5 ::< 3 ::< Ø) evalOp :: Op as a -> Tuple as -> a Source # Run the function that an Op encodes, to get the result. >>> runOp (op2 (*)) (3 ::< 5 ::< Ø) 15 gradOp :: Num a => Op as a -> Tuple as -> Tuple as Source # Run the function that an Op encodes, and get the gradient of the output with respect to the inputs. >>> gradOp (op2 (*)) (3 ::< 5 ::< Ø) 5 ::< 3 ::< Ø -- the gradient of x*y is (y, x) gradOp o xs = gradOpWith o xs 1 Arguments  :: Op as a Op to run -> Tuple as Inputs to run it with -> a The total derivative of the result. -> Tuple as The gradient Get the gradient function that an Op encodes, with a third argument expecting the total derivative of the result. See the module documentaiton for Numeric.Backprop.Op for more information. # Creation op0 :: a -> Op '[] a Source # Create an Op that takes no inputs and always returns the given value. There is no gradient, of course (using gradOp will give you an empty tuple), because there is no input to have a gradient of. >>> runOp (op0 10) Ø (10, Ø) For a constant Op that takes input and ignores it, see opConst and opConst'. opConst :: (Every Num as, Known Length as) => a -> Op as a Source # An Op that ignores all of its inputs and returns a given constant value. >>> gradOp' (opConst 10) (1 ::< 2 ::< 3 ::< Ø) (10, 0 ::< 0 ::< 0 ::< Ø) idOp :: Op '[a] a Source # An Op that just returns whatever it receives. The identity function. idOp = opIso id id opConst' :: Every Num as => Length as -> a -> Op as a Source # A version of opConst taking explicit Length, indicating the number of inputs and their types. Requiring an explicit Length is mostly useful for rare "extremely polymorphic" situations, where GHC can't infer the type and length of the the expected input tuple. If you ever actually explicitly write down as as a list of types, you should be able to just use opConst. ## Giving gradients directly op1 :: (a -> (b, b -> a)) -> Op '[a] b Source # Create an Op of a function taking one input, by giving its explicit derivative. The function should return a tuple containing the result of the function, and also a function taking the derivative of the result and return the derivative of the input. If we have \eqalign{ f &: \mathbb{R} \rightarrow \mathbb{R}\cr y &= f(x)\cr z &= g(y) } Then the derivative $$\frac{dz}{dx}$$, it would be: $\frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx}$ If our Op represents $$f$$, then the second item in the resulting tuple should be a function that takes $$\frac{dz}{dy}$$ and returns $$\frac{dz}{dx}$$. As an example, here is an Op that squares its input: square :: Num a => Op '[a] a square = op1 \x -> (x*x, \d -> 2 * d * x
)

Remember that, generally, end users shouldn't directly construct Ops; they should be provided by libraries or generated automatically.

op2 :: (a -> b -> (c, c -> (a, b))) -> Op '[a, b] c Source #

Create an Op of a function taking two inputs, by giving its explicit gradient. The function should return a tuple containing the result of the function, and also a function taking the derivative of the result and return the derivative of the input.

If we have

\eqalign{ f &: \mathbb{R}^2 \rightarrow \mathbb{R}\cr z &= f(x, y)\cr k &= g(z) }

Then the gradient $$\left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right>$$ would be:

$\left< \frac{\partial k}{\partial x}, \frac{\partial k}{\partial y} \right> = \left< \frac{dk}{dz} \frac{\partial z}{dx}, \frac{dk}{dz} \frac{\partial z}{dy} \right>$

If our Op represents $$f$$, then the second item in the resulting tuple should be a function that takes $$\frac{dk}{dz}$$ and returns $$\left< \frac{\partial k}{dx}, \frac{\partial k}{dx} \right>$$.

As an example, here is an Op that multiplies its inputs:

mul :: Num a => Op '[a, a] a
mul = op2' \$ \x y -> (x*y, \d -> (d*y, x*d)
)

Remember that, generally, end users shouldn't directly construct Ops; they should be provided by libraries or generated automatically.

op3 :: (a -> b -> c -> (d, d -> (a, b, c))) -> Op '[a, b, c] d Source #

Create an Op of a function taking three inputs, by giving its explicit gradient. See documentation for op2 for more details.

## From Isomorphisms

opCoerce :: Coercible a b => Op '[a] b Source #

An Op that coerces an item into another item whose type has the same runtime representation.

>>> gradOp' opCoerce (Identity 5) :: (Int, Identity Int)
(5, Identity 1)
opCoerce = opIso coerced coerce

opTup :: Op as (Tuple as) Source #

An Op that takes as and returns exactly the input tuple.

>>> gradOp' opTup (1 ::< 2 ::< 3 ::< Ø)
(1 ::< 2 ::< 3 ::< Ø, 1 ::< 1 ::< 1 ::< Ø)

opIso :: (a -> b) -> (b -> a) -> Op '[a] b Source #

An Op that runs the input value through an isomorphism.

Warning: This is unsafe! It assumes that the isomorphisms themselves have derivative 1, so will break for things like exponentiating. Basically, don't use this for any "numeric" isomorphisms.

opLens :: Num a => Lens' a b -> Op '[a] b Source #

An Op that extracts a value from an input value using a Lens'.

Warning: This is unsafe! It assumes that it extracts a specific value unchanged, with derivative 1, so will break for things that numerically manipulate things before returning them.

opIsoN :: (Tuple as -> b) -> (b -> Tuple as) -> Op as b Source #

An Op that runs the input value through an isomorphism between a tuple of values and a value.

Warning: This is unsafe! It assumes that the isomorphisms themselves have derivative 1, so will break for things like exponentiating. Basically, don't use this for any "numeric" isomorphisms.

Since: 0.1.2.0

# Manipulation

Arguments

 :: (Every Num as, Known Length as) => Prod (Op as) bs Prod of Ops taking as and returning different b in bs -> Op bs c Op taking eac of the bs from the input Prod. -> Op as c Composed Op

Compose Ops together, like sequence for functions, or liftAN.

That is, given an Op as b1, an Op as b2, and an Op as b3, it can compose them with an Op '[b1,b2,b3] c to create an Op as c.

composeOp1 :: (Every Num as, Known Length as) => Op as b -> Op '[b] c -> Op as c Source #

Convenient wrapper over composeOp for the case where the second function only takes one input, so the two Ops can be directly piped together, like for ..

(~.) :: (Known Length as, Every Num as) => Op '[b] c -> Op as b -> Op as c infixr 9 Source #

Convenient infix synonym for (flipped) composeOp1. Meant to be used just like .:

f :: Op '[b]   c
g :: Op '[a,a] b

f ~. g :: Op '[a, a] c

Arguments

 :: Every Num as => Length as -> Prod (Op as) bs Prod of Ops taking as and returning different b in bs -> Op bs c OpM taking eac of the bs from the input Prod. -> Op as c Composed Op

A version of composeOp taking explicit Length, indicating the number of inputs expected and their types.

Requiring an explicit Length is mostly useful for rare "extremely polymorphic" situations, where GHC can't infer the type and length of the the expected input tuple. If you ever actually explicitly write down as as a list of types, you should be able to just use composeOp.

composeOp1' :: Every Num as => Length as -> Op as b -> Op '[b] c -> Op as c Source #

A version of composeOp1 taking explicit Length, indicating the number of inputs expected and their types.

Requiring an explicit Length is mostly useful for rare "extremely polymorphic" situations, where GHC can't infer the type and length of the the expected input tuple. If you ever actually explicitly write down as as a list of types, you should be able to just use composeOp1.

# Utility

pattern (:>) :: forall k (f :: k -> *) (a :: k) (b :: k). f a -> f b -> Prod k f ((:) k a ((:) k b ([] k))) infix 6 #

Construct a two element Prod. Since the precedence of (:>) is higher than (:<), we can conveniently write lists like:

>>> a :< b :> c

Which is identical to:

>>> a :< b :< c :< Ø

only :: f a -> Prod k f ((:) k a ([] k)) #

Build a singleton Prod.

head' :: Prod k f ((:<) k a as) -> f a #

pattern (::<) :: forall a (as :: [*]). a -> Tuple as -> Tuple ((:<) * a as) infixr 5 #

Cons onto a Tuple.

only_ :: a -> Tuple ((:) * a ([] *)) #

Singleton Tuple.

## Numeric Ops

Built-in ops for common numeric operations.

Note that the operators (like +.) are meant to be used in prefix form, like:

liftOp2 (.+) v1 v2

(+.) :: Num a => Op '[a, a] a Source #

(-.) :: Num a => Op '[a, a] a Source #

Op for subtraction

(*.) :: Num a => Op '[a, a] a Source #

Op for multiplication

negateOp :: Num a => Op '[a] a Source #

Op for negation

absOp :: Num a => Op '[a] a Source #

Op for absolute value

signumOp :: Num a => Op '[a] a Source #

(/.) :: Fractional a => Op '[a, a] a Source #

Op for division

recipOp :: Fractional a => Op '[a] a Source #

Op for multiplicative inverse

expOp :: Floating a => Op '[a] a Source #

Op for exp

logOp :: Floating a => Op '[a] a Source #

Op for the natural logarithm

sqrtOp :: Floating a => Op '[a] a Source #

Op for square root

(**.) :: Floating a => Op '[a, a] a Source #

Op for exponentiation

logBaseOp :: Floating a => Op '[a, a] a Source #

sinOp :: Floating a => Op '[a] a Source #

Op for sine

cosOp :: Floating a => Op '[a] a Source #

Op for cosine

tanOp :: Floating a => Op '[a] a Source #

Op for tangent

asinOp :: Floating a => Op '[a] a Source #

Op for arcsine

acosOp :: Floating a => Op '[a] a Source #

Op for arccosine

atanOp :: Floating a => Op '[a] a Source #

Op for arctangent

sinhOp :: Floating a => Op '[a] a Source #

Op for hyperbolic sine

coshOp :: Floating a => Op '[a] a Source #

Op for hyperbolic cosine

tanhOp :: Floating a => Op '[a] a Source #

Op for hyperbolic tangent

asinhOp :: Floating a => Op '[a] a Source #

Op for hyperbolic arcsine

acoshOp :: Floating a => Op '[a] a Source #

Op for hyperbolic arccosine

atanhOp :: Floating a => Op '[a] a Source #

Op for hyperbolic arctangent