| Portability | GHC only |
|---|---|
| Stability | experimental |
| Maintainer | ekmett@gmail.com |
| Safe Haskell | None |
Numeric.AD
Contents
- AD modes
- Gradients (Reverse Mode)
- Higher Order Gradients (Sparse-on-Reverse)
- Variadic Gradients (Sparse or Kahn)
- Jacobians (Sparse or Reverse)
- Higher Order Jacobian (Sparse-on-Reverse)
- Transposed Jacobians (Forward Mode)
- Hessian (Sparse-On-Reverse)
- Hessian Tensors (Sparse or Sparse-On-Reverse)
- Hessian Tensors (Sparse)
- Hessian Vector Products (Forward-On-Reverse)
- Derivatives (Forward Mode)
- Derivatives (Tower)
- Directional Derivatives (Forward Mode)
- Directional Derivatives (Tower)
- Taylor Series (Tower)
- Maclaurin Series (Tower)
- Gradient Descent
Description
Mixed-Mode Automatic Differentiation.
Each combinator exported from this module chooses an appropriate AD mode. The following basic operations are supported, modified as appropriate by the suffixes below:
-
gradcomputes the gradient (partial derivatives) of a function at a point -
jacobiancomputes the Jacobian matrix of a function at a point -
diffcomputes the derivative of a function at a point -
ducomputes a directional derivative of a function at a point -
hessiancompute the Hessian matrix (matrix of second partial derivatives) of a function at a point
The suffixes have the following meanings:
-
'-- also return the answer -
Withlets the user supply a function to blend the input with the output -
Fis a version of the base function lifted to return aTraversable(orFunctor) result -
smeans the function returns all higher derivatives in a list or f-branchingStream -
Tmeans the result is transposed with respect to the traditional formulation. -
0means that the resulting derivative list is padded with 0s at the end.
- class (Num t, Num (Scalar t)) => Mode t where
- type family Scalar t :: *
- grad :: (Traversable f, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> f a
- grad' :: (Traversable f, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> (a, f a)
- gradWith :: (Traversable f, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> f b
- gradWith' :: (Traversable f, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> (a, f b)
- grads :: (Traversable f, Num a) => (forall s. f (Sparse a s) -> Sparse a s) -> f a -> Cofree f a
- class Num a => Grad i o o' a | i -> a o o', o -> a i o', o' -> a i o
- vgrad :: Grad i o o' a => i -> o
- vgrad' :: Grad i o o' a => i -> o'
- class Num a => Grads i o a | i -> a o, o -> a i
- vgrads :: Grads i o a => i -> o
- jacobian :: (Traversable f, Functor g, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (f a)
- jacobian' :: (Traversable f, Functor g, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (a, f a)
- jacobianWith :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (f b)
- jacobianWith' :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (a, f b)
- jacobians :: (Traversable f, Functor g, Num a) => (forall s. f (Sparse a s) -> g (Sparse a s)) -> f a -> g (Cofree f a)
- jacobianT :: (Traversable f, Functor g, Num a) => (forall s. f (Forward a s) -> g (Forward a s)) -> f a -> f (g a)
- jacobianWithT :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (forall s. f (Forward a s) -> g (Forward a s)) -> f a -> f (g b)
- hessian :: (Traversable f, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Sparse a s') s)) -> On (Reverse (Sparse a s') s)) -> f a -> f (f a)
- hessian' :: (Traversable f, Num a) => (forall s. f (Sparse a s) -> Sparse a s) -> f a -> (a, f (a, f a))
- hessianF :: (Traversable f, Functor g, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Sparse a s') s)) -> g (On (Reverse (Sparse a s') s))) -> f a -> g (f (f a))
- hessianF' :: (Traversable f, Functor g, Num a) => (forall s. f (Sparse a s) -> g (Sparse a s)) -> f a -> g (a, f (a, f a))
- hessianProduct :: (Traversable f, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Forward a s') s)) -> On (Reverse (Forward a s') s)) -> f (a, a) -> f a
- hessianProduct' :: (Traversable f, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Forward a s') s)) -> On (Reverse (Forward a s') s)) -> f (a, a) -> f (a, a)
- diff :: Num a => (forall s. Forward a s -> Forward a s) -> a -> a
- diffF :: (Functor f, Num a) => (forall s. Forward a s -> f (Forward a s)) -> a -> f a
- diff' :: Num a => (forall s. Forward a s -> Forward a s) -> a -> (a, a)
- diffF' :: (Functor f, Num a) => (forall s. Forward a s -> f (Forward a s)) -> a -> f (a, a)
- diffs :: Num a => (forall s. Tower a s -> Tower a s) -> a -> [a]
- diffsF :: (Functor f, Num a) => (forall s. Tower a s -> f (Tower a s)) -> a -> f [a]
- diffs0 :: Num a => (forall s. Tower a s -> Tower a s) -> a -> [a]
- diffs0F :: (Functor f, Num a) => (forall s. Tower a s -> f (Tower a s)) -> a -> f [a]
- du :: (Functor f, Num a) => (forall s. f (Forward a s) -> Forward a s) -> f (a, a) -> a
- du' :: (Functor f, Num a) => (forall s. f (Forward a s) -> Forward a s) -> f (a, a) -> (a, a)
- duF :: (Functor f, Functor g, Num a) => (forall s. f (Forward a s) -> g (Forward a s)) -> f (a, a) -> g a
- duF' :: (Functor f, Functor g, Num a) => (forall s. f (Forward a s) -> g (Forward a s)) -> f (a, a) -> g (a, a)
- dus :: (Functor f, Num a) => (forall s. f (Tower a s) -> Tower a s) -> f [a] -> [a]
- dus0 :: (Functor f, Num a) => (forall s. f (Tower a s) -> Tower a s) -> f [a] -> [a]
- dusF :: (Functor f, Functor g, Num a) => (forall s. f (Tower a s) -> g (Tower a s)) -> f [a] -> g [a]
- dus0F :: (Functor f, Functor g, Num a) => (forall s. f (Tower a s) -> g (Tower a s)) -> f [a] -> g [a]
- taylor :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> a -> [a]
- taylor0 :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> a -> [a]
- maclaurin :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> [a]
- maclaurin0 :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> [a]
- gradientDescent :: (Traversable f, Fractional a, Ord a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> [f a]
- gradientAscent :: (Traversable f, Fractional a, Ord a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> [f a]
- conjugateGradientDescent :: (Traversable f, Ord a, Fractional a) => (forall s1 s2 s3 s4. Chosen s4 => f (Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> f a -> [f a]
- conjugateGradientAscent :: (Traversable f, Ord a, Fractional a) => (forall s1 s2 s3 s4. Chosen s4 => f (Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> f a -> [f a]
AD modes
class (Num t, Num (Scalar t)) => Mode t whereSource
Instances
| Mode (ForwardDouble s) | |
| (Mode t, Mode (Scalar t)) => Mode (On t) | |
| Num a => Mode (Tower a s) | |
| Num a => Mode (Id a s) | |
| Num a => Mode (Sparse a s) | |
| (Num a, Reifies * s Tape) => Mode (Reverse a s) | |
| Num a => Mode (Kahn a s) | |
| Num a => Mode (Forward a s) | |
| (Mode a, Mode b, Chosen s, ~ * (Scalar a) (Scalar b)) => Mode (Or a b s) | |
| (Num a, Traversable f) => Mode (Dense f a s) |
Gradients (Reverse Mode)
grad :: (Traversable f, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> f aSource
The grad function calculates the gradient of a non-scalar-to-scalar function with reverse-mode AD in a single pass.
>>>grad (\[x,y,z] -> x*y+z) [1,2,3][2,1,1]
grad' :: (Traversable f, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> (a, f a)Source
The grad' function calculates the result and gradient of a non-scalar-to-scalar function with reverse-mode AD in a single pass.
>>>grad' (\[x,y,z] -> x*y+z) [1,2,3](5,[2,1,1])
gradWith :: (Traversable f, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> f bSource
gradWith' :: (Traversable f, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> (a, f b)Source
Higher Order Gradients (Sparse-on-Reverse)
grads :: (Traversable f, Num a) => (forall s. f (Sparse a s) -> Sparse a s) -> f a -> Cofree f aSource
Variadic Gradients (Sparse or Kahn)
Variadic combinators for variadic mixed-mode automatic differentiation.
Unfortunately, variadicity comes at the expense of being able to use
quantification to avoid sensitivity confusion, so be careful when
counting the number of auto calls you use when taking the gradient
of a function that takes gradients!
Jacobians (Sparse or Reverse)
jacobian :: (Traversable f, Functor g, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (f a)Source
Calculate the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between sparse and Reverse mode AD.
If you know that you have relatively many outputs per input, consider using jacobian.
>>>jacobian (\[x,y] -> [y,x,x+y,x*y,exp x * sin y]) [pi,1][[0.0,1.0],[1.0,0.0],[1.0,1.0],[1.0,3.141592653589793],[19.472221418841606,12.502969588876512]]
jacobian' :: (Traversable f, Functor g, Num a) => (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (a, f a)Source
Calculate both the answer and Jacobian of a non-scalar-to-non-scalar function, using reverse-mode AD.
If you have relatively many outputs per input, consider using jacobian'.
jacobianWith :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (f b)Source
calculates the Jacobian of a non-scalar-to-non-scalar function, using Reverse mode AD.
jacobianWith g f
The resulting Jacobian matrix is then recombined element-wise with the input using g.
If you know that you have relatively many outputs per input, consider using jacobianWith.
jacobianWith' :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (forall s. Reifies s Tape => f (Reverse a s) -> g (Reverse a s)) -> f a -> g (a, f b)Source
calculates the answer and Jacobian of a non-scalar-to-non-scalar function, using Reverse mode AD.
jacobianWith' g f
The resulting Jacobian matrix is then recombined element-wise with the input using g.
If you know that you have relatively many outputs per input, consider using jacobianWith'.
Higher Order Jacobian (Sparse-on-Reverse)
jacobians :: (Traversable f, Functor g, Num a) => (forall s. f (Sparse a s) -> g (Sparse a s)) -> f a -> g (Cofree f a)Source
Transposed Jacobians (Forward Mode)
jacobianT :: (Traversable f, Functor g, Num a) => (forall s. f (Forward a s) -> g (Forward a s)) -> f a -> f (g a)Source
A fast, simple, transposed Jacobian computed with forward-mode AD.
jacobianWithT :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (forall s. f (Forward a s) -> g (Forward a s)) -> f a -> f (g b)Source
A fast, simple, transposed Jacobian computed with Forward mode AD that combines the output with the input.
Hessian (Sparse-On-Reverse)
hessian :: (Traversable f, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Sparse a s') s)) -> On (Reverse (Sparse a s') s)) -> f a -> f (f a)Source
Compute the Hessian via the Jacobian of the gradient. gradient is computed in reverse mode and then the Jacobian is computed in sparse (forward) mode.
>>>hessian (\[x,y] -> x*y) [1,2][[0,1],[1,0]]
hessian' :: (Traversable f, Num a) => (forall s. f (Sparse a s) -> Sparse a s) -> f a -> (a, f (a, f a))Source
Hessian Tensors (Sparse or Sparse-On-Reverse)
hessianF :: (Traversable f, Functor g, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Sparse a s') s)) -> g (On (Reverse (Sparse a s') s))) -> f a -> g (f (f a))Source
Compute the order 3 Hessian tensor on a non-scalar-to-non-scalar function using 'Sparse'-on-'Reverse'
>>>hessianF (\[x,y] -> [x*y,x+y,exp x*cos y]) [1,2][[[0.0,1.0],[1.0,0.0]],[[0.0,0.0],[0.0,0.0]],[[-1.1312043837568135,-2.4717266720048188],[-2.4717266720048188,1.1312043837568135]]]
Hessian Tensors (Sparse)
hessianF' :: (Traversable f, Functor g, Num a) => (forall s. f (Sparse a s) -> g (Sparse a s)) -> f a -> g (a, f (a, f a))Source
Hessian Vector Products (Forward-On-Reverse)
hessianProduct :: (Traversable f, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Forward a s') s)) -> On (Reverse (Forward a s') s)) -> f (a, a) -> f aSource
computes the product of the hessian hessianProduct f wvH of a non-scalar-to-scalar function f at w = with a vector fst $ wvv = snd $ wv using "Pearlmutter's method" from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.6143, which states:
H v = (d/dr) grad_w (w + r v) | r = 0
Or in other words, we take the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode.
hessianProduct' :: (Traversable f, Num a) => (forall s s'. Reifies s Tape => f (On (Reverse (Forward a s') s)) -> On (Reverse (Forward a s') s)) -> f (a, a) -> f (a, a)Source
computes both the gradient of a non-scalar-to-scalar hessianProduct' f wvf at w = and the product of the hessian fst $ wvH at w with a vector v = snd $ wv using "Pearlmutter's method". The outputs are returned wrapped in the same functor.
H v = (d/dr) grad_w (w + r v) | r = 0
Or in other words, we return the gradient and the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode.
Derivatives (Forward Mode)
diff :: Num a => (forall s. Forward a s -> Forward a s) -> a -> aSource
The diff function calculates the first derivative of a scalar-to-scalar function by forward-mode AD
>>>diff sin 01.0
Derivatives (Tower)
Directional Derivatives (Forward Mode)
du :: (Functor f, Num a) => (forall s. f (Forward a s) -> Forward a s) -> f (a, a) -> aSource
Compute the directional derivative of a function given a zipped up Functor of the input values and their derivatives
du' :: (Functor f, Num a) => (forall s. f (Forward a s) -> Forward a s) -> f (a, a) -> (a, a)Source
Compute the answer and directional derivative of a function given a zipped up Functor of the input values and their derivatives
duF :: (Functor f, Functor g, Num a) => (forall s. f (Forward a s) -> g (Forward a s)) -> f (a, a) -> g aSource
Compute a vector of directional derivatives for a function given a zipped up Functor of the input values and their derivatives.
duF' :: (Functor f, Functor g, Num a) => (forall s. f (Forward a s) -> g (Forward a s)) -> f (a, a) -> g (a, a)Source
Compute a vector of answers and directional derivatives for a function given a zipped up Functor of the input values and their derivatives.
Directional Derivatives (Tower)
dusF :: (Functor f, Functor g, Num a) => (forall s. f (Tower a s) -> g (Tower a s)) -> f [a] -> g [a]Source
dus0F :: (Functor f, Functor g, Num a) => (forall s. f (Tower a s) -> g (Tower a s)) -> f [a] -> g [a]Source
Taylor Series (Tower)
taylor :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> a -> [a]Source
taylor0 :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> a -> [a]Source
Maclaurin Series (Tower)
maclaurin :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> [a]Source
maclaurin0 :: Fractional a => (forall s. Tower a s -> Tower a s) -> a -> [a]Source
Gradient Descent
gradientDescent :: (Traversable f, Fractional a, Ord a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> [f a]Source
The gradientDescent function performs a multivariate
optimization, based on the naive-gradient-descent in the file
stalingrad/examples/flow-tests/pre-saddle-1a.vlad from the
VLAD compiler Stalingrad sources. Its output is a stream of
increasingly accurate results. (Modulo the usual caveats.)
It uses reverse mode automatic differentiation to compute the gradient.
gradientAscent :: (Traversable f, Fractional a, Ord a) => (forall s. Reifies s Tape => f (Reverse a s) -> Reverse a s) -> f a -> [f a]Source
Perform a gradient descent using reverse mode automatic differentiation to compute the gradient.
conjugateGradientDescent :: (Traversable f, Ord a, Fractional a) => (forall s1 s2 s3 s4. Chosen s4 => f (Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> f a -> [f a]Source
Perform a conjugate gradient descent using reverse mode automatic differentiation to compute the gradient, and using forward-on-forward mode for computing extrema.
>>>let sq x = x * x>>>let rosenbrock [x,y] = sq (1 - x) + 100 * sq (y - sq x)>>>rosenbrock [0,0]1>>>rosenbrock (conjugateGradientDescent rosenbrock [0, 0] !! 5) < 0.1True
conjugateGradientAscent :: (Traversable f, Ord a, Fractional a) => (forall s1 s2 s3 s4. Chosen s4 => f (Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> Or (On (Forward (Forward a s1) s2)) (Kahn a s3) s4) -> f a -> [f a]Source
Perform a conjugate gradient ascent using reverse mode automatic differentiation to compute the gradient.