v2      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~              !"#$%&'()*+,-./01GHC only experimentalekmett@gmail.comGHC only experimentalekmett@gmail.comGHC only experimentalekmett@gmail.com 234  GHC only experimentalekmett@gmail.com 56While we can not be a  without a fzip!-like operation, you can use the  comonad for  f a< to manipulate a structure comonadically that you can turn  into  .   GHC only experimentalekmett@gmail.com` is used by  deriveMode but is not exposed  via # to prevent its abuse by end users  via the AD data type.  is used by  deriveMode but is not exposed  via the ) class to prevent its abuse by end users  via the AD data type. QIt provides direct access to the result, stripped of its derivative information, K but this is unsafe in general as (lift . primal) would discard derivative N information. The end user is protected from accidentally using this function G by the universal quantification on the various combinators we expose. Embed a constant  Vector sum Scalar-vector multiplication Vector-scalar multiplication !Scalar division "  'zero' = 'lift' 0 #$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcd789:;<=>ee t provides   instance Lifted $t given supplied instances for  + instance Lifted $t => Primal $t where ... - instance Lifted $t => Jacobian $t where ... The seemingly redundant # $tB constraints are caused by Template Haskell staging restrictions. ?@$Find all the members defined in the # data type ff f g# provides the following instances:  < instance ('Lifted' $f, 'Num' a, 'Enum' a) => 'Enum' ($g a) 8 instance ('Lifted' $f, 'Num' a, 'Eq' a) => 'Eq' ($g a) : instance ('Lifted' $f, 'Num' a, 'Ord' a) => 'Ord' ($g a) B instance ('Lifted' $f, 'Num' a, 'Bounded' a) => 'Bounded' ($g a) 3 instance ('Lifted' $f, 'Show' a) => 'Show' ($g a) 1 instance ('Lifted' $f, 'Num' a) => 'Num' ($g a) ? instance ('Lifted' $f, 'Fractional' a) => 'Fractional' ($g a) ; instance ('Lifted' $f, 'Floating' a) => 'Floating' ($g a) = instance ('Lifted' $f, 'RealFloat' a) => 'RealFloat' ($g a) ; instance ('Lifted' $f, 'RealFrac' a) => 'RealFrac' ($g a) 3 instance ('Lifted' $f, 'Real' a) => 'Real' ($g a) AU !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefU !"def#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcU !" !"#=$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcbcdefGHC only experimentalekmett@gmail.com ghg* serves as a common wrapper for different # instances, exposing a traditional X numerical tower. Universal quantification is used to limit the actions in user code to Z machinery that will return the same answers under all AD modes, allowing us to use modes ( interchangeably as both the type level "brand") and dictionary, providing a common API. hijBA non-scalar-to-non-scalar automatically-differentiable function. k>A non-scalar-to-scalar automatically-differentiable function. l>A scalar-to-non-scalar automatically-differentiable function. m:A scalar-to-scalar automatically-differentiable function. BCDghijklmghihijklmGHC only experimentalekmett@gmail.comnopqEFrsnopqrsnoopqrsGHC only experimentalekmett@gmail.comtuvw ghijklmnopqrstuvwghimlkj  nopqrstuvwtuvwGHC only experimentalekmett@gmail.comx?The composition of two AD modes is an AD mode in its own right yz{?Functor composition, used to nest the use of jacobian and grad |}GHI~JKLxyz{|}~{|}xyz~xyzyz{|}|}~GHC only experimentalekmett@gmail.com GHC only experimentalekmett@gmail.comThe Y function calculates the first derivative of a scalar-to-scalar function by forward-mode hg  diff sin == cos The d'UUV function calculates the result and first derivative of scalar-to-scalar function by Forward hg  d' sin == sin &&& cos  d' f = f &&& d f The N function calculates the first derivative of scalar-to-nonscalar function by Forward hg The \ function calculates the result and first derivative of a scalar-to-non-scalar function by Forward hg BA fast, simple transposed Jacobian computed with forward-mode AD. BA fast, simple transposed Jacobian computed with forward-mode AD. TCompute the product of a vector with the Hessian using forward-on-forward-mode AD. LCompute the gradient and hessian product using forward-on-forward-mode AD. " !"ghijklm"mlkjghi !"GHC only experimentalekmett@gmail.comTower is an AD B that calculates a tangent tower by forward AD, and provides fast diffsUU, diffsUF  GHC only experimentalekmett@gmail.comReverse is a A using reverse-mode automatic differentiation that provides fast diffFU, diff2FU, grad, grad2 and a fast jacobianF when you have a significantly smaller number of outputs than inputs. A TapeT records the information needed back propagate from the output to each input during   AD. >Used to mark variables for inspection during the reverse pass MNOP+back propagate sensitivities along a tape. 6This returns a list of contributions to the partials. 2 The variable ids returned in the list are likely not unique!  Return an Q of $ given bounds for the variable IDs.  Return an R of sparse partials  GHC only experimentalekmett@gmail.comThe J function calculates the gradient of a non-scalar-to-scalar function with  AD in a single pass. The U function calculates the result and gradient of a non-scalar-to-scalar function with  AD in a single pass.  g fE function calculates the gradient of a non-scalar-to-scalar function f( with reverse-mode AD in a single pass. L The gradient is combined element-wise with the argument using the function g.  grad == gradWith (\_ dx -> dx)  id == gradWith const  g fG calculates the result and gradient of a non-scalar-to-scalar function f with  AD in a single pass L the gradient is combined element-wise with the argument using the function g. " grad' == gradWith' (\_ dx -> dx) The c function calculates the jacobian of a non-scalar-to-non-scalar function with reverse AD lazily in m passes for m outputs. The b function calculates both the result and the Jacobian of a nonscalar-to-nonscalar function, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of   | An alias for gradF' 'jacobianWith g f'@ calculates the Jacobian of a non-scalar-to-non-scalar function f with reverse AD lazily in m passes for m outputs. kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. ( jacobian == jacobianWith (\_ dx -> dx) 1 jacobianWith const == (\f x -> const x <$> f x)  g f'R calculates both the result and the Jacobian of a nonscalar-to-nonscalar function f, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of  kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. * jacobian' == jacobianWith' (\_ dx -> dx) The d'4 function calculates the value and derivative, as a ' pair, of a scalar-to-scalar function. Compute the hessian via the jacobian of the gradient. gradient is computed in reverse mode and then the jacobian is computed in reverse mode. However, since the 'grad f :: f a -> f a'i is square this is not as fast as using the forward-mode Jacobian of a reverse mode gradient provided by Numeric.AD.hessian. Compute the order 3 Hessian tensor on a non-scalar-to-non-scalar function via the reverse-mode Jacobian of the reverse-mode Jacobian of the function. Less efficient than Numeric.AD.Mode.Mixed.hessianF.  !"ghijklmmlkjghi !" JWe only store partials in sorted order, so the map contained in a partial M will only contain partials with equal or greater keys to that of the map in S which it was found. This should be key for efficiently computing sparse hessians. L there are only (n + k - 1) choose k distinct nth partial derivatives of a  function with k inputs. Sdrop keys below a given value T GHC only experimentalekmett@gmail.comGHC only experimentalekmett@gmail.com  !"  !" GHC only experimentalekmett@gmail.com  !"ghijklm mlkj !"ghiGHC only experimentalekmett@gmail.comThe 2 function finds a zero of a scalar function using  Halley':s method; its output is a stream of increasingly accurate ( results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@ The * function inverts a scalar function using  Halley':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Note: the "take 10 $ inverse sqrt 1 (sqrt 10) example that works for Newton' s method  fails with Halley'0s method because the preconditions do not hold.  The  ( function find a fixedpoint of a scalar  function using Halley'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607  The  ( function finds an extremum of a scalar  function using Halley',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) + take 10 $ extremum cos 1 -- convert to 0  !"ghijklm    mlkjghi !"  GHC only experimentalekmett@gmail.comU   VW4  !"ghijklmnopqrstuvw       !"   GHC only experimentalekmett@gmail.comThe 2 function finds a zero of a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@ The  inverseNewton* function inverts a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Example: > take 10 $ inverseNewton sqrt 1 (sqrt 10) -- converges to 10 The ( function find a fixedpoint of a scalar  function using Newton'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607 The ( function finds an extremum of a scalar  function using Newton',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) + take 10 $ extremum cos 1 -- convert to 0 The " function performs a multivariate ? optimization, based on the naive-gradient-descent in the file   stalingrad/examples/ flow-tests/pre-saddle-1a.vlad from the > VLAD compiler Stalingrad sources. Its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) HIt uses reverse mode automatic differentiation to compute the gradient.  !"ghijklmmlkjghi !"GHC only experimentalekmett@gmail.com XYZCalculate the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. @If you know the relative number of inputs and outputs, consider Numeric.AD.Reverse.jacobian or Nuneric.AD.Sparse.jacobian. [\ Calculate both the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward- and reverse- mode AD based on the relative, based on the number of inputs @If you know the relative number of inputs and outputs, consider Numeric.AD.Reverse.jacobian' or Nuneric.AD.Sparse.jacobian'. !! g f calculates the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider Numeric.AD.Reverse.jacobianWith or Nuneric.AD.Sparse.jacobianWith. "" g f calculates the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between sparse and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider  Numeric.AD.Reverse.jacobianWith' or Nuneric.AD.Sparse.jacobianWith'. ## f wv% computes the product of the hessian H$ of a non-scalar-to-scalar function f at w = ]  $ wv with a vector v = snd  $ wv using " Pearlmutter's method" from  ?http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.6143, which states:  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we take the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode. $$ f wv6 computes both the gradient of a non-scalar-to-scalar f at w = ]  $ wv and the product of the hessian H at w with a vector v = snd  $ wv using " Pearlmutter's method"8. The outputs are returned wrapped in the same functor.  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we return the gradient and the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode. %Compute the Hessian via the Jacobian of the gradient. gradient is computed in reverse mode and then the Jacobian is computed in sparse (forward) mode. &mCompute the order 3 Hessian tensor on a non-scalar-to-non-scalar function using Sparse or Sparse-on-Reverse M  !"ghijklmnopqrstuvw !"#$%&2 !"%&#$ !" !"#$%&GHC only experimentalekmett@gmail.com '()*+,-./01 !"ghijklm'()*+,-./0101./,-mlkj'+*)( !"ghi '+*)(()*+,-./01GHC only experimentalekmett@gmail.comS  !"ghijklmnopqrstuvw !"#$%&^ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                             !" # $%&'()*+,-. ad-0.45.0Numeric.AD.Internal.CombinatorsNumeric.AD.ClassesNumeric.AD.TypesNumeric.AD.Internal.ClassesNumeric.AD.Internal.CompositionNumeric.AD.Internal.ForwardNumeric.AD.Mode.ForwardNumeric.AD.Internal.TowerNumeric.AD.Internal.ReverseNumeric.AD.Mode.ReverseNumeric.AD.Internal.SparseNumeric.AD.Internal.DenseNumeric.AD.Mode.TowerNumeric.AD.HalleyNumeric.AD.Mode.SparseNumeric.AD.NewtonNumeric.AD.Mode.MixedNumeric.AD.Mode.DirectedNumeric.AD.Internal.ComonadNumeric.AD.Internal.StreamNumeric.AD.Internal.TensorsNumeric.AD.Internal.TypesNumeric.AD.Internal.Identity Numeric.ADonzipWithTzipWithDefaultTComonad duplicateextend CopointedextractStream:<headStailSunfoldSTensors:-tailTheadTtensorsJacobianDunarylift1lift1_binarylift2lift2_PrimalprimalModelift<+>*^^*^/zeroLifted showsPrec1==!compare1 fromInteger1+!*!-!negate1signum1abs1/!recip1 fromRational1 toRational1pi1exp1sqrt1log1**!logBase1sin1atan1acos1asin1tan1cos1sinh1atanh1acosh1asinh1tanh1cosh1properFraction1 truncate1floor1ceiling1round1 floatRadix1 floatDigits1 floatRange1 decodeFloat1 encodeFloat1 exponent1 significand1 scaleFloat1isNaN1isIEEE1isNegativeZero1isDenormalized1 isInfinite1atan21succ1pred1toEnum1 fromEnum1 enumFrom1 enumFromThen1 enumFromTo1enumFromThenTo1 minBound1 maxBound1Isoisoosione deriveLifted deriveNumericADrunADFFFUUFUUIdprobeunprobeprobedunprobedlowerUUlowerUFlowerFUlowerFF ComposeModerunComposeModeComposeFunctordecomposeFunctor composeMode decomposeModeForwardtangentunbundlebundleapplybindbind'bindWith bindWith' transposeWithdudu'duFduF'diffdiff'diffFdiffF' jacobianT jacobianWithTjacobian jacobianWith jacobian' jacobianWith'gradgrad'gradWith gradWith'hessianProducthessianProduct'TowergetTowerzeroPadzeroPadF transposePadFdd'tangentswithD getADTowertowerReverseTapeUnaryBinaryVarLiftGradpackunpackunpack'varvarId derivative derivative'partials partialArray partialMapunbind unbindWith unbindMapunbindMapWithDefaultvgradvgrad'hessianhessianFSparseIndex emptyIndex addToIndexindicesvarsskeletondspartialspartialGradspacksunpacksvgradsDenseds'diffsdiffs0diffsFdiffs0Ftaylortaylor0 maclaurin maclaurin0dusdus0dusFdus0FfindZeroinverse fixedPointextremumgrads jacobianshessian' hessianF'gradientDescentgradientAscent DirectionMixed streamTyCon consConstrstreamDataType tensorsTyCon$fCopointedTensorsnegOne withPrimalfromBy fromIntegral1square1 discrete1 discrete2 discrete3varA liftedMembers lowerInstanceadTyConadConstr adDataTypepidunpidcomposeFunctorTyConcomposeFunctorConstrcomposeFunctorDataTypecomposeModeTyConcomposeModeConstrcomposeModeDataTypeSrunS backPropagatebaseGHC.ArrArraycontainers-0.3.0.0 Data.IntMapIntMapdropMaptimessecondd2d2'NatZsizebig Data.Tuplefst