$_1      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                !"#$%&'()*+,-./0GHC only experimentalekmett@gmail.com Safe-InferedGHC only experimentalekmett@gmail.comNone 123456 123456GHC only experimentalekmett@gmail.comNone  is used by  deriveMode but is not exposed  via # to prevent its abuse by end users  via the AD data type.  is used by  deriveMode but is not exposed  via the ) class to prevent its abuse by end users  via the AD data type. QIt provides direct access to the result, stripped of its derivative information, K but this is unsafe in general as (lift . primal) would discard derivative N information. The end user is protected from accidentally using this function G by the universal quantification on the various combinators we expose. @allowed to return False for items with a zero derivative, but we'*ll give more NaNs than strictly necessary 6allowed to return False for zero, but we give more NaN's than strictly necessary then Embed a constant  Vector sum Scalar-vector multiplication Vector-scalar multiplication Scalar division aExponentiation, this should be overloaded if you can figure out anything about what is constant!   'zero' = 'lift' 0 ^^ t provides   instance Lifted $t given supplied instances for  + instance Lifted $t => Primal $t where ... - instance Lifted $t => Jacobian $t where ... The seemingly redundant  $tB constraints are caused by Template Haskell staging restrictions. __ f g# provides the following instances:  < instance ('Lifted' $f, 'Num' a, 'Enum' a) => 'Enum' ($g a) 8 instance ('Lifted' $f, 'Num' a, 'Eq' a) => 'Eq' ($g a) : instance ('Lifted' $f, 'Num' a, 'Ord' a) => 'Ord' ($g a) B instance ('Lifted' $f, 'Num' a, 'Bounded' a) => 'Bounded' ($g a) 3 instance ('Lifted' $f, 'Show' a) => 'Show' ($g a) 1 instance ('Lifted' $f, 'Num' a) => 'Num' ($g a) ? instance ('Lifted' $f, 'Fractional' a) => 'Fractional' ($g a) ; instance ('Lifted' $f, 'Floating' a) => 'Floating' ($g a) = instance ('Lifted' $f, 'RealFloat' a) => 'RealFloat' ($g a) ; instance ('Lifted' $f, 'RealFrac' a) => 'RealFrac' ($g a) 3 instance ('Lifted' $f, 'Real' a) => 'Real' ($g a) Y  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_7X  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_X] ^_ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\   = !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_7GHC only experimentalekmett@gmail.comNone``* serves as a common wrapper for different # instances, exposing a traditional X numerical tower. Universal quantification is used to limit the actions in user code to Z machinery that will return the same answers under all AD modes, allowing us to use modes ( interchangeably as both the type level "brand") and dictionary, providing a common API. cBA non-scalar-to-non-scalar automatically-differentiable function. d>A non-scalar-to-scalar automatically-differentiable function. e>A scalar-to-non-scalar automatically-differentiable function. f:A scalar-to-scalar automatically-differentiable function. `abcdef89:`abcdef`abcdef89:GHC only experimentalekmett@gmail.comNoneghijklm;<=>?@ABghijklm ghijklm;<=>?@ABGHC only experimentalekmett@gmail.com Safe-Inferednopq`abcdefghijklmnopq`abfedcghijklmnopqnopqGHC only experimentalekmett@gmail.comNoner?The composition of two AD modes is an AD mode in its own right u?Functor composition, used to nest the use of jacobian and grad rstuvwxyCDEFGHIJKLMrstuvwxyuvwrstxyrstuvwxyCDEFGHIJKLMGHC only experimentalekmett@gmail.comNonez{|}~NOPQ z{|}~ z}|{~z}|{~NOPQGHC only experimentalekmett@gmail.com Safe-InferedThe Y function calculates the first derivative of a scalar-to-scalar function by forward-mode `  diff sin == cos The d'UUV function calculates the result and first derivative of scalar-to-scalar function by Forward `  d' sin == sin &&& cos  d' f = f &&& d f The N function calculates the first derivative of scalar-to-nonscalar function by Forward ` The \ function calculates the result and first derivative of a scalar-to-non-scalar function by Forward ` BA fast, simple transposed Jacobian computed with forward-mode AD. BA fast, simple transposed Jacobian computed with forward-mode AD. TCompute the product of a vector with the Hessian using forward-on-forward-mode AD. LCompute the gradient and hessian product using forward-on-forward-mode AD. %`abcdef%fedc`abGHC only experimentalekmett@gmail.comNoneTower is an AD B that calculates a tangent tower by forward AD, and provides fast diffsUU, diffsUF RSTUVRSTUVGHC only experimentalekmett@gmail.comNoneReverse is a A using reverse-mode automatic differentiation that provides fast diffFU, diff2FU, grad, grad2 and a fast jacobianF when you have a significantly smaller number of outputs than inputs. A TapeT records the information needed back propagate from the output to each input during   AD. >Used to mark variables for inspection during the reverse pass 6This returns a list of contributions to the partials. 2 The variable ids returned in the list are likely not unique!  Return an W of $ given bounds for the variable IDs.  Return an X of sparse partials %YZ[\]^_`abYZ[\]^_`ab GHC only experimentalekmett@gmail.comNone The J function calculates the gradient of a non-scalar-to-scalar function with  AD in a single pass. The U function calculates the result and gradient of a non-scalar-to-scalar function with  AD in a single pass.  g fE function calculates the gradient of a non-scalar-to-scalar function f( with reverse-mode AD in a single pass. L The gradient is combined element-wise with the argument using the function g.  grad == gradWith (\_ dx -> dx)  id == gradWith const  g fG calculates the result and gradient of a non-scalar-to-scalar function f with  AD in a single pass L the gradient is combined element-wise with the argument using the function g. " grad' == gradWith' (\_ dx -> dx) The c function calculates the jacobian of a non-scalar-to-non-scalar function with reverse AD lazily in m passes for m outputs. The b function calculates both the result and the Jacobian of a nonscalar-to-nonscalar function, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of   | An alias for gradF' 'jacobianWith g f'@ calculates the Jacobian of a non-scalar-to-non-scalar function f with reverse AD lazily in m passes for m outputs. kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. ( jacobian == jacobianWith (\_ dx -> dx) 1 jacobianWith const == (\f x -> const x <$> f x)  g f'R calculates both the result and the Jacobian of a nonscalar-to-nonscalar function f, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of  kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. * jacobian' == jacobianWith' (\_ dx -> dx) The d'4 function calculates the value and derivative, as a ' pair, of a scalar-to-scalar function. Compute the hessian via the jacobian of the gradient. gradient is computed in reverse mode and then the jacobian is computed in reverse mode. However, since the 'grad f :: f a -> f a'i is square this is not as fast as using the forward-mode Jacobian of a reverse mode gradient provided by . Compute the order 3 Hessian tensor on a non-scalar-to-non-scalar function via the reverse-mode Jacobian of the reverse-mode Jacobian of the function. Less efficient than . "`abcdef"fedc`ab NoneJWe only store partials in sorted order, so the map contained in a partial M will only contain partials with equal or greater keys to that of the map in S which it was found. This should be key for efficiently computing sparse hessians. K there are only (n + k - 1) choose k distinct nth partial derivatives of a  function with k inputs. "cdefghijcdefghij GHC only experimentalekmett@gmail.comNone klmno klmnoGHC only experimentalekmett@gmail.com Safe-Infered   GHC only experimentalekmett@gmail.com Safe-Infered#`abcdef#fedc`ab GHC only experimentalekmett@gmail.com Safe-InferedThe 2 function finds a zero of a scalar function using  Halley':s method; its output is a stream of increasingly accurate ( results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@ The * function inverts a scalar function using  Halley':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Note: the "take 10 $ inverse sqrt 1 (sqrt 10) example that works for Newton' s method  fails with Halley'0s method because the preconditions do not hold. The ( function find a fixedpoint of a scalar  function using Halley'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607  The  ( function finds an extremum of a scalar  function using Halley',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) + take 10 $ extremum cos 1 -- convert to 0  `abcdef  fedc`ab GHC only experimentalekmett@gmail.com Safe-Infered    3`abcdefghijklmnopq            GHC only experimentalekmett@gmail.com Safe-InferedThe 2 function finds a zero of a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@ The  inverseNewton* function inverts a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Example: > take 10 $ inverseNewton sqrt 1 (sqrt 10) -- converges to 10 The ( function find a fixedpoint of a scalar  function using Newton'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607 The ( function finds an extremum of a scalar  function using Newton',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) + take 10 $ extremum cos 1 -- convert to 0 The " function performs a multivariate ? optimization, based on the naive-gradient-descent in the file   stalingrad/examples/ flow-tests/pre-saddle-1a.vlad from the > VLAD compiler Stalingrad sources. Its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) HIt uses reverse mode automatic differentiation to compute the gradient. `abcdeffedc`abGHC only experimentalekmett@gmail.com Safe-InferedCalculate the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. @If you know the relative number of inputs and outputs, consider  or . Calculate both the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward- and reverse- mode AD based on the relative, based on the number of inputs @If you know the relative number of inputs and outputs, consider  or .    g f calculates the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider  or . !! g f calculates the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between sparse and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider  or . "" f wv% computes the product of the hessian H$ of a non-scalar-to-scalar function f at w = p  $ wv with a vector v = snd  $ wv using " Pearlmutter's method" from  ?http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.6143, which states:  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we take the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode. ## f wv6 computes both the gradient of a non-scalar-to-scalar f at w = p  $ wv and the product of the hessian H at w with a vector v = snd  $ wv using " Pearlmutter's method"8. The outputs are returned wrapped in the same functor.  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we return the gradient and the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode. $Compute the Hessian via the Jacobian of the gradient. gradient is computed in reverse mode and then the Jacobian is computed in sparse (forward) mode. %mCompute the order 3 Hessian tensor on a non-scalar-to-non-scalar function using Sparse or Sparse-on-Reverse  !"#$%L`abcdefghijklmnopq !"#$%5 !$%"# !"#$%GHC only experimentalekmett@gmail.com Safe-Infered &'()*+,-./0`abcdef&'()*+,-./0/0-.+,fedc&*)('`ab&*)('+,-./0GHC only experimentalekmett@gmail.com Safe-InferedR`abcdefghijklmnopq !"#$%q !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                        !"#$%&'()*+,-./ 0 1 2 3 4 * + 5 6 7 8 9 : ;<=ad-1.4Numeric.AD.Internal.CombinatorsNumeric.AD.TypesNumeric.AD.Internal.ClassesNumeric.AD.Internal.CompositionNumeric.AD.Internal.ForwardNumeric.AD.Mode.ForwardNumeric.AD.Internal.TowerNumeric.AD.Internal.ReverseNumeric.AD.Mode.ReverseNumeric.AD.Internal.SparseNumeric.AD.Internal.DenseNumeric.AD.Mode.TowerNumeric.AD.HalleyNumeric.AD.Mode.SparseNumeric.AD.NewtonNumeric.AD.Mode.MixedNumeric.AD.Mode.DirectedNumeric.AD.Internal.TensorsNumeric.AD.Internal.TypesNumeric.AD.Internal.Identity Numeric.ADhessianhessianFNumeric.AD.ClassesNumeric.AD.ReversejacobianNuneric.AD.Sparse jacobian' jacobianWith jacobianWith'onzipWithTzipWithDefaultTTensors:-tailTheadTtensorsJacobianDunarylift1lift1_binarylift2lift2_PrimalprimalModeisKnownConstant isKnownZerolift<+>*^^*^/<**>zeroLifted showsPrec1==!compare1 fromInteger1+!*!-!negate1signum1abs1/!recip1 fromRational1 toRational1pi1exp1sqrt1log1**!logBase1sin1atan1acos1asin1tan1cos1sinh1atanh1acosh1asinh1tanh1cosh1properFraction1 truncate1floor1ceiling1round1 floatRadix1 floatDigits1 floatRange1 decodeFloat1 encodeFloat1 exponent1 significand1 scaleFloat1isNaN1isIEEE1isNegativeZero1isDenormalized1 isInfinite1atan21succ1pred1toEnum1 fromEnum1 enumFrom1 enumFromThen1 enumFromTo1enumFromThenTo1 minBound1 maxBound1Isoisoosione deriveLifted deriveNumericADrunADFFFUUFUUIdrunIdprobeunprobeprobedunprobedlowerUUlowerUFlowerFUlowerFF ComposeModerunComposeModeComposeFunctordecomposeFunctor composeMode decomposeModeForwardZeroLifttangentunbundlebundleapplybindbind'bindWith bindWith' transposeWithdudu'duFduF'diffdiff'diffFdiffF' jacobianT jacobianWithTgradgrad'gradWith gradWith'hessianProducthessianProduct'TowergetTowerzeroPadzeroPadF transposePadFdd'tangentswithD getADTowertowerReverseTapeUnaryBinaryVarGradpackunpackunpack'varvarId derivative derivative'partials partialArray partialMapunbind unbindWith unbindMapunbindMapWithDefaultvgradvgrad'SparseIndex emptyIndex addToIndexindicesvarsskeletondspartialspartialGradspacksunpacksvgradsDenseds'diffsdiffs0diffsFdiffs0Ftaylortaylor0 maclaurin maclaurin0dusdus0dusFdus0FfindZeroinverse fixedPointextremumgrads jacobianshessian' hessianF'gradientDescentgradientAscent DirectionMixed$fTypeable1Tensors$fTraversableTensors$fFoldableTensors$fFunctorTensors $fShowTensors$fShowShowable$fIsoaa$fDataAD $fTypeable1AD$fEnumAD $fPrimalId$fModeId $fLiftedId $fMonadId$fApplicativeId$fTraversableId $fFoldableId $fFunctorId$fDataComposeMode$fTypeableComposeMode$fTypeable1ComposeMode$fLiftedComposeMode$fModeComposeMode$fPrimalComposeMode$fDataComposeFunctor$fTypeable1ComposeFunctor$fTraversableComposeFunctor$fFoldableComposeFunctor$fFunctorComposeFunctor$fJacobianForward $fModeForward$fPrimalForward$fLiftedForward$fJacobianTower $fModeTower $fPrimalTower $fShowTower $fLiftedTowerbaseGHC.ArrArraycontainers-0.4.2.1 Data.IntMapIntMap$fJacobianReverse$fPrimalReverse $fModeReverse$fMuRefReverse$fGrad(->)(->)(->)a$fGradAD[](,)a$fVarAD $fVarReverse$fMonadS$fLiftedReverse$fJacobianSparse $fModeSparse$fPrimalSparse$fGrads(->)(->)a$fGradsADCofreea$fLiftedSparse$fJacobianDense $fModeDense $fPrimalDense $fShowDense $fLiftedDense Data.Tuplefst