(      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~        !"#$%&'GHC only experimentalekmett@gmail.com Safe-InferedGHC only experimentalekmett@gmail.comNone ()*+,- ()*+,-GHC only experimentalekmett@gmail.comNone  is used by  deriveMode but is not exposed  via # to prevent its abuse by end users  via the AD data type.  is used by  deriveMode but is not exposed  via the ) class to prevent its abuse by end users  via the AD data type. QIt provides direct access to the result, stripped of its derivative information, K but this is unsafe in general as (lift . primal) would discard derivative N information. The end user is protected from accidentally using this function G by the universal quantification on the various combinators we expose. Embed a constant  Vector sum Scalar-vector multiplication Vector-scalar multiplication Scalar division   'zero' = 'lift' 0 [[ t provides   instance Lifted $t given supplied instances for  + instance Lifted $t => Primal $t where ... - instance Lifted $t => Jacobian $t where ... The seemingly redundant  $tB constraints are caused by Template Haskell staging restrictions. \\ f g# provides the following instances:  < instance ('Lifted' $f, 'Num' a, 'Enum' a) => 'Enum' ($g a) 8 instance ('Lifted' $f, 'Num' a, 'Eq' a) => 'Eq' ($g a) : instance ('Lifted' $f, 'Num' a, 'Ord' a) => 'Ord' ($g a) B instance ('Lifted' $f, 'Num' a, 'Bounded' a) => 'Bounded' ($g a) 3 instance ('Lifted' $f, 'Show' a) => 'Show' ($g a) 1 instance ('Lifted' $f, 'Num' a) => 'Num' ($g a) ? instance ('Lifted' $f, 'Fractional' a) => 'Fractional' ($g a) ; instance ('Lifted' $f, 'Floating' a) => 'Floating' ($g a) = instance ('Lifted' $f, 'RealFloat' a) => 'RealFloat' ($g a) ; instance ('Lifted' $f, 'RealFrac' a) => 'RealFrac' ($g a) 3 instance ('Lifted' $f, 'Real' a) => 'Real' ($g a) V  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\.U  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\UZ [\ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXY  = !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\.GHC only experimentalekmett@gmail.comNone]]* serves as a common wrapper for different # instances, exposing a traditional X numerical tower. Universal quantification is used to limit the actions in user code to Z machinery that will return the same answers under all AD modes, allowing us to use modes ( interchangeably as both the type level "brand") and dictionary, providing a common API. `BA non-scalar-to-non-scalar automatically-differentiable function. a>A non-scalar-to-scalar automatically-differentiable function. b>A scalar-to-non-scalar automatically-differentiable function. c:A scalar-to-scalar automatically-differentiable function. ]^_`abc/01]^_`abc]^_`abc/01GHC only experimentalekmett@gmail.comNonedefghi23456789defghi defghi23456789GHC only experimentalekmett@gmail.com Safe-Inferedjklm]^_`abcdefghijklm]^_cba`defghijklmjklmGHC only experimentalekmett@gmail.comNonen?The composition of two AD modes is an AD mode in its own right q?Functor composition, used to nest the use of jacobian and grad nopqrstu:;<=>?@ABCDnopqrstuqrsnoptunopqrstu:;<=>?@ABCDGHC only experimentalekmett@gmail.comNonevwxyz{EFG|}~H vwxyz{|}~ vwxzy{|}~vwxyz{EFG|}~HGHC only experimentalekmett@gmail.com Safe-InferedThe Y function calculates the first derivative of a scalar-to-scalar function by forward-mode ]  diff sin == cos The d'UUV function calculates the result and first derivative of scalar-to-scalar function by Forward ]  d' sin == sin &&& cos  d' f = f &&& d f The N function calculates the first derivative of scalar-to-nonscalar function by Forward ] The \ function calculates the result and first derivative of a scalar-to-non-scalar function by Forward ] BA fast, simple transposed Jacobian computed with forward-mode AD. BA fast, simple transposed Jacobian computed with forward-mode AD. TCompute the product of a vector with the Hessian using forward-on-forward-mode AD. LCompute the gradient and hessian product using forward-on-forward-mode AD. "]^_`abc"cba`]^_GHC only experimentalekmett@gmail.comNoneTower is an AD B that calculates a tangent tower by forward AD, and provides fast diffsUU, diffsUF IJKLMIJKLMGHC only experimentalekmett@gmail.comNoneReverse is a A using reverse-mode automatic differentiation that provides fast diffFU, diff2FU, grad, grad2 and a fast jacobianF when you have a significantly smaller number of outputs than inputs. A TapeT records the information needed back propagate from the output to each input during   AD. >Used to mark variables for inspection during the reverse pass 6This returns a list of contributions to the partials. 2 The variable ids returned in the list are likely not unique!  Return an N of $ given bounds for the variable IDs.  Return an O of sparse partials $PQRSTUVWXYPQRSTUVWXY GHC only experimentalekmett@gmail.comNone The J function calculates the gradient of a non-scalar-to-scalar function with  AD in a single pass. The U function calculates the result and gradient of a non-scalar-to-scalar function with  AD in a single pass.  g fE function calculates the gradient of a non-scalar-to-scalar function f( with reverse-mode AD in a single pass. L The gradient is combined element-wise with the argument using the function g.  grad == gradWith (\_ dx -> dx)  id == gradWith const  g fG calculates the result and gradient of a non-scalar-to-scalar function f with  AD in a single pass L the gradient is combined element-wise with the argument using the function g. " grad' == gradWith' (\_ dx -> dx) The c function calculates the jacobian of a non-scalar-to-non-scalar function with reverse AD lazily in m passes for m outputs. The b function calculates both the result and the Jacobian of a nonscalar-to-nonscalar function, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of   | An alias for gradF' 'jacobianWith g f'@ calculates the Jacobian of a non-scalar-to-non-scalar function f with reverse AD lazily in m passes for m outputs. kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. ( jacobian == jacobianWith (\_ dx -> dx) 1 jacobianWith const == (\f x -> const x <$> f x)  g f'R calculates both the result and the Jacobian of a nonscalar-to-nonscalar function f, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of  kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. * jacobian' == jacobianWith' (\_ dx -> dx) The d'4 function calculates the value and derivative, as a ' pair, of a scalar-to-scalar function. Compute the hessian via the jacobian of the gradient. gradient is computed in reverse mode and then the jacobian is computed in reverse mode. However, since the 'grad f :: f a -> f a'i is square this is not as fast as using the forward-mode Jacobian of a reverse mode gradient provided by . Compute the order 3 Hessian tensor on a non-scalar-to-non-scalar function via the reverse-mode Jacobian of the reverse-mode Jacobian of the function. Less efficient than . ]^_`abccba`]^_ NoneJWe only store partials in sorted order, so the map contained in a partial M will only contain partials with equal or greater keys to that of the map in S which it was found. This should be key for efficiently computing sparse hessians. L there are only (n + k - 1) choose k distinct nth partial derivatives of a  function with k inputs. !Z[\]^_`aZ[\]^_`a GHC only experimentalekmett@gmail.comNone bcdef bcdefGHC only experimentalekmett@gmail.com Safe-Infered GHC only experimentalekmett@gmail.com Safe-Infered ]^_`abc cba`]^_ GHC only experimentalekmett@gmail.com Safe-InferedThe 2 function finds a zero of a scalar function using  Halley':s method; its output is a stream of increasingly accurate ( results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@ The * function inverts a scalar function using  Halley':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Note: the "take 10 $ inverse sqrt 1 (sqrt 10) example that works for Newton' s method  fails with Halley'0s method because the preconditions do not hold. The ( function find a fixedpoint of a scalar  function using Halley'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607 The ( function finds an extremum of a scalar  function using Halley',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) + take 10 $ extremum cos 1 -- convert to 0 ]^_`abccba`]^_GHC only experimentalekmett@gmail.com Safe-Infered     /]^_`abcdefghijklm               GHC only experimentalekmett@gmail.com Safe-InferedThe 2 function finds a zero of a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@ The  inverseNewton* function inverts a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Example: > take 10 $ inverseNewton sqrt 1 (sqrt 10) -- converges to 10 The ( function find a fixedpoint of a scalar  function using Newton'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607 The ( function finds an extremum of a scalar  function using Newton',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) + take 10 $ extremum cos 1 -- convert to 0 The " function performs a multivariate ? optimization, based on the naive-gradient-descent in the file   stalingrad/examples/ flow-tests/pre-saddle-1a.vlad from the > VLAD compiler Stalingrad sources. Its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) HIt uses reverse mode automatic differentiation to compute the gradient. ]^_`abccba`]^_GHC only experimentalekmett@gmail.com Safe-InferedCalculate the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. @If you know the relative number of inputs and outputs, consider  or . Calculate both the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward- and reverse- mode AD based on the relative, based on the number of inputs @If you know the relative number of inputs and outputs, consider  or .  g f calculates the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider  or .  g f calculates the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between sparse and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider  or .  f wv% computes the product of the hessian H$ of a non-scalar-to-scalar function f at w = g  $ wv with a vector v = snd  $ wv using " Pearlmutter's method" from  ?http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.6143, which states:  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we take the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode.  f wv6 computes both the gradient of a non-scalar-to-scalar f at w = g  $ wv and the product of the hessian H at w with a vector v = snd  $ wv using " Pearlmutter's method"8. The outputs are returned wrapped in the same functor.  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we return the gradient and the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode. Compute the Hessian via the Jacobian of the gradient. gradient is computed in reverse mode and then the Jacobian is computed in sparse (forward) mode. mCompute the order 3 Hessian tensor on a non-scalar-to-non-scalar function using Sparse or Sparse-on-Reverse H]^_`abcdefghijklm   2   GHC only experimentalekmett@gmail.com Safe-Infered  !"#$%&']^_`abc !"#$%&'&'$%"#cba`! ]^_! "#$%&'GHC only experimentalekmett@gmail.com Safe-InferedN]^_`abcdefghijklm   h !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{||}~                      !"#$%&'()* + , - . / % & 0 1 2 3 4 5678ad-1.3.1Numeric.AD.Internal.CombinatorsNumeric.AD.TypesNumeric.AD.Internal.ClassesNumeric.AD.Internal.CompositionNumeric.AD.Internal.ForwardNumeric.AD.Mode.ForwardNumeric.AD.Internal.TowerNumeric.AD.Internal.ReverseNumeric.AD.Mode.ReverseNumeric.AD.Internal.SparseNumeric.AD.Internal.DenseNumeric.AD.Mode.TowerNumeric.AD.HalleyNumeric.AD.Mode.SparseNumeric.AD.NewtonNumeric.AD.Mode.MixedNumeric.AD.Mode.DirectedNumeric.AD.Internal.TensorsNumeric.AD.Internal.TypesNumeric.AD.Internal.Identity Numeric.ADhessianhessianFNumeric.AD.ClassesNumeric.AD.ReversejacobianNuneric.AD.Sparse jacobian' jacobianWith jacobianWith'onzipWithTzipWithDefaultTTensors:-tailTheadTtensorsJacobianDunarylift1lift1_binarylift2lift2_PrimalprimalModelift<+>*^^*^/zeroLifted showsPrec1==!compare1 fromInteger1+!*!-!negate1signum1abs1/!recip1 fromRational1 toRational1pi1exp1sqrt1log1**!logBase1sin1atan1acos1asin1tan1cos1sinh1atanh1acosh1asinh1tanh1cosh1properFraction1 truncate1floor1ceiling1round1 floatRadix1 floatDigits1 floatRange1 decodeFloat1 encodeFloat1 exponent1 significand1 scaleFloat1isNaN1isIEEE1isNegativeZero1isDenormalized1 isInfinite1atan21succ1pred1toEnum1 fromEnum1 enumFrom1 enumFromThen1 enumFromTo1enumFromThenTo1 minBound1 maxBound1Isoisoosione deriveLifted deriveNumericADrunADFFFUUFUUIdprobeunprobeprobedunprobedlowerUUlowerUFlowerFUlowerFF ComposeModerunComposeModeComposeFunctordecomposeFunctor composeMode decomposeModeForwardtangentunbundlebundleapplybindbind'bindWith bindWith' transposeWithdudu'duFduF'diffdiff'diffFdiffF' jacobianT jacobianWithTgradgrad'gradWith gradWith'hessianProducthessianProduct'TowergetTowerzeroPadzeroPadF transposePadFdd'tangentswithD getADTowertowerReverseTapeUnaryBinaryVarLiftGradpackunpackunpack'varvarId derivative derivative'partials partialArray partialMapunbind unbindWith unbindMapunbindMapWithDefaultvgradvgrad'SparseIndex emptyIndex addToIndexindicesvarsskeletondspartialspartialGradspacksunpacksvgradsDenseds'diffsdiffs0diffsFdiffs0Ftaylortaylor0 maclaurin maclaurin0dusdus0dusFdus0FfindZeroinverse fixedPointextremumgrads jacobianshessian' hessianF'gradientDescentgradientAscent DirectionMixed$fTypeable1Tensors$fTraversableTensors$fFoldableTensors$fFunctorTensors $fShowTensors$fShowShowable$fIsoaa$fDataAD $fTypeable1AD$fEnumAD $fPrimalId$fModeId $fLiftedId $fMonadId$fApplicativeId$fTraversableId $fFoldableId $fFunctorId$fDataComposeMode$fTypeableComposeMode$fTypeable1ComposeMode$fLiftedComposeMode$fModeComposeMode$fPrimalComposeMode$fDataComposeFunctor$fTypeable1ComposeFunctor$fTraversableComposeFunctor$fFoldableComposeFunctor$fFunctorComposeFunctor$fJacobianForward $fModeForward$fPrimalForward$fLiftedForward$fJacobianTower $fModeTower $fPrimalTower $fShowTower $fLiftedTowerbaseGHC.ArrArraycontainers-0.4.2.1 Data.IntMapIntMap$fJacobianReverse$fPrimalReverse $fModeReverse$fMuRefReverse$fGrad(->)(->)(->)a$fGradAD[](,)a$fVarAD $fVarReverse$fMonadS$fLiftedReverse$fJacobianSparse $fModeSparse$fPrimalSparse$fGrads(->)(->)a$fGradsADCofreea$fLiftedSparse$fJacobianDense $fModeDense $fPrimalDense $fShowDense $fLiftedDense Data.Tuplefst