y&      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                 !"#$%GHC only experimentalekmett@gmail.com Safe-InferedGHC only experimentalekmett@gmail.comNone  is used by  deriveMode but is not exposed  via  # to prevent its abuse by end users  via the AD data type.   is used by  deriveMode but is not exposed  via the  ) class to prevent its abuse by end users  via the AD data type. QIt provides direct access to the result, stripped of its derivative information, K but this is unsafe in general as (lift . primal) would discard derivative N information. The end user is protected from accidentally using this function G by the universal quantification on the various combinators we expose. @allowed to return False for items with a zero derivative, but we'*ll give more NaNs than strictly necessary 6allowed to return False for zero, but we give more NaN's than strictly necessary then Embed a constant  Vector sum Scalar-vector multiplication Vector-scalar multiplication Scalar division aExponentiation, this should be overloaded if you can figure out anything about what is constant!   'zero' = 'lift' 0 YY t provides   instance Lifted $t given supplied instances for  + instance Lifted $t => Primal $t where ... - instance Lifted $t => Jacobian $t where ... The seemingly redundant  $tB constraints are caused by Template Haskell staging restrictions. ZZ f g# provides the following instances:  < instance ('Lifted' $f, 'Num' a, 'Enum' a) => 'Enum' ($g a) 8 instance ('Lifted' $f, 'Num' a, 'Eq' a) => 'Eq' ($g a) : instance ('Lifted' $f, 'Num' a, 'Ord' a) => 'Ord' ($g a) B instance ('Lifted' $f, 'Num' a, 'Bounded' a) => 'Bounded' ($g a) 3 instance ('Lifted' $f, 'Show' a) => 'Show' ($g a) 1 instance ('Lifted' $f, 'Num' a) => 'Num' ($g a) ? instance ('Lifted' $f, 'Fractional' a) => 'Fractional' ($g a) ; instance ('Lifted' $f, 'Floating' a) => 'Floating' ($g a) = instance ('Lifted' $f, 'RealFloat' a) => 'RealFloat' ($g a) ; instance ('Lifted' $f, 'RealFrac' a) => 'RealFrac' ($g a) 3 instance ('Lifted' $f, 'Real' a) => 'Real' ($g a) Y  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ&X  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZX X YZ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVW   = !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ&GHC only experimentalekmett@gmail.comNone[IA jet is a tower of all (higher order) partial derivatives of a function [\]^_'()*+,[\]^_ [\]^_'()*+,GHC only experimentalekmett@gmail.comNone``* serves as a common wrapper for different  # instances, exposing a traditional X numerical tower. Universal quantification is used to limit the actions in user code to Z machinery that will return the same answers under all AD modes, allowing us to use modes ( interchangeably as both the type level "brand") and dictionary, providing a common API. `ab-./`ab`ab-./GHC only experimentalekmett@gmail.comNonecTower is an AD  B that calculates a tangent tower by forward AD, and provides fast diffsUU, diffsUF cdefghijklmnop01234cdefghijklmnopcdefghijmklnopcdefghijklmnop01234NoneqJWe only store partials in sorted order, so the map contained in a partial M will only contain partials with equal or greater keys to that of the map in S which it was found. This should be key for efficiently computing sparse hessians. K there are only (n + k - 1) choose k distinct nth partial derivatives of a  function with k inputs. "qrstuvwxyz{|}~56789:;<qrstuvwxyz{|}~tuvwxqsrzy|}~{qsrtuvwxyz{|}~56789:;< non-portable experimentalekmett@gmail.comNoneGHC only experimentalekmett@gmail.comNone=>?@ABCDEFGHIJK=>?@ABC =>?@ABCDEFGHIJKGHC only experimentalekmett@gmail.comNoneLMNO LMNOGHC only experimentalekmett@gmail.comNoneReverse is a  A using reverse-mode automatic differentiation that provides fast diffFU, diff2FU, grad, grad2 and a fast jacobianF when you have a significantly smaller number of outputs than inputs. A TapeT records the information needed back propagate from the output to each input during    AD. >Used to mark variables for inspection during the reverse pass 6This returns a list of contributions to the partials. 2 The variable ids returned in the list are likely not unique!  Return an P of $ given bounds for the variable IDs.  Return an Q of sparse partials %RSTUVWXYZ[RSTUVWXYZ[ non-portable experimentalekmett@gmail.comNone non-portable experimentalekmett@gmail.comNoneGHC only experimentalekmett@gmail.comNone \]^_` \]^_` GHC only experimentalekmett@gmail.comNone?The composition of two AD modes is an AD mode in its own right ?Functor composition, used to nest the use of jacobian and grad abcdefghijkabcdefghijkGHC only experimentalekmett@gmail.com Safe-Infered [\]^_`ab `ab[\^]_ GHC only experimentalekmett@gmail.com Safe-InferedThe Y function calculates the first derivative of a scalar-to-scalar function by forward-mode `  diff sin == cos The d'V function calculates the result and first derivative of scalar-to-scalar function by Forward `  d' sin == sin &&& cos  d' f = f &&& d f The N function calculates the first derivative of scalar-to-nonscalar function by Forward ` The \ function calculates the result and first derivative of a scalar-to-non-scalar function by Forward ` BA fast, simple transposed Jacobian computed with forward-mode AD. BA fast, simple transposed Jacobian computed with forward-mode AD. TCompute the product of a vector with the Hessian using forward-on-forward-mode AD. LCompute the gradient and hessian product using forward-on-forward-mode AD.  GHC only experimentalekmett@gmail.com Safe-Infered GHC only experimentalekmett@gmail.comNone The J function calculates the gradient of a non-scalar-to-scalar function with  AD in a single pass. The U function calculates the result and gradient of a non-scalar-to-scalar function with  AD in a single pass.  g fE function calculates the gradient of a non-scalar-to-scalar function f( with reverse-mode AD in a single pass. L The gradient is combined element-wise with the argument using the function g.  grad == gradWith (\_ dx -> dx)  id == gradWith const  g fG calculates the result and gradient of a non-scalar-to-scalar function f with  AD in a single pass L the gradient is combined element-wise with the argument using the function g. " grad' == gradWith' (\_ dx -> dx) The c function calculates the jacobian of a non-scalar-to-non-scalar function with reverse AD lazily in m passes for m outputs. The b function calculates both the result and the Jacobian of a nonscalar-to-nonscalar function, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of   | An alias for gradF' 'jacobianWith g f'@ calculates the Jacobian of a non-scalar-to-non-scalar function f with reverse AD lazily in m passes for m outputs. kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. ( jacobian == jacobianWith (\_ dx -> dx) 1 jacobianWith const == (\f x -> const x <$> f x)  g f'R calculates both the result and the Jacobian of a nonscalar-to-nonscalar function f, using m invocations of reverse AD,  where m( is the output dimensionality. Applying fmap snd* to the result will recover the result of  kInstead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g. * jacobian' == jacobianWith' (\_ dx -> dx) The d'4 function calculates the value and derivative, as a ' pair, of a scalar-to-scalar function. Compute the hessian via the jacobian of the gradient. gradient is computed in reverse mode and then the jacobian is computed in reverse mode. However, since the 'grad f :: f a -> f a'i is square this is not as fast as using the forward-mode Jacobian of a reverse mode gradient provided by . Compute the order 3 Hessian tensor on a non-scalar-to-non-scalar function via the reverse-mode Jacobian of the reverse-mode Jacobian of the function. Less efficient than .  GHC only experimentalekmett@gmail.com Safe-InferedGHC only experimentalekmett@gmail.com Safe-Infered The  2 function finds a zero of a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@  The  inverseNewton* function inverts a scalar function using  Newton':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Example: > take 10 $ inverseNewton sqrt 1 (sqrt 10) -- converges to 10  The  ( function find a fixedpoint of a scalar  function using Newton'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607  The  ( function finds an extremum of a scalar  function using Newton',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) * take 10 $ extremum cos 1 -- convert to 0  The  " function performs a multivariate ? optimization, based on the naive-gradient-descent in the file   stalingrad/examples/ flow-tests/pre-saddle-1a.vlad from the > VLAD compiler Stalingrad sources. Its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) HIt uses reverse mode automatic differentiation to compute the gradient.                     GHC only experimentalekmett@gmail.com Safe-InferedThe 2 function finds a zero of a scalar function using  Halley':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Examples:  7 take 10 $ findZero (\\x->x^2-4) 1 -- converge to 2.0  module Data.Complex C take 10 $ findZero ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)@ The * function inverts a scalar function using  Halley':s method; its output is a stream of increasingly accurate ' results. (Modulo the usual caveats.)  Note: the "take 10 $ inverse sqrt 1 (sqrt 10) example that works for Newton' s method  fails with Halley'0s method because the preconditions do not hold. The ( function find a fixedpoint of a scalar  function using Halley'$s method; its output is a stream of = increasingly accurate results. (Modulo the usual caveats.) ? take 10 $ fixedPoint cos 1 -- converges to 0.7390851332151607 The ( function finds an extremum of a scalar  function using Halley',s method; produces a stream of increasingly 0 accurate results. (Modulo the usual caveats.) * take 10 $ extremum cos 1 -- convert to 0 GHC only experimentalekmett@gmail.com Safe-InferedCalculate the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. @If you know the relative number of inputs and outputs, consider  or . Calculate both the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward- and reverse- mode AD based on the relative, based on the number of inputs @If you know the relative number of inputs and outputs, consider  or .  g f calculates the Jacobian of a non-scalar-to-non-scalar function, automatically choosing between forward and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider  or .  g f calculates the answer and Jacobian of a non-scalar-to-non-scalar function, automatically choosing between sparse and reverse mode AD based on the number of inputs and outputs. SThe resulting Jacobian matrix is then recombined element-wise with the input using g. @If you know the relative number of inputs and outputs, consider   or  .  f wv% computes the product of the hessian H$ of a non-scalar-to-scalar function f at w = l  $ wv with a vector v = snd  $ wv using " Pearlmutter's method" from  ?http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.6143, which states:  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we take the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode.  f wv6 computes both the gradient of a non-scalar-to-scalar f at w = l  $ wv and the product of the hessian H at w with a vector v = snd  $ wv using " Pearlmutter's method"8. The outputs are returned wrapped in the same functor.  ' H v = (d/dr) grad_w (w + r v) | r = 0 Or in other words, we return the gradient and the directional derivative of the gradient. The gradient is calculated in reverse mode, then the directional derivative is calculated in forward mode. Compute the Hessian via the Jacobian of the gradient. gradient is computed in reverse mode and then the Jacobian is computed in sparse (forward) mode. lCompute the order 3 Hessian tensor on a non-scalar-to-non-scalar function using Sparse or Sparse-on-Reverse &&GHC only experimentalekmett@gmail.com Safe-Infered  !"#$%  !"#$% $%"# ! !"#$%m!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                           !"#$%&'()*+,- . / 0 1 2 3 4 5 6 7 89:;ad-1.5Numeric.AD.Internal.CombinatorsNumeric.AD.Internal.ClassesNumeric.AD.TypesNumeric.AD.Internal.TowerNumeric.AD.Internal.SparseNumeric.AD.Internal.ForwardNumeric.AD.Internal.ReverseNumeric.AD.Internal.DenseNumeric.AD.Internal.CompositionNumeric.AD.Mode.ForwardNumeric.AD.Mode.TowerNumeric.AD.Mode.ReverseNumeric.AD.Mode.SparseNumeric.AD.NewtonNumeric.AD.Halley Numeric.ADNumeric.AD.Mode.DirectedNumeric.AD.Internal.JetNumeric.AD.Internal.TypesNumeric.AD.Variadic.SparseNumeric.AD.Internal.IdentityNumeric.AD.Variadic.ReverseNumeric.AD.VariadichessianNumeric.AD.Mode.MixedhessianFNumeric.AD.ReversejacobianNuneric.AD.Sparse jacobian' jacobianWith jacobianWith'onzipWithTzipWithDefaultTJacobianDunarylift1lift1_binarylift2lift2_PrimalprimalModeisKnownConstant isKnownZerolift<+>*^^*^/<**>zeroLifted showsPrec1==!compare1 fromInteger1+!*!-!negate1signum1abs1/!recip1 fromRational1 toRational1pi1exp1sqrt1log1**!logBase1sin1atan1acos1asin1tan1cos1sinh1atanh1acosh1asinh1tanh1cosh1properFraction1 truncate1floor1ceiling1round1 floatRadix1 floatDigits1 floatRange1 decodeFloat1 encodeFloat1 exponent1 significand1 scaleFloat1isNaN1isIEEE1isNegativeZero1isDenormalized1 isInfinite1atan21succ1pred1toEnum1 fromEnum1 enumFrom1 enumFromThen1 enumFromTo1enumFromThenTo1 minBound1 maxBound1Isoisoosione deriveLifted deriveNumericJet:-tailJetheadJetjetADrunADTowergetTowerzeroPadzeroPadF transposePadFdd'tangentsbundlewithDapply getADTowertowerSparseZeroIndex emptyIndex addToIndexindicesvarsskeletondspartialspartialGradspacksunpacksGradpackunpackunpack'vgradvgrad'vgradsForwardLifttangentunbundlebindbind'bindWith bindWith' transposeWithReverseTapeUnaryBinaryVarvarvarId derivative derivative'partials partialArray partialMapunbind unbindWith unbindMapunbindMapWithDefaultDenseds' ComposeModerunComposeModeComposeFunctordecomposeFunctor composeMode decomposeModelowerUUlowerUFlowerFUlowerFFdudu'duFduF'diffdiff'diffFdiffF' jacobianT jacobianWithTgradgrad'gradWith gradWith'hessianProducthessianProduct'diffsdiffs0diffsFdiffs0Ftaylortaylor0 maclaurin maclaurin0dusdus0dusFdus0Fgrads jacobianshessian' hessianF'findZeroinverse fixedPointextremumgradientDescentgradientAscent DirectionMixed$fIsoaa$fTypeable1Jet$fTraversableJet $fFoldableJet $fFunctorJet $fShowJet$fShowShowable$fDataAD $fTypeable1AD$fEnumAD$fJacobianTower $fModeTower $fPrimalTower $fShowTower $fLiftedTower$fJacobianSparse $fModeSparse$fPrimalSparse$fGrads(->)(->)a$fGradsADCofreea$fGrad(->)(->)(->)a$fGradAD[](,)a$fLiftedSparseIdrunIdprobeunprobeprobedunprobed $fPrimalId$fModeId $fLiftedId $fMonadId$fApplicativeId$fTraversableId $fFoldableId $fFunctorId$fJacobianForward $fModeForward$fPrimalForward$fLiftedForwardbaseGHC.ArrArraycontainers-0.4.2.1 Data.IntMapIntMap$fJacobianReverse$fPrimalReverse $fModeReverse$fMuRefReverse$fVarAD $fVarReverse$fMonadS$fLiftedReverse$fJacobianDense $fModeDense $fPrimalDense $fShowDense $fLiftedDense$fDataComposeMode$fTypeableComposeMode$fTypeable1ComposeMode$fLiftedComposeMode$fModeComposeMode$fPrimalComposeMode$fDataComposeFunctor$fTypeable1ComposeFunctor$fTraversableComposeFunctor$fFoldableComposeFunctor$fFunctorComposeFunctor Data.Tuplefst