úÎ!ŮĂÍ{Ž      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmno p q r s t u v w x y z { | } ~  €  ‚ ƒ „ … † ‡ ˆ ‰ Š ‹ Œ  Ž   ‘ ’ “ ”•–—˜™š›œžŸ Ą˘Ł¤ĽŚ§¨ŠŞŤŹ­None"#Q sgdDataset stored on a disksgdJThe size of the dataset; the individual indices are [0, 1, ..., size - 1]sgd1Get the dataset element with the given identifiersgd+Lazily load the entire dataset from a disk.sgdShuffle the dataset.sgdJRandom dataset sample with a specified number of elements (loaded eagerly)sgd\Construct dataset from a list of elements, store it as a vector, and run the given handler. sgd{Construct dataset from a list of elements, store it on a disk and run the given handler. Training elements must have the Ž% instance for this function to work.ŻsgdYLazily evaluate each action in the sequence from left to right, and collect the results.°sgd° f is equivalent to Ż . ą f.   None78>UV_fD‹ sgd(Helper class for automatically deriving  using GHC Generics. sgd(Helper class for automatically deriving  using GHC Generics. sgd(Helper class for automatically deriving  using GHC Generics. sgd(Helper class for automatically deriving  using GHC Generics.sgd(Helper class for automatically deriving  using GHC Generics.sgd(Helper class for automatically deriving  using GHC Generics.sgdÜClass of types that can be treated as parameter sets. It provides basic element-wise operations (addition, multiplication, mapping) which are required to perform stochastic gradient descent. Many of the operations (, , , w, etc.) have the same interpretation and follow the same laws (e.g. associativity) as the corresponding operations in ˛ and ł.  takes a parameter set as argument and "zero out"'s all its elements (as in the backprop library). This allows instances for ´, ľe, etc., where the structure of the parameter set is dynamic. This leads to the following property: add (zero x) x = x However,  does not have to obey (add (zero x) y = y).A 2 can be also seen as a (structured) vector, hence  and j. The latter is not strictly necessary to perform SGD, but it is useful to control the training process. should obey the following law:  pmap id x = xŽIf you leave the body of an instance declaration blank, GHC Generics will be used to derive instances if the type has a single constructor and each field is an instance of .sgdElement-wise mappingsgdZero-out all elementssgdElement-wise additionsgdElementi-wise substructionsgdElement-wise multiplicationsgdElement-wise divisionsgdL2 normśsgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.ˇsgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.¸sgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.šsgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.şsgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.ťsgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.0sgdXA map with different parameter sets (of the same type) assigned to the individual keys.[When combining two maps with different sets of keys, only their intersection is preserved.1sgdź6 represents a deactivated parameter set component. If ź( is given as an argument to one of the  operations, the result is ź as well.MThis differs from the corresponding instance in the backprop library, where źý is equivalent to `Just 0`. However, the implementation below seems to correspond adequately enough to the notion that a particular component is either active or not in both the parameter set and the gradient, hence it doesn't make sense to combine ˝ with ź.   Safe"#7EINone"#Oý <sgd A dataset with elements of type a.>sgdA size of the dataset.?sgd_Get dataset element with a given index. The set of indices is of a {0, 1, .., size - 1} form.@sgd Lazily load dataset from a disk.Asgd#A dataset sample of the given size.BsgdGConstruct dataset from a vector of elements and run the given handler.CsgdYConstruct dataset from a list of elements, store it on a disk and run the given handler.DsgdQUse disk or vector dataset representation depending on the first argument: when ž, use C, otherwise use B.żsgdYLazily evaluate each action in the sequence from left to right, and collect the results.ŔsgdŔ f is equivalent to ż . ą f. <=>?@ABCD <=>?@ABCDNoneMUôEsgd,Signed real value in the logarithmic domain.GsgdPositive componentHsgdNegative componentIsgdSmart LogSigned constructor.Jsgd2Make LogSigned from a positive, log-domain number.Ksgd2Make LogSigned from a negative, log-domain number.Lsgd#Shift LogSigned to a normal domain.Msgd Change the E to either negative Á  or positive Ă Â. EFGHIJKLM EFGHIJKLMNoned@ Ssgd×Gradient with nonzero values stored in a logarithmic domain. Since values equal to zero have no impact on the update phase of the SGD method, it is more efficient to not to store those components in the gradient.Tsgd?Add normal-domain double to the gradient at the given position.UsgdDAdd log-domain, singed number to the gradient at the given position.Vsgd~Construct gradient from a list of (index, value) pairs. All values from the list are added at respective gradient positions.Wsgd“Construct gradient from a list of (index, signed, log-domain number) pairs. All values from the list are added at respective gradient positions.Xsgd9Collect gradient components with values in normal domain.Ysgd0Empty gradient, i.e. with all elements set to 0.ZsgdKPerform parallel unions operation on gradient list. Experimental version.Äsgd!Parallel unions in the Par monad.STUVWXYZSYTUVWXZNone"#r! Ĺsgd3Type synonym for mutable vector with Double values.[sgdVector of parameters.\sgd0SGD parameters controlling the learning process.^sgdSize of the batch_sgdRegularization variance`sgdNumber of iterationsasgdInitial gain parameterbsgdOAfter how many iterations over the entire dataset the gain parameter is halvedcsgdDefault SGD parameter values.dsgdA stochastic gradient descent method. A notification function can be used to provide user with information about the progress of the learning.Ćsgd8Add up all gradients and store results in normal domain.Çsgd$Scale the vector by the given value.ČsgdYApply gradient to the parameters vector, that is add the first vector to the second one.dsgdSGD parameter valuessgdNotification run every updatesgdGradient for dataset elementsgdDatasetsgdStarting pointsgd SGD result STUVWXYZ[\]^_`abcd \]^_`abc[dNone"#…ŠÉsgd3Type synonym for mutable vector with Double values.esgdVector of parameters.fsgd0SGD parameters controlling the learning process.hsgdSize of the batchisgdRegularization variancejsgdNumber of iterationsksgdInitial gain parameterlsgdOAfter how many iterations over the entire dataset the gain parameter is halvedmsgdDefault SGD parameter values.Ęsgd*The gamma parameter which drives momentum.TODO: put in SgdArgs.nsgdA stochastic gradient descent method. A notification function can be used to provide user with information about the progress of the learning.Ësgd+Compute the new momentum (gradient) vector.Ěsgd+Compute the new momentum (gradient) vector.Ísgd8Add up all gradients and store results in normal domain.Îsgd$Scale the vector by the given value.ĎsgdYApply gradient to the parameters vector, that is add the first vector to the second one.nsgdSGD parameter valuessgdNotification run every updatesgdGradient for dataset elementsgdDataSetsgdStarting pointsgd SGD resultËsgdRegularization parametersgdThe parameterssgdThe current gradientĚsgdThe gamma parametersgdThe previous momentumsgdThe scaled current gradient STUVWXYZefghijklmn fghijklmen Safeˆüosgd]SGD is a pipe which, given the initial parameter values, consumes training elements of type eA and outputs the subsequently calculated parameter sets of type p.oo None"#7ępsgdMomentum configurationrsgd-Initial step size, used to scale the gradientssgdThe step size after k * s iterations = r / (k + 1)tsgd Momentum termusgd Scale the sJ parameter. Useful e.g. to account for the size of the training dataset.vsgd/Stochastic gradient descent with momentum. See Numeric.SGD.Momentum for more information.ĐsgdScalingvsgdMomentum configurationsgdGradient on a training elementpqtsruvpqtsruv None"#7—b|sgdAdaDelta configuration~sgdInitial step sizesgdThe step size after k *  iterations = ~ / (k + 1)€sgd1st exponential moment decaysgd1st exponential moment decay‚sgdEpsilonƒsgd Scale the J parameter. Useful e.g. to account for the size of the training dataset.„sgd:Perform gradient descent using the Adam algorithm. See Numeric.SGD.Adam for more information.„sgdAdam configurationsgdGradient on a training element |}‚~€ƒ„ |}‚~€ƒ„ None"#7œ™ŠsgdAdaDelta configurationŒsgdExponential decay parametersgd Epsilon valueŽsgd>Perform gradient descent using the AdaDelta algorithm. See Numeric.SGD.AdaDelta for more information.ŃsgdScalingŇsgd Root squareÓsgdSquareŽsgdAdaDelta configurationsgdGradient on a training elementŠ‹ŒŽŠ‹ŒŽNone"#7ĚP”sgd%High-level IO-based SGD configuration–sgd4Number of iteration over the entire training dataset—sgdMini-batch size˜sgd=The number of overlapping elements in subsequent mini-batches™sgdÉShould the mini-batch be selected at random? If not, the subsequent training elements will be picked sequentially. Random selection gives no guarantee of seeing each training sample in every epoch.šsgdHHow often the value of the objective function should be reported (with 1. meaning once per pass over the training data)›sgdąTraverse all the elements in the training data stream in one pass, calculate the subsequent gradients, and apply them progressively starting from the initial parameter values.Consider using Ÿ# if your training dataset is large.Ôsgd(Number of new elements in each new batchœsgdbCalculate the effective number of SGD iterations (and gradient calculations) performed per epoch.sgd+Report the total objective value on stdout.žsgdrValue of the objective function over the entire dataset (i.e. the sum of the objectives on all dataset elements).ŸsgdËPerform SGD in the IO monad, regularly reporting the value of the objective function on the entire dataset. A higher-level wrapper which should be convenient to use when the training dataset is large..An alternative is to use the simpler function ›G, or to build a custom SGD pipeline based on lower-level combinators ( , ˘, „, §, Ś, etc.). sgd2Pipe all the elements in the dataset sequentially.Ąsgd7Pipe all the elements in the dataset in a random order.˘sgd=Group dataset elements into (mini-)batches of the given size.ŁsgdEAdapt the gradient function to handle (mini-)batches. Relies on the p's Ő9 instance to efficiently calculate gradients in parallel.¤sgd A version of Ł with no Ő[ constraint. Evaluates the sub-gradients calculated in parallel to weak head normal form.Ösgd™Adapt the gradient function to handle (mini-)batches. The sub-gradients of the individual batch elements are evaluated in parallel based on the given ×.ĽsgdzAdapt the gradient function to handle (mini-)batches. The function calculates the individual sub-gradients sequentially.ŚsgdWExtract the result of the SGD calculation (the last parameter set flowing downstream).§sgd Keep every k:-th element flowing downstream and discard all the others.¨sgd~Make the stream decreasing in the given (monadic) function by discarding elements with values higher than those already seen.›sgdSelected SGD methodsgdTraining data streamsgdInitial parametersœsgd Dataset sizesgd4Value of the objective function on a dataset elementsgdTraining datasetžsgd4Value of the objective function on a dataset elementsgdTraining datasetŸsgdSGD configurationsgd3SGD pipe consuming mini-batches of dataset elementssgdFQuality reporting function (the reporting frequency is specified via š)sgdTraining datasetsgdInitial parameter valuesŚsgd+Default value (in case the stream is empty)sgdStream of parameter setsv„Ž”•—–˜™š›œžŸ Ą˘Ł¤ĽŚ§¨vŽ„›”•—–˜™šœžŸ Ą˘ĽŁ¤Ś§¨SafeÍVŘŮÚŰÜÝŢßŕ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLLMNOOPQRSTUVWXYZ[\#]^_`abcddefghijkcddefghijk l m m n i o p q r s t u v m m n i w x y p z r s t u v m m { y | r s t u vmmge}~€‚ƒ„…†‡ˆ‰Š‹ŒrstuvŽ‘’“”•“–—“˜™“š›œžŸ Ą˘Ł¤“šĽ“šڧ¨Š‘’“ŞŤŹ­Ž“ŞŻ°ą˛ł´ąoľś˛ł´ ł ł ˇ ¸šşťź˝žżŔÁÂĂÄĹĆÇČÉ"sgd-0.8.0.0-Gv9lCFrFnqOGbdfFqecuIG Numeric.SGDNumeric.SGD.DataSetNumeric.SGD.ParamSetNumeric.SGD.Sparse.DatasetNumeric.SGD.Sparse.LogSignedNumeric.SGD.Sparse.GradNumeric.SGD.SparseNumeric.SGD.Sparse.MomentumNumeric.SGD.TypeNumeric.SGD.MomentumNumeric.SGD.AdamNumeric.SGD.AdaDeltaNumeric.SGD.Pipe Paths_sgd1data-default-class-0.1.2.0-FeIQ5tLoVZBHMSgrT9zptQData.Default.ClassdefDataSetsizeelemAtloadDatashuffle randomSamplewithVectwithDiskGPMapGNorm2GDivGMulGSubGAddParamSetpmapzeroaddsubmuldivnorm_2$fGAddM1$fGAddU1$fGAddV1 $fGAdd:*:$fGSubM1$fGSubU1$fGSubV1 $fGSub:*:$fGMulM1$fGMulU1$fGMulV1 $fGMul:*:$fGDivM1$fGDivU1$fGDivV1 $fGDiv:*: $fGNorm2M1 $fGNorm2U1 $fGNorm2V1 $fGNorm2:*: $fGPMapM1 $fGPMapU1 $fGPMapV1 $fGPMap:*: $fParamSetMap$fParamSetMaybe $fParamSetL $fParamSetR $fParamSet(,)$fParamSetDouble $fGPMapK1 $fGNorm2K1$fGDivK1$fGMulK1$fGSubK1$fGAddK1DatasetsamplewithData LogSignedposneg logSignedfromPosfromNegtoNorm toLogFloat$fNumLogSigned$fNFDataLogSigned$fOrdLogSigned $fEqLogSigned$fShowLogSignedGradaddLfromList fromLogListtoListempty parUnionsParaSgdArgs batchSizeregVariterNumgain0tausgdArgsDefaultsgdSGDConfigalpha0gammascaleTaumomentum$fDefaultConfig $fShowConfig $fEqConfig $fOrdConfig$fGenericConfigbeta1beta2epsadamdecayadaDelta batchOverlap batchRandom reportEveryruniterNumPerEpochreportObjective objectiveWithrunIOpipeSeqpipeRanbatch batchGradPar batchGradPar' batchGradSeqresult keepEvery decreasingBybinary-0.8.6.0Data.Binary.ClassBinary lazySequencelazyMapMbaseGHC.BasemapGHC.NumNumGHC.Real Fractional GHC.MaybeMaybecontainers-0.6.0.1Data.Map.InternalMap genericAdd genericSub genericDiv genericMul genericNorm2 genericPMapNothingJustghc-prim GHC.TypesTrue Data.EitherLeft(logfloat-0.13.3.3-INZZl0jI1XtLqGOkSuzmnAData.Number.LogFloatLogFloatRight parUnionsPMVectaddUpscaleaddToapplyRegularizationupdateMomentum squareRootsquarebatchNewdeepseq-1.4.4.0Control.DeepSeqNFData batchGradWith'parallel-3.2.2.0-EGl5SOk48TWHAD161C93aQControl.Parallel.StrategiesStrategyversion getBinDir getLibDir getDynLibDir getDataDir getLibexecDir getSysconfDirgetDataFileName