úÎ!°tŠN–      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcde f g h i j k l m n o p q r s t u v w x y z { | } ~  €  ‚ ƒ „ … †‡ˆ‰Š‹ŒŽ‘’“”• None"#³ sgdDataset stored on a disksgdJThe size of the dataset; the individual indices are [0, 1, ..., size - 1]sgd1Get the dataset element with the given identifiersgd+Lazily load the entire dataset from a disk.sgdJRandom dataset sample with a specified number of elements (loaded eagerly)sgd\Construct dataset from a list of elements, store it as a vector, and run the given handler.sgd{Construct dataset from a list of elements, store it on a disk and run the given handler. Training elements must have the –% instance for this function to work.—sgdYLazily evaluate each action in the sequence from left to right, and collect the results.˜sgd˜ f is equivalent to — . ™ f.None78>UV_fBå sgd(Helper class for automatically deriving  using GHC Generics. sgd(Helper class for automatically deriving  using GHC Generics. sgd(Helper class for automatically deriving  using GHC Generics. sgd(Helper class for automatically deriving  using GHC Generics. sgd(Helper class for automatically deriving  using GHC Generics.sgd(Helper class for automatically deriving  using GHC Generics.sgdÜClass of types that can be treated as parameter sets. It provides basic element-wise operations (addition, multiplication, mapping) which are required to perform stochastic gradient descent. Many of the operations (, , , w, etc.) have the same interpretation and follow the same laws (e.g. associativity) as the corresponding operations in š and ›.  takes a parameter set as argument and "zero out"'s all its elements (as in the backprop library). This allows instances for œ, e, etc., where the structure of the parameter set is dynamic. This leads to the following property: add (zero x) x = x However,  does not have to obey (add (zero x) y = y).A 2 can be also seen as a (structured) vector, hence  and j. The latter is not strictly necessary to perform SGD, but it is useful to control the training process. should obey the following law:  pmap id x = x®If you leave the body of an instance declaration blank, GHC Generics will be used to derive instances if the type has a single constructor and each field is an instance of .sgdElement-wise mappingsgdZero-out all elementssgdElement-wise additionsgdElementi-wise substructionsgdElement-wise multiplicationsgdElement-wise divisionsgdL2 normžsgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.Ÿsgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors. sgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.¡sgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.¢sgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors.£sgd; using GHC Generics; works if all fields are instances of /, but only for values with single constructors./sgdXA map with different parameter sets (of the same type) assigned to the individual keys.[When combining two maps with different sets of keys, only their intersection is preserved.0sgd€6 represents a deactivated parameter set component. If €( is given as an argument to one of the  operations, the result is € as well.MThis differs from the corresponding instance in the backprop library, where €ý is equivalent to `Just 0`. However, the implementation below seems to correspond adequately enough to the notion that a particular component is either active or not in both the parameter set and the gradient, hence it doesn't make sense to combine ¥ with €.   NoneMI:sgd,Signed real value in the logarithmic domain.<sgdPositive component=sgdNegative component>sgdSmart LogSigned constructor.?sgd2Make LogSigned from a positive, log-domain number.@sgd2Make LogSigned from a negative, log-domain number.Asgd#Shift LogSigned to a normal domain.Bsgd Change the : to either negative Š § or positive š §. :;<=>?@AB :;<=>?@ABNoneWP Hsgd×Gradient with nonzero values stored in a logarithmic domain. Since values equal to zero have no impact on the update phase of the SGD method, it is more efficient to not to store those components in the gradient.Isgd?Add normal-domain double to the gradient at the given position.JsgdDAdd log-domain, singed number to the gradient at the given position.Ksgd~Construct gradient from a list of (index, value) pairs. All values from the list are added at respective gradient positions.Lsgd“Construct gradient from a list of (index, signed, log-domain number) pairs. All values from the list are added at respective gradient positions.Msgd9Collect gradient components with values in normal domain.Nsgd0Empty gradient, i.e. with all elements set to 0.OsgdKPerform parallel unions operation on gradient list. Experimental version.©sgd!Parallel unions in the Par monad.HIJKLMNOHNIJKLMONone"#e1 ªsgd3Type synonym for mutable vector with Double values.PsgdVector of parameters.Qsgd0SGD parameters controlling the learning process.SsgdSize of the batchTsgdRegularization varianceUsgdNumber of iterationsVsgdInitial gain parameterWsgdOAfter how many iterations over the entire dataset the gain parameter is halvedXsgdDefault SGD parameter values.YsgdA stochastic gradient descent method. A notification function can be used to provide user with information about the progress of the learning.«sgd8Add up all gradients and store results in normal domain.¬sgd$Scale the vector by the given value.­sgdYApply gradient to the parameters vector, that is add the first vector to the second one.YsgdSGD parameter valuessgdNotification run every updatesgdGradient for dataset elementsgdDatasetsgdStarting pointsgd SGD resultHIJKLMNOPQRSTUVWXY QRSTUVWXPYNone"#x–®sgd3Type synonym for mutable vector with Double values.ZsgdVector of parameters.[sgd0SGD parameters controlling the learning process.]sgdSize of the batch^sgdRegularization variance_sgdNumber of iterations`sgdInitial gain parameterasgdOAfter how many iterations over the entire dataset the gain parameter is halvedbsgdDefault SGD parameter values.¯sgd*The gamma parameter which drives momentum.TODO: put in SgdArgs.csgdA stochastic gradient descent method. A notification function can be used to provide user with information about the progress of the learning.°sgd+Compute the new momentum (gradient) vector.±sgd+Compute the new momentum (gradient) vector.²sgd8Add up all gradients and store results in normal domain.³sgd$Scale the vector by the given value.ŽsgdYApply gradient to the parameters vector, that is add the first vector to the second one.csgdSGD parameter valuessgdNotification run every updatesgdGradient for dataset elementsgdDataSetsgdStarting pointsgd SGD result°sgdRegularization parametersgdThe parameterssgdThe current gradient±sgdThe gamma parametersgdThe previous momentumsgdThe scaled current gradientHIJKLMNOZ[\]^_`abc [\]^_`abZcSafe|dsgd]SGD is a pipe which, given the initial parameter values, consumes training elements of type eA and outputs the subsequently calculated parameter sets of type p.dd None"#7ÉesgdMomentum configurationgsgd2Initial gain parameter, used to scale the gradienthsgdAAfter how many gradient calculations the gain parameter is halvedisgd Momentum termjsgd/Stochastic gradient descent with momentum. See Numeric.SGD.Momentum for more information.µsgdScalingjsgdMomentum configurationsgdGradient on a training elementefighjefighj None"#7†ÂpsgdAdaDelta configurationrsgd Step sizessgd1st exponential moment decaytsgd1st exponential moment decayusgdEpsilonvsgd:Perform gradient descent using the Adam algorithm. See Numeric.SGD.Adam for more information.vsgdAdam configurationsgdGradient on a training elementpqurstvpqurstv None"#7‹é|sgdAdaDelta configuration~sgdExponential decay parametersgd Epsilon value€sgd>Perform gradient descent using the AdaDelta algorithm. See Numeric.SGD.AdaDelta for more information.¶sgdScaling·sgd Root squarežsgdSquare€sgdAdaDelta configurationsgdGradient on a training element|}~€|}~€None"#7¥Ò †sgd%High-level IO-based SGD configurationˆsgd4Number of iteration over the entire training dataset‰sgdÉShould the mini-batch be selected at random? If not, the subsequent training elements will be picked sequentially. Random selection gives no guarantee of seeing each training sample in every epoch.ŠsgdHHow often the value of the objective function should be reported (with 1. meaning once per pass over the training data)‹sgd±Traverse all the elements in the training data stream in one pass, calculate the subsequent gradients, and apply them progressively starting from the initial parameter values.Consider using Œ# if your training dataset is large.ŒsgdËPerform SGD in the IO monad, regularly reporting the value of the objective function on the entire dataset. A higher-level wrapper which should be convenient to use when the training dataset is large..An alternative is to use the simpler function ‹G, or to build a custom SGD pipeline based on lower-level combinators (, €, , , etc.).sgd(Pipe the dataset sequentially in a loop.Žsgd$Pipe the dataset randomly in a loop.sgdWExtract the result of the SGD calculation (the last parameter set flowing downstream).sgdApply the given function every k param sets flowing downstream.‹sgdSelected SGD methodsgdTraining data streamsgdInitial parametersŒsgdSGD configurationsgdSelected SGD methodsgdYValue of the objective function on a sample element (needed for model quality reporting)sgdTraining datasetsgdInitial parameter valuessgd+Default value (in case the stream is empty)sgdStream of parameter setsjv€†‡ˆ‰Š‹ŒŽj€v‹†‡ˆ‰ŠŒŽ¹  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc[\\]^_`abcd e e ` a f g h i j k l e e m n o p q h i j k l e e r p s h i j k lee_tuvwxyz{hijkl|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’ˆ“ˆ”•–—˜™•š›œžŸœf ¡žŸ ž ž ¢ £€"sgd-0.6.0.0-JHbBPHWTGdxDz4NNNmWXxw Numeric.SGDNumeric.SGD.DataSetNumeric.SGD.ParamSetNumeric.SGD.Sparse.LogSignedNumeric.SGD.Sparse.GradNumeric.SGD.SparseNumeric.SGD.Sparse.MomentumNumeric.SGD.TypeNumeric.SGD.MomentumNumeric.SGD.AdamNumeric.SGD.AdaDelta1data-default-class-0.1.2.0-FeIQ5tLoVZBHMSgrT9zptQData.Default.ClassdefDataSetsizeelemAtloadData randomSamplewithVectwithDiskGPMapGNorm2GDivGMulGSubGAddParamSetpmapzeroaddsubmuldivnorm_2$fGAddM1$fGAddU1$fGAddV1 $fGAdd:*:$fGSubM1$fGSubU1$fGSubV1 $fGSub:*:$fGMulM1$fGMulU1$fGMulV1 $fGMul:*:$fGDivM1$fGDivU1$fGDivV1 $fGDiv:*: $fGNorm2M1 $fGNorm2U1 $fGNorm2V1 $fGNorm2:*: $fGPMapM1 $fGPMapU1 $fGPMapV1 $fGPMap:*: $fParamSetMap$fParamSetMaybe $fParamSetL $fParamSetR$fParamSetDouble $fGPMapK1 $fGNorm2K1$fGDivK1$fGMulK1$fGSubK1$fGAddK1 LogSignedposneg logSignedfromPosfromNegtoNorm toLogFloat$fNumLogSigned$fNFDataLogSigned$fOrdLogSigned $fEqLogSigned$fShowLogSignedGradaddLfromList fromLogListtoListempty parUnionsParaSgdArgs batchSizeregVariterNumgain0tausgdArgsDefaultsgdSGDConfiggammamomentum$fDefaultConfig $fShowConfig $fEqConfig $fOrdConfig$fGenericConfigalphabeta1beta2epsadamdecayadaDelta batchRandom reportEveryrunrunIOpipeSeqpipeRanresulteverybinary-0.8.6.0Data.Binary.ClassBinary lazySequencelazyMapMbaseGHC.BasemapGHC.NumNumGHC.Real Fractional GHC.MaybeMaybecontainers-0.6.0.1Data.Map.InternalMap genericAdd genericSub genericDiv genericMul genericNorm2 genericPMapNothingJust Data.EitherLeft(logfloat-0.13.3.3-INZZl0jI1XtLqGOkSuzmnAData.Number.LogFloatLogFloatRight parUnionsPMVectaddUpscaleaddToapplyRegularizationupdateMomentum squareRootsquare