nonlinear-optimization- Various iterative algorithms for optimization of nonlinear functions.





This module implements the algorithms described by Hager and Zhang [1]. We use bindings to CG_DESCENT library by the same authors, version 3.0 from 18/05/2008 [2]. The library code is also licensed under the terms of the GPL.


Main function

Please pay close attention to the types of Vectors and MVetors being used below. They may come from Data.Vector.Generic/Data.Vector.Generic.Mutable or from Data.Vector.Storable/Data.Vector.Storable.Mutable. The rule of thumb is that input pure vectors are Generic and everything else is Storable.



:: Vector v Double 
=> Parameters

How should we optimize.

-> Double

grad_tol, see stopRules.

-> v Double

Initial guess.

-> Function t1

Function to be minimized.

-> Gradient t2

Gradient of the function.

-> Maybe (Combined t3)

(Optional) Combined function computing both the function and its gradient.

-> IO (Vector Double, Result, Statistics) 

Run the CG_DESCENT optimizer and try to minimize the function.

User-defined function types

data Function t whereSource

Function calculating the value of the objective function f at a point x.


VFunction :: Vector v Double => (v Double -> Double) -> Function Simple 
MFunction :: (forall m. (PrimMonad m, Functor m) => PointMVector m -> m Double) -> Function Mutable 

data Gradient t whereSource

Function calculating the value of the gradient of the objective function f at a point x.

The MGradient constructor uses a function receiving as parameters the point x being evaluated (should not be modified) and the vector where the gradient should be written.


VGradient :: Vector v Double => (v Double -> v Double) -> Gradient Simple 
MGradient :: (forall m. (PrimMonad m, Functor m) => PointMVector m -> GradientMVector m -> m ()) -> Gradient Mutable 

data Combined t whereSource

Function calculating both the value of the objective function f and its gradient at a point x.


VCombined :: Vector v Double => (v Double -> (Double, v Double)) -> Combined Simple 
MCombined :: (forall m. (PrimMonad m, Functor m) => PointMVector m -> GradientMVector m -> m Double) -> Combined Mutable 

type PointMVector m = MVector (PrimState m) DoubleSource

Mutable vector representing the point where the function/gradient is begin evaluated. This vector should not be modified.

type GradientMVector m = MVector (PrimState m) DoubleSource

Mutable vector representing where the gradient should be written.

Kinds of function types

data Simple Source

Phantom type for simple pure functions.

data Mutable Source

Phantom type for functions using mutable data.

Result and statistics

data Result Source



Convergence tolerance was satisfied.


Change in function value was less than funcEpsilon * |f|.


Total iterations exceeded maxItersFac * n.


Slope was always negative in line search.


Number of secant iterations exceed nsecant.


Search direction not a descent direction.


Line search fails in initial interval.


Line search fails during bisection.


Line search fails during interval update.


Debug tolerance was on and the test failed (see debugTol).


Function value became NaN.


Initial function value was NaN.

data Statistics Source

Statistics given after the process finishes.




finalValue :: Double

Value of the function at the solution.

gradNorm :: Double

Maximum absolute component of the gradient at the solution.

totalIters :: CInt

Total number of iterations.

funcEvals :: CInt

Total number of function evaluations.

gradEvals :: CInt

Total number of gradient evaluations.


defaultParameters :: ParametersSource

Default parameters. See the documentation for Parameters and TechParameters to see what are the defaults.

data Parameters Source

Parameters given to the optimizer.




printFinal :: Bool

Print final statistics to stdout. Defaults to True.

printParams :: Bool

Print parameters to stdout before starting. Defaults to False

verbose :: Verbose

How verbose we should be while computing. Everything is printed to stdout. Defaults to Quiet.

lineSearch :: LineSearch

What kind of line search should be used. Defaults to AutoSwitch 1e-3.

qdecay :: Double

Factor in [0, 1] used to compute average cost magnitude C_k as follows:

 Q_k = 1 + (qdecay)Q_{k-1},   Q_0 = 0
 C_k = C_{k-1} + (|f_k| - C_{k-1})/Q_k

Defaults to 0.7.

stopRules :: StopRules

Stop rules that define when the iterations should end. Defaults to DefaultStopRule 0.

estimateError :: EstimateError

How to calculate the estimated error in the function value. Defaults to RelativeEpsilon 1e-6.

quadraticStep :: Maybe Double

When to attempt quadratic interpolation in line search. If Nothing then never try a quadratic interpolation step. If Just cutoff, then attemp quadratic interpolation in line search when |f_{k+1} - f_k| / f_k <= cutoff. Defaults to Just 1e-12.

debugTol :: Maybe Double

If Just tol, then always check that f_{k+1} - f_k <= tol * C_k. Otherwise, if Nothing then no checking of function values is done. Defaults to Nothing.

initialStep :: Maybe Double

If Just step, then use step as the initial step of the line search. Otherwise, if Nothing then the initial step is programatically calculated. Defaults to Nothing.

maxItersFac :: Double

Defines the maximum number of iterations. The process is aborted when maxItersFac * n iterations are done, where n is the number of dimensions. Defaults to infinity.

nexpand :: CInt

Maximum number of times the bracketing interval grows or shrinks in the line search. Defaults to 50.

nsecant :: CInt

Maximum number of secant iterations in line search. Defaults to 50.

restartFac :: Double

Restart the conjugate gradient method after restartFac * n iterations. Defaults to 1.

funcEpsilon :: Double

Stop when -alpha * dphi0, the estimated change in function value, is less than funcEpsilon * |f|. Defaults to 0.

nanRho :: Double

After encountering NaN while calculating the step length, growth factor when searching for a bracketing interval. Defaults to 1.3.

techParameters :: TechParameters

Technical parameters which you probably should not touch.

data Verbose Source

How verbose we should be.



Do not output anything to stdout, which most of the time is good.


Print what work is being done on each iteraction.


Print information about every step, may be useful for troubleshooting.

data LineSearch Source

Line search methods that may be used.



Use approximate Wolfe line search.

AutoSwitch Double

Use ordinary Wolfe line search, switch to approximate Wolfe when

 |f_{k+1} - f_k| < AWolfeFac * C_k

where C_k is the average size of cost and AWolfeFac is the parameter to this constructor.

data StopRules Source

Stop rules used to decided when to stop iterating.


DefaultStopRule Double

DefaultStopRule stop_fac stops when

 |g_k|_infty <= max(grad_tol, |g_0|_infty * stop_fac)

where |g_i|_infty is the maximum absolute component of the gradient at the i-th step.


AlternativeStopRule stops when

 |g_k|_infty <= grad_tol * (1 + |f_k|)

data EstimateError Source

How to calculate the estimated error in the function value.


AbsoluteEpsilon Double

AbsoluteEpsilon eps estimates the error as eps.

RelativeEpsilon Double

RelativeEpsilon eps estimates the error as eps * C_k.

Technical parameters

data TechParameters Source

Technical parameters which you probably should not touch. You should read the papers of CG_DESCENT to understand how you can tune these parameters.




techDelta :: Double

Wolfe line search parameter. Defaults to 0.1.

techSigma :: Double

Wolfe line search parameter. Defaults to 0.9.

techGamma :: Double

Decay factor for bracket interval width. Defaults to 0.66.

techRho :: Double

Growth factor when searching for initial bracketing interval. Defaults to 5.

techEta :: Double

Lower bound for the conjugate gradient update parameter beta_k is techEta * ||d||_2. Defaults to 0.01.

techPsi0 :: Double

Factor used in starting guess for iteration 1. Defaults to 0.01.

techPsi1 :: Double

In performing a QuadStep, we evaluate the function at psi1 * previous step. Defaults to 0.1.

techPsi2 :: Double

When starting a new CG iteration, our initial guess for the line search stepsize is psi2 * previous step. Defaults to 2.