# Nesterov's Optimal Gradient method

:: (Additive f, Functor f, Ord a, Floating a, Epsilon a) | |

=> a | condition number, |

-> a | Lipschitz constant, |

-> (f a -> f a) | gradient of function |

-> a | initial step size, |

-> f a | starting point |

-> [f a] | iterates |

Nesterov 1983
`optimalGradient kappa l df alpha0 x0`

is Nesterov's optimal
gradient method, first described in 1983. This method requires
knowledge of the Lipschitz constant `l`

of the gradient, the condition
number `kappa`

, as well as an initial step size `alpha0`

in `(0,1)`

.