Gradient descent is simply used in machine learning to find the values of a function's parameters (coefficients) that minimize a cost function as far as possible. You start by defining the initial parameter's values and from there gradient descent uses calculus to iteratively adjust the values so they minimize the given cost-function.

We can interpret this as saying that the gradient,rf(a), has enough information to nd the deriva-tive in any direction. Steepest ascent. The gradientrf(a) is a vectorin a certain direction. Letube any direction, thatis, any unit vector, and letbe the angle betweenthe vectorsrf(a) andu. Now, we may concludethat the directional derivative