ansaurus

Question

Implementation of Gradient (Steepest) Descent

Answer 1

+2 A:

1) Start in 2D, this way you can plot the path of the descent and actually see your algorithm working.

2) df/dx = (f(x+h)-f(x-h))/(2*h) if f evaluation is cheap, (f(x+h)-f(x))/h if it is expensive. The choice of h should balance truncation error (mostly with big h) and roundoff error (small h). Typical values of h are ~ pow(DBL_EPSILON, 1./3), but the actual exponent depends on the formula for the derivative, and ideally there should be a prefactor that depends on f. You may plot the numerical derivative as a function of h in a logscale, for some given sample points in the parameter space. You will then clearly see the range of h that is optimal for the points you are sampling.

3) Yes whatever you find easier.

4) The hard point is finding the optimal step size. You may want to use an inner loop here to search for the optimal step.

Edgar Bonet 2010-10-16 19:38:58

Answer 2

A:

1) I'd start with a simple 1D example, and then go to 2D once I'm sure that works.

2) As you know the objective function before-hand, maybe you can supply an analytical gradient as well. If possible, that is (almost) always better than resorting to numerical derivatives.

3) By all means.

4) Presumably steepest descent is only the first step, next maybe look into something like CG or BFGS.

janneb 2010-10-16 20:05:54

Answer 3

A:

I am finding the minimum of f(x)=||Ax-y||^2, with A(n,n) and y(n) given.

This problem is known as least squares, and you are doing unconstrained optimization. Writing a finite-difference gradient descent solver in C is not the right approach at all. First of all you can easily calculate the derivative analytically, so there is no reason to do finite difference. Also, the problem is convex, so it even gets easier.

(Let A' denote the transpose of A)

d/dx ||Ax - y||^2 = 2*A'*(Ax - y)

since this is a covex problem we know the global min will occur when the derivative is 0

0 = 2*A'(Ax - y)
A'y = A'Ax
inverse(A'A)*A'y = x

A'A is guaranteed invertible because it is positive definite, so the problem reduces to calculating this inverse which is O(N^3). That said, there are libraries to do least squares in both C and python, so you should probably just use them instead of writing your own code.

anonymous_21321 2010-10-16 20:27:39

Oh yeah, I thought it was clear from my question that this was a hwk assignment. Gotta do my own in C, and each iteration have to minimize the descent based on the gradient. Thanks, though!

dougvk 2010-10-17 00:18:39

Ok, I see, well in that case use the analytic gradient I showed rather than finite difference estimate.

anonymous_21321 2010-10-17 00:19:47

ansaurus

tags:

views:

answers:

Implementation of Gradient (Steepest) Descent

related questions