tags:

views:

115

answers:

3

I'm looking for some advice on how to go about implementing Gradient (steepest) Descent in C. I am finding the minimum of f(x)=||Ax-y||^2, with A(n,n) and y(n) given.

This is difficult in C (I think) because computing the gradient, Δf(x)=[df/dx(1), ..., df/dx(n)] requires calculating derivatives.

I just wanted to throw this at SO to get some direction on going about programming this, e.g.:

1) What dimensionality would be best to start with (1,2,...)

2) Advice on how to go about doing the partial derivatives

3) Whether I should implement in an easier language, like python, first -- then translate over to C

4) Etc.

Let me know your thoughts! Thanks in advance

+2  A: 

1) Start in 2D, this way you can plot the path of the descent and actually see your algorithm working.

2) df/dx = (f(x+h)-f(x-h))/(2*h) if f evaluation is cheap, (f(x+h)-f(x))/h if it is expensive. The choice of h should balance truncation error (mostly with big h) and roundoff error (small h). Typical values of h are ~ pow(DBL_EPSILON, 1./3), but the actual exponent depends on the formula for the derivative, and ideally there should be a prefactor that depends on f. You may plot the numerical derivative as a function of h in a logscale, for some given sample points in the parameter space. You will then clearly see the range of h that is optimal for the points you are sampling.

3) Yes whatever you find easier.

4) The hard point is finding the optimal step size. You may want to use an inner loop here to search for the optimal step.

Edgar Bonet
A: 

1) I'd start with a simple 1D example, and then go to 2D once I'm sure that works.

2) As you know the objective function before-hand, maybe you can supply an analytical gradient as well. If possible, that is (almost) always better than resorting to numerical derivatives.

3) By all means.

4) Presumably steepest descent is only the first step, next maybe look into something like CG or BFGS.

janneb
A: 

I am finding the minimum of f(x)=||Ax-y||^2, with A(n,n) and y(n) given.

This problem is known as least squares, and you are doing unconstrained optimization. Writing a finite-difference gradient descent solver in C is not the right approach at all. First of all you can easily calculate the derivative analytically, so there is no reason to do finite difference. Also, the problem is convex, so it even gets easier.

(Let A' denote the transpose of A)

d/dx ||Ax - y||^2 = 2*A'*(Ax - y)

since this is a covex problem we know the global min will occur when the derivative is 0

0 = 2*A'(Ax - y)
A'y = A'Ax
inverse(A'A)*A'y = x

A'A is guaranteed invertible because it is positive definite, so the problem reduces to calculating this inverse which is O(N^3). That said, there are libraries to do least squares in both C and python, so you should probably just use them instead of writing your own code.

anonymous_21321
Oh yeah, I thought it was clear from my question that this was a hwk assignment. Gotta do my own in C, and each iteration have to minimize the descent based on the gradient. Thanks, though!
dougvk
Ok, I see, well in that case use the analytic gradient I showed rather than finite difference estimate.
anonymous_21321