views:

322

answers:

4

I am trying to fit a step function using scipy.optimize.leastsq. Consider the following example:

import numpy as np
from scipy.optimize import leastsq

def fitfunc(p, x):
    y = np.zeros(x.shape)
    y[x < p[0]] = p[1]
    y[p[0] < x] = p[2]
    return y

errfunc = lambda p, x, y: fitfunc(p, x) - y # Distance to the target function

x = np.arange(1000)
y = np.random.random(1000)

y[x < 250.] -= 10

p0 = [500.,0.,0.]
p1, success = leastsq(errfunc, p0, args=(x, y))

print p1

the parameters are the location of the step and the level on either side. What's strange is that the first free parameter never varies, if you run that scipy will give

[  5.00000000e+02  -4.49410173e+00   4.88624449e-01]

when the first parameter would be optimal when set to 250 and the second to -10.

Does anyone have any insight as to why this might not be working and how to get it to work?

If I run

print np.sum(errfunc(p1, x, y)**2.)
print np.sum(errfunc([250.,-10.,0.], x, y)**2.)

I find:

12547.1054663
320.679545235

where the first number is what leastsq is finding, and the second is the value for the actual optimal function it should be finding.

A: 

Most probably your optimization is stuck in a local minima. I don't know what leastsq really works like, but if you give it an initial estimate of (0, 0, 0), it gets stuck there, too.

You can check the gradient at the initial estimate numerically (evaluate at +/- epsilon for a very small epsilon and divide bei 2*epsilon, take difference) and I bet it will be sth around 0.

bayer
A: 

It turns out that the fitting is much better if I add the epsfcn= argument to leastsq:

p1, success = leastsq(errfunc, p0, args=(x, y), epsfcn=10.)

and the result is

[ 248.00000146   -8.8273455     0.40818216]

My basic understanding is that the first free parameter has to be moved more than the spacing between neighboring points to affect the square of the residuals, and epsfcn has something to do with how big steps to use to find the gradient, or something similar.

astrofrog
+1  A: 

I don't think that least squares fitting is the way to go about coming up with an approximation for a step. I don't believe it will give you a satisfactory description of the discontinuity. Least squares would not be my first thought when attacking this problem.

Why wouldn't you use a Fourier series approximation instead? You'll always be stuck with Gibbs' phenomenon at the discontinuity, but the rest of the function can be approximated as well as you and your CPU can afford.

What exactly are you going to use this for? Some context might help.

duffymo
I have data the suffers from linear drift as a function of time. At some time t0, the drift suddenly jumps and has a different slope, and it happens a second time too. So what I really want to fit is three lines in three different ranges. The problem is, the jump times are not known in advance, and I need to do this for thousands of datasets, so I want the times of the jumps and the slopes and intercepts of the lines to be free parameters. I just thought I'd start with a simpler case.
astrofrog
If it's time-dependent data, all the more reason to use Fourier transforms. Maybe an FFT would be more helpful.
duffymo
SciPy has FFT capabilities: http://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.fft.html
duffymo
+1  A: 

I propose approximating the step function. Instead of inifinite slope at the "change point" make it linear over one x distance (1.0 in the example). E.g. if the x parameter, xp, for the function is defined as the midpoint on this line then the value at xp-0.5 is the lower y value and the value at xp+0.5 is the higher y value and intermediate values of the function in the interval [xp-0.5; xp+0.5] is a linear interpolation between these two points.

If it can be assumed that the step function (or its approximation) goes from a lower value to a higher value then I think the initial guess for the last two parameters should be the lowest y value and the highest y value respectively instead of 0.0 and 0.0.


I have 2 corrections:

1) np.random.random() returns random numbers in the range 0.0 to 1.0. Thus the mean is +0.5 and is also the value of the third parameter (instead 0.0). And the second paramter is then -9.5 (+0.5 - 10.0) instead of -10.0.

Thus

print np.sum(errfunc([250.,-10.,0.], x, y)**2.)

should be

print np.sum(errfunc([250.,-9.5,0.5], x, y)**2.)

2) In the original fitfunc() one value of y becomes 0.0 if x is exactly equal to p[0]. Thus it is not a step function in that case (more like a sum of two step functions). E.g. this happens when the start value of the first parameter is 500.

Peter Mortensen