Find the "peak" of a set of data.

tags:

normal-distribution

views:

answers:

+1 Q:

Find the "peak" of a set of data.

I have a set of data, for which I'd like to find an average peak. I've done some testing in Numbers.app to see what I'm after and if I make a chart of the dataset it has a feature it calls "polynomial trendline" which draws a curve of the data and the peak of that curve looks exactly like the point/value I'm after.

So how could I programmatically calculate that curve and find that tangent on the curve?

I've been looking around on wikipedia and found topics like "Normal distribution" and "Polynomial regression" which seems very much related, but I've always found it hard to follow the equations on wikipedia so I'm hoping maybe someone here could give me a programatic example.

Here's a couple of charts to illustrate what I'm after. The green dots are the data points and the blue line is the "polynomial trendline" (of order 6). ~~The "peak" of that trendline is what I'm after.~~

Example with even dataset Example with uneven dataset

Updated question:

After some answers I realize my question need to be rephrased as the problem is not really how to find the peak of the curve but more of how to generate that blue curve from the green points so I can find where in the dataset the "weight" lies. The goal is to get a sort of 'average maximum'.

I guess another question would be "what is this particular problem actually called?" ;)

Derivative is equal to zero at peaks.

Andrey 2010-08-18 12:39:13

Ah right, that's another term I've forgotten since school. It's also zero at 'valleys' if I recall. But running a max(d1,d2,d3) would find me the perfect point. But now I just need to figure out how to make that curve to find the derivative on. ;)

Robert Sköld 2010-08-18 12:45:23

@Robert Sköld you should refresh your math (calculus actually). Numerically derivative in point x can be calculated as f(x + 1) - f(x), so if you have points 1 2 3 3 2 1 derivatives will be 1 1 0 -1 -1. then yes, find maximum.

Andrey 2010-08-18 14:35:31

Lets say you are plotting Y vs X. You already have the values of Y corresponding to each X. Let Y(X1) mean value of Y when X=X1.

Set a variable max = 0. Then calculate value of Y at each X. If Y(X1) > max then set max=Y(X). Once you go through all the Ys, what you'll have in max will be the peak value of Y.

e.g in your example just go through all green dots and find the maximum of them. That would be the peak, right? Let me know if that's what you wanted. Which programming language are you using? You don't need to go into distributions and stuff just to get the peak..

Raze2dust 2010-08-18 12:45:09

Updated my question a bit, but as you can see on the second image the target would be right in between two "maximums" which is why I'd like the peak to be more 'weighted' (or whatever term is correct) which is why that trendline seems proper.And the programming language will be javascript in the end...

Robert Sköld 2010-08-18 13:04:32

As you speak of normal distributions, and seem to be able to fit data to a function, you should fit to a normal distribution, which jas parameters µ and σ, which are respectively the mean and standard deviation of the distribution (see wiki first formula).

Fit this function to your data, and the peak will be at the mean value, given by µ.

rubenvb 2010-08-18 12:45:47

+2 A:

Although the data looks like that you're not necessarily after a normal distribution.

The topic of distribution fitting is quite complex and, unless you have some clear a priori assumptions of what your data distribution is, I would not venture there. In case you have assumptions on the type of distribution, have a look at least squares or maximum likelihood extimation methods.

However, I would suggest you should rather use a bezier-spline or LOESS to "smooth" your data and then just find the maximum of the computed curve.

I doubt that an approach using the derivative would work here.

nico 2010-08-18 12:48:31

Also, have a look at this: http://stats.stackexchange.com/questions/1315/how-do-i-figure-out-what-kind-of-distribution-this-is

nico 2010-08-18 12:51:20

Thanks, looks like interesting links!

Robert Sköld 2010-08-18 13:12:18

You could start with calculating the mean and standard deviation/variance. This would tell you some information about the distribution.

I don't think you'll be able to solve the problem for an arbitrary data set. So you would need to have some common characteristic behavior.

After all, fitting a curve can be somewhat arbitrary depending upon the method - it needs to be chosen appropriately for your problem domain - perhaps there needs to be some weighting or data cleansing to throw out outlying values first.

Cade Roux 2010-08-18 13:04:25

True, I will update my question with some example datasets (and maybe more of an explanation of my problem).

Robert Sköld 2010-08-18 13:13:16

ansaurus

tags:

views:

answers:

Find the "peak" of a set of data.

related questions