ansaurus

Question

Smoothing values over time: moving average or something better?

Answer 1

+1 A:

I think that Wikipedia explains these quite well.

High Performance Mark 2010-09-21 13:16:14

Answer 2

+2 A:

Moving average I can get down with ... but it strikes me that it's probably quite inefficient.

There's really no reason a moving average should be inefficient. You keep the number of data points you want in some buffer (like a circular queue). On each new data point, you pop the oldest value and subtract it from a sum, and push the newest and add it to the sum. So every new data point really only entails a pop/push, an addition and a subtraction. Your moving average is always this shifting sum divided by the number of values in your buffer.

It gets a little trickier if you're receiving data concurrently from multiple threads, but since your data is coming from a hardware device that seems highly doubtful to me.

Oh and also: awful self-taught programmers unite! ;)

Dan Tao 2010-09-21 13:20:21

Henry Cooke 2010-09-22 00:54:59

@Henry: For a straight-up moving average you do need the buffer simply so that you know what value gets popped when the next value get pushed. That said, the "fixed-length queue )

Dan Tao 2010-09-22 01:58:36

@Dan, no, I thought you meant a actual high-level array.push() / array.unshift() or something, like an AS3 or Java programmer would do. Forgive me. Arts graduate, remember?

Henry Cooke 2010-09-22 03:09:08

@Henry: I don't know about AS3, but a Java programmer's got collections like [`CircularQueue`](http://www.java2s.com/Code/Java/Collections-Data-Structure/CircularQueue.htm) at his/her disposal (I'm not a Java developer so I'm sure there are better examples out there; that's just what I found from a quick Google search), which implements precisely the functionality we're talking about. I'm fairly confident the majority of medium- and low-level languages with standard libraries have something similar (e.g., in .NET there's `Queue<T>`). Anyway, I was philosophy myself, so... all is forgiven.

Dan Tao 2010-09-22 12:44:17

Answer 3

+6 A:

If you are trying to remove the occasional odd value, a low-pass filter is the best of the three options that you have identified. Low-pass filters allow low-speed changes such as the ones caused by rotating a compass by hand, while rejecting high-speed changes such as the ones caused by bumps on the road, for example.

A moving average will probably not be sufficient, since the effects of a single "blip" in your data will affect several subsequent values, depending on the size of your moving average window.

If the odd values are easily detected, you may even be better off with a glitch-removal algorithm that completely ignores them:

if (abs(thisValue - averageOfLast10Values) > someThreshold)
{
    thisValue = averageOfLast10Values;
}

Here is a guick graph to illustrate:

graph comparison

The first graph is the input signal, with one unpleasant glitch. The second graph shows the effect of a 10-sample moving average. The final graph is a combination of the 10-sample average and the simple glitch detection algorithm shown above. When the glitch is detected, the 10-sample average is used instead of the actual value.

e.James 2010-09-21 13:38:14

Nicely explained, and bonus points for the graph ;)

Henry Cooke 2010-09-22 00:50:42

Answer 4

+5 A:

If your moving average has to be long in order to achieve the required smoothing, and you don't really need any particular shape of kernel, then you're better off if you use an exponentially decaying moving average:

a(i+1) = tiny*data(i+1) + (1.0-tiny)*a(i)

where you choose tiny to be an appropriate constant (e.g. if you choose tiny = 1- 1/N, it will have the same amount of averaging as a window of size N, but distributed differently over older points).

Anyway, since the next value of the moving average depends only on the previous one and your data, you don't have to keep a queue or anything. And you can think of this as doing something like, "Well, I've got a new point, but I don't really trust it, so I'm going to keep 80% of my old estimate of the measurement, and only trust this new data point 20%". That's pretty much the same as saying, "Well, I only trust this new point 20%, and I'll use 4 other points that I trust the same amount", except that instead of explicitly taking the 4 other points, you're assuming that the averaging you did last time was sensible so you can use your previous work.

Rex Kerr 2010-09-21 14:27:32

Good explanation, thanks Rex. Would I be right in thinking that another way of expressing the same thing would be: workingAverage = (newValue*smoothingFactor) + ( workingAverage * ( 1.0 - smoothingFactor) )?

Henry Cooke 2010-09-22 01:05:07

@Henry - Right, that's the way to do it without using any extra storage.

Rex Kerr 2010-09-22 02:49:27

Answer 5

+1 A:

An exponentially decaying moving average can be calculated "by hand" with only the trend if you use the proper values. See http://www.fourmilab.ch/hackdiet/e4/ for an idea on how to do this quickly with a pen and paper if you are looking for “exponentially smoothed moving average with 10% smoothing”. But since you have a computer, you probably want to be doing binary shifting as opposed to decimal shifting ;)

This way, all you need is a variable for your current value and one for the average. The next average can then be calculated from that.

Daren Thomas 2010-09-21 14:39:03

Also, The Hacker's Diet is a good read for anyone looking for common engineering sense in the dieting world of crazies...

Daren Thomas 2010-09-21 14:41:46

Good explanation of EMA, thanks.

Henry Cooke 2010-09-22 00:59:12

ansaurus

tags:

views:

answers:

Smoothing values over time: moving average or something better?

related questions