tags:

views:

591

answers:

6

I needed a Low-Pass-Filter in one of my Scala projects and came up with this:

def filter(numbers: Seq[Double], filterSize: Int): Seq[Double] = {
  assert(filterSize > 0)
  val ringBuffer = new Array[Double](filterSize)
  var ringBufferIndex = 0

  numbers.map(x => {
    // update ring buffer
    ringBuffer(ringBufferIndex) = x

    // increase ring index
    ringBufferIndex += 1 
    if (ringBufferIndex == filterSize) {
      ringBufferIndex = 0
    }

    // get avarage
    ringBuffer.foldLeft(0.0)(_ + _) / filterSize
  })
}

However, there are some things I don't like about it:

  • It uses map (nicely functional) but needs a mutable variable (ringBufferIndex - BAD).
  • It work's on Seq[Double] (which is fine), but returns Seq[Double], which is bad cause it requires the caller to call ".toList" or whatever he actually uses. I tried using Generics here like this:

    def filter[T <% Seq[Double]](numbers: T, filterSize: Int): T

but that would not compile

Does anyone have suggestionts how to improve those two issues?

A: 

Here is a shorter version that I have come up with to tackle the first problem:

  def filter(numbers: Seq[Double], filterSize: Int): Seq[Double] = {
    assert(filterSize > 0)
    (0 until numbers.length).map(x => {
      (((x - filterSize) max 0) to x).foldLeft(0.0)((sum, index) => sum + numbers(index)) / filterSize
    })
  }

It's downside is the index lookup which could be very bad for things like "List".

Lemmy
+1  A: 

If the input can be a List rather than Seq, you can clean it up a bit with zipWithIndex:

def filter(numbers: List[Double], filterSize: Int): List[Double] = {
  require(filterSize > 0)
  val ringBuffer = new Array[Double](filterSize)
  numbers.zipWithIndex.map(pair => {
    // update ring buffer
    ringBuffer(pair._2 % filterSize) = pair._1
    // get avarage
    ringBuffer.foldLeft(0.0)(_ + _) / filterSize
  })
}

Note that the return value is also now List and I replaced assert with require.

Dave Ray
didn't know about zipWithIndex, thanks.
Lemmy
A: 

OK, so I cleaned those up a little. There are three functions for three possible datatypes (which solves problem #2 automatically). I took those all from above (one for Array, one for List, one for Seq.):

def filter(numbers: Seq[Double], filterSize: Int): Seq[Double] = {
  require(filterSize > 0)
  val ringBuffer = new Array[Double](filterSize)
  var ringBufferIndex = 0

  numbers.map(x => {
    // update ring buffer
    ringBuffer(ringBufferIndex) = x

    // increase ring index
    ringBufferIndex += 1 
    if (ringBufferIndex == filterSize) {
      ringBufferIndex = 0
    }

    // get avarage
    ringBuffer.foldLeft(0.0)(_ + _) / filterSize
  })
}

def filter(numbers: Array[Double], filterSize: Int): Array[Double] = {
  require(filterSize > 0)
  (0 until numbers.length).map(x => {
    (((x - filterSize) max 0) to x).foldLeft(0.0)((sum, index) => sum + numbers(index)) / filterSize
  }).toArray
}

def filter(numbers: List[Double], filterSize: Int): List[Double] = {
  require(filterSize > 0)
  val ringBuffer = new Array[Double](filterSize)
  numbers.zipWithIndex.map(pair => {
    val (value, index) = pair
    // update ring buffer
    ringBuffer(index % filterSize) = value
    // get avarage
    ringBuffer.foldLeft(0.0)(_ + _) / filterSize
  })
}
Lemmy
FWIW, Array also supports zipWithIndex
Dave Ray
+1  A: 

While I don't know Scala, I wouldn't use a ring buffer here. As I understand it, you want to take the average, at each array position, of the preceding filterSize elements. So go through the array left to right, keeping an accumulator holding the sum of the previous filterSize elements (at each step adding the rightmost and subtracting the leftmost) and yielding accumulator/filterSize as the value at that position. A factor of filterSize fewer additions, and in principle purely functional. Is that inconvenient to code in Scala?

(If overflow is not a concern, I'd do something a bit simpler and more parallelizable: compute the running sum of the whole array (scanl (+) 0 numbers in Haskell) and produce the running sum minus the running sum displaced filterSize positions.)

Darius Bacon
that was my first idea but I feared you might lose precision that way. In my code I have an error of filterSize*addition for each number, but when adding and subtracting I have the error 2*index which for big indices could be quite a lot.
Lemmy
Oh, fair point, I was neglecting that you're using floating-point. You could still use scanl to extract the sublists to add at each position, though I'm afraid that'd be considerably less efficient as sequential code. If speed is paramount, add the sublists directly instead of via the ringbuffer.
Darius Bacon
+2  A: 

If index lookups are a problem (O(n) with List), you could use a persistent vector. This gives you O(1) indexing as well as O(1) update. It's also purely functional (immutable), so life is still happy in that regard.

With a little bit of imagination, we can convert your code directly into a pure-functional version using Vector:

def filter(numbers: List[Double], size: Int) = {
  def walk(numbers: List[Double], buffer: Vector[Double], i: Int): List[Double] = {
    numbers match {
      case x :: tail => {
        val nextBuffer = buffer(i) = x
        val nextI = if (i == size) 0 else i + 1

        val avg = buffer.foldLeft(0.0) { _ + _ } / size
        avg :: walk(tail, nextBuffer, nextI)
      }

      case Nil => Nil
    }
  }

  walk(numbers, Vector.empty, 0)
}

Note that this isn't tail-recursive, so it'll break down when numbers is too large. It's pretty easy to fix this problem, but I'm being lazy right now.

Daniel Spiewak
I never used Scala Vector. I assume the line val nextBuffer = buffer(i) = x creates a copy of the vector with the element at element[i]=value? Isn't that like ultra slow? ;)
Lemmy
+2  A: 

To have your method take a generic collection type and return the same type, I believe you would need Higher Kinded types, as described in the Generics of a Higher Kind paper. Unfortunately the current collections library predates this feature in Scala, but this will be corrected for 2.8.

Rafael de F. Ferreira
exactly what I wanted to know, thank you!
Lemmy