tags:

views:

603

answers:

2

Given the following simple data set, what is the best way to average the values for the sets 0 25 53 and 80.

  [["0", "148.5"],
   ["0", "146.5"],
   ["0", "148.6"],
   ["0", "202.3"],
   ["25", "145.7"],
   ["25", "145.5"],
   ["25", "147.4"],
   ["25", "147.3"],
   ["53", "150.4"],
   ["53", "147.6"],
   ["53", "147.8"],
   ["53", "215.4"],
   ["80", "150.4"],
   ["80", "149.4"],
   ["80", "148.0"],
   ["80", "149.9"]]
+4  A: 

It's simple enough with inject. I often implement a general group_by method in projects to help with stuff like this.

If data is large and performance matters consider using a numeric library or database if appropriate.

data = [ ... ]

groups = data.inject({}) do |hash, pair| 
  hash[pair.first] ||= []
  hash[pair.first] << pair.last.to_f
  hash
end

groups.inject({}) do |hash, pair| 
  hash[pair.first] = pair.last.inject(0,&:+) / pair.last.size
  hash
end
Jason Watkins
Nick Hildebrant
Jason Watkins
Doh, apologize for the markdown clutter, didn't realize comments weren't filtered the same as answers.
Jason Watkins
Cool trick! Thanks for the help!
Nick Hildebrant
+1  A: 

Using inject with a hash will yield poor performance (you're re-assigning the memo var at every iteration). If you're on 1.9, Enumerable implements the method group_by, which can be used to make the code a little more obvious:

result = array.map{ |row| [row.first.to_i, row.last.to_f] }.group_by(&:first)
result.each_pair do |key, values|
  result[key] = values.average
end

Array#average is easily implemented as

class Array
  def average
    inject(0.0) { |sum, e| sum + e } / length
  end
end

The fact that your data is strings is quite inconvenient, I recommend avoiding that whenever possible.

Mikoangelo
Can you explain what you mean by "re-assigning the memo var at every iteration"? The calls are done by reference, so assigning the variable will involve the same overhead no matter the type of the value.
Jason Watkins