views:

2024

answers:

5

I have an array of hashes, and I want the unique values out of it. Calling Array.uniq doesn't give me what I expect.

a = [{:a => 1},{:a => 2}, {:a => 1}]
a.uniq # => [{:a => 1}, {:a => 2}, {:a => 1}]

Where I expected:

[{:a => 1}, {:a => 2}]

In searching around on the net, I didn't come up with a solution that I was happy with. Folks recommended redefining Hash.eql? and Hash.hash, since that is what Array.uniq is querying.

Edit: Where I ran into this in the real world, the hashes were slightly more complex. They were the result of parsed JSON that had multiple fields, some of which the values were hashes as well. I had an array of those results that I wanted to filter out the unique values.

I don't like the redefine Hash.eql? and Hash.hash solution, because I would either have to redefine Hash globally, or redefine it for each entry in my array. Changing the definition of Hash for each entry would be cumbersome, especially since there may be nested hashes inside of each entry.

Changing Hash globally has some potential, especially if it were done temporarily. I'd want to build another class or helper function that wrapped saving off the old definitions, and restoring them, but I think this adds more complexity than is really needed.

Using inject seems like a good alternative to redefining Hash.

+10  A: 

I can get what I want by calling inject

a = [{:a => 1},{:a => 2}, {:a => 1}]
a.inject([]) { |result,h| result << h unless result.include?(h); result }

This will return:

[{:a=>1}, {:a=>2}]
Aaron Hinni
much more better I think than the one link I posted above
Ed
A: 

The answer you give is similar to the one discussed here. It overrides the hash and eql? methods on the hashes that are to appear in the array which then makes uniq behave correctly.

Mark Reid
That is one of the solutions I found on the net. I didn't like that I needed to redefine hash, just to call uniq.
Aaron Hinni
If the vanilla Hash and Array classes don't do what you need you should really consider defining your own classes that implement the required behaviour.Can you describe what it is you are trying to model with arrays of hashes?
Mark Reid
A: 

Assuming your hashes are always single key-value pairs, this will work:

a.map {|h| h.to_a[0]}.uniq.map {|k,v| {k => v}}

Hash.to_a creates an array of key-value arrays, so the first map gets you:

[[:a, 1], [:a, 2], [:a, 1]]

uniq on Arrays does what you want, giving you:

[[:a, 1], [:a, 2]]

and then the second map puts them back together as hashes again.

glenn mcdonald
The real world problem that I came across used more complex hashes.
Aaron Hinni
Not sure why this was down voted, so I put it back up.
Aaron Hinni
+2  A: 

I've had a similar situation, but hashes had keys. I used sorting method.

What I mean:

you have an array:

[{:x=>1},{:x=>2},{:x=>3},{:x=>2},{:x=>1}]

you sort it (#sort_by {|t| t[:x]}) and get this:

[{:x=>1}, {:x=>1}, {:x=>2}, {:x=>2}, {:x=>3}]

now a bit modified version of answer by Aaaron Hinni:

your_array.inject([]) do |result,item| 
  result << item if !result.last||result.last[:x]!=item[:x]
  result
end

I've also tried:

test.inject([]) {|r,h| r<<h unless r.find {|t| t[:x]==h[:x]}; r}.sort_by {|t| t[:x]}

but it's very slow. here is my benchmark:

test=[]
1000.times {test<<{:x=>rand}}

Benchmark.bmbm do |bm|
  bm.report("sorting: ") do
    test.sort_by {|t| t[:x]}.inject([]) {|r,h| r<<h if !r.last||r.last[:x]!=h[:x]; r}
  end
  bm.report("inject: ") {test.inject([]) {|r,h| r<<h unless r.find {|t| t[:x]==h[:x]}; r}.sort_by {|t| t[:x]} }
end

results:

Rehearsal ---------------------------------------------
sorting:    0.010000   0.000000   0.010000 (  0.005633)
inject:     0.470000   0.140000   0.610000 (  0.621973)
------------------------------------ total: 0.620000sec

                user     system      total        real
sorting:    0.010000   0.000000   0.010000 (  0.003839)
inject:     0.480000   0.130000   0.610000 (  0.612438)