views:

440

answers:

5

As far as i know, the result of

["a", "A"].uniq

is

["a", "A"]

My question is:

How do I make ["a", "A"].uniq give me either ["a"] or ["A"]

+5  A: 

Just make the case consistent first.

e.g:

["a","A"].map{|i| i.downcase}.uniq

Edit: If as mikej suggests, the elements returned must be exactly the same as in the original array, then this will do that for you:

a.inject([]) { |result,h| result << h unless result.map{|i| i.downcase}.include?(h.downcase); result }

Edit2 Solution which should satisfy mikej :-)

downcased = [] 
a.inject([]) { |result,h| 
     unless downcased.include?(h.downcase);
      result << h
      downcased << h.downcase
     end;
     result}
DanSingerman
Whilst this would work for the example given if the list was something like ["Hello", "HELLO"] then ["Hello", "HELLO"].map { |i| i.downcase }.uniq would return ["hello"] which doesn't match either of the strings in the original list.
mikej
Edited solution is good except that it will build the downcased list using result.map{|i| i.downcase} multiple times (once for each element in the original list) so maybe execute that once as a separate statement and store in a temporary variable if the list is large.
mikej
+3  A: 
["a", "A"].map{|x| x.downcase}.uniq
=> ["a"]

or

["a", "A"].map{|x| x.upcase}.uniq
=> ["A"]
Mr. Matt
Ack! Beaten to it!
Mr. Matt
A: 

A more general solution (though not the most efficient):

class EqualityWrapper
  attr_reader :obj

  def initialize(obj, eq, hash)
    @obj = obj
    @eq = eq
    @hash = hash
  end

  def ==(other)
    @eq[@obj, other.obj]
  end

  alias :eql? :==

  def hash
    @hash[@obj]
  end
end

class Array
  def uniq_by(eq, hash = lambda{|x| 0 })
    map {|x| EqualityWrapper.new(x, eq, hash) }.
    uniq.
    map {|x| x.obj }
  end

  def uniq_ci
    eq = lambda{|x, y| x.casecmp(y) == 0 }
    hash = lambda{|x| x.downcase.hash }
    uniq_by(eq, hash)
  end
end

The uniq_by method takes a lambda that checks the equality, and a lambda that returns a hash, and removes duplicate objects as defined by those data.

Implemented on top of that, the uniq_ci method removes string duplicates using case insensitive comparisons.

Paolo Capriotti
+5  A: 

you may build a mapping (Hash) between the case-normalized (e.g. downcased) values and the actual value and then take just the values from the hash:

["a", "b", "A", "C"]\
.inject(Hash.new){ |h,element| h[element.downcase] = element ; h }\
.values

selects the last occurrence of a given word (case insensitive):

["A", "b", "C"]

if you want the first occurrence:

["a", "b", "A", "C"]\
.inject(Hash.new){ |h,element| h[element.downcase] = element  unless h[element.downcase]  ; h }\
.values
LucaM
+1 Very clever.
DanSingerman
+2  A: 

A bit more efficient and way is to make use of uniq keys in hashes, so check this:

["a", "A"].inject(Hash.new){ |hash,j| hash[j.upcase] = j; hash}.values

will return the last element, in this case

["A"]

whereas using ||= as assign operator:

["a", "A"].inject(Hash.new){ |hash,j| hash[j.upcase] ||= j; hash}.values

will return first element, in this case

["a"]

especially for big Arrays this should be faster as we don't search the array each time using include?

cheers...

RngTng