views:

622

answers:

7

My product model contains some items

 Product.first
 => #<Product id: 10, name: "Blue jeans" >

I'm now importing some product parameters from another dataset, but there are inconsistencies in the spelling of the names. For instance, in the other dataset, Blue jeans could be spelled Blue Jeans.

I wanted to Product.find_or_create_by_name("Blue Jeans"), but this will create a new product, almost identical to the first. What are my options if I want to find and compare the lowercased name.

Performance issues is not really important here: There are only 100-200 products, and I want to run this as a migration that imports the data.

Any ideas?

+9  A: 

You'll probably have to be more verbose here

name = "Blue Jeans"
model = Product.find(:first, :conditions => [ "lower(name) = ?", name.downcase ]) || Product.create(:name => name)
neutrino
This works well. Thanks a lot for your reply :)
Jesper Rønn-Jensen
A: 

Assuming that you use mysql, you could use fields that are not case sensitive: http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html

marcgg
I'll give it a look, but for now, my app is actually running Sqlite :)
Jesper Rønn-Jensen
A: 

So far, I made a solution using Ruby. Place this inside the Product model:

  #return first of matching products (id only to minimize memory consumption)
  def self.custom_find_by_name(product_name)
    @@product_names ||= Product.all(:select=>'id, name')
    @@product_names.select{|p| p.name.downcase == product_name.downcase}.first
  end

  #remember a way to flush finder cache in case you run this from console
  def self.flush_custom_finder_cache!
    @@product_names = nil
  end

This will give me the first product where names match. Or nil.

>> Product.create(:name => "Blue jeans")
=> #<Product id: 303, name: "Blue jeans">

>> Product.custom_find_by_name("Blue Jeans")
=> nil

>> Product.flush_custom_finder_cache!
=> nil

>> Product.custom_find_by_name("Blue Jeans")
=> #<Product id: 303, name: "Blue jeans">
>>
>> #SUCCESS! I found you :)
Jesper Rønn-Jensen
+1  A: 

Quoting from the SQLite documentation:

Any other character matches itself or its lower/upper case equivalent (i.e. case-insensitive matching)

...which I didn't know.But it works:

sqlite> create table products (name string);
sqlite> insert into products values ("Blue jeans");
sqlite> select * from products where name = 'Blue Jeans';
sqlite> select * from products where name like 'Blue Jeans';
Blue jeans

So you could do something like this:

name = 'Blue jeans'
if prod = Product.find(:conditions => ['name LIKE ?', name])
    # update product or whatever
else
    prod = Product.create(:name => name)
end

Not #find_or_create, I know, and it may not be very cross-database friendly, but worth looking at?

Mike Woodhouse
+1  A: 

You might want to use the following:

validates_uniqueness_of :name, :case_sensitive => false

Please note that by default the setting is :case_sensitive => false, so you don't even need to write this option if you haven't changed other ways.

Find more at: http://ar.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#M000086

Sohan
A: 

Another approach that no one has mentioned is to add case insensitive finders into ActiveRecord::Base. Details can be found here. The advantage of this approach is that you don't have to modify every model, and you don't have to add the lower() clause to all your case insensitive queries, you just use a different finder method instead.

Alex - Aotea Studios
+2  A: 

Upper and lower case letters differ only by a single bit - the most efficient way to search them is to ignore this bit, not to convert lower or upper, etc.. See keywords COLLATION for MS SQL, see NLS_SORT=BINARY_CI if using Oracle, etc..

Dean Radcliffe