views:

63

answers:

1

I'm trying to write a relatively simple algorithm to search for a string on several attributes

Given some data:

Some data:

1: name: 'Josh', location: 'los angeles'
2: name: 'Josh', location: 'york'

search string: "josh york"

The results should be [2, 1] because that query string hits the 2nd record twice, and the 1st record once.

It's safe to assume case-insensitivity here.

So here's what I have so far, in ruby/active record:

query_string = "josh new york"
some_attributes = [:name, :location]

results = {}
query_string.downcase.split.each do |query_part|
  some_attributes.each do |attribute|
    find(:all, :conditions => ["#{attribute} like ?", "%#{query_part}%"]).each do |result|
      if results[result]
        results[result] += 1
      else
        results[result] = 1
      end
    end
  end
end

results.sort{|a,b| b[1]<=>a[1]}

The issue I have with this method is that it produces a large number of queries (query_string.split.length * some_attributes.length).

Can I make this more efficient somehow by reducing the number of queries ?

I'm okay with sorting within ruby, although if that can somehow be jammed into the SQL that'd be nice too.

A: 

Why aren't you using something like Ferret? Ferret is a Ruby + C extension to make a full text index. Since you seem to be using ActiveRecord, there's also acts_as_ferret.

François Beausoleil
I'd think any "full-featured" text search "engine" is overkill for searching two columns in one table. Going for simple/light-weight here.
crankharder