views:

213

answers:

3

I have a piece of Ruby on Rails code that has a complex SQL query (well, not that complex, but as far as I know beyond the ORM capabilities) and for my taste it has too many strings and harcoded values. I'd like to improve it as much as possible, so my question is open ended, what else can I do to improve it?

Some particular issues I have

  • Is there a way to get a table name to use it in a query in the same escaped way as the ORM does? I think this is database independent, being `items` for MySQL but not for other databases.
  • In the same vein, is there a way to get a field name the same way Rail's ORM would put it in a SQL query?
  • Maybe there's a way to get both, the table name and the field name in one operation. I'm imaging something like Item.voteable_id.for_query => "`items`.`voteable`".
  • How do I escape code to avoid SQL injection when not in conditions? I'm using the user_id variable directly in a query and although it's impossible for a user to put anything in it, I'd rather escape it properly. In a condition I would do ['user_id = ?', user_id], but in a join or a select, how do I do it?
  • Does my use of class constants here make sense?
  • Is there any chance at all of using the ORM and less string?
  • Any other thing to do to it?

It is not my intention to have anybody else do my pet project for me. I remember a long discussion in some mailing list about writing portable SQL in Rails by calling methods to get table names and things like that; but I can't find it anymore. My goal here is obviously to learn.

The code is this one

class Item < ActiveRecord::Base  
  has_many :votes, :as => :voteable

  def self.ranking(user_id)   
    Item.find(:all,
      # items.* for all the Item attributes, score being the sum of votes, user_vote is the vote of user_id (0 if no vote) and voter_id is just user_id for latter reference.
      :select => "items.*,
                  IFNULL(sum(all_votes.value), 0) as score,
                  user_votes.value as user_vote,
                  \"#{user_id}\" as voter_id",
      # The first join gets all the votes for a single item (to be summed latter).
      # The second join gets the vote for a single user for a single item.
      :joins => ["LEFT JOIN votes as all_votes ON
                    all_votes.voteable_id = items.id and
                    all_votes.voteable_type = \"Item\"",
                 "LEFT JOIN votes as user_votes ON
                    user_votes.voteable_id = items.id and
                    user_votes.user_id = \"#{user_id}\" and
                    user_votes.voteable_type = \"Item\""
                ],
      :group => :id,
      :order => "score DESC")

    # This is the query it should generate (for user_id = 2)
    # SELECT items.*,
    #        user_votes.value as user_vote,
    #        IFNULL(sum(all_votes.value),0) as score,
    #        "2" as voter_id
    # FROM items
    # LEFT JOIN votes as all_votes ON
    #     all_votes.voteable_id = items.id and
    #     all_votes.voteable_type = "Item"
    # LEFT JOIN votes as user_votes ON
    #     user_votes.voteable_id = items.id and
    #     user_votes.user_id = "2" and
    #     user_votes.voteable_type = "Item"
    # GROUP BY items.id
    # ORDER BY score DESC
  end

  def score
    s = read_attribute("score")
    if s == nil
      votes.sum :value
    else
      Integer(s)
    end
  end

  def user_vote(user_id)
    if Integer(read_attribute("voter_id")) == user_id
      Integer(read_attribute("user_vote"))
    else
      vote = votes.find(:first, :conditions => ["user_id = ?", user_id])
      if vote
        vote.value
      else
        0
      end
    end
  end
end

UPDATE: Simplified the code to get to the point faster and be able to discuss the core issues. Document the code so people don't have to waste time deciphering it.

A: 

I'm not familiar with Ruby or Rails, so I can't give specific examples, but I think a good way would be to move the SQL from the method bodies into properties of the class. To insert variables into them, you could use the Ruby-equivalent of printf syntax.

This would make it more obvious where all the SQL is located, and easier to modify them and the methods too I think.

Jani Hartikainen
Using printf to insert variables in to an sql statement is a really bad idea. Almost every db library I've seen has had some functionality for handling parameterized sql statements built in, and they'll typically get the escaping and such right. You really don't want to roll that sort of thing yourself (unless you're writing a db library).
John Hyland
Correct. Feel free to point out how to do it properly in Rails, since I don't know.
Jani Hartikainen
I'm already using the Ruby equivalent of printf, and that's what I don't like.
J. Pablo Fernández
+2  A: 

There doesn't appear to be anything in this query that can't be done using the regular ActiveRecord ORM capabilies. I'm not going to solve the problem, but I'll point you in the right direction with these:

  1. Instead of doing string manipulation in the joins, use objects :joins => :votes

  2. Restricting by user_id is a condition (don't use string manipulation): cond = user_id ? {:votes => {:user_id => user_id}} : nil ... Item.find(..., :condition => cond, ...)

  3. You :select should be determined in an if/else statement

  4. Read the ActiveRecord::Base documentation thoroughly

  5. Read the has_many documentation thoroughly
BigCanOfTuna
Ok, just taking a look at point 1, how do I use the :joins => :votes syntax and specify an "as" so that I can join it twice and even then, how do I provide different ONs for each one?
J. Pablo Fernández
2: I don't understad what you mean (I mean the pieces of code you provide). I'm restricting the joined table by user_id, not the main table, so it cannot go into the where clause (that is, :conditions).
J. Pablo Fernández
3: do you mean so that I don't have to do the concatenation? Then I would have repetition, or concatenation in a different way and creating yet another variable.
J. Pablo Fernández
4, 5: I've already have, if there's anything in particular you have in mind, feel free to let me know about it.
J. Pablo Fernández
A: 

Start by breaking down the query and trying to use ORM for as much as possible. You want to get rid of as much of the hand-crafted sql as you can. Use find as you are, and only move on to find_by_sql when you get stuck doing that.

has_many :user_votes, :conditions => ["user_id = 2 and voteable_type = 'Item'"], :class_name => 'Vote'
has_many :all_votes, :conditions => ["voteable_type = 'Item'"], :class_name => 'Vote'

You can use the "user_id = ?" notation in find_by_sql to avoid sql-injection. For multiple substitutions, name the vars using symbols (user_id = :user_id) and then pass a hash as the 2nd param.

There's so much refactoring to be done here! Before you even start, write some tests and specs!

EDIT: add :class_name

Jonathan Julian
The code is very well tested, 100% code coverage and every possible case I can think of. Agreed about getting rid of as much SQL as possible.Your has_many :all_votes is equivalent to the has_many :votes, :as => voteable I already have. What's your point there?
J. Pablo Fernández
To be able to use proper SQL escaping on the joins, then I should use find_by_sql instead of find?
J. Pablo Fernández
I was trying to show how every relationship can be mapped using AR associations - then they can be turned into `nested_scope`s and reused to help solve your problem.Either. To do proper escaping in any `find`, always use conditions as described in the docs for `ActiveRecord::Base`.
Jonathan Julian