views:

313

answers:

2

PROBLEM

Hello,

I am having no luck trying to break down this SQL statement into ActiveRecord/Rails friendly code and I'd like to learn how I can avoid a find_by_sql statement in this situation.

Scenario I have users that create audits when they perform an action. Each audit is of a specific audit_activity. Each audit_activity is worth a certain number of points, based on score_weight. I need to find the total scores of each user, based on their total accumulated audit_activity score_weights. Eventually I'll need to rank them which means adding a sort to this as well.

My Code Here is my sql and simplified versions of the tables in question. Any thoughts?

SQL with full column names (for clarity)

SELECT users.id, u.email, SUM(audit_activity.score_weight) 
FROM users 
JOIN audits ON users.id = audits.user_id 
JOIN audit_activities ON audit_activities.id = audits.audit_activity_id 
GROUP BY users.id;

Models: User, Audit, AuditActivity

User fields: id, email

class User < ActiveRecord::Base
 include Clearance::User
 has_many :audits
end

Audit fields: id, user_id, audit_activity_id

class Audit < ActiveRecord::Base
  belongs_to :user
  belongs_to :audit_activity
end

AuditActivity fields: id, score_weight

class AuditActivity < ActiveRecord::Base
  has_many :audits
end

Example Data

Here is a set of SQL statements so you can play with similar data I'm working with and see what comes up when the concerned query is run. You should just be able to copy/paste the whole thing into a database query browser.

CREATE TABLE users(
 id INTEGER NOT NULL,
 email TEXT (25),
 PRIMARY KEY (id)
);

CREATE TABLE audits(
 id INTEGER NOT NULL,
 user_id INTEGER,
 audit_activity_id INTEGER,
 PRIMARY KEY (id)
); 

CREATE TABLE audit_activities(
 id INTEGER NOT NULL,
 score_weight INTEGER,
 PRIMARY KEY (id)
);

INSERT INTO users(id, email)
VALUES(1, "[email protected]");
INSERT INTO users(id, email)
VALUES(2, "[email protected]");
INSERT INTO users(id, email)
VALUES(3, "[email protected]");

INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(1, 1, 1);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(2, 1, 2);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(3, 1, 1);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(4, 1, 3);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(5, 1, 1);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(6, 1, 4);

INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(7, 2, 4);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(8, 2, 4);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(9, 2, 4);

INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(10, 3, 3);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(11, 3, 2);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(12, 3, 2);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(13, 3, 2);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(14, 3, 3);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(15, 3, 1);
INSERT INTO audits(id, user_id, audit_activity_id)
VALUES(16, 3, 1);

INSERT INTO audit_activities(id, score_weight)
VALUES(1, 1);
INSERT INTO audit_activities(id, score_weight)
VALUES(2, 2);
INSERT INTO audit_activities(id, score_weight)
VALUES(3, 7);
INSERT INTO audit_activities(id, score_weight)
VALUES(4, 11);

The Query Again, here is the query.

SELECT u.id, u.email, SUM(aa.score_weight) 
FROM users u 
JOIN audits a ON u.id = a.user_id 
JOIN audit_activities aa ON aa.id = a.audit_activity_id 
GROUP BY u.id;

Many Thanks, Chad

A: 

It's easy enough to get your users, and iterate through the audits for each of them, summing up the values as you go. So it would be something like this:

users = User.find(:all)
users.each do |user|
  puts "user: #{user.email}"
  score = 0
  user.audits.each do |audit|
       puts "  audit: #{audit.audit_activity.id}  score: #{audit.audit_activity.score_weight}"
       score += audit.audit_activity.score_weight
  end
  puts "total score for this user: #{score"
end

That will generate many separate queries, however, but that's not always a bad thing.

If the data volumes are going to be large, and as you say, you will want to sort by user score, then I think the answer will be to have a field with the current score on the user record, which gets updated every time an audit activity record is written. This can be done automatically with an association callback (i.e. an after_add method on the audit association on the User record). See http://guides.rubyonrails.org/association_basics.html#association-callbacks.

stephenr
Thanks stephenr,I definitely could do the iteration. My worry was that iterating like that may not be ideal practice (I'm not sure, I just started with RoR) because, as you guessed it, there data volumes will be very large. On callbacks, these scores are valid only for the current day so I would need to clear them out at midnight each night. Maybe an automated rake task?
Chad M
Iterating over the dataset would be the normal way of doing it, but will be a problem if you have many tens of thousands of records, and retrieve them all at once. By having the score on the user record, you could do the find ordered by score, and use a pagination plugin or a simple limit clause to limit the number of records returned. Rails will be quite happy with that.As for the cron job to clear out the old scores, a script executed by [RAILS_ROOT]/script/runner is probably better than rake - it give you access to the full Rails environement, models, etc.
stephenr
A: 
User.sum( :score_weight, :include => {:audits => :audit_activity}, :group => 'users.id' )
zed_0xff
This is very close to what I have so far. Seeing as you came up with something very similar, I'm starting to lean more toward this.Yours, I believe, is much cleaner though.What I had:User.calculate(:sum, 'audit_activities.score_weight', :joins => ['INNER JOIN audits ON audits.user_id = users.id', 'INNER JOIN audit_activities ON audit_activities.id = audits.audit_activity_id'], :order => :sum_id, :group=>'users.id')
Chad M