views:

1622

answers:

3

Hi, Just started learning active record and am wondering how to best retrieve data from multiple tables where an SQL aggregate query is involved.

In the following example (from a medical app) I'm looking for the most recent events of various types for each patient (e.g. last visit, last labtest etc). As you can see from the sql query below I'm looking for the max(date) value from a grouped query. I resorted to find_by_sql to do this - however I'd like to see how to do this without using find_by_sql.

IOW - how would you get the required data here using a pure ActiveRecord approach. Below are the Table and Class defs I'm testing with:

Find by Sql to retrieve most recent entries for each type - note the 'max(event_date)' here

strsql = "select  p.lname, e.patient_id, e.event_type, max(e.event_date) as event_date 
     from events e   
         inner join patients p on e.patient_id = p.id
         group by p.lname, e.patient_id, e.event_type"

Here's the sample sql query result:

lname, patient_id, event_type, latest
'Hunt', 3, 'Labtest', '2003-05-01 00:00:00'
'Hunt', 3, 'Visit', '2003-03-01 00:00:00'
'Seifer', 2, 'Labtest', '2002-05-01 00:00:00'
'Seifer', 2, 'Visit', '2002-03-01 00:00:00'

Table Relationships are:
Tables ---> Patients --> Events
                               --> visits
                               --> labtests
                               --> ... other
patients
      t.string :lname
      t.date :dob

events
      t.column :patient_id, :integer
      t.column :event_date, :datetime
      t.column :event_type, :string

visits 
      t.column :event_id, :integer
      t.column :visittype, :string

labtests
      t.column :event_id, :integer
      t.column :testtype, :string
      t.column :testvalue, :string

Classes

class Patient < ActiveRecord::Base
  has_many :events
  has_many :visits, :through =>:events
  has_many :labtests, :through => :events
end

class Event < ActiveRecord::Base
  has_many :visits
  has_many :labtests
  belongs_to :patient
end

class Visit < ActiveRecord::Base
  belongs_to :event
end

class Labtest < ActiveRecord::Base
    belongs_to :event
end
A: 

In the current version of AR (2.3) I think your only option is to use find_by_sql. Why? Your custom SELECT attributes. ActiveRecord allows :select and :include arguments to the find call. Unfortunately they can't be used together. If you specify both the :select will be ignored. In your example you need the select to limit your columns and to get your MAX function.

I've heard there is a patch/hack to get them to work together, but I am not sure where it is.

Peer

Peer Allan
Is there a way to use an explicit :join instead of and include to get around this limitiation? Would that be an option here?
BrendanC
A: 

An include is just a clever left join, you can use that in conjuction with select and group_by in your .find

Omar Qureshi
+3  A: 

As Pallan pointed out, the :select option cannot be used with the :include option. However the :joins option can. And this is what you want here. In fact, it can take the same arguments as :include or use your own SQL. Here's some rough, untested code, may need some minor fiddling.

Event.all(:select => "events.id, patients.lname, events.patient_id, events.event_type, max(events.event_date) as max_date", :joins => :patient, :group => "patients.lname, events.patient_id, events.event_type")

Note I modified things slightly. I renamed the event_date alias to max_date so there's no confusion over which attribute you are referring to. The attributes used in your :select query are available in the models returned. For example, in this you can call event.max_date. I also added the event id column because you can sometimes get some nasty errors without an id attribute (depending on how you use the returned models).

The primary difference between :include and :joins is that the former performs eager loading of the associated models. In other words, it will automatically fetch the associated patient object for each event. This requires control of the select statement because it needs to select the patient attributes at the same time. With :joins the patient objects are not instantiated.

ryanb
Quick follow up here from OP. With 1 minor tweak :joins (vs :join) Ryan's code works and replaces the Find_By_Sql query in my original post. Very cool. However in passing I'll just note that (AFAIK) not all SQL queries can be implemented via ORM calls (e.g. Union Query that combines Min and Max values from a table). I suspect this will require 2 separate calls via the ORM, while a Find_By_Sql can do it in 1 call e.g. find_by_sql("Select Min(t.Event_Date) from table t Union Select Max(t.Event_Date) from table t")
BrendanC
Updated the original post to use :joins. You are correct though, not every query can be done with the ORM, and that is not the goal of Active Record. It was purposefully designed to fit the majority of needs well and leave the rest possible with find_by_sql. It does not try to do everything, so don't hesitate to use find_by_sql if it fits the job better.
ryanb