One solution is to introduce a counter_cache:
# add a machine_updates_count integer database column (with default 0)
# and add this to your Machine model:
counter_cache :machine_updates_count
and then add OR machine_updates_count = 0
to your SQL conditions.
However, you can also solve the problem without a counter cache by using a LEFT JOIN:
named_scope :needs_updates,
:select => "machines.*, MAX(machine_updates.date) as last_update",
:joins => "LEFT JOIN machine_updates ON machine_updates.machine_id = machines.id",
:group => "machines.id",
:having => ["last_update IS NULL OR last_update < ?", lambda{ UPDATE_THRESHOLD.seconds.ago }]
The left join is necessary so that you are sure you are looking at the most recent MachineUpdate (the one with MAX date).
Note also that you have to put your condition in a lambda
so it is evaluated every time the query is run. Otherwise it will be evaluated only once (when your model is loaded on application boot-up), and you will not be able to find Machines that have come to need updates since your app started.
UPDATE:
This solution works in MySQL and SQLite, but not PostgreSQL. Postgres does not allow naming of columns in the SELECT clause that are not used in the GROUP BY clause (see discussion). I'm very unfamiliar with PostgreSQL, but I did get this to work as expected:
named_scope :needs_updates, lambda{
cols = Machine.column_names.collect{ |c| "\"machines\".\"#{c}\"" }.join(",")
{
:select => cols,
:group => cols,
:joins => 'LEFT JOIN "machine_updates" ON "machine_updates"."machine_id" = "machines"."id"',
:having => ['MAX("machine_updates"."date") IS NULL OR MAX("machine_updates"."date") < ?', UPDATE_THRESHOLD.days.ago]
}
}