



I have a list of hashes, as such:

incoming_links = [
 {:title => 'blah1', :url => ""},
 {:title => 'blah2', :url => ""},
 {:title => 'blah3', :url => ""}]

And an ActiveRecord model which has fields in the database with some matching rows, say:

Link.all => 
[<Link#2 @title='blah2' @url=''>,
 <Link#3 @title='blah3' @url=''>,
 <Link#4 @title='blah4' @url=''>]

I'd like to do set operations on Link.all with incoming_links so that I can figure out that <Link#4 ...> is not in the set of incoming_links, and {:title => 'blah1', :url =>''} is not in the Link.all set, like so:

#incoming_links =  as above
links = Link.all
expired_links = links - incoming_links
missing_links = incoming_links - links
missing_links.each{|link| Link.create(link)}

Crappy solution a):

I'd rather not rewrite Array#- and such, and I'm okay with converting incoming_links to a set of unsaved Link objects; so I've tried overwriting hash eql? and so on in Link so that it ignored the id equality that AR::Base provides by default. But this is the only place this sort of equality should be considered in the application - in other places the Link#id default identity is required. Is there some way I could subclass Link and apply the hash, eql?, etc overwriting there?

Crappy solution b):

The other route I've tried is to pull out the attributes hash for each Link and doing a .slice('id',...etc) to prune the hashes down. But this requires writing seperate - methods for keeping track of the Link objects while doing set operations on the hashes, and writing seperate Proxy classes to wrap the incoming_links hashes and Links, which seems a bit overkill. Nonetheless, this is the current solution for me.

Can you think of a better way to design this interaction? Extra credit for cleanliness.

+1  A: 

try this

incoming_links = [
 {:title => 'blah1', :url => ""},
 {:title => 'blah2', :url => ""},
 {:title => 'blah3', :url => ""}]

ar_links = Link.all(:select => 'title, url').map(&:attributes)

# wich incoming links are not in ar_links
incoming_links - ar_links

# and vice versa
ar_links - incoming_links


For your Link model:

def self.not_in_array(array)
  keys = array.first.keys
  all.reject do |item|
    hash = {}
    keys.each { |k| hash[k] = item.send(k) }
    array.include? hash

def self.not_in_class(array)
  keys = array.first.keys
  class_array = []
  all.each do |item|
    hash = {}
    keys.each { |k| hash[k] = item.send(k) }
    class_array << hash
  array - class_array

ar = [{:title => 'blah1', :url => ''}]
Link.not_in_array ar
#=> all links from Link model which not in `ar`
Link.not_in_class ar
#=> all links from `ar` which not in your Link model
One of the key points is to keep the Link objects around, and know which have been excluded. To do so with your `:select` suggestion, I'd have to do the second thing I spoke of - write a class or method which does the subtraction while keeping the Link object ids around.Only loading the fields that I need is an interesting idea though, for when I need to speed this whole process up.
Tim Snowhite
I've updated my answer

If you rewrite the equality method, will ActiveRecord complain still?

Can't you do something similar to this (as in a regular ruby class):

class Link
  attr_reader :title, :url

  def initialize(title, url)
    @title = title
    @url = url

  def eql?(another_link)
    self.title == another_link.title and self.url == another_link.url

  def hash
     title.hash * url.hash

aa = ['a', 'url1'),'b', 'url2')]
bb = ['a', 'url1'),'d', 'url4')]

(aa - bb).each{|x| puts x.title}
Do you think this would work as a subclass of Link? That way I could alter eql? and hash, but not do so globally. I fear the ActiveRecord association internals, and what they'd do with overwritten `hash` and `eql?` methods, but if this was a subclass (call it, say, LinkComp) I might be able to #becomes(LinkComp) to all the `links` and `incoming_links.collect{|il|}`. I might try this.
Tim Snowhite

The requirements are:

#  Keep track of original link objects when 
#   comparing against a set of incomplete `attributes` hashes.
#  Don't alter the `hash` and `eql?` methods of Link permanently, 
#   or globally, throughout the application.

The current solution is in effect using Hash's eql? method, and annotating the hashes with the original objects:

class LinkComp < Hash
  LINK_COLS = [:title, :url]
  attr_accessor :link
  def self.[](args)
    if args.first.is_a?(Link) #not necessary for the algorithm, 
                              #but nice for finding typos and logic errors
      links = args.collect do |lnk|
         lk = super(lnk.attributes.slice(*(LINK_COLS.collect(&:to_s)).to_a) = lnk
    elsif args.blank?
    #else #raise error for finding typos

incoming_links = [
 {:title => 'blah1', :url => ""},
 {:title => 'blah2', :url => ""},
 {:title => 'blah3', :url => ""}]

#Link.all => 
#[<Link#2 @title='blah2' @url=''>,
# <Link#3 @title='blah3' @url=''>,
# <Link#4 @title='blah4' @url=''>]

incoming_links= LinkComp[incoming_links.collect{|i|}]
links = LinkComp[Link.all] #As per fl00r's suggestion 
                           #this could be :select'd down somewhat, w.l.o.g.

missing_links =  (incoming_links - links).collect(&:link)
expired_links = (links - incoming_links).collect(&:link)
Tim Snowhite