views:

303

answers:

2

I have a collection of Post objects and I want to be able to sort them based on these conditions:

  • First, by category (news, events, labs, portfolio, etc.)
  • Then by date, if date, or by position, if a specific index was set for it

Some posts will have dates (news and events), others will have explicit positions (labs, and portfolio).

I want to be able to call posts.sort!, so I've overridden <=>, but am looking for the most effective way of sorting by these conditions. Below is a pseudo method:

def <=>(other)
  # first, everything is sorted into 
  # smaller chunks by category
  self.category <=> other.category

  # then, per category, by date or position
  if self.date and other.date
    self.date <=> other.date
  else
    self.position <=> other.position
  end
end

It seems like I'd have to actually sort two separate times, rather than cramming everything into that one method. Something like sort_by_category, then sort!. What is the most ruby way to do this?

+1  A: 

You should always sort by the same criteria to insure a meaningful order. If comparing two nil dates, it is fine that the position will judge of the order, but if comparing one nil date with a set date, you have to decide which goes first, irrespective of the position (for example by mapping nil to a day way in the past).

Otherwise imagine the following:

a.date = nil                   ; a.position = 1
b.date = Time.now - 1.day      ; b.position = 2
c.date = Time.now              ; c.position = 0

By your original criteria, you would have: a < b < c < a. So, which one is the smallest??

You also want to do the sort at once. For your <=> implementation, use #nonzero?:

def <=>(other)
  return nil unless other.is_a?(Post)
  (self.category <=> other.category).nonzero? ||
  ((self.date || AGES_AGO) <=> (other.date || AGES_AGO)).nonzero? ||
  (self.position <=> other.position).nonzero? ||
  0
end

There is a sort_by version which takes a block, so if you don't to use your comparison criteria just once, you could:

post_ary.sort_by{|a, b| (a.category <=> ...) }

Notes:

  • sort_by! will be introduced in Ruby 1.9.2. You can require "backports" to use it now though.
  • I'm assuming that Post is not a subclass of ActiveRecord (in which case you'd want the sort to be done by the db server).
Marc-André Lafortune
Thanks, I wasn't aware of `Numeric#nonzero?`. Isn't a bit odd for a `?`-method to return non-boolean values?
Mladen Jablanović
@Mladen: It is, but quite useful. Another example where you could expect a `true/false`: `String < Fixnum` returns `nil`, not `false`.
Marc-André Lafortune
A: 

Alternatively you could do the sort in one fell swoop in an array, the only gotcha is handling the case where one of the attributes is nil, although that could still be handled if you knew the data set by selecting the appropriate nil guard. Also it's not clear from your psuedo code if the date and position comparisons are listed in a priority order or an one or the other (i.e. use date if exists for both else use position). First solution assumes use, category, followed by date, followed by position

def <=>(other)
    [self.category, self.date, self.position] <=> [other.category, other.date, other.position]
end

Second assumes it's date or position

def <=>(other)
    if self.date && other.date
        [self.category, self.date] <=> [other.category, other.date]
    else
        [self.category, self.position] <=> [other.category, other.position]
    end
end
naven87
Ah, had forgotten about the `nil` of dates. This sort order is not well ordered (see my updated answer).
Marc-André Lafortune
For my learning, what do you mean by it not being well ordered?
naven87
Marc-André Lafortune
But then you could get that with the following tweak to the first case: [self.category, self.date || AGES_AGO, self.position] <=> [other.category, other.date || AGES_AGO, other.position]to use your nomeclature, right, not as elegant but same result without your extra checks
naven87
Yes, using an array this way is fine. It will be slower though, especially if some fields need to be calculated (say `#position` is a method making some calculations), but will yield the same result.
Marc-André Lafortune