tags:

views:

513

answers:

2

I'm following instructions on http://www.rubyrss.com/ to parse a feed from craigslist.org:

http://seattle.craigslist.org/sof/index.rss

Everything seems to run fine, but when I can't get any dates from the parsed object:

irb(main):010:0> rss.date
NoMethodError: undefined method `date' for #<RSS::RDF:0x2c412b8>
        from (irb):10
irb(main):011:0> rss.channel.date
NoMethodError: undefined method `date' for #<RSS::RDF::Channel:0x2c406ec>
        from (irb):11
        from :0
irb(main):012:0> rss.items[0].date
NoMethodError: undefined method `date' for #<RSS::RDF::Item:0x2cdc290>
        from (irb):12
        from :0
irb(main):013:0> rss.items[1].date
NoMethodError: undefined method `date' for #<RSS::RDF::Item:0x2cd04a4>
        from (irb):13
        from :0

What am I doing wrong here?

+2  A: 

Take a look at the rss feed using Firefox so you can easily see the structure of the feed. The date items are represented using "Dublin Core" <dc:date>

Try this:

require 'rss/dublincore'
rss.items[3].dc_date  #=>  Sat Apr 18 01:02:11 -0400 2009

More details at the Ruby rss parser reference and Dublin Core.

slothbear
Cool! I see other tags that look like <foo:bar> too. I'm not that good with xml, so I just remembered the "dc" is a tag namespace (at least I think that's what they're called), and the top of the document has some references where I can find out more about them.
allyourcode
+3  A: 

You really should switch libraries. I'd recommend using Feedzirra. It's way, way, way faster, and it's actively maintained.

require "feedzirra"
feed = Feedzirra::Feed.fetch_and_parse("http://seattle.craigslist.org/sof/index.rss")
feed.entries.first.published
# => Fri Apr 24 18:27:28 UTC 2009
Bob Aman
Thanks sporkmonger! I will certainly take a look.
allyourcode