views:

363

answers:

2

I ultimately want to get data from this page:

http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber?trackingNumber=0656887000494793

But that page forwards to:

http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber?execution=eXs1

So when I use open (open-uri) to try and fetch the data, it throws a RuntimeError error saying HTTP redirection loop:

So I'm not really sure how to get that data after it redirects and throws that error.

A: 

The site seems to be doing some of the redirection logic with sessions. If you don't send back the session cookies they are sending on the first request you will end up in a redirect loop. IMHO it's a crappy implementation on their part.

However, I tried to pass the cookies back to them, but I didn't get it to work, so I can't be completely sure that that is all that's going on here.

Theo
Right, that's what I'm asking...because it's a redirect, how do I get the data from the page it's redirecting to?
Shpigford
I've rephrased my answer to make my point more clear. I wasn't just saying that it was a redirect, I also explained why you ended up in a loop, hopefully that should be plain now.
Theo
+6  A: 

You need a tool like Mechanize. From it's description:

The Mechanize library is used for automating interaction with websites. Mechanize automatically stores and sends cookies, follows redirects, can follow links, and submit forms. Form fields can be populated and submitted. Mechanize also keeps track of the sites that you have visited as a history.

which is exactly what you need. So,

sudo gem install mechanize

then

require 'mechanize'
agent = WWW::Mechanize.new
page = agent.get "http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber trackingNumber=0656887000494793"
puts page.body

and you're ready to rock 'n' roll.

Vlad Zloteanu