views:

160

answers:

1

This is a really specialized case and I feel awkward asking it; however I'm at wits end working on it.

I need to follow a tracking number through a form and to a results page so I've been using mechanize in python, the link after form submission is embedded in javascript so I can't simply follow_link. What I want to do is to regex out the url and then ask call open() on that, however when I do - I run into some problems.

I can call br.geturl() and br.title() just fine on the target page, but when it comes time to read the source of the page in question, it throws

AttributeError: mechanize._mechanize.Browser instance has no attribute read (perhaps you forgot to .select_form()?)

Is there some way to do this or am I monkey-patching it too much, any advice would be terrific

edit [more code {really ugly just trying to get it to work}]:

cosn="########"
baseurl="http://aaa.com/"
search="thing.do"

br=Browser()
br.open(baseurl+search)
br.select_form('traceForm')
br['consignments']=cosn
req=br.submit()
pars=Soup(req.read())
found_url=re.match(r"javascript:window.location.href = '(?P<url>[\w\d=&?\.]+)", pars.find('td', attrs={'class':'select'})['onclick']).group('url')

br.open(baseurl+found_url)
print br.title()  # works
print br.geturl()  # works
print br.read()  # throws exception
+4  A: 

You never make first .read method call on Browser instance. That's because it doesn't have such method. The Browswer.response has read method, so if you want to get the body of response you'd need to do:

response = br.response()
response.read()

For the future, you could use dir(obj) to see the content of the object obj, be it browser or anything else.

SilentGhost
thank you so much, I knew what I was doing was wrong but I was feeling helpless with the mechanize documentation
dagoof