ansaurus

Question

python- is beautifulsoup misreporting my html?

Answer 1

+2 A:

There are documented problems with version 3.1 of BeautifulSoup.

You might want to double check that is the version you in fact are using, and if so downgrade.

Paolo Bergantino 2009-05-01 05:13:48

Answer 2

+1 A:

I suspect the problem is in the urlib2 request, not BeautifulSoup:

It might help if you show us the same section of the raw data as returned by this command on both machines:

urllib2.urlopen(base_url)

This page looks like it might help: http://bytes.com/groups/python/635923-building-browser-like-get-request

The simplest solution is probably just to detect which environment the script is running in and change the parsing logic accordingly.

>>> import os
>>> os.uname() 
('Darwin', 'skom.local', '9.6.0', 'Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386', 'i386')

Or get microsoft to use web standards :)

Also, didn't you use mechanize to fetch the pages? If so, the problem may be there.

Pete Skomoroch 2009-05-01 18:51:24

ansaurus

tags:

views:

answers:

python- is beautifulsoup misreporting my html?

related questions