ansaurus

Question

Answer 1

+2 A:

Sitemap is an extension to the standard, and robotparser doesn't support it. You can see in the source that it only processes "user-agent", "disallow", and "allow". For its current functionality (telling you whether a particular URL is allowed), understanding Sitemap isn't necessary.

Matthew Flaschen 2010-06-04 22:16:36

True, but i need to see if there are sitemaps specified in order to parse them. I guess I'll just have to open the robots through urlopen. Thanks.

Ben 2010-06-04 22:29:04

ansaurus

tags:

views:

answers:

Python's robotparser ignoring sitemaps

related questions