currently I have a spider written in Java that logs into a supplier website and spiders the website. (using htmlunit)
It keeps the session (cookie) and even lets me enable/disable javascript etc.
I also use htmlparser (java) to help parse the html and extract the relevant information.
Does python have something similar to do this?