I am writing a crawler. Once after the crawler logs into a website I want to make the crawler to "stay-always-logged-in". How can I do that? Is a client (like browser, crawler etc.,) make a server to obey this rule? This scenario could occur when the server allows limited logins in day.
"Logged-in state" is usually represented by cookies. So what your have to do is to store the cookie information sent by that server on login, then send that cookie with each of your subsequent requests (as noted by Aiden Bell in his message, thx).
See also this question:
http://stackoverflow.com/questions/1016765/how-to-use-cookielib-with-httplib-in-python
A more comprehensive article on how to implement it:
http://www.voidspace.org.uk/python/articles/cookielib.shtml
The simplest examples are at the bottom of this manual page:
http://www.python.org/doc/2.6.4/library/cookielib.html
You can also use a regular browser (like Firefox) to log in manually. Then you'll be able to save the cookie from that browser and use that in your crawler. But such cookies are usually valid only for a limited time, so it is not a long-term fully automated solution. It can be quite handy for downloading contents from a Web site once, however.
UPDATE:
I've just found another interesting tool in a recent question:
It can also do such cookie based login:
http://doc.scrapy.org/topics/request-response.html#topics-request-response-ref-request-userlogin
The question I mentioned is here:
http://stackoverflow.com/questions/1804694/scrapy-domainname-for-spider
Hope this helps.