tags:

views:

204

answers:

3
+3  Q: 

Screen Scraping

Hi friends , now i learn cURL , i face one difficult that is to login into a page by username and password directly

+1  A: 

For standard HTTP authentication, you could try:

curl http://username:password@url

It should work!

MatthieuP
+1  A: 

The method you need to use will depend on exactly how the web page's username/password checking is implemented, but this might help you:
http://curl.haxx.se/mail/archive-2008-05/0113.html

gkrogers
A: 

I assume you want to fetch pages hidden behind a login page, and this page is not CAPTCHA-protected. To do it, you have to

  1. send POST request with login form data to the submit URL of the login form (see HTML source)
  2. save cookies
  3. send these cookies with all subsequent requests (update if necessary)

I do it with wget. curl should be similar (see its manual).

1, 2:

wget --keep-session-cookies --save-cookies "mycookies" \
     --post-data "login=mylogin&password=mypass" submit_URL

3:

wget --load-cookies "mycookies" --keep-session-cookies --save-cookies "mycookies" \
     another_URL_behind_login_form

From what I see in the man curl, 1–2 should be something like this (not tested):

curl -F "login=mylogin;password=mypass" -c "mycookies" submit_URL

and 3:

curl -b "mycookies" -c "mycookies" another_URL

But I didn't try it with curl.

jetxee