views:

1056

answers:

4

I need to make a proxy script that can access a page hidden behind a login screen. I do not need the proxy to "simulate" logging in, instead the login page HTML should be displayed to the user normally, and all the cookies and HTTP GET/POST data to flow through the proxy to the server, so the login should be authentic.

I don't want the login/password, I only need access to the HTML source code of the pages generated after logging in.

Does anybody here know how this can be accomplished? Is it easy?

If not, where do I begin?* (I'm currently using PHP)*

+1  A: 

What you are talking about is accessing pages for which you need to authenticate yourself.

Here are a few things that must be laid down:

  • you can't view those pages without authenticating yourself.
  • if the website (whose HTML code you want to see) only supports web login as an authentication method, you will need to simulate login by sending a (username,password) via POST/GET, as the case may be
  • if the website will let you authenticate yourself in other ways (like LDAP, Kerberos etc), then you should do that

The key point is that you cannot gain access without authenticating yourself first.

As for language, it is pretty doable in PHP. And as the tags on the question suggest, you are using the right tools to do that job already.

One thing I would like to know is, why are you calling it a "proxy"? do you want to serve the content to other users?

EDIT: [update after comment]

In that case, use phproxy. It does what you want, along with a host of other features.

Here Be Wolves
I think you're a little confused. My proxy script will NOT login automatically by feeding in GET/POST data to the login page, but instead I would (1) transparently show the original login page's HTML to the proxy's visitor (2) the visitor will login using his username/password (3) I will transfer than request data to the original login server, in a way that I do not interfere.
Jenko
Oh. answer updated.
Here Be Wolves
A: 

I would recommand using Curl (php library that you might need to activate in your php.ini) It's used to manipulate remote websites, handling cookies and every http parameters you need. You'll have to write your proxy based on the web pages you're hitting, but it'll make the job.

Rodolphe
you may run into issues authenticating, if it is expecting the login request to come from a specific referring page, so make sure you take that into account too when using curl
bumperbox
A: 

If you are confused how to do it, I show you a reference code. Your problem can be resolved with cURL, as far as i understand.

http://prad.tutorboy.com/utube.zip

The above file is an example of accepting a youtube URL as POST data, and push the video to download. I think you can use this to authenticate your users with username and password as POST data. ( that file wont work as youtube updated their pages, it was working fine before)

I hope it helps. :-)

pMan
+1  A: 

Have your PHP script request the URL you want, and rewrite all links and form actions to point back to your php script. When receiving requests to the script that have a URL parameter, forward that to the remote server and repeat.

You won't be able to catch all JavaScript requests, (unless you implemented a JavaScript portion of your "proxy")

Eg: User types http://example.com/login.php into your proxy form.

send the user to http://yoursite.com/proxy.php?url=http://example.com/login.php

make sure to urlencode the parameter "http://example.com/login.php"

In http://yoursite.com/proxy.php, you make an HTTP request to http://example.com/login.php

// make the HTTP request to the requested URL
$url = $_REQUEST['url'];
$content = file_get_contents($url);

// parse all links and forms actions and redirect back to this script
$content = preg_replace("/some-smart-regex-here/i", "$1 or $2 smart replaces", $content);

echo $content;

Note that /some-smart-regex-here/i is actually a regex expression you should write to parse links, and such.

The example just proxies the HTTP Body, you may want to proxy the HTTP Headers. You can use fsockopen() or PHP stream functions in PHP5+ (stream_socket_client() etc.)

bucabay