views:

577

answers:

3

I have this very simple script that allows the user to specify the url of any site. The the script replaces the url of the "data" attribute on an object tag to display the site of the users choice inside the object on the HTML page.

How could I validate the input so the user can't load any page from my site inside the object because I have noticed that it will display my code.

The code:

 <?php
 $url = 'http://www.google.com';
 if (array_key_exists('_check', $_POST)) {
    $url = $_POST['url'];
 }
 //gets the title from the selected page
 $file = @ fopen(($url),"r") or die ("Can't read input stream");
 $text = fread($file,16384);
 if (preg_match('/<title>(.*?)<\/title>/is',$text,$found)) {
         $title = $found[1];
 } else {
         $title = "Untitled Document";
 }
 ?>

Edit: (more details) This is NOT meant to be a proxy. I am letting the users decide which website is loaded into an object tag (similar to iframe). The only thing php is going to read is the title tag from the input url so it can be loaded into the title of my site. (Don't worry its not to trick the user) Although it may display the title of any site, it will not bypass any filters in any other way.

I am also aware of vulnerabilities involved with what I am doing that's why im looking into validation.

+2  A: 

Are you aware that you are creating an open HTTP proxy, which can be a really bad idea?

Do you even need to fetch the contents of the URL? Why don't you let your user's browser do that by supplying it with the URL?

Assuming you do need to fetch the URL, consider validating against a known "whitelist" of URLs. If you can't restrict it to a known list, then you are back to the open proxy again...

Use a regular expression (preg) to ensure it is a good HTTP url, and then use the CURL extension to do the actual request.

Mixing the fopen() family of functions with user supplied parameters is a recipe for potential disaster.

gahooa
This is NOT a proxy. I am letting the users decide which website is loaded into an object tag (similar to iframe). The only thing php is going to read is the title tag from the input url so it can be loaded into the title of my site. (Don't worry its not to trick the user)
teh_noob
+2  A: 

As gahooa said, I think you need to be very careful with what you're doing here, because you're playing with fire. It's possible to do safely, but be very cautious with what you do with the data from the URL the user gives you.

For the specific problem you're having though, I assume it happens if you get an input of a filename, so for example if someone types "index.php" into the box. All you need to do is make sure that their URL starts with "http://" so that fopen uses the network method, instead of opening a local file. Something like this before the fopen line should do the trick:

if (!preg_match('/^http:\/\//', $url))
    $url = 'http://'.$url;
Chad Birch
+3  A: 

parse_url: http://us3.php.net/parse_url

You can check for scheme and host.

If scheme is http, then make sure host is not your website. I would suggest using preg_match, to grab the part between dots. As in www.google.com or google.com, use preg_match to get the word google.

If the host is an ip, I am not sure what you want to do in that situation. By default, the preg match would only get the middle 2 numbers and the dot(assuming u try to use preg_match to get the sitename before the .com)