views:

157

answers:

9
+1  Q: 

Dynamic Include

What would be the safest way to include pages with $_GET without puttingt allowed pages in an array/use switch etc. I have many pages so no thank you.

$content = addslashes($_GET['content']);

if (file_exists(PAGE_PATH."$content.html")) { 
include(PAGE_PATH."$content.html");
}

How safe is that?

Thanks.

A: 

You should use at least something like that to prevent XSS attacks.

$content = htmlentities($_GET['page'], ENT_QUOTES, 'UTF-8');

And addslashes won't protect you from SQL Injections.

FA
He's not injecting the page name into an SQL statement, nor is he emitting it to HTML; he's including a file based on user input, which requires different handling and logic to protect against path traversal attacks.
Rob
I know, but it's still very common to use addslashes for trying to prevent SQL injections.
FA
But not for including files.
Sebastian Hoitz
Yes, but he's quite clearly *not* trying to use it to prevent SQL injection, and your comment about cross-site-scripting was quite clearly out of the ball park.
Rob
A: 

Assuming the "allowed" set of pages all exist under PAGE_PATH, then I might suggest something like the following:

  • Trim page name
  • Reject page names which start with a slash (could be an attempt at an absolute path)
  • Reject page names which contain .. (could be an attempt at path traversal)
  • Explicitly prefix PAGE_PATH and include the (hopefully) safe path

If your page names all follow some consistent rules, e.g. alphanumeric characters, then you could in theory use a regular expression to validate, rejecting "bad" page names.

There's some more discussion of these issues on the PHP web site.

Rob
+1  A: 

Match it against a regex that only accepts "a-zA-Z-".

edit: I don't think that blocking specific patterns is a good idea. I'd rather do, like I said, a regex that only accepts chars that we know that won't cause exploits.

Tiago
+3  A: 

You'll sleep safer if you check the input for a valid pattern. e.g. suppose you know the included files never have a subdirectory and are always alphanumeric

if (preg_match('/^[a-z0-9]+$/', $_GET['page']))
{
    $file=PAGE_PATH.$_GET['page'].".html";
    if (file_exists($file))
    {
         readfile($file);
    }
}

I've used readfile, as if the .html files are just static, there's no need to use include.

The possible flaw with your approach is that you can engineer a path to any HTML file in the system, and have it executed as PHP. If you could find some way to get an HTML file of your own devising on the filesystem, you can execute it through your script.

Paul Dixon
And of course "getting a file of [the attacker's] own devising on the file system" is what script kiddies excel at. file_exists() really doesn't buy the OP anything here but an illusion of security.
Ben Dunlap
Problem is that your approach (Paul Dixon's) also suffers from the flaw you identified in the OP's approach -- except that in your case, the execution context is the end-user's browser. Seems the uber-downvoted answer below about XSS wasn't totally out of the ballpark after all...
Ben Dunlap
that's true, but at least it limits the attack possibility to one directory rather than one of the attackers choosing.
Paul Dixon
A: 

It looks generally safe as in you are checking that the page actually exists before displaying it. However, you may wish to create a blacklist of pages that people should not be able to view with the valid $_SESSION credentials. This can be done either with an array/switch or you can simply have all special pages in a certain directory, and check for that.

Mike
A: 

You could scan the directory containing all HTML templates first and cache all template names in an array, that you can validate the GET parameter against. But even if you cache this array it still creates some kind of overhead.

Sebastian Hoitz
+3  A: 

This is very bad practice. You should setup a controller to handle dispatching to the code that needs to be executed or retrieved rather than trying to directly include it from a variable supplied by a user. You shouldn't trust user input when including files, ever. You have nothing to prevent them from including things you do not want included.

Syntax
+1 for giving the generic version of my answer, which for some reason got downvoted. ;-)(file_exists() == TRUE) means *nothing*. Unless you are perfectly certain that your file system is beyond compromise.
Ben Dunlap
A: 

Don't. You'll never anticipate all possible attacks and you'll get hacked.

If you want to keep your code free of arrays and such, use a database with two columns, ID and path. Request the page by numeric ID. Ignore all requests for ids that are not purely numeric and not in your range of valid IDs. If you're concerned about SEO you can add arbitrary page names after the numeric id in your links, just like Stack Overflow does.

The database need not be heavy-duty. You can use SQLite, for example.

Ben Dunlap
Hey, at least explain why you downvoted!
Ben Dunlap
A: 

The safest method involves cleaning up the request a bit.

  1. Strip out any ../
  2. Strip out ^\/

From there, make sure that you check to see if the file they're requesting exists, and can be read. Then, just include it.

gms8994