Colleagues often ask me: “bobince”, they say*, “I want to learn PHP, but I know you're always ranting on about poor code which is full of errors and security holes. That's why I normally don't like talking to you really. But, I'm looking to learn PHP now and I'd like to be able to write good code. Where's a tutorial that will teach me how to do it properly, so my site won't get all hacked up and you won't get all cross at me?”
“Hmm...” I reply, then point and call out “Look over there! A lovely pony!”, and run away.
I do keep looking for a PHP tutorial that isn't awful, full of disastrously terrible practices, but I've yet to find anything I can recommend.
Dear reader, can you perchance help me out? Unlike previous questions, I'm not after a security-specific tutorial for people who can already code in PHP; I'm looking for a good tutorial (sites or books) for those new to the language, that happens to be solid on security and writing readable code. I want the reader to be able to learn PHP properly from the start, not to have to say “well you can learn it here... but then afterwards you'll need to go to this other tutorial to find out about the bad habits and fundamental misunderstandings you've just picked up”.
(*: they don't say that. It is not my real name.)
I want a tutorial that:
Uses HTML-escaping consistently from the start. The very first “Hello, your_inputted_name!” example should be using
htmlspecialchars()
correctly. This should not be introduced as an afterthought in a separate security chapter. There should be no HTML-injection hole anywhere in the given example code.Either uses SQL-escaping consistently from the start, or parameterised queries. If SQL-escaping is used it should be correct escaping such as
mysql_real_escape_string()
if the database is MySQL. I do not want to seeaddslashes()
. There should be no SQL-injection hole anywhere in the given example code.More generally, the tutorial should understand the problems to do with putting a string inside another string, and treat escaping as a matter of correctness, not merely of security. That is to say there is no reason a user called N'gwale O'Reilly should be prevented from having an apostrophe in their name, or from talking about the HTML
<script>
tag in their message like what I'm doing now; they merely just need the right form of encoding when they're output.The tutorial should explain that when a string goes into a new context it needs an encoding process appropriate for that context, like
htmlspecialchars()
in HTML output; it should not regard less-than symbols as ‘evil’ and attempt to ‘sanitise’ them away. I don't want to seestrip_tags
. I absolutely don't want to see misguided ‘security’ measures like looping over the$_GET
array removing punctuation characters, or blanket-applying haphazard output-stage escaping to an input stage.There is so much bad code and bad examples like this out there, even in learning materials that are supposed to be explicitly about security. As questions on SO have proved it is difficult to fix people's miunderstanding of how and when string escaping needs to happen once they've learned a quick hack ‘solution’ from some misconceived ‘PHP Security’ site or book.
I don't want to see
eval()
. I don't want to seesystem()
. Nothing good comes of having these in a tutorial!There should be proper separation of active logic and page markup. I don't mean they have to be kept religiously in different files or using a specialised templating language, but I do at least want the actions up at the top and the page down at the bottom with only display logic inside it. I don't want to see an
echo
orprint
hidden inside the guts of some program logic.Actually I don't really want to see
echo
/print
used at all except as the only thing in an output block. PHP is a templating language, there is no reason to go about creating complex strings of HTML then echoing them. This only encourages the use of unescaped"<p>$var</p>"
-style interpolation, and makes it difficult to follow. All markup should be literal markup in the PHP file, not hidden within a string.Proper use of indentation is essential, both in the HTML tag hierarchy and in the PHP code. Ideally there should be a single hierarchy, using structures like
<?php if () { ?>
...<?php } ?>
(or the alternative syntax) as if they were well-formed XML. The HTML should itself be in ‘well-formed’ style even if it is not actually XHTML.Some mention of XSRF would be a nice bonus.
In short, I want a tutorial that teaches one to code something like [assuming predefined output-escaping function shortcuts]:
<?php
$result= mysql_query('SELECT * FROM items WHERE category='.m($_POST['category']));
?>
<table>
<?php while($row= mysql_fetch_assoc($result)) { ?>
<tr id="row-<?php h($row['id']); ?>">
<td>
<?php h($row['title']); ?>
</td>
<td class="thing">
<a href="/view.php?id=<?php u($row['id']); ?>">View</a>
</td>
</tr>
<?php } ?>
</table>
and not like:
<?php $category=$_POST["category"];
$result = mysql_query("SELECT * FROM items WHERE category=$category");
echo "<table>";
while($row=mysql_fetch_assoc($result))
{$html="<TR id=row{$row[id]}><td class=\"thing\">".$row[title];
$html.="</td><td><a href=\"/view.php?id={$row[id]}\">View</a></td></TR>";
print $html; }
print "</table>"; ?>
which is the sort of Other People's Code I'm fed up of fixing. Is there anything out there I can recommend, in the hope that when I end up looking after the code it doesn't look like this kind of mess?
I know I'm being too prescriptive and everything, but is there anything even close? So far I've yet to find a tutorial without SQL-injection in it, which is just so depressing.