I keep hearing about how often php results in lousy code and practices. Given these criticisms and popularity of the language, what are best practices (in general) to insure that these poor habits are avoided? Following frameworks? Does it matter?
Yes, it does matter. Those same programming practices are brought along to any new programming language that developer learns and the same mistakes are made all over again without knowing why they are wrong, and in other programming languages those mistakes can be costly, or cause maintenance nightmares.
PHP has a very low entry level, it does not require much to get your first web script up and running and in general is very forgiving. Users don't have to worry about handling variables correctly, there is no type checking, there is no real structure required.
There is also very little though built into the PHP programming language concerning security. Users are able to easily use variables that are given to them by users and place them inside their output. There is very little regard for type checking, and most users don't worry about passing by value or reference.
As to how to avoid such programming practices? I don't have a clear cut answer. Training and better tutorials out there on the web with more warnings and more clear guidelines would help. Checking out how the big guys do it and learning from them is considered good practice. Using frameworks that already exist force one to follow their guidelines, and to use their various pieces of infrastructure that exists, thereby making it harder to make "rookie" mistakes.
One last note, I don't think that PHP is the only offender out there, it just happens to be the worst offender because it is so ubiquitous out there, any user is able to buy hosting that offers PHP, there are even free hosts that offer PHP. It is also one of those programming language that since it can be embedded into HTML extremely easy there are many different tutorials and guides out there offering people simple ways to add a guest book to their web-site, or a counter, or an email form, etc...
Picking up PHP as a first programming language is a major issue as PHP doesn't really lend itself to learning good programming practices, and often results in poor code quality and a lot of unlearning that must be done before taking on another language.
That being said, I think PHP as a web development technology is a good choice - I'd just be careful who I teach it to. You can mitigate most of these concerns by doing things like:
- Learning a good general programming language as a first language, such as C#.
- Reading a good general programming book, just to get some understanding how good programmers write code, like Design Patterns.
- Working with more experienced programmers on an existing project and making sure they do code review on your code - open source projects lets you get both for free.
Using a framework is not that good an idea (for picking up good skills), unless you pay a lot of attention to how the framework itself is written and you play with some framework code yourself. But please consider one of the above options as well.
The most important is not using register_globals. Complementary to that, make sure you always declare and initialize your variables before you use them.
This is generally bad:
<?php echo $foo;
and this is generally good:
<?php $foo = 0; echo $foo;
Always remember to sanitize your queries when doing an SQL query. To aid you in cleaning your inputs to SQL queries, consider using PDO.
Don't let the simplicity of the language think you can make poor design choices. Probably the main reason why C or C++ code is so much better is because programmers must critically think about every decision. In PHP, it is easy not to think at all, resulting in horrible code.
I would suggest you to read and follow coding conventions from well known sources:
Whatever coding guidelines you choose to follow, the most important aspect is consistency.
I would also recommend you to use templating engines, to avoid making unmaintenable tag-soup files.
Give a look to Smarty.
I'm going to answer this question from the perspective of what not to do more than what to do. This is because I think that at the end of the day its more important to stamp out bad practices than it is to encourage good ones (which can be highly subjective anyway). Bad practices are most often universally agreed upon however.
The other thing I'll add is that in some ways this answer won't help you as much as it could because part of doing anything the right way--particularly something as loosely typed and structured as PHP--comes from experience so any answer given may help guide you but it'll only take you so far.
The first thing I'll say about PHP is that the advantage of PHP is that it has a low barrier to entry and is highly flexible. The disadvantage of PHP is that it has a low barrier to entry and is highly flexible.
I've seen some terribly PHP code. That's because anyone can (and does) pick it up and start programming with it. This of course leads them to do things the easiest way they can, which is usually the first way they find that does what they want (which is human nature in general).
The biggest problems I've seen in PHP are:
- Not sanitizing input to SQL queries creating vulnerabilities to SQL injection
- Not sanitizing user input and cookie data creating XSS (cross site scripting) vulnerabilities;
- Interspersing database queries with other code, which just creates a mess;
- Including files based on user input (eg include "$inc.php");
- Cutting and pasting code rather than using functional decomposition; and
- Not reusing code at all.
The advantage of PHP being highly flexible is that if you do know what you're doing PHP can be great because it just doesn't get in your way. By this I mean that I am primarily a Java developer of some 10+ years experience and the trend in the last 5+ years has been to go for highly layered approach to solving any problem.
So if you want to add an extra field on your Web page you have to add it to:
- Your JSP page;
- Your model object;
- Your validation;
- Your business object;
- Your DAO;
- Your persistence object;
- Your query; and
- Your table.
So you have to add it to eight different places even in the best case scenario where you only have three layers (presentation, business, persistence). Some people insist on having more. This quote:
One more layer of abstraction and this problem should go away...
cracked me up because it's so true: the trend has become to almost pile on extra layers as a knee-jerk response in response to any problem. Now this is all well-intentioned but as they say the road to hell is paved with good intentions. All these layers seem aimed at avoiding mistakes and increasing quality but more lines of code = more errors. Period. So at best you're robbing Peter to pay Paul.
Anyway, to bring this back to PHP: it doesn't suffer from this many-layered paranoia, which is a good thing... so long as you know what you're doing. So I'd never advocate that someone learn PHP as a first programming language--or even a first Web language--because you need a sound basis in the fundamentals or you're just inviting disaster.
Programmers that are methodical, defensive and organized can write excellent PHP code. Amateurs can write disastrously bad PHP code.
General
- PHP is a templating language, not a general purpose or even an object oriented language. Embrace it. Don't fight it;
- Don't introduce frameworks just for the sake of them. This goes back to the fundamental principle that you shouldn't solve a problem until you have a problem. Developers these days I find are somewhat "framework trigger happy" ("I'm sick of writing if (s == null || s.equals("")), lets add Apache Commons Validator to the project!");
- Always reuse code rather than cutting and pasting it unless the code is so trivial (say under 10 lines) or would require so much parametrization (say 5+ parameters) to effectively reuse;
- Strive to minimize lines of code as best you can;
- Don't try and do things the Java, C# or Ruby way in PHP. Do things the PHP way. Those other languages each have fundamentally different models;
- Never, ever, ever use require_once or include_once. Some will no doubt disagree with this but I see those as being symptomatic of laziness and poor organization. They're also slower than require and include and arguably anathema to caching;
- Always use opcode caching. This is usually as simple as turning APC on in your php.ini;
- Use output buffering (ie ob_start(), etc);
- Set up common constants, database connections and so on in a file that you include at the top of every one of your scripts; and
- Be consistent. I can't stress how important that is.
Web Best Practices
These aren't necessarily PHP specific but should be done in any Web project.
- Treat all user input (including cookie data) as if it had the plague. Quarantine it, sanitize it and behave as if all of your users are out to get to you. PHP has an excellent and comprehensive array of filtering functions. Use them;
- Never rely on Javascript for input validation. It can be useful from a user experience point of view but every input should be tested on the server in addition to whatever is done on the client without exception;
- GZip everything. With Apache you can use mod_deflate or, as I prefer to do most of the time, do it from the PHP level, which can simply be done by putting
ob_start("ob_gzhandler")
at the top of your scripts; - Mirror your directory structure with your menu structure. If you have an Account menu with View Orders in it, if I can't find a script at /account/vieworders.php (from your document root) you'd better have a pretty darn good reason why not;
- Always version static content. There are multiple ways of doing this but this one is pretty good;
- Always minify Javascript;
- Combine all your CSS and Javascript files into one of each and version/minify as above (more here). Note: this doesn't mean you can't develop them in multiple files. In fact I would encourage you to do this from an organisational point of view. Just combine them at runtime. I have a js.php script that combines, minifies, caches and serves my Javascript content, for example;
- For Javascript, CSS and image files use far-future Expires headers and ETags either directly with PHP or with a Web Server module;
- Never store sensitive information or information of any kind that you trust or rely upon in a cookie. A cookie is just a session identifier, that's it.
- Abstract your security so you can do things in your code like "if (is_admin()) { ... }" rather than putting direct session checks in your code; and
- In a shared hosting environment store nothing sensitive in the session. Other sites on the same box can read your session information if they're so inclined.
Database
- All query inputs should be escaped or parametrized without exception;
- Separate your database logic into required files; and
- Refer to Database Development Mistakes Made by AppDevelopers for more general database advice.
That's about all I can think of off the top of my head.
Lastly it's worth reading 7 reasons I switched back to PHP after 2 years on Rails.
I'd recommend programming with error_reporting(E_ALL);
in each application, or effectively doing the same by modifying/inserting the following into the php.ini configuration file:
display_errors = On
error_reporting = E_ALL
This will display all applicable warnings and errors to you in respect to your code, which might normally be ignored. If you do not initialise a variable, say, PHP would not typically display a warning. With this configuration however, you will be notified.
This is not purely for bug-finding, this also helps enforce good programming practice. If you're getting warnings for problems such as uninitialised variables, you're doing it wrong and need to amend your practices.
Of course, don't enable these warnings on a production machine, just your development one. :)
I find the biggest problem that I run into when trying to debug, edit or even understand PHP code is a lack of comments and documentation. This is the same problem that I find with HTML. This may be because web developers never got into the habit of commenting HTML. When is the last time you saw a comment in HTML source code? I asked one of my faculty members who teaches web development how much time is spent discussing comments. He said that the only coverage was a mention in the book.
I rarely find comments about functions, variable declaration, or anything else. It takes more time to find out what is being done than fixing the problem.
I would also echo other comments about learning good programming techniques. Again, this may be because many PHP programmers are web developers without programming experience. This is not meant to disparage the web developers. It is a different skill set.
I see a lot of code that does this:
include "$page.php";
where $page
is a GET parameter, sometimes not even validated or sanitized. I am not a fan of this technique in the first place but don't if you must do it sanitize your data.
Proper code structuring and commenting will make even garbage code readable to others and will make your life easier in the long run. Even if you're writing poo for code, so long as others can follow the structure and comments, you'll be ahead of the game.