Apache mod_rewrite
What you're looking for is mod_rewrite,
Description: Provides a rule-based rewriting engine to rewrite requested URLs on the fly.
Generally speaking, mod_rewrite
works by matching the requested document against specified regular expressions, then performs URL rewrites internally (within the apache process) or externally (in the clients browser). These rewrites can be as simple as internally translating site.com/foo into a request for site.com/foo/bar.
The Apache docs include a mod_rewrite
guide and I think some of the things you want to do are covered in it. Detailed mod_rewrite guide.
Force the www subdomain
I would like it to force "www" before every url, so its not domain.com but www.domain.com/page
The rewrite guide includes instructions for this under the Canonical Hostname example.
Remove trailing slashes (Part 1)
I would like to remove all trailing slashes from pages
I'm not sure why you would want to do this as the rewrite guide includes an example for the exact opposite, i.e., always including a trailing slash. The docs suggest that removing the trailing slash has great potential for causing issues:
Trailing Slash Problem
Description:
Every webmaster can sing a song about the problem of the trailing slash on URLs referencing
directories. If they are missing, the server dumps an error, because if you say /~quux/foo
instead of /~quux/foo/ then the server searches for a file named foo. And because this file is
a directory it complains. Actually it tries to fix it itself in most of the cases, but sometimes
this mechanism need to be emulated by you. For instance after you have done a lot of complicated
URL rewritings to CGI scripts etc.
Perhaps you could expand on why you want to remove the trailing slash all the time?
Remove .php extension
I need it to remove the .php
The closest thing to doing this that I can think of is to internally rewrite every request document with a .php extension, i.e., mysite.com/somepage is instead processed as a request for mysite.com/somepage.php. Note that proceeding in this manner would would require that each somepage actually exists as somepage.php on the filesystem.
With the right combination of regular expressions this should be possible to some extent. However, I can foresee some possible issues with index pages not being requested correctly and not matching directories correctly.
For example, this will correctly rewrite mysite.com/test as a request for mysite.com/test.php:
RewriteEngine on
RewriteRule ^(.*)$ $1.php
But will make mysite.com fail to load because there is no mysite.com/.php
I'm going to guess that if you're removing all trailing slashes, then picking a request for a directory index from a request for a filename in the parent directory will become almost impossible. How do you determine a request for the directory 'foobar':
mysite.com/foobar
from a request for a file called foobar (which is actually foobar.php)
mysite.com/foobar
It might be possible if you used the RewriteBase directive. But if you do that then this problem gets way more complicated as you're going to require RewriteCond directives to do filesystem level checking if the request maps to a directory or a file.
That said, if you remove your requirement of removing all trailing slashes and instead force-add trailing slashes the "no .php extension" problem becomes a bit more reasonable.
# Turn on the rewrite engine
RewriteEngine on
# If the request doesn't end in .php (Case insensitive) continue processing rules
RewriteCond %{REQUEST_URI} !\.php$ [NC]
# If the request doesn't end in a slash continue processing the rules
RewriteCond %{REQUEST_URI} [^/]$
# Rewrite the request with a .php extension. L means this is the 'Last' rule
RewriteRule ^(.*)$ $1.php [L]
This still isn't perfect -- every request for a file still has .php appended to the request internally. A request for 'hi.txt' will put this in your error logs:
[Tue Oct 26 18:12:52 2010] [error] [client 71.61.190.56] script '/var/www/test.peopleareducks.com/rewrite/hi.txt.php' not found or unable to stat
But there is another option, set the DefaultType
and DirectoryIndex
directives like this:
DefaultType application/x-httpd-php
DirectoryIndex index index.html
Now requests for hi.txt (and anything else) are successful, requests to mysite.com/test will return the processed version of test.php, and index.php files will work again.
I must give credit where credit is due for this solution as I found it Michael J. Radwins Blog by searching Google for php no extension apache.
Remove trailing slashes
Some searching for apache remove trailing slashes
brought me to some Search Engine Optimization pages. Apparently some Content Management Systems (Drupal in this case) will make content available with and without a trailing slash in URls, which in the SEO world will cause your site to incur a duplicate content penalty. Source
The solution seems fairly trivial, using mod_rewrite
we rewrite on the condition that the requested resource ends in a "/" and rewrite the URL by sending back the 301 Permanent Redirect
HTTP header.
Here's his example which assumes your domain is blamcast.net and allows the the request to optionally be prefixed with 'www.'.
#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?blamcast\.net$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]
Now we're getting somewhere. Lets put it all together and see what it looks like.
Mandatory www., no .php, and no trailing slashes
This assumes the domain is foobar.com and it is running on the standard port 80.
# Process all files as PHP by default
DefaultType application/x-httpd-php
# Fix sub-directory requests by allowing 'index' as a DirectoryIndex value
DirectoryIndex index index.html
# Force the domain to load with the www subdomain prefix
# If the request doesn't start with www...
RewriteCond %{HTTP_HOST} !^www\.foobar\.com [NC]
# And the site name isn't empty
RewriteCond %{HTTP_HOST} !^$
# Finally rewrite the request: end of rules, don't escape the output, and force a 301 redirect
RewriteRule ^/?(.*) http://www.foobar.com/$1 [L,R,NE]
#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?foobar\.com$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]
The 'R' flag is described in the RewriteRule directive section. Snippet:
'redirect|R [=code]' (force redirect) Prefix Substitution with http://thishost[:thisport]/
(which makes the new URL a URI) to force a external redirection. If no code is given, a
HTTP response of 302 (MOVED TEMPORARILY) will be returned.
Final Note
I wasn't able to get the slash removal to work successfully. The redirect ended up giving me infinite redirect loops. After reading the original solution closer I get the impression that the example above works for them because of how their Drupal installation is configured. He mentions specifically:
On a normal Drupal site, with clean URLs enabled, these two addresses are basically interchangeable
In reference to URLs ending with and without a slash. Furthermore,
Drupal uses a file called .htaccess to tell your web server how to handle URLs. This is the same file that enables Drupal's clean URL magic. By adding a simple redirect command to the beginning of your .htaccess file, you can force the server to automatically remove any trailing slashes.