views:

221

answers:

8

I want to have "pretty" and SEO oriented URLs in my site.

I've build up my own tiny framework for this site and almost everything is complete now.

One thing I'm still puzzled up is the pretty/SEO URLs system that I will use. I know there's many way to achieve this and I'm looking to balance best practices/ease of implementation on this one.

So far I'm thinking to have all URLs of the site to point to a specific PHP file (let's say index.php) that will contain a file/URL dictionary that will direct traffic to the correct file.

I'm really not sure if it's a good approach... Anyone have a better way to do this? The only thing I really want to avoid is to only do this in an .htaccess...

A: 

you are probably looking for something like mod-rewrite that way you would have a link that looks like http://my.server.com/my_file_of_this_name
and with mod-rewrite it would turn it into http://my.server.com/index.php?path=my_file_of_this_name
where index.php knows that if path was a passed variable, look in the DB for the content or page matching 'my_file_of_this_name' and either display it or header("location: ???) it

FatherStorm
OP wants to avoid too many specific rules in the .htaccess.
webbiedave
+3  A: 

You'll need an .htaccess file (but you won't need to change it each time you add a page):

RewriteEngine On 
#RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /index.php?url=$1 [QSA,L]

Now in your index.php you can access the requested url in $_GET['url'], map it to the correct file and then include it.

Note: Put the RewriteBase comment in there in case you need to uncomment it as some configurations require this.

webbiedave
@webbiedave Thanks for the tips! My question is more about is this a good solution than how to do it. So is this a proper way to do this kind of system? Will it impact cache in any "bad" ways? Also can you elaborate on the RewriteBase comment? Thanks!
AlexV
Wordpress does it exactly the same way, so there's no problem with that.
SorinV
+1 for remembering the `RewriteCond` you might want to explain why they are there though, if the file or directory doesnt exist, silently redirect to `index.php`
Kristoffer S Hansen
be very careful when include'ing using a variable from the url, you can't be too careful when using user input, ESPECIALLY when it's a matter of including a file. (think of what would happen if you include($url) and $url was something like "../../../etc/passwd")
Geoffrey Bachelet
@Geoffrey Bachelet Yeah I'm aware of this really dangerous security hole :)
AlexV
You don't need to add that `url` parameter. The requested URL is already available through `$_SERVER['REQUEST_URI']`. And to add to Sorin's example, Zend Framework does this too.
mercator
@mercator: `$_SERVER['REQUEST_URI']` will contain the entire request, including query string. `url` will contain only the requested path, which in most cases is more desirable.
webbiedave
@AlexV: Since you are mapping to static pages, I believe your general approach is very appropriate. I've implemented something very similar before (`.htaccess` to map in `index.php`) and it works fine. Cache handling is an issue for all PHP scripts regardless of any `.htaccess` redirects. You'll want to ensure that your scripts handle `Last-Modified` and `If-Modified-Since` headers (plenty of tutorials out there already on this). The `RewriteBase` directive sets the base URL-path, only uncomment it if you are receiving 500 internal errors to see if it helps.
webbiedave
+2  A: 

Well, there are some ways to do this.

Widely adopted is using .htaccess, which you said don't wanna use. OK.

You still have some options:

  • Using a database table to map everything is one.
  • Using a routine to check existing files.
  • Using a extended xml sitemap.
  • Using a caching system.

Well, everything above can be mixed on your implemetation if you want.

You can find a good article for this on a list apart.

http://www.alistapart.com/articles/succeed/

On this article the solution is a mix:

first check if a file exists (for example "about-us.php"); if yes, include the file contents, else

check db for this request (as a tip, you can have a field named "friendlyURL" on your main content tables). if exists, extract and display, else

show a 404 page. as a tip for this one, keeping the SEO feature in mind, I would recommend you to have a sitemap xml. If page is not found, you can check sitemap if is not a broken URL, like:

http://yourdomain.net/shoes/male/bro

you can check if some URL like: http://yourdomain.net/shoes/male/brown

and suggest it to your customers/visitors. Along with:

http://yourdomain.net/shoes/male/

http://yourdomain.net/shoes/

also a link for your HTML sitemap, and if you have a search feature on your site, also use it, display a link for user go to search page with that query.

http://yourdomain.net/search?q=shoes+male+bro

OR

[input type="text" name="q" value="shoe+male+bro"];

And another extra tech tip: make use of full-text search feature of your db if available.

A interesting reading comes from Rasmus Lerdorf, PHP creator: http://lerdorf.com/lca04.pdf (check page 34, about 404 redirects).

Dave
@Dave: I can use .htaccess, I just don't want 2000 entries in it...
AlexV
@AlexV Got your point. Well, a few lines on htaccess can made your lif much easier then.
Dave
Also note that you can define Rewrite Rules in your webservers configuration (if you have access to them). Still, using only those 4 lines is not that bloated; and that’s enough for a rewrite / speaking-url system.
Kissaki
+2  A: 

Directing everything to index.php and then routing from there is a good, clean way to do it. I have something very similar, I:

  1. Route everything to index.php in .htacess.
  2. In index.php I split the url by '/' to get an array
  3. The first element of the array is the name of the class to call.
  4. The second element is the function of the class to call.
  5. If needed, remaining elements are parameters.

For example, browsing to:

www.blah.com/shop/browse/cakes

Would call index.php, which would include shop.php and instantiate a class called Shop. Would try to call a function on Shop called browse and would pass it a parameter of "cakes".

This is a simplified example but you get the idea. Convention over configuration makes the URLs and the code clean.

Steve Claridge
+1  A: 

Just to add another alternative, you might want to check out the RewriteMap directive.

bazmegakapa
+2  A: 

Hey,

I think you said that the pages could be static, right? Well, a solution that I use sometimes is:

Options +FollowSymLinks
RewriteEngine On

RewriteRule ^([a-zA-Z0-9-_]+)/?$ files/$1.php [NC,L]
RewriteRule ^$ files/index.php [NC,L]

This way, you can't run into issues like people accesing files like .htpasswd or sensitive data. The page /about would lead to /files/about.php

Create your files directory and new PHP (or HTML, XML, whatever - just change it or add it in the .htaccess as well) files. Nothing else than that.

Let me know if this is good wnough for you :)

P.S: If the pages have content from the database (like articles on a blog) it could be even easier to use an extra parameter and stick with .htaccess:

Options +FollowSymLinks
RewriteEngine On

RewriteRule ^([0-9-_]+)/([a-zA-Z0-9-_]+)/?$ files/article.php?id=$1 [NC,L]
RewriteRule ^([a-zA-Z0-9-_]+)/?$ files/$1.php [NC,L]
RewriteRule ^$ files/index.php [NC,L]

This way, the rules I said first time apply, but when you have an URL like /3/my_new_article it would point to /files/article.php?id=3

You would also have to parse the title of the article with PHP so it would only accept the characters in the regex, look nice and also contain only valid URL characters if you plan to change the regex.

Cheers!

Claudiu
+1  A: 

An alternative to routing through your index.php page and the rewrite rules etc people have been posting is multiviews. It is by far the easiest way to do what you want and will let you convert your site to use pretty/seo urls really easily and intuitively without complex htaccess rules and having to maintain routings (in .htaccess or a db)/a complex site structure.

To enable multiviews, edit your httpd.conf or vhost file for the directory you want to enable it for: (if you're on shared hosting it should still be possible providing your host allows you to do a local override in your .htaccess file - Options +Multiviews)

<Directory /var/www/>
    Options +Multiviews
</Directory>

... And then you're done! Now take a look at an example of it in action:

http://{yoursite}/page.php can now be accessed as http://{yoursite}/page or even http://{yoursite}/page/variable or even http://{yoursite}/page/variable/variable/variable/variable !

What apache is doing is looking for the closest file match for the path sent to it and calling that file. All that is required to make these URLs usable is a function within your framework to extract data from the URL. Here's a basic function I have used before which could be adapted to your needs, it extracts the variable in the URL at $position:

function getWebParam($position) {
    static $params;

    if (!$params) {
        $params = explode('/', $_SERVER['PATH_INFO']);        
    }

    return $params[$position];
}

You could adapt it to parse URLs however you like (you don't have to separate variables based on a /, page-variable.html can also route to page.php), as long as the target file is the closest matching file for the URL entered.

Of course, there are downsides to using multiviews and it goes against the philosophy of directing all requests through your index.php file - but it does answer your request for simplicity and not having to maintain a htaccess file and is good to know about/consider.

JoeR
A: 

Very pretty routing is in Nette Framework see nette.org. Like your idea, every url goes to index.php. And there is routers, which direct what will happen, etc. http://api.nette.org/2.0/Nette.Application.SimpleRouter.html

Jaroslav Moravec