views:

787

answers:

10

I see many, many sites that have URLs for individual pages such as

http://www.mysite.com/articles/this-is-article-1 http://www.mysite.com/galleries/575

And they don't redirect, they don't run slowly...

I know how to parse URL's, that's easy enough. But in my mind, that seems slow and cumbersome on a dynamic site. As well, if the pages are all staticly built (hende the custom URL) then that means all components of the page are static as well... (which would be bad)

I'd love to hear some ideas about how this is typically accomplished.

+1  A: 

It's usually done via a rewrite engine, either in the server (via something like mod_rewrite in Apache) or in the web application (all requests are routed to the web application, which looks for a route for the path specified).

ceejayoz
+3  A: 

There are many ways you can handle the above. Generally speaking, there is always at least some form of redirection involved - although that could be at the .htaccess level rather than php. Here's a scenario:

  1. Use .htaccess to redirect to your php processing script.

  2. Parse the uri ($_SERVER['REQUEST_URI']) and ascertain the type of content (for instance, articles or galleries as per your examples).

  3. Use the provided id (generally appended to the end of the uri, again as in your examples) to obtain the correct data - be that by serving a static file or querying a database for the requested content.

This method is a very popular way of increasing SEO, but as you rightly highlight there can be difficulties in taking this approach - not typically performance, but it can make development or administration more troublesome (the later if your implementation is not well thought out and scalable).

BrynJ
+1 for the explanation, but what do you see as the admin problems? I went this route for my personal website as a way to reduce admin: all pages are run through a single template that loads static content
kdgregory
I guess the administration difficulties I have faced with this method are more to do with the implementation I inherited on a specific project - I have clarified my answer on this point.
BrynJ
+1  A: 

A rewrite engine is the best approach as they are fast and optimised. Allowing your Server-Side scripting to use just plain local vars.

Apaches mod_rewrite is the most common.

stuartloxton
That's interesting. I'll need to read up on that one
jerebear
A: 

In my case, I stick to the web framework with this feature already built-in. (CodeIgniter)

... As well, if the pages are all staticly built (hende the custom URL) then that means all components of the page are static as well... (which would be bad)

... yes, this is very bad indeed. :o

andyk
A: 

It is possible to rewrite at

  1. The server level in either the .htaccess file or the httpd.conf or vhosts.conf file. This is typically faster than the next level of rewriting which is done on the application level.

  2. The application level (in this instance with PHP). You can write custom redirects that analyse the URL and redirect in some way based on that. Modern web frameworks such as the Zend Framework (ZF) use routes to control URL rewriting. The following is an example of a static route with ZF

$route = new Zend_Controller_Router_Route_Static('latest/news/this/week', array('controller' => 'news'));

Which would redirect any request from http://somedomain.com/lastest/news/this/week to the news controller.

An example of a dynamic route would be

$route = new Zend_Controller_Router_Route('galleries/:id', array('controller' => 'gallery'));

Where the variable $id would be availbe to that controller (and using our example above would be 575)

These are very useful tools to that allow you to develop an application and retrospectively change the URL to anything you want.

m3clov3n
+2  A: 

Firstly, when comparing /plain/ URL rewriting at the application level to using /plain/ CGI (CGI can be PHP, ISAPI, ASP.NET, etc.) with serving static pages, serving static files will always, always win. There is simply less work. For example, in Windows and Linux (that I know of) there are even enhancements in the kernel for serving static files on a local drive via HTTP. To further make the point I even found a benchmark using several servers and OSs: http://www.litespeedtech.com/web-server-performance-comparison-litespeed-2.0-vs.html#RESULT Note that serving static files is dramatically faster than using any type of CGI

However, there can potentially be performance and scalability gains by using rewritten URLs effectively and it is done with caching. If you return proper cache headers (see cache-control directive in HTTP documentation) then it enables downstream servers to cache the data so you won't even get hits on your site. However, I guess you could get the same benefit with static pages :) I just happen to read an article on this very topic a day or two ago at the High Scalability blog: http://highscalability.com/strategy-understanding-your-data-leads-best-scalability-solutions

scott
Great articles! Thank you.
jerebear
A: 

A very simple way is to have a CGI parse the PATH_INFO portion of the URL. In your example:

http://www.example.com/articles/12345 (where "articles" is a CGI script)
                       ^CGI^   ^^^^^^PATH_INFO

Every thing after the script name is passed to the script in the PATH_INFO CGI header.

Then you can do a database lookup or whatever you wish to generate the page.

Use caution when accessing this value as the IIS server and Apache server put different portions of the URL in PATH_INFO. (IIRC: IIS incorrectly uses the entire URL and Apache prunes it as stated above.)

Chris Nava
A: 

On apache servers mod_rewrite is the most common for this, it's an apache mod which allows you to rewrite request urls to other urls with regular expressions, so for your example something like this would be used:

RewriteEngine ON
RewriteRule ^articles/(.*) articles.php?article=$1 [L]
RewriteRule ^galleries/(\d*) galleries.php?gallerie=$1 [L]

This costs hardly any time, and in practice is just as fast as having the url:
www.mysite.com/galleries.php?gallerie=575 but looks way better

Pim Jager
A: 

I have used this method preiously - you just need to add the file extensions that should not be redirected in the regex and then everything else is handled by php so you don't need to be going into your .htacces file

suceed with urls

A: 

I love this community!

These are all really good options and avenues I need to explore further. Thanks for all of your input.

jerebear