views:

35

answers:

4

Our company is changing web platforms and we would like to be able to preserve our Google results so we are planning on putting the 301 redirects into our htaccess file.

My concern is that if if I put in all these redirects (probably 3000 - 5000 total) it will slow the server down as it makes all those checks.

Does anyone know if having an htaccess file this large will cause any problems? I have a pretty fast server for the site (8 cores) so I have a fair amount of horsepower available.

A: 

Hmm. I don't have any hard numbers whether Apache might have performance problems with that many redirects, but I would feel uneasy having such huge htaccess files that get parsed on every request, regardless whether it's a new or old URL.

If at all possible, I would tend to handle the matching of "old" URLs to new ones using a server side language, and a database table for lookup, if only for easier maintenance.

Whether and how that is possible depends on your old an new URL structure. If for example, all old URLs had a common structure like

www.domain.com/cms/folder/pagename.htm

that can be separated from the new structure, I would redirect all "old" traffic into a central script file (whatever your server platform is, ASP, PHP...) and do a simple lookup and header redirect there.

Pekka
+2  A: 

I doubt it would noticeably slow down the server. But check it out first.

Create a .htaccess 5k line file in www/temp folder with some rewrite rules like you will be using. See how long it takes to access a page with and without the .htaccess file.

Byron Whitlock
I am going to give this a try. I'll post what I experience when I have some real data.
Josh Pennington
A: 

According to reference docs: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html I suppose it would be drag to process all 3k x2 on every request.

Moreover, I suppose it would be nighmare to manage those rules on change - manually it is...

f13o
A: 

The other answers have some good suggestions, but if you don't wind up using some alternative to rewrite rules, I would strongly suggest putting these lines in the main server configuration file instead of a .htaccess file. That way Apache will only parse them once, when it starts up, and it can simply reference an internal data structure rather than having to check for the .htaccess file on every request. In fact, the Apache developers recommend against using .htaccess files at all, unless you don't have access to the main server configuration. If you don't use .htaccess files, you can set

AllowOverride None

in the main configuration, and then Apache doesn't even have to spend time looking for the files at all. On a busy server this could be a useful optimization.

Another thing you could consider doing (in conjunction with the above) is using the RewriteMap directive to "outsource" the URL rewriting to an external program. You could write this external program to, say, store the old URLs in a hashtable, or whatever sort of optimization is appropriate.

David Zaslavsky