views:

468

answers:

7

We have a site that's always been deployed on a windows server with no case sensitivity issues. However we now need to deploy to Linux and know the site has lots of incorrectly cased URL's and references.

Are there any applications that could scan the site and fix casing issues? This would need to fix HTML files, CSS files and if possible Javascript files.

I was thinking about writing an application that for each file in the site searches all the other files to see if they referenced it then corrected any casing errors, but thought on the off chance it may already be done and I can just download a ready made solution.

Thanks Gavin

+1  A: 

Depending on your leeway for error and your timelines, you could solve this problem by monitoring the webserver logs obsessively for 404 errors as you visit the site. That would involve the fewest changes to the codebase.

Alternatively, you could require all files to be all lower-case, and then run a checker over the codebase looking for upper-case characters in URLs.

Either way, you're going to have to do some manual work to get all the kinks worked out.

DDaviesBrackett
+4  A: 

Is this Apache? You can use mod_speling to have your server ignore case.

http://httpd.apache.org/docs/1.3/mod/mod_speling.html

Rich
I wish we could but the server is out of our control.
Gavin Draper
A: 

The best way is to fix your urls, alternatively, you can add the following rewrite rules to your .htaccess

RewriteEngine on
RewriteRule ^[A-Z]+.*\.html$ lowercase.php [L]

and within lowercase.php (change according to your technology)

<?php
// convert uri to lowercase
$uri = $_SERVER['REQUEST_URI'];
$uri_lc = strtolower($uri);
// redirect (permanent)
header("Location: http://".$_SERVER['HTTP_HOST'].$uri_lc,TRUE,301);
?>

Ensure that all your filenames are lowercased.

+1  A: 

What development environment are you using?

For example in dreamweaver you can check and correct links site-wide.

Edit: To answer your question: you can download a trial version of dreamweaver, put in your web-site as a project and use the link checker to check and correct the links.

As said in the comments, I would definitely correct the problem and not try to get around it by using a "ignore-case" solution. That way your web-site is portable and you will avoid problems in the future. A good file-name convention is always a good idea (no upper case, no spaces, no exotic characters, etc.).

jeroen
we're using Web Developer Express
Gavin Draper
Don't know that one, so I´m afraid I can´t tell you if it has that option...
jeroen
Either way I´d recommend fixing the site instead of looking for an ignore option.
jeroen
I also recommend to choose a convention, and stick to it. One could be, use only lowercase in filenames and urls, separate words with '_' or '-'.
Mercer Traieste
+1  A: 

Here’s a super fast, simple way of doing it; load the site onto the target environment then point Xenu's Link Sleuth (free download) at the root and let it run wild. It will report all the 404s that are generated then you can just run through and resolve each of them. Easy.

Troy Hunt
A: 

I have used ActiveState Komodo Edit's find and replace function in regular expression mode to do something similar.

A: 

One way is to place all the files on the linux server, perhaps under a test config/URL, then run LinkChecker against the root URL (or any other appropriate URLs):

http://linkchecker.sourceforge.net/

and see if it reports any broken links.

ars