views:

765

answers:

3

There are several links pointing to a site I manage in which the webmaster mistakenly included a space between the domain name and the page name:

    www.domain.com/ page.html

When the user clicks, this gives

    www.domain.com/%20page.html

I'd like to use mod_rewrite to redirect hits to the incorrect address to the correct address, but my rewrite rule is not working. I have tried the following without success:

    rewriterule ^\%20page.html$ /page.html [R=301,L]
    rewriterule ^.20page.html$ /page.html [R=301,L]

How can I write a rule to catch this address? I'd like to keep the PageRank and not be penalized for a broken link, and I can't get the webmaster to fix his links.

+2  A: 

Put the space in your RewriteRule. Probably by the time mod_rewrite sees it, it's been decoded.

chaos
rewriterule ^.page.html$ /page.html [R=301,L] worked, thanks!
Andrew Swift
I hope you don't have apage.html otherwise it'll end up in the wrong place...
Greg
Yeah, you really ought to embed the space. RoBorg specified how you actually do that, which I should've.
chaos
+7  A: 

Use a literal space, escaped with a backslash so it doesn't end the regular expression:

RewriteRule ^\ page.html$ /page.html [R=301,L]
Greg
A: 

You could use something like this to remove all control characters:

RewriteRule ^([^\x00-\x19\x7F]*)[\x00-\x19\x7F]+(.*) /$1$2 [L,R=301]

And for your additional space character:

RewriteRule ^([^\x00-\x20\x7F]*)[\x00-\x20\x7F]+(.*) /$1$2 [L,R=301]
Gumbo