views:

904

answers:

3

I'd like to use mod rewrite in to convert web page addresses like /directory to /directory/index.html, in a standard LAMP hosting situation. What I have works for addresses that end in a slash. I can't find a way to handle addresses that don't end a slash.

What seems like it should work is:

rewriterule ^(.*)/$ $1/index.html [L] /* addresses ending in / */
rewriterule ^(.*(?!html))$ $1/index.html [L] /* where the problem is */

But the second line causes a 500 server error. If I add a single letter x to the second line:

rewriterule ^(.*)/$ $1/index.html [L]
rewriterule ^(.*x(?!html))$ $1/index.html [L]

It starts to work, but only for directory names that end in an x. I have tried replacing the x with many different things. Anything more complicated than real characters (like [^x] or .+) gives a 500 server error.

And, to satisfy my own curiosity, does anyone know why the addition of a single real letter makes the difference between a server error and a perfectly functioning rule?

[Accepted Answer] Thanks to Gumbo I was able to approximate a solution using rewritecond:

rewritecond %{REQUEST_URI} !\.[^/]+$
rewriterule (.+) $1/index.html [L]

This works, but filters more than just .html -- it could block other pages. Unfortunately,

rewritecond %{REQUEST_URI} !\.html$

results in a server error:

Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary.

I'd still like to know why:

rewriterule ^(.*(?!html))$ $1/index.html [L]

results in a loop. The first half is supposed to check if it doesn't end in .html. Since the second half adds .html, it seems like the functional equivalent of:

while(substr($address,-4)!='html') $address.='html'

Obviously I'm missing something.

A: 

Well, for actually making it work, you could just use a negative lookbehind instead of a lookahead:

RewriteRule ^(.*)(?<!html)$ $1/index.html [L]

I'm not sure offhand why adding the 'x' makes it work, I'll edit if I figure it out.

Chad Birch
This suggestion created the same 500 server error as my original proposal -- and again it only worked if I added a random letter before the (?.
Andrew Swift
A: 

For why adding the x makes it work: If the replacement will match the regex, the RewriteRule will be applied again. As an example, this causes an error:

RewriteRule ^(.*)$ $1.rb

because it would replace script with script.rb. That matches the regex, so it replaces script.rb with script.rb.rb, again and again...

This is hinted at in the error log:

Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary.

In your example, you add index.html to the end. When there is an x at the end of the regex, then it won't match your replacement, which ends in an l.

Jordan Miner
It shouldn't do this in the sample provided by andrew as the [L] at the end of the line should prevent it.
VirtualBlackFox
Sorry, somehow I didn't see the [L] there. I've edited my answer so it doesn't state anything wrong. I'll look at it more to see if I can figure out the actual reason. Thanks.
Jordan Miner
I see how this example creates a loop, but I don't see how it applies to my question -- what I'm trying to do is "if the address doesn't end in .html, add.html to the address". How does the [L] affect that?
Andrew Swift
@VirtualBlackFox: I've been testing, and it seems [L] doesn't stop the loop. [L] must just prevent *other* RewriteRules from being applied. The docs state "Stop the rewriting process here and don't apply any more rewrite rules."
Jordan Miner
@Andrew: You were wondering why adding x to this line -- rewriterule ^(.*(?!html))$ $1/index.html [L] -- made it work. Adding the x prevents it from looping. The negative lookahead doesn't do anything because .* already matches the whole thing, so the lookahead is tested against emptyness.
Jordan Miner
+1  A: 

Use a RewriteCond directive to check whether the URL path does not end with a .html:

RewriteCond %{REQUEST_URI} !\.html$
RewriteRule ^(.*[^/])?/?$ $1/index.html [L]


Edit   You’re using a look-ahead assertion ((?!…)). But there isn’t anything after .* (only a $). So try a look-behind assertion instead:

RewriteRule ^.*$(?<!html) $0/index.html [L]

But note that you probably need Apache 2.2 to use these assertions.

Gumbo
This suggestion created a 500 server error, and I'm not expert enough to know wny. I'm going to look into it in more detail and edit this comment later.
Andrew Swift
And what does the server error log say?
Gumbo