views:

312

answers:

2

I'm trying to write a rule to make that one can generalize, since multiple pages to pass the values are different. Right now I could do:

RewriteRule ^forum/([^/]{1,255})/([\+]{1})/((([a-z]+)([_]{1})([a-zA-Z0-9]+)([/]?))+)$   forum.php?name=$1&$5=$7 [L]

To address such as:

Nome+del+Forum/+/page_1/action_do

Should return:

forum.php?name=Nome+del+Forum&page=1&action=do

Instead, take only the last parameter (in this case action=do):

forum.php?name=Nome+del+Forum&action=do

How can I fix? Thanks in advance!

A: 

Easiest way is if you can put a RewriteMap prg:foo in the site configuration. Note that you can't use RewriteMap in a .htaccess as far as I know.

If you have access to the server configuration file, add something like this, either globally or in your <VirtualHost>:

RewriteMap forum prg:/path/to/forum-rewriter.pl

Then in your chosen .htaccess try:

RewriteRule ^forum/.* ${forum:$0} [QSA]

(In your attempted RewriteRule it looked like you wanted to match on paths that start with forum/ so that's what I'm doing here.)

Next you need to make forum-rewriter.pl:

#!/usr/bin/perl

use strict;
use warnings;


$| = 1;
while ($_ = <STDIN>) {
    chomp($_);
    if (s,^forum/([^/]+)(/\+)?(/.*)?,forum.php?name=$1, && $3) {
        my $query_string = $3;
        $query_string =~ s,/([^/_]+)_([^/_]+),&$1=$2,g;
        $_ .= $query_string;
    }
    print "$_\n";
}

Make sure to chmod 755 /path/to/forum-rewriter.pl.

The RewriteMap will cause Apache to start forum-rewriter.pl once for every httpd process at start time. Every time the RewriteRule's pattern matches, it'll write the path to the stdin of forum-rewriter.pl. forum-rewriter.pl will do its work (if any) on the URL and print the result back to Apache.

FYI it's very easy to test forum-rewriter.pl. Just run it and start sending it paths:

$ /tmp/forum-rewriter.pl
forum/Nome+del+Forum/+/page_1/action_do
forum.php?name=Nome+del+Forum&page=1&action=do
forum/Nome+del+Forum/page_1/action_do
forum.php?name=Nome+del+Forum&page=1&action=do
forum/Nome+del+Forum
forum.php?name=Nome+del+Forum
foo
foo

If you can't use RewriteMap, I guess it's possible with mod_rewrite but I think this is dangerous:

RewriteCond %{ENV:SetForumName} =""
RewriteRule ^forum/([^/]+)(/\+)?(/.*)? /forum.php$3?name=$1 [QSA,E=SetForumName:1]
RewriteCond %{ENV:SetForumName} !=""
RewriteRule ^forum.php/([^/_]+)_([^/_]+)(/.*)? /forum.php$3?$1=$2 [N,QSA]

It's dangerous because that [N] causes Apache to restart processing all rewrite rules when that second RewriteRule matches. If you're not careful you'll create an infinite loop and Apache will either 500 the request or croak. Also note the side-effect that it sets the SetForumName environment variable. However, this is the only way I could find to tell mod_rewrite "replace all foo_bar in the path with foo=bar in the query string."

draebek
A: 

Something like this?

And see also this for some additional info.

The basic info is these rules

RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^(.*/)([^/]+)/([^/]+) $1?$2=$3&%1 [L]
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^([^/]+)/ $1?%1 [L]

will rewrite the following

/mypage/param1/val1/param2/val2/param3/val3/...     --->
/mypage?param1=val1&param2=val2&param3=val3&...

It takes the first parameter and calls that page. The rest of the parameters are turned into the query string.

EDIT with more about how the code works.

The first line (and the third) merely capture everything after the "?" so that it can be added back to the end of the rewritten URL. The same thing can be accomplished with the [QSA] flag, but I prefer to do it explicitly so it's visible.

The string is saved in the %1 variable.

The second line uses a regular expression to split the URL into three parts.

$2 and $3 are the last two parts of the path. $1 is everything before that.

So this:

/mypage/param1/val1/param2/val2/param3/val3/param4/val4/param5/val5/

Becomes this:

$1                                                        $2      $3
/mypage/param1/val1/param2/val2/param3/val3/param4/val4/  param5  val5

and is rewritten as this (spaces added for readability):

/mypage/param1/val1/param2/val2/param3/val3/param4/val4/ ? param5 = val5

If your URL uses separators other than slashes, you can modify the regular expression accordingly.

The [L] ("last") flag prevents the rest of the rules from running. HOWEVER -- and this is the key -- mod_rewrite calls the webserver with the new URL and the new URL is sent through mod_rewrite again! This is sometimes called mod_rewrite recursion and is exactly what you want for this to work.

The next time through, the next two parts are stripped off the end and added to the query string.

This:

$1                                           $2      $3     %1
/mypage/param1/val1/param2/val2/param3/val3  param4  val4 ? param5=val5

Becomes this:

/mypage/param1/val1/param2/val2/param3/val3 ? param4 = val4 & param5=val5

This keeps on happening until all the parts of the URL are converted into parts of the query string. When that happens, the second line doesn't match but the fourth line does. That is what gives you your final URL to call the program you want. The links above show answers to slightly different variations that allow for different rules for that.

Also note that there are limits to the number of times mod_rewrite can recurse. See the settings for MaxRedirects in .htaccess and LimitInternalRecursion in your apache conf file.

bmb
FedericoBiccheddu
FedericoBiccheddu, I added more of an explanation. I hope this helps. I know your URL uses `_` and not `/` but the concept should work just as well if you use a slightly different regex.
bmb