views:

114

answers:

1

I'm trying to write some .htaccess rules that replace certain characters in the REQUEST_URI parameter. Specifically, I want to replace the following:

  • "<" = &lt;
  • ">" = &gt;
  • "'" = &apos;
  • '"' = &#x22;
  • ")" = &#x29;
  • "(" = &#x28;

Example URL could be http://www.example.com/?&lt;script&gt;alert(1)&lt;/script&gt;&amp;q=")("&lt;script')

I've tried a whole bunch of methods with no success. Can someone point me in the right direction? Thanks.

+1  A: 

You can use mod_rewrite to do this replacement, see this example for <:

RewriteCond %{QUERY_STRING} ^([^<]*)<([^<]*)<(.*)
RewriteRule ^ %{REQUEST_URI}?%1&lt;%2&lt; [N]
RewriteCond %{QUERY_STRING} ^([^<]*)<([^<]*)$
RewriteRule ^ %{REQUEST_URI}?%1&lt;%2 [L]

The first rule will replace two < characters at a time and the second will end the recursion. The other characters can be replaced in the same way (just replace < and &lt; with the other pairs).

But using mod_rewrite for this kind of work is not that suitable because

  1. mod_rewrite allows to replace only fixed number of occurrences at a time and
  2. the number of replacements is limited to the internal redirection counter that is used to avoid infinite recursion.

Although the second statement does not apply in this case due to the usage of the N flag, I would not recommend the usage of mod_rewrite for this kind of work.

I would rather recommend to do this in the web application, possibly just before putting your data out into an HTML document and not in a prophylactic manner for every input no matter how that data is processed.

Gumbo