views:

431

answers:

2

[SOLVED: See solution below.]

I'm having a problem writing a RewriteMap program (using Python). I have a RewriteMap directive pointing to a Python script which determines if the requested URL needs to be redirected elsewhere.

When the script outputs a string terminated by a linebreak, Apache redirects accordingly. However, when the script outputs NULL (with no linebreak), Apache hangs and subsequent HTTP requests are effectively ignored.

The error log shows no errors. The rewrite log only shows a pass through followed by a redirect when successful, then only pass through when NULL is returned by the script. Subsequent requests also only show pass through.

Additionally, replacing stdout with os.fdopen(sys.stdout.fileno(), 'w', 0) to set buffer length to zero did not help.

Any help would be greatly appreciated. Thank you in advance.

/etc/apache2/httpd.conf

[...]
RewriteLock /tmp/apache_rewrite.lock

/etc/apache2/sites-available/default

<VirtualHost *:80>
  [...]
  RewriteEngine on
  RewriteLogLevel 1
  RewriteLog /var/www/logs/rewrite.log
  RewriteMap remap prg:/var/www/remap.py
  [...]
</VirtualHost>

/var/www/webroot/.htaccess

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*_.*) /${remap:$1} [R=301]

/var/www/remap.py

#!/usr/bin/python

import sys

def getRedirect(str):
  new_url = None
  # if url needs to be redirected, put this value in new_url
  # otherwise new_url remains None
  return new_url

while True:
  request = sys.stdin.readline().strip()
  response = getRedirect(request)
  if response:
    sys.stdout.write(response + '\n')
  else:
    sys.stdout.write('NULL')
  sys.stdout.flush()
+2  A: 

You have to return a single newline, not 'NULL'.

Apache waits for a newline to know when the URL to be rewrite to ends. If your script sends no newline, Apache waits forever.

So just change return ('NULL') to return ('NULL\n'), this will then redirect to /. If you don't want this to happen, have the program to return the URL you want when there's no match in the map.

If you want not to redirect when there's no match I would:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond (${remap:$1}) !NULL
RewriteRule (.*_.*) /%1 [R=301]

Use a match in the RewriteCond (this would work with NULL as well, of course). But given your problem, this looks like the proper solution.

Vinko Vrsalovic
Thanks for the speedy response. I tried this and it redirects to /. In the Apache documentation, it states: '[The rewrite map program] then has to give back the looked-up value as a newline-terminated string on stdout or the four-character string ``NULL'' if it fails (i.e., there is no corresponding value for the given key).'Source: [http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritemap]
Andrew Ashbacher
Bah, I don't remember reading that ever. I've always returned an URL.Have you tested newline terminating the string NULL?
Vinko Vrsalovic
Yes. I've tried that and it redirects to /. I've set it up now to return '_NULL_\n' and have RewriteCond ${remap:$1} !^_NULL_ before it. When _NULL_ is returned, the program runs once. However, when the program returns a value, it runs twice. At least, this works. I need to find a better solution though.
Andrew Ashbacher
"... the map MapName is consulted and the key LookupKey is looked-up. If the key is found, the map-function construct is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue or by the empty string if no DefaultValue was specified."From there it looks like that NULL\n is doing the expected, this is getting replaced by the empty string.
Vinko Vrsalovic
http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#InternalBackRefsThis section specifies that the pattern (!NULL) not the test string (${remap:$1}) are backreferenced with %N. Therefore, this solution will redirect to NULL or /. I'm trying to think of combinations of RewriteCond and RewriteRules to give me the proper backreference so I can query to script at most once per request. Again, thanks for your help.
Andrew Ashbacher
+1  A: 

The best solution I've come up with thus far is to have the RewriteMap script return the new url or '__NULL__\n' if no redirect is desired and store this value in a ENV variable. Then, check the ENV variable for !__NULL__ and redirect. See .htaccess file below.

Also, if anyone is planning on doing something similar to this, inside of the Python script I wrapped a fair amount of it in try/except blocks to prevent the script from dying (in my case, due to failed file/database reads) and subsequent queries being ignored.

/var/www/webroot/.htaccess

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.+)$ - [E=REMAP_RESULT:${remap:$1},NS]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{ENV:REMAP_RESULT} !^__NULL__$
RewriteRule ^(.+)$ /%{ENV:REMAP_RESULT} [R=301,L]

/var/www/remap.py

#!/usr/bin/python

import sys

def getRedirect(str):
  try:  # to prevent the script from dying on any errors
    new_url = str
    # if url needs to be redirected, put this value in new_url
    # otherwise new_url remains None
    if new_url == str: new_url = '__NULL__'
    return new_url
  except:
    return '__NULL__'

while True:
  request = sys.stdin.readline().strip()
  response = getRedirect(request)
  sys.stdout.write(response + '\n')
  sys.stdout.flush()

Vinko, you definitely helped me figure this one out. If I had more experience with stackoverflow, you would have received ^ from me. Thank you.

I hope this post helps someone dealing with a similar problem in the future.

Cheers, Andrew

Andrew Ashbacher