views:

70

answers:

3

I'm working on a Google appengine project and I've encountered a quandary. The following should (if the regex's are normal) redirect everything which does not contain the word "test" to the MainPage class, and the rest to the TestPage class.

application = webapp.WSGIApplication(
                                     [
                                        ('[^(test)]*', MainPage),
                                        ('.+', TestPage)
                                     ],
                                     debug=True)

Instead, I find that the regular expression is being interpreted:

('[^tes]*', MainPage)

This means that anything which includes a t, e, or s will NOT direct to MainPage (in this case, it will direct to TestPage). Obviously, the workaround is to re-write the TestPage regex, but I don't want to have to make a work around. This should work without being re-written.

Am I missing some library somewhere? Is this a configuration issue? I have far less issue with calling a function or setting a property before running run_wsgi_app, but this looks inconsistent as is.


UPDATE

It turns out that the culprit was two things. First it was a mistake on my part in the syntax (Mea culpa). Second, the tool I had used to confirm the regular expression said that the expression would not match "test " but it would match "t est ".

+3  A: 

Why not just invert it? Instead of checking for [not 'test'] the check for test is simpler. Route that match to TestPage and the rest to MainPage.

The difference of (not working):

'[^(test)]*'

and

'test'

Unless I'm completely mistaken.

Erik Noren
Oh, I knew how to reverse it, but I didn't like that I had to.
Christopher W. Allen-Poole
+7  A: 

Actually, it's being interpreted as identical to any rearrangement of the characters aside from the leading caret within the square brackets, such as [^est()]. Standard regular expression syntax includes no straightforward way to specify the complement of the language matched by a particular regex.

In this case, you don't need to worry about that. Follow Erik Noren's advice and change the order of the matching expressions like so:

application = webapp.WSGIApplication(
                                     [
                                        ('test', TestPage)
                                        ('.+', MainPage),
                                     ],
                                     debug=True)

This straightforwardly accomplishes the same result.

Jeremy W. Sherman
It looks like a combination of issues. I happened to be using a RegEx parser for something else which was behaving as I expected it to (as described above) and (obviously) I thought that my syntax was correct (the thought: since it worked in that alternate environment, why wouldn't it here?).
Christopher W. Allen-Poole
+3  A: 

The square-bracket notation in regex is a set of characters, and parens have no special meaning within them. So [^(test)] matches any character other than 't', 'e', 's', '(', or ')'

Ned Batchelder
Just to supplement Ned's answer, here is a link to the docs that explain this: http://docs.python.org/library/re.html
Robert Kluin