views:

260

answers:

3

Regular Expressions are usually expressed as strings, but they also have properties (ie. single line, multi line, ignore case). How would you store them? And for compiled regular expressions, how to store it?

Please note that we can write custom property classes: http://googleappengine.blogspot.com/2009/07/writing-custom-property-classes.html

As I don't understand Python enough, my first try to write a custom property which stores a compiled regular expression failed.

+3  A: 

I'm not sure if Python supprts it, but in .net regex, you can specify these options within the regex itself:

(?si)^a.*z$

would specify single-line, ignore case.

Indeed, the Python docs describe such a mechanism here: http://docs.python.org/library/re.html

To recap: (cut'n'paste from link above)

(?iLmsux)

(One or more letters from the set 'i', 'L', 'm', 's', 'u', 'x'.) The group matches the empty string; the letters set the corresponding flags: re.I (ignore case), re.L (locale dependent), re.M (multi-line), re.S (dot matches all), re.U (Unicode dependent), and re.X (verbose), for the entire regular expression. (The flags are described in Module Contents.) This is useful if you wish to include the flags as part of the regular expression, instead of passing a flag argument to the compile() function.

Note that the (?x) flag changes how the expression is parsed. It should be used first in the expression string, or after one or more whitespace characters. If there are non-whitespace characters before the flag, the results are undefined.

spender
+3  A: 

I wouldn't try to store the compiled regex. The data in a compiled regex is not designed to be stored, and is not guaranteed to be picklable or serializable. Just store the string and re-compile (the re module will do this for you behind the scenes anyway).

Ned Batchelder
+2  A: 

You can either store the text, as suggested above, or you can pickle and unpickle the compiled RE. For example, see PickledProperty on the cookbook.

Due to the (lack of) speed of Pickle, particularly on App Engine where cPickle is unavailable, you'll probably find that storing the text of the regex is the faster option. In fact, it appears that when pickled, a re simply stores the original text anyway.

Nick Johnson