views:

517

answers:

6

Is there a way to debug a regular expression in Python? And I'm not referring to the process of trying and trying till they work :)

EDIT: here is how regexes can be debugged in perl :


use re 'debug';

my $str = "GET http://some-site.com HTTP/1.1";
if($str =~/get\s+(\S+)/i) {
    print "MATCH:$1\n";
}

The code above produces the following output on my computer when ran :


Compiling REx "get\s+(\S+)"
Final program:
   1: EXACTF  (3)
   3: PLUS (5)
   4:   SPACE (0)
   5: OPEN1 (7)
   7:   PLUS (9)
   8:     NSPACE (0)
   9: CLOSE1 (11)
  11: END (0)
stclass EXACTF  minlen 5
Matching REx "get\s+(\S+)" against "GET http://some-site.com HTTP/1.1"
Matching stclass EXACTF  against "GET http://some-site.com HTTP/1.1" (33 chars)
   0           |  1:EXACTF (3)
   3        |  3:PLUS(5)
                                  SPACE can match 1 times out of 2147483647...
   4       |  5:  OPEN1(7)
   4       |  7:  PLUS(9)
                                    NSPACE can match 20 times out of 2147483647...
  24       |  9:    CLOSE1(11)
  24       | 11:    END(0)
Match successful!
MATCH:http://some-site.com
Freeing REx: "get\s+(\S+)"

+3  A: 

Why don't you use some regEx tool (i usually use Regulator) and test the regex-expression there and when you are satisfied, just copy/paste it into your code.

Because using a regex tool won't tell me why my regex isn't working.
Geo
@Geo - what exactly do you mean by "isn't working". Isn't working at all, isn't matching the things you want to match or ... ?
ldigas
At the risk of stating the obvious, a regex tool can't tell you why it isn't giving you the right matches. A regex is going to do exactly what you tell it, and the best any tool can do is step you through so that you can figure out yourself which bit is wrong.
Noldorin
@Noldorin - in which case I'd reccommend a book, "Learning ..." by O'Reilly, wonderful for this kinda stuff.
ldigas
@Idigas: Not quite sure what you mean. There's a "Mastering Regular Expressions" book by O'Reilly... are you suggesting the OP reads this to understand RegEx better?
Noldorin
@Noldorin - yes, it's a nice book, very user friendly. Once he grasps the basics he'll have a easier time on.
ldigas
@Noldorin - "Mastering" is also a nice book, but learning is better for starters (it's not just basics)
ldigas
+1  A: 

Not sure about doing such a thing directly in Python, but I could definitely suggest using a RegEx editor tool. That's likely to be your best bet anyway. Personally, I've used The Regulator and found it to very helpful. Some others are listed in this SO thread.

Noldorin
+6  A: 

>>> p = re.compile('.*', re.DEBUG)
max_repeat 0 65535
  any None
>>>                         

http://stackoverflow.com/questions/580993/regex-operator-vs-separate-runs-for-each-sub-expression/582227#582227

Mykola Kharechko
Actually, it's re.compile('.*', re.DEBUG) -- which just happens to be equal to 128.
DzinX
+1  A: 

Similar to the already mentioned, there is also Regexbuddy

ldigas
A: 

I quite often use RegexPal for quick checks (an online regular expression prototyper). It has a lot of the common expressions listed along with a simple expression. Very handy when you don't have a dedicated tool and just need a quick way to work out a somple regex.

Jon Cage
A: 

What RegexBuddy has that the other tools don't have is a built-in debugger that shows you the entire matching process of both successful and failed match attempts. The other tools only show the final result (which RegexBuddy can show too).

Jan Goyvaerts