Please help me to discover whether this is a bug in Python (2.6.5), in my competence at writing regexes, or in my understanding of pattern matching.
(I accept that a possible answer is "Upgrade your Python".)
I'm trying to parse a Yubikey token, allowing for the optional extras.
When I use this regex to match a token without any optional extras (that is, containing only the stuff that matches the two capture groups), the match fails:
r'^\t?[^a-z0-9]?([cbdefghijklnrtuv1-8]{0,32})\t?([cbdefghijklnrtuv1-8]{32})\t?\r?\n?$'
However, if I make the first group non-greedy:
r'^\t?[^a-z0-9]?([cbdefghijklnrtuv1-8]{0,32}?)\t?([cbdefghijklnrtuv1-8]{32})\t?\r?\n?$'
it succeeds.
So, OK, it's working, but I would have thought that the only difference in end result between these two regexes would be performance.
Both Expresso and Regex Coach like both patterns.
What have I missed?
Here are two of the strings I'm testing with.
No optional extras (the ones that can fail):
"vvbrentlnccnhgfgrtetilbvckjcegblehfvbihrdcui"
With optional extras (haven't failed so far; actual tabs are shown here as "_"):
"_!_8R5Gkruvfgheufhcnhllchgrfiutujfh_"
"_!1U4Knivdgvkfthrd_brvejhudrdnbunellrjjkkccfnggbdng_"
I've tried to reproduce it using the suggestion from Alex Martelli, and it doesn't fail in the raw Python environment, so I'm going to revisit my code (I'm actually hacking on yubikey-python); I'll report back in a day or so.
My apologies to everyone. I cannot reproduce the problem. When it occurred, I was reading input via getpass
; I suspect that an accidental foreign keystroke got in the way.
I am going to close the question. If whoever upvoted the question wishes to remove their vote, that is fair.
Very sorry.