views:

48

answers:

3

Given a regexp, I would like to generate random data x number of time to test something.

e.g.

>>> print generate_date('\d{2,3}')
13
>>> print generate_date('\d{2,3}')
422

Of course the objective is to do something a bit more complicated than that such as phone numbers and email addresses.

Does something like this exists? If it does, does it exists for Python? If not, any clue/theory I could use to do that?

+1  A: 

There is a post on the Python mailing list about a module that generates all permutations of a regex. I'm not so sure how you might go about randomising it though. I'll keep checking.

detly
+1 cause you searched for it.
e-satis
@e-satis - I actually found it somewhere else on SO :)
detly
+1  A: 

I will probably be flogged for suggesting this, but perl has a module that does exactly this. You might want to take a look at the code how to implement it in python:

http://p3rl.org/String::Random

nicomen
SO doesn't have a [whip] button yet, so you're safe.
detly
Interesting to know it exists, at least for perl. +1
e-satis
+2  A: 

Pyparsing includes this regex inverter, which returns a generator of all permutations for simple regexes. Here are some of the test cases from that module:

[A-C]{2}\d{2}
@|TH[12]
@(@|TH[12])?
@(@|TH[12]|AL[12]|SP[123]|TB(1[0-9]?|20?|[3-9]))?
@(@|TH[12]|AL[12]|SP[123]|TB(1[0-9]?|20?|[3-9])|OH(1[0-9]?|2[0-9]?|30?|[4-9]))?
(([ECMP]|HA|AK)[SD]|HS)T
[A-CV]{2}
A[cglmrstu]|B[aehikr]?|C[adeflmorsu]?|D[bsy]|E[rsu]|F[emr]?|G[ade]|H[efgos]?|I[nr]?|Kr?|L[airu]|M[dgnot]|N[abdeiop]?|Os?|P[abdmortu]?|R[abefghnu]|S[bcegimnr]?|T[abcehilm]|Uu[bhopqst]|U|V|W|Xe|Yb?|Z[nr]
(a|b)|(x|y)

Edit:

To do your random selection, create a list (once!) of your permutations, and then call random.choice on the list each time you want a random string that matches the regex, something like this (untested):

class RandomString(object):
    def __init__(self, regex):
        self.possible_strings = list(invRegex.invert(regex))
    def random_string(self):
        return random.choice(self.possible_strings)
Paul McGuire
+1 That's awesome!
katrielalex
Almost what I'm looking for. +1
e-satis
I've also packaged this module up as a utility on UtilityMill: http://utilitymill.com/utility/Regex_inverter. All UM utilities expose XML and JSON API's, so you can call this remotely from your own code, and UtilityMill does the regex inversion processing.
Paul McGuire