views:

149

answers:

3

Specifically, is there a library that, when given 2 (or more) regular expressions, can tell if exists an input that both would match? Bonus points if it's easily accessible via Java or .NET, but command-line would be fine as well.

Asker's log, supplemental:

The regular expressions that would be fed to this algorithm are fairly simple. While I believe there are a couple with lookaheads, they are all fairly simple combinations of literals or character classes with fixed minimum and maximum lengths.

+1  A: 

If there were it would not run in a useful amount of time. Comparing regex is a PSPACE problem

http://en.wikipedia.org/wiki/PSPACE-complete

You may have some luck if you can allow extra restrictions on your regex

gnibbler
If I can allow restrictions, or limit to a subset of regex syntax, is there a library that has implemented this?
Chris Marasti-Georg
+2  A: 

If, I understand you correctly, you would like to know if the intersection of 2 regular expressions is the empty set or not? I believe that's hard, but I wouldn't be surprised if the complexity was exponential in the length of the regex (though some regexes would obivously be easier than others)

Regardless, here's a Haskell implementation: http://sulzmann.blogspot.com/2008/11/playing-with-regular-expressions.html

And a prolog implementation http://www.let.rug.nl/vannoord/Fsa/

Falaina
Thanks, I'm checking out the prolog library now.
Chris Marasti-Georg
It looks like those libraries would require me to do some pre-processing on my expressions to convert them to a form the library would understand. At that point, I'd probably be better off trying to find the intersection myself.
Chris Marasti-Georg
+1  A: 

I found a python library that lets me do what I need to do.

>>> import reCompiler
>>> fsa1 = reCompiler.compileRE('\d\d\d?\d?a')
>>> fsa2 = reCompiler.compileRE('123a')
>>> fsa3 = reCompiler.compileRE('a23a')
>>> print len(FSA.intersection(fsa1, fsa2).finalStates)
1
>>> print len(FSA.intersection(fsa1, fsa3).finalStates)
0

The library is called pyFSA. I will need to implement some preparsing to turn statements like \d{2,4} into \d\d\d?\d?, but other than that it should suit my needs nicely. Thanks for the input, and if people find libraries that implement this in other languages, by all means include them.

Chris Marasti-Georg
Even though this answer is your own, you should still accept it so it is marked as the best one.
steveha