views:

84

answers:

2

I am porting some functionality from a C++ application to java. This involves reading non-modifiable data files that contain regular expressions.

A lot of the data files contain regular expressions that look similar to the following:

(?<=id="VIEWSTATE".*?value=").*?(?=")

These regular expressions produce the following error:

"Look-behind group does not have an obvious maximum length near index XX"

In C++ the engine being used supported these expressions. Is there another form of regexp that can produce the same result that can be generated using expressions like my example as input?

+1  A: 

The only workaround seems to be: Replace the star with {0,ALMOST_INTEGER_MAX_VALUE}, where the upper limit can be large, but must be small enough for the whole lookbehind group to have a maximum length not larger than Integer.MAX_VALUE.

See also http://stackoverflow.com/questions/1536915/regex-look-behind-without-obvious-maximum-length-in-java

Christian Semrau
+1  A: 

As far as I know, only .NET and JGSoft, among all the current regex flavors, support unbounded quantifiers in lookbehind expressions. If you can't change the regex, you can't do what you want in Java.

But lookbehind is the wrong way to do that job in the first place. It would have been much easier, as well as more efficient, to use a capturing group:

id="VIEWSTATE".*?value="([^"]*)"

...then you retrieve the value from group #1. Are you sure you can't change the regexes?

Alan Moore
I really appreciate the response, and I completely understand the reasoning behind it, but unfortunately I cannot change the regex values. They are fed to the application externally from a process I have no control over, and from a number of 3rd party sources (many which implement them as shown)
CDSO1
I implemented it by converting the incoming regex to the format you specified using another regex to match against the pieces of the original (if that makes sense). Thank you
CDSO1
That sounds like fun! ;) Glad I could help.
Alan Moore