tags:

views:

57

answers:

3

Hey, I've been trying to figure out why this regular expression isn't matching correctly.


    List l_operators = Arrays.asList(Pattern.compile(" (\\d+)").split(rtString.trim()));
    
The input string is "12+22+3"

The output I get is -- [,+,+]

There's a match at the beginning of the list which shouldn't be there? I really can't see it and I could use some insight. Thanks.

+1  A: 

That's the behavior of split in Java. You just have to take it (and deal with it) or use other library to split the string. I personally try to avoid split from Java.

An example of one alternative is to look at Splitter from Google Guava.

nanda
+2  A: 

Well, technically, there is an empty string in front of the first delimiter (first sequence of digits). If you had, say a line of CSV, such as abc,def,ghi and another one ,jkl,mno you would clearly want to know that the first value in the second string was the empty string. Thus the behaviour is desirable in most cases.

For your particular case, you need to deal with it manually, or refine your regular expression somehow. Like this for instance:

Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(rtString);
if (m.find()) {
  List l_operators = Arrays.asList(p.split(rtString.substring(m.end()).trim()));
  // ...
}

Ideally however, you should be using a parser for these type of strings. You can't for instance deal with parenthesis in expressions using just regular expressions.

aioobe
ofcourse!!.. that makes perfect sense.. thanks..
bala singareddy
A: 

Try Guava's Splitter.

Splitter.onPattern("\\d+").omitEmptyStrings().split(rtString)
Emil