tags:

views:

78

answers:

1

I want to match on a regex and modify the match. here is my function. right now, my method doesn't change the input at all. what is wrong? thanks.

    Matcher abbrev_matcher = abbrev_p.matcher(buffer);
    StringBuffer result = new StringBuffer();//must use stringbuffer here!
    while (abbrev_matcher.find()){
        //System.out.println("match found");
        abbrev_matcher.appendReplacement(result, getReplacement(abbrev_matcher));
    }
    abbrev_matcher.appendTail(result);

private static String getReplacement(Matcher aMatcher){
    StringBuilder temp = new StringBuilder(aMatcher.group(0));
    for (int i = 0; i < temp.length(); i++){
      if (temp.charAt(i) == '.'){
          temp.deleteCharAt(i);
      }

    }
    return temp.toString();
}
+1  A: 

You just want to remove all the dots from the matched text? Here:

StringBuffer result = new StringBuffer();
while (abbrev_matcher.find()) {
    abbrev_matcher.appendReplacement(result, "");
    result.append(abbrev_matcher.group().replaceAll("\\.", ""));
}
abbrev_matcher.appendTail(result);

The reason for the appendReplacement(result, "") is because appendReplacement looks for $1, $2, etc., so it can replace them with capture groups. If you aren't passing string literals or other string constants to that method, it's best to avoid that processing step and use StringBuffer's append method instead. Otherwise it will tend to blow up if there are any dollar signs or backslashes in the replacement string.

As for your getReplacement method, in my tests it does change the matched string, but it doesn't do it correctly. For example, if the string is ...blah..., it returns .blah.. That's because, every time you call deletecharAt(i) on the StringBuilder, you change the indexes of all subsequent characters. You would have to iterate through the string backward to make that approach work, but it's not worth it; just start with an empty StringBuilder and build the string by append-ing instead of deleting. It's much more efficient as well as easier to manage.

Now that I think about it some more, the reason you aren't seeing any change may be that your code is throwing a StringIndexOutOfBoundsException, which you aren't seeing because the code runs in a try block and the corresponding catch block is empty (the classic Empty Catch Block anti-pattern). N'est-ce pas?

Alan Moore
Thanks for your help Alan, but that's not quite what I'm trying to do. The "dot delete" was just an example. I want to do more sophisticated dynamic replacement, like matching different forms of dates: "Oct 10, 2005; 10/10/2005; 10/10/05" and normalizing them. Perhaps this isn't possible with these helper methods :-/
Of course it's possible, and the way you're doing it is just right. You just have an error somewhere that's not apparent in the code you posted. Try using Elliott Hughes's Rewriter class instead, so you only have to write the `replacement` method: http://elliotth.blogspot.com/2004/07/java-implementation-of-rubys-gsub.html
Alan Moore