tags:

views:

403

answers:

2

hi,

i'm searching forward in an array of strings with a regex, like this:

for (int j = line; j < lines.length; j++) {  
    if (lines[j] == null || lines[j].isEmpty()) {
        continue;
    }
    matcher = pattern.matcher(lines[j]);
    if (matcher.find(offset)) {
        offset = matcher.end();
        line = j;
        System.out.println("found \""+matcher.group()+"\" at line "+line+" ["+matcher.start()+","+offset+"]");
        return true;
    }
    offset = 0;
}
return false;

note that in my implementation above i save the line and offset for continuous searches.
anyway, now i want to search backwards from that [line,offset].

my question: is there a way to search backwards with a regex efficiently? if not, what could be an alternative?

10x, asaf :-)

clarification: by backwards i mean finding the previous match.
for example, say that i'm searching for "dana" in "dana nama? dana kama! lama dana kama?" and got to the 2nd match. if i do matcher.find() again, i'll search forward and get the 3rd match. but i want to seach backwards and get to the 1st match.
the code above should then output something like:

found "dana" at line 0 [0,3] // fwd
found "dana" at line 0 [11,14] // fwd
found "dana" at line 0 [0,3] // bwd
A: 

Is the search string strictly a regex (full, rich syntax?) Because if not, for(int j = line; j >= 0 ; j--), reverse the line, reverse the match and search forward ;)

SF.
thanks for answering. clearly i'll need to search backwards in the lines. so, you solution is actually reverse the line + reverse the pattern + map back the match indices (start/end). (a) i'm wondering if there's a more efficient way... (b) reversing a regex pattern? i donnow...
Asaf
Yes, as long as the pattern is `/some text/` it's okay, but if you search for `/^[0-9]+\s(\w+)/` or the like, this will obviously break. Another approach (that would not break likewise) would be to append a greedy `.*` in the beginning of the pattern, and truncate searched line at each found match, but finding the actual offset would become more problematic (the `.*` will always match at 0), so you'd have to substract length of your real match from matcher.end() Not very efficient again...
SF.
+1  A: 

Java's regular expression engine cannot search backwards. In fact, the only regex engine that I know that can do that is the one in .NET.

Instead of searching backwards, iterate over all the matches in a loop (searching forward). If the match is prior to the position you want, remember it. If the match is after the position you want, exit from the loop. In pseudo code (my Java is a little rusty):

storedmatch = ""
while matcher.find {
  if matcher.end < offset {
    storedmatch = matcher.group()
  } else {
    return storedmatch
  }
}
Jan Goyvaerts
yeh, that's more or less what i did. thanks.
Asaf