views:

331

answers:

2

Does anyone know how to split a string on a character taking into account its escape sequence?

For example, if the character is ':', "a:b" is split into two parts ("a" and "b"), whereas "a\:b" is not split at all.

I think this is hard (impossible?) to do with regular expressions.

Thank you in advance,

Kedar

+2  A: 

(?<=^|[^\\]): gets you close, but doesn't address escaped slashes. (That's a literal regex, of course you have to escape the slashes in it to get it into a java string)

(?<=(^|[^\\])(\\\\)*): How about that? I think that should satisfy any ':' that is preceded by an even number of slashes.

Edit: don't vote this up. MizardX's solution is better :)

Jeremy Huiskamp
The key is the (?<=foo) construct, positive look-behind. You need to check what precedes the ':' without matching it.
Jeremy Huiskamp
MizardX points out that look-behind needs to have a finite length. Mine doesn't so I guess it wouldn't work (have not tested). I believe our solutions are otherwise similar. His is probably better in that it uses negative look-behind to check for a non-slash character, whereas I use "^|[^\\]" which may or may not act differently in multi-line scenarios (not sure).
Jeremy Huiskamp
(^|[^\\]) should work. ^ could possibly match start of a line instead of a string. That's fine, since it still assures that it is not a backslash there. [^\\] will also match newlines, so no problem when multi-line mode is not used either.
MizardX
dont tell me who to vote up! I am voting you both up :)
willcodejavaforfood
+2  A: 
MizardX