tags:

views:

132

answers:

2

HI All,

I've been using the basic split for a while - where I just parse out a string into an array based on a simple token like " " or ",".

So of course a customer tries this: \\.br\ which fails miserably.

I need to parse to an array of lines. The string for example looks like this:

"LINE 1\\.br\\LINE 2\\.br\\LINE 3\\.br\\LINE 4\\.br\\"

and this fails with java.util.regex.PatternSyntaxException: Unexpected internal error.

Any ideas?

+5  A: 

You can use Pattern.quote() to escape a string for using as regexp:

... = s.split(Pattern.quote("\\.br\\"));
axtavt
this requires Java 5 by the way
newacct
So does using regexp-based splitting anyway, IIRC.
Donal Fellows
@Donal: Regex support, including `split()`, has been around since JDK 1.4. `quote()` was added in JDK 1.5.
Alan Moore
@Alan: OK, so I was wrong. (I'm working with webservices mostly, and have for quite a while now; they tend to force 1.6 as a minimum.)
Donal Fellows
+2  A: 

Don't forget String.split takes a Regex, and most people put in a string literal, so you have to escape the \ for regex and escape it for strings, so you end up with "\\\" in your String argument to represent a single \ character.

e.g.

String.split("\\\")

so in your case:

String.split("\\\\\\.br\\\\") // extra slash for regex, and String encoded
jayshao
-1 - you need four slashes, not three. Three slashes will give you a Java compilation error.
Stephen C
Actually, he need 5 slashes at the beginning because "." should be escaped too.
Ha
Actually, to match a backslash followed by a dot you need *six* backslashes: four to match the backslash and two to escape the dot.
Alan Moore
Good point - that's what I get for typing w/o the computer in front of me - updating - should only be 5 though, not 6, since the String above is also escaped in the literal.
jayshao
No, it has to be six: `"\\\\"` to match the backslash and `"\\."` to match the dot. And another four at the end to match the trailing backslash. The way you've got it now--five before and three after--won't even compile.
Alan Moore
Alan - you're correct, updated - and you're right, the 3 slashes will escape the end of the literal
jayshao