tags:

views:

173

answers:

2

I am completely new to regular expressions so I'm looking for a bit of help here.

I am compiling under JDK 1.5

Take this line as an example that I read from standard input:

ab:Some string po:bubblegum

What I would like to do is split by the two characters and colon. That is, once the line is split and put into a string array, these should be the terms:

ab:Some string
po:bubblegum

I have this regex right now:

String[] split = input.split("[..:]");

This splits at the semicolon; what I would like is for it to match two characters and a semicolon, but split at the space before that starts. Is this even possible?

Here is the output from the string array:

ab
Some String po
bubblegum

I've read about Pattern.compile() as well. Is this something I should be considering?

+3  A: 
input.split(" (?=[A-Za-z]{2}:)")

The ?= creates a positive lookahead. This means the engine looks ahead to see if the next part matches, without consuming that part. If it does match, it splits on the space character. [A-Za-z] means a upper or lower-case letter, while {2} specifies we want two characters matching that class.

Matthew Flaschen
Man that was fast—it works great. Thanks a bunch. Would it be possible for you to explain how it is working? I can understand that it is looking for two upper- or lower-case characters and then a colon but thats it.Excellent, thanks again.
slikz
+1  A: 

You asked about Pattern#compile(String pattern). You should consider using it if you are going to use the regex a lot since the aforementioned method compiles the regex into something that's fast to execute while using String#split(String regex) directly always recompiles the regex.

Esko
Yes, thanks. I just implemented it with the regex from Matthew and it still works.
slikz