views:

85

answers:

3

Hey again,

I just asked a question about Regex, and received a great answer: http://stackoverflow.com/questions/3047201/javascript-split-without-losing-character

Now, I have another question.

My current Regex looks like this:

var split = text.split(/(?=\w*\d*\d:\d\d)/);

Basically, I'm trying to split using the timestamps (eg - 9:30 or 10:30, the difference between them is the extra digit in the latter). How do I go about this?

Currently, if I have these two:

9:30 pm
The user did action A.

10:30 pm
Welcome, user John Doe.

The splits are :

9:30 pm
The user did action A.
----
1
----
0:30 pm
Welcome, user John Doe.

How do I add an optional check for the first character in the timestamp?

Thanks!

+1  A: 
var split = text.split(/(?=\w*[\d]{1,2}:[\d]{2})/);

RegexPal is helpful for these tasks.

Andy
Thanks, but it doesn't seem to work.
Rohan
Which doesn't -- that pattern or the site? If you are using the site, you need to just put in the pattern, i.e. `(\w*\d*\d:\d\d)` will match the input you gave.
Andy
+ for great link
chapluck
+1  A: 

I'm not clear on what you're trying to do to the text, but I do have a regex that hopefully can help match the times only.

\d{1,2}:\d{1,2} (am|pm)

The problem with your regex and andy's regex is that the * is greedy. It means zero or more matches, as many times as possible. Using {min,max} with the exact numbers you need will be more accurate and avoid the greedy *.

edit: Andy's does in fact work on that site he linked. And the * doesn't seem to be greedy. Does either pattern work for you?

robertpateii
@robert: `{min, max}` was the first thing I tried. The real problem is the lookahead, it will look ahead and find `0:30` and split on that, instead of `10:30`. Adding a required word boundary forces the lookahead to find where the "word" begins. +1 and welcome to Stack Overflow :-)
Andy E
+1  A: 

See my answer to your other question, I fixed this problem in the regex by adding a word boundary:

var split = journals.split(/\s*(?=\b\d+:)/);

Updated it with \s* to strip out any unnecessary whitespace (but not the line breaks) too. Result:

["9:30 pm    
The user did action A.", "10:30 pm  
Welcome, user John Doe.", "11:30 am
Messaged user John Doe"]
Andy E
Perfect, thanks so much!!
Rohan