tags:

views:

62

answers:

3

I'm creating a calendar where users can set events and time in single line, for example:

"6pm supper" - event with start time only

"8:00 - 16:00 work" - event with time period

Regex I'm currently using to get times:

[\d]{1,2}[.|:]?[\d]{0,2}[\s]?[am|pm|AM|PM]{0,2}

It works fine but I can't figure out how to filter out the unwanted occurrences of time if they happen, for example:

"6pm supper at '8pm' restaurant" In this example '8pm' is a restaurant name but it will be interpreted as event with time period while it's not. I suppose I have to write a regex that will only match time pattern in the beginning of line and the next pattern that follows after it without any words between but I have no success composing such a regex so far.

Any suggestions?

A: 

Would ^[\d]{1,2}[.|:]?[\d]{0,2}[\s]?[am|pm|AM|PM]{0,2} fix the problem of matching the '8pm' in your example?

The ^ is used to match the start of a line. $ can be used for matching the end of a line (in case you need that for later;) ).

UPDATE:

This one's a bit ugly but it seems to work:

[^'"][\d]{1,2}[.|:]?[\d]{0,2}[\s]?[am|pm|AM|PM]{0,2}[^'"]|^[\d]{1,2}[.|:]?[\d]{0,2}[\s]?[am|pm|AM|PM]{0,2}

The first option ensures that if a time appears in the middle of a string, it can't be surrounded by any kind of quote character. The second option allows for times that are at the start of a string. This is ugly looking and can probably be improved somewhat... but it seems to work for me.

UPDATE:

I think this version's a little easier to read:

([^'"]|^)[\d]{1,2}[.|:]?[\d]{0,2}[\s]?[am|pm|AM|PM]{0,2}[^'"]

FrustratedWithFormsDesigner
It will find the first occurrence of time only. I need two occurrences for events with time period.
Anton
@Anton: I'm confused: you say you want both times, but in your example, you only want the first one. Why should the second time ("8pm") be excluded in your example?
FrustratedWithFormsDesigner
I'm sorry for not making this clear, '8pm' in last example is a name of restaurant, not an event related time. Just wanted to make it look like real world example to illustrate my problem.
Anton
@Anton: Ah those sneaky restaurateurs and their oh-so-clever names! in this example, it looks like you want to exclude a time that is surrounded by quotes. So maybe that needs to be a part of the expression.
FrustratedWithFormsDesigner
Quotes are not the case there. If someone enters such a name without quotes this regex will fail.
Anton
@Anton: Ah so really it was a problem of finding the first Time string, and ignoring the others?
FrustratedWithFormsDesigner
A: 

You could try using a lookbehind construct, to only select dates that are not preceded by letters other than "a","p", and "m". Something in the line of

(?<![letters other than apm].*)

According to http://www.regular-expressions.info/lookaround.html, not all Regex implementations support this in the needed extent, though. Most do not seem to allow .* in a lookbehind.

Jens
A good idea, but it's dependent on having a compatible regex dialect.
FrustratedWithFormsDesigner
The idea is great but neither PHP nor Perl seem to support lookbehind with variable length and they are the only languages I can use on project. Any ideas about how this can be accomplished without lookbehind?
Anton
+1  A: 

What if you used the following regex

([\d]{1,2}[.|:]?[\d]{0,2}[\s]?[apm|APM]{0,2})( - )?([\d]{1,2}[.|:]?[\d]{0,2}[\s]?[apm|APM]{0,2})?(.*)

This would allow you to access the different sections e.g. 6pm supper at '8pm' restaurant would be:

(6pm)()()( supper at '8pm' restaurant)
 $1  $2$3 $4
Nalum
Good solution, thanks.
Anton
Glad to be able to help. Not sure if that can be cleaned up or but it works.
Nalum