views:

59

answers:

1

Is there an efficient way to search a message for substrings which might represent a time?

For example, this message:

let's meet tomorrow at 14:30 or do you prefer 2:30pm?

should return ('14:30', '2:30pm'). Finding hh:mm times can be easily achieved using a simple regex, but I'm wondering if there are existing solutions to find more than the simple cases.

+2  A: 

Here's a regex I came up with:

^((\d{1,2}:\d{2}\s?([ap]m?)?)|(\d{1,2}\s?[ap]m?))$

It matches:

2:10
14:20
10:00am
3:49p
4pm
10a 

But not:

12
22:342
14:0
20rpm

As seen on rubular

I think it would be just too difficult for it to be much smarter than this. For example, "I have 2 classes after 2 tomorrow" you can't expect a program to correctly identify which numbers can be interpreted as time unless it's able to understand semantics - but that's a whole different story

PS: The regex also matches string like 99:99 am, which can be fixed but would make the regex even more confusing and just not worth to fix IMO.

NullUserException
a "little bit" more complex than my own regex but definitely a good start.
knittl
Simplified regex: `^\d{1,2}:\d{2}\s*(?:[ap]m?)?|\d{1,2}\s*[ap]m?$` I removed all those capturing groups. This will yield in a cleaner result array.
nikic
@niki You broke some of the functionality (now it matches 11:111), but I like it. I actually had something cleaner, but I figured the OP'd would want to match the whole thing anyways so the groups could be ignored.
NullUserException