views:

71

answers:

6

I have a regular expression in PHP that looks for the date in the format of YYYY-MM-DD

What I have is: [\d]{4}-[\d]{2}-[\d]{2}

I'm using preg_match to test the date, the problem is that 2009-11-10 works, but 2009-11-1033434 works as well. It's been awhile since I've done regex, how do I ensure that it stops at the correct spot? I've tried doing /([\d]{4}-[\d]{2}-[\d]{2}){1}/, but it returns the same result.

Any help would be greatly appreciated.

+1  A: 

you're probably wanting to put anchors on the expression. i.e.

^[\d]{4}-[\d]{2}-[\d]{2}$

note the caret and dollar sign.

theraccoonbear
+3  A: 

How do you expect your date to be terminated ?

If an end-of-line, then a following $ should do the trick.

If by a non-digit character, then a following negative assertion (?!\d) will similarly work.

Brian Agnew
Using [^\d] will capture the non-digit character. Using a negative lookahead assertion would be better, as in `(?!\d)`
Kevin Ballard
Ah. Of course. Corrected. Thanks.
Brian Agnew
+7  A: 

What you need is anchors, specifically ^ and $. The former matches the beginning of the string, the latter matches the end.

The other point I would make is the [] are unnecessary. \d retains its meaning outside of character ranges.

So your regex should look like this: /^\d{4}-\d{2}-\d{2}$/.

Kevin Ballard
A: 

You could try putting both a '^' and a '$' symbol at the start and end of your expression:

/^[\d]{4}-[\d]{2}-[\d]{2}$/

which match the start and the end of the string respectively.

richsage
+1  A: 

[\d]{4}-[\d]{2}-[\d]{2}?

where the question mark means "non-greedy"

Gerd Klima
`\d{2}?` and `\d{2}` are identical. `{2}` already explicitly says match 2 of these atoms, so the greediness of this group has no effect.
Kevin Ballard
@Kevin B: thx. But why would $ as EOL work in the original `2009-11-1033434` example and not the `{2}`?
Gerd Klima
Because the $ matches the end of the string. `\d{2}` matches two digits, but doesn't say that the string has to stop there. `\d{2}$` says "match two digits, then ensure that we've reached the end of the string".Well, to be more specific ^ and $ actually match line beginning/end in a multiline context, but in a non-multiline context they match string beginning/end. There's also `\A` and `\z` (or `\Z`), which always match string beginning/end regardless of multiline context.
Kevin Ballard
OK, I haven't gone completely insane, only a little ;-) I know about the EOL concept and understand it in the way you described. But I thought: He obviously has a string like `abcdef2009-11-1033434abcdef`.With the regexp he matches `2009-11-1033434`. So why would the EOL (or end-of-string in this context) trigger?
Gerd Klima
+1  A: 

You probably want look ahead assertions (assuming your engine supports them, php/preg/pcre does)

Look ahead assertions (or positive assertions) allow you to say "and it should be followed by X, but X shouldn't be a part of the match). Try the following syntax

\d{4}-\d{2}-\d{2}(?=[^0-9])

The assertion is this part

(?=[^0-9])

It's saying "after my regex, the next character can't be a number"

If that doesn't get you what you want/need, post an example of your input and your PHP code that's not working. Those two items can he hugely useful in debugging these kinds of problems.

Alan Storm
A positive lookahead assertion will fail if it encounters the end of the string. Using a negative lookahead assertion like `(?!\d)` is better.
Kevin Ballard