tags:

views:

170

answers:

11

What is the correct regular expression to use to validate a date like. 2009-10-22 or 2009-01-01 etc. Platform PHP

A: 

In .NET Regex:

\d{4}\-\d{2}\-\d{2}
Nestor
Note that in .NET (and other unicode-aware regex engines), \d matches all Unicode Nd charts (there are 370 of them), not only 0-9, if not used with the ECMAScript option.
Lucero
A: 
[0-9]{4}-[0-9]{2}-[0-9]{2}

or

\d\d\d\d-\d\d-\d\d

or

...

simply read first regex tutorial

Michał Niklas
A: 
 ^\d{4}-\d{2}-\d{2}$

but no regular expression can prevent someone to enter "9867-39-56"

stereofrog
Yes it can ... but it will be a fairly complex regexp.
Pop Catalin
yes, fairly complex and quite stupid waste of time
stereofrog
+1  A: 

As you have to deal with accepting 2009-02-28 but not 2009-02-29 but accept 2008-02-28 you need more logic that 1 think a regex can give. (But if someone can show it I would be impressed)

I would try to convert it to a date and report if the conversion failed or if you you language has a check date function use that.

Mark
+7  A: 

This (from regexplib.com) will match what you want, and perform checks for leap years, days-per-month etc. It's a little more tolerant of separators than you want, but that can be easily fixed. As you can see, it's rather hideous.

Alternatively (and preferably in my opinion) you may want to simply check for figures in the correct places, and then perform leap year and days-per-month checks in code. Sometimes one regexp isn't so understandable and there's greater clarity in performing the checks in code explicitly (since you can report precisely what's wrong - "only 30 days in November", rather than a "doesn't match pattern" message, which is next to useless)

Brian Agnew
Part of me wants to downvote this, not because it's wrong, but because that is an abuse of regular expressions. The other part of me wants to upvote for finding a solution that meets all of the OP's requirements.
Thomas Owens
Now that's a wild regex... I'd be tempted to call it an abuse but hey, if someone is desperate to do this using a regex... Nevertheless, I'm impressed that someone worked the problem through.
Lazarus
I see your dilemma Thomas..
Cloud
Yes. It's hideous. Note my reservations above. My approach is to give someone what they want, and hopefully demonstrate why it's *not* ideal :-)
Brian Agnew
@Brian Okay, +1 from my part
Cloud
With that second paragraph, I can +1 you and rest easily tonight. I like it.
Thomas Owens
@Cloud - thx. Following your and Thomas' comments I decided to be more explicit in my motivation :-)
Brian Agnew
+1 for the humor associated with this.I DO like your explaination paragraph - agree validation is a tricky thing and precise explainations are better than wiping the date and just saying it was bad :)_
Mark Schultheiss
+1  A: 

\d{4}-\d{2}-\d{2} would match string in that form, but to check if date is valid, you'd had to break that string to year, month and date (you can use this regexp for that) parts and check each of them.

You can additionally, make sure that year must start with 1 or 2: [12]\d{3}-\d{2}-\d{2}, and you can also do the same for month and day: [12]\d{3}-[01]\d-[0123]\d (but I would go with the first regexp and compare parts "manually")

krcko
+3  A: 

If you want something simple that does a little more than just validates format, but doesn't go as far as validating how many days is in the month that is entered, or leap years, you can use this:

^(19|20)[0-9]{2}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$

This example allows years 19xx and 20xx

reko_t
+1 for not trying to accept 3141 as a valid year. I think the chances are nearly zero that any code written today will be used 100 years from now, let alone in 1000 years.
Dan
A: 

For a complete validation (which would include verifying that the day, month and year parts are valid) a Regex is not the tool of choice. Apart from month issues you'd get into trouble with leap years...

So, if you just want to check if the rough format is correct, or isolate the different parts (year-month-day), a regex is fine.

([0-9]{1,4})-(1[012]|0?[1-9])-([12][0-9]|3[01]|0?[1-9])

This is already pretty exact and captures the year (0..9999), month and day into capture groups, ready for parsing...

Lucero
+1  A: 

found this on the web tested it with a few dates and looks stable, for dates between 1900 and 2000:

(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])
Luke Duddridge
The year range is 1900-2099 :-)
Hans Kesting
Thanks for the correction :)
Luke Duddridge
+1  A: 

OK, a regex that will validate month and day ranges could be

[0-9]{4}-(?:1[0-2]|[1-9])-(?:3[01]|[12][0-9]|[1-9])

If you want to restrict the years, say, from 1900 to 2050, you could end up with

(?:2050|20[0-4][0-9]|19[0-9]{2})-(?:1[0-2]|[1-9])-(?:3[01]|[12][0-9]|[1-9])

They will not catch "subtly wrong" dates like February 31st, so it's really quite clear that a sanity check needs to be performed outside of the regex.

Tim Pietzcker
A: 

If you can rely on more than a regular expression, an hybrid solution by using Posix functions date() and time() delivered with PHP could look like this:

<?php
date_default_timezone_set("GMT");

function validateDate($date)
{
    if (preg_match("^[0-9]{4}-[0-9]{2}-[0-9]{2}^", $date))
    {
     return date('Y-m-d', strtotime($date)) === $date;
    }
    return false;
}

// Some tests
$dates = array(
    '2009-09-09', '2009-09-32', '2009-31-00', '2035-01-02',
);
foreach($dates AS $date)
{
    echo $date .': '. (validateDate($date) ? 'OK' : 'FAILS') ."\n";
}
?>

It's not elegant plus you'll be limited by Unix Epoch time (from January 1 1970 00:00:00 GMT to January 19 2038 03:14:07 GMT), but it's reliable and it's well supported in PHP.

JP