views:

580

answers:

4

I cant quite get my head around what the ^ is doing in my preg_match.

 if (preg_match('~^(\d\d\d\d)(-(\d{1,2})(-(\d{1,2}))?)?$~', trim($date), $dateParts)) {
   echo the $dateparts and do some magic with them
 } else {
   tell me the date is formatted wrong
 }

As I see it this is looking to see if the $date matches the format which I read as 4 decimals - 1 or 2 decimals - 1 or 2 decimals

if it does match then the IF statement displays the date, if it doesn't then it gives an error of incorrect date formatting.

However just passing it the year $date = '1977' with nothing else (no day or month) it still goes through as true and displays the dateparts, I would thought it would throw an error?

Can someone point out what I'm missing in the regular expression? I'm guessing it's the ^ or possibly the ?$ at the end may mean only match part of it?

+2  A: 

Try this:

'~^(\d\d\d\d)-(\d{1,2})-(\d{1,2})$~'

The problem was the regex was allowing the month and day as optional by the '?' character.

karim79
Thanks for the quick and clear response.
Paul M
+2  A: 

^ and $ anchor your pattern to the beginning and end respectively of the string passed in. The ? is a multiplier, matching 0 or 1 of the preceding pattern (in this case, the parenthesised bit).

Your pattern matches a year, or a year and a month, or a year and a month and a date; if you follow the parentheses, you'll see the final ? is operating on the parens surrounding the whole of the pattern after the year.

^    # beginning of string
    (\d\d\d\d)   #year
    (
        -(\d{1,2})   #month after a dash
        (
            -(\d{1,2}) #date after a dash
        )? #date optional
    )?   # month and date optional
$   # end of string
Jeremy Smyth
Thanks for breaking it down, also helped me a lot.
Paul M
+3  A: 

There is no need to group absolutely everything. This looks nicer and will do the same:

preg_match('~^\d{4}(-\d{1,2}(-\d{1,2})?)?$~', trim($date), $dateParts)

This also explains why "1977" is accepted - the month and day parts are both optional (the question mark makes something optional).

To do what you say ("4 decimals - 1 or 2 decimals - 1 or 2 decimals"), you need to remove both the optional groups:

preg_match('~^\d{4}-\d{1,2}-\d{1,2}$~', trim($date), $dateParts)

The "^" and "$" have nothing to do with the issue you are seeing. They are just start-of-string and end-of-string anchors, making sure that nothing else than what the pattern describes is in the checked string. Leave them off and "blah 1977-01-01 blah" will start to match.

Tomalak
This answers my question well and points to the specific problem I was having.
Paul M
You just scraped in with being awarded the right answer, there were 4 very good other responses, but I felt yours helped me the most in what I wanted to do. Never had such a good response, not being sure who to award the 'winning' answer too is a nice problem to have :)
Paul M
+2  A: 

Ok, let's break this up for you:

  • '~^(\d\d\d\d)(-(\d{1,2})(-(\d{1,2}))?)?$~'
  • ~ - in the beginning and the end are RegExp-delimiters, so they are not really part of the regular expression.
  • ^ - Means "This is the beginning of the line"
    • Avoids matches in the middle of the string, and anchors it so that the start of the string must match
  • (\d\d\d\d) - Matches (and captures) four digits, and is not optional
    • This could also be written as \d{4}
  • (-(\d{1,2})(-(\d{1,2}))?)? - Matches (and captures) an optional group.
    • It says that if this group exists, it must be a dash, followed by one or two digits (day or month), followed by a dash, followed by one or two digits (day or month)
  • $ - Means end of string, so this, together with ^ in the beginning of the string means that the whole string must match the Regexp.

Some examples of what this Regex will match:

  • 1982-08-11
  • 1982-30-01
  • 8127-99-52

Some examples that will NOT match:

  • 82-08-11
  • 2009-10

As you can see, this regex will accept some "dates" that are not really valid dates, so I would probably run it through some sort of date-handling function too, such as strtotime.

PatrikAkerstrand
Thanks for the walkthrough it clarified some of the issues I am having.
Paul M