views:

288

answers:

4

I can't beleive I've never come across this one before.

Basically, I'm parsing the text in human-created text documents and one of the fields I need to parse is a date and time. Because I'm in Australia, dates are formatted like dd/mm/yyyy but strtotime only wants to parse it as a US formatted date. Also, exploding by / isn't going to work because, as I mentioned, these documents are hand-typed and some of them take the form of d M yy.

I've tried multiple combinations of setlocale but no matter what I try, the language is always set to US English.

I'm fairly sure setlocale is the key here, but I don't seem to be able to strike upon the right code. Tried these:

  • au
  • au-en
  • en_AU
  • australia
  • aus

Anything else I can try?

For clarity: I'm running on IIS with a Windows box.

Thanks so much :)

Iain

Example:

$mydatetime = strtotime("9/02/10 2.00PM");
echo date('j F Y H:i', $mydatetime);

Produces

2 September 2010 14:00

I want it to produce:

9 February 2010 14:00

My solution

I'm giving the tick to one of the answers here as it is a much easier-to-read solution to mine, but here's what I've come up with:

$DateTime = "9/02/10 2.00PM";
$USDateTime = preg_replace('%([0-3]?[0-9]{1})\s*?[\./ ]\s*?((?:1[0-2])|0?[0-9])\s*?[./ ]\s*?(\d{4}|\d{2})%', '${2}/${1}/${3}', $DateTime);  
echo date('j F Y H:i',strtotime($USDateTime));

Because I can't rely on users to be consistent with their date entry, I've made my regex a bit more complex:

  • 0 or 1 digit between 0 and 3
  • 1 digit between 0 and 9 -- yes this will match 37 as a valid date but I think the regex is already big enough!
  • Could be some whitespace
  • Delimiting character (a '.', a '/' or a ' ')
  • Could be some whitespace
  • Either:
    • A number between 10 and 12 OR
    • A number between 1 and 9 with an optional leading 0
  • Could be some whitespace
  • Delimiting character (a '.', a '/' or a ' ')
  • Could be some whitespace
  • Either:
    • A number 2 digits long OR
    • A number 4 digits long

Hopefully this will match most styles of date writing...

+2  A: 

I think a good place to start would be here.

Paulo Santos
Given you the big tick of approval because I got the idea for my solution here
Iain Fraser
+1  A: 

setlocale() sucks for exactly the reason you describe: You never know what you're going to get. Best to process the string manually.

Zend Framework's Zend_Date is one alternative promising more exact and consistent date handling. I don't have experience with it myself yet, just beginning to work with it, but so far, I like it.

Pekka
Yeah, that's kind of what I did in the end. Well, actually I just reformatted the string so UK dates become US dates and so far it has worked on all my test cases - yay!
Iain Fraser
+2  A: 

The problem is that strtotime doesn't take a format argument. What about strptime?

kiwicptn
Dang, would have loved to have tried this but I'm in a WIMP environment :(
Iain Fraser
A: 

Ah, the old problem us lucky Australians get.

What I've done in the past is something like this

public static function getTime($str) { // 3/12/2008

       preg_match_all('/^(\d{1,2})\/(\d{1,2})\/(\d{4})$/', $str, $matches);

       return (isset($matches[0][0])) ? strtotime($matches[3][0] . '-' . $matches[2][0] . '-' . $matches[1][0]) : NULL;

    }

Though this relies on dates in this format dd/mm/yyyy.

You can probably use another regex or so to convert from d M yy or use a modified one. I don't know if this would be correct but it may be a start:

/^(\d{1,2})(?:\/|\s)(\d{1,2})(?:\/|\s)(\d{2,4})$/

alex
Thanks for the effort mate. Not sure why you got down-voted though. I guess my only criticism of this method is that it's very specific given the nature of possible inputs - but that's my problem, I'm sure this is some generic code you have lying around in your toolbox for problems with more consistant inputs. My answer is similar to yours in that I try to match using regex, but rather than return a formatted match, I just convert my Australian string into US format.
Iain Fraser
Also, to the person that downvoted... I'd be interested to know why? Not trolling, I'd just be interested to know - I might learn something :)
Iain Fraser
Yeah, it does rely on one format, except it came from a helper function I wrote that only is given the one format. In the example above, I'm not returning a formatted match, but a unix time from the date once passed as an ISO standard (the same MySQL uses, and PHP can easily use in `strtotime()`). I'm not sure of the downvote, maybe it's cause I used a regex when some people would of preferred `explode()`.
alex
Which is silly because when you're dealing with user input, you can't rely on there being a consistent delimeter to `explode` on... hence; regex!
Iain Fraser