views:

1042

answers:

5

In my application, I have some records which I need to filter based on some time parameters. To do the date comparisons, I need to convert the dates in my records (a string in the format YYYY-MM-DD), to a unix timestamp (seconds since 1970). Since I have many thousands of records, I really want to find the most efficient way to do it.

Answers to another question suggested doing it in the database (not an option here, sadly), using strtotime() or strptime(), but these don't feel like the most memory- and time-efficient methods, you know?

Given that I know the exact format of my input, would perhaps using some form of string manipulation (substr, or explode) combined with mktime be better?

If you believe I'm being premature in my optimisation, then just humour me ok?

A: 

As the string is nicely build-up, I'd build the timestamp walking the string from left to right. You can pre-build a table of years->seconds, months -> seconds, days -> seconds and correct for leap years. Multiplication is evil :)

Or with less effort, build a hashtable to eliminate duplicates.

But you'll need a lot of records for this to be faster. And it sounds you are using a database. That's incompatible with fast...

Stephan Eggermont
+1  A: 

It seems to me that if you really want the most efficient way to do it, you will have to do it all the ways that you have suggested, plus any others you find, do some benchmarking, and select the winner.

Eli
+5  A: 

Why do you think strtotime() is not an efficient function? Have you done any benchmarks. Keep in mind that strtotime() is a builtin library function which will is written directly in C, whereas anything you come up will run through the php interpreter.

Also keep in mind that date calculations are inherently tricky, do to issues like leap years and timezones.

As for an alternate faster method, I think using substr() to parse will be faster than a regex. Feed the results in to mktime() and you're set.

Are all built in functions written in C? Can anyone confirm ?
alex
@alex - Pretty much. I just downloaded the 5.2.9 tarball and took a peek inside there. All of the PHP files in it seemed to be either tests or build scripts.
Sean McSomething
+2  A: 

strtotime is way (and I mean WAAAY) faster than anything you can come up to.

How likely can the dates be reapeated in your records? If they aren´t completely unique, you can do some sort of caching. Assuming you´re in a class:

protected $_timestamps = array();

public function toTimestamp($date)
{
    if (!array_key_exists($date, $this->_timestamps)) {
        $this->_timestamps[$date] = strtotime($date);
    }
    return $this->_timestamps[$date];
}

Now you just have to do:

foreach ($records as $record) {
    $timestamp = $this->toTimestamp($record->date);
}
Luiz Damim
+1  A: 

Ok, I've benchmarked 4 different methods:

function _strtotime($date) {
    return strtotime($date);
}

function _preg($date) {
    preg_match("@^(\\d+)-(\\d\\d)-(\\d\\d)$@", $date, $matches);
    return mktime(0, 0, 0, $matches[2], $matches[3], $matches[1]);
}

function _explode($date) {
    list($y,$m,$d) = explode("-", $date);
    return mktime(0, 0, 0, $m, $d, $y);
}

function _substr($date) {
    $y = substr($date, 0, 4);
    $m = substr($date, 5, 2);
    $d = substr($date, 8);
    return mktime(0, 0, 0, $m, $d, $y);
}

// I also added this one as sort of a "control"
function _hardCoded() {
    return mktime(0, 0, 0, 3, 31, 2009);
}

And here are the results of running each function 100,000 times. With the input "2009-03-31"

_hardCoded: 2.02547
_strtotime: 2.35341
_preg:      2.70448
_explode:   2.31173
_substr:    2.27883
nickf
Is that small benefit going to warrant you using the substr() over the strtotime() function?
alex
nope! I think I'll definitely go with strtotime in the future. The only problem I have with it is that it assumes the US style MM/DD/YYYY format for dates which look like that.
nickf
Ah, to be an Australian developer! :P
alex