views:

282

answers:

6

I was reading over an article that shows some really good information and benchmarks about how well the three different MySQL date/time storage options perform.

MySQL DATETIME vs TIMESTAMP vs INT performance and benchmarking with MyISAM

While reading the article you start to get the idea that using ints are just a waste and you should instead go with MySQL Datetime or Timestamp column types.

However, towards the end of the article he does one more test not using MySQL functions and you suddenly see that straight INT's are 2x as fast as the two MySQL options when searching by unix timestamps.

So it suddenly dawned on me - duh, what do PHP apps all use? time()! Almost every php application bases their logic off of the Unix Epoch. Which means that most queries for results in a certain time start off based on time() and then are converted to work with MySQL's fields.

This leaves me with the following:

  1. Unix Timestamps stored as INT's are faster, take less space, and work natively with PHP's time() based calculations.

  2. MySQL Date types are more suited to operations and logic from the MySQL side.

  3. For the time being both Unix And MySQL Timestamps only work until 2037 which means that you must use a datetime field for larger dates in the future.

  4. MySQL commands like date = NOW() can lag when using replication causing data inconsistencies.

So applying this to real life we see that answer that these results given that most really DBA's would use a better engine like PostgreSQL - is there arny

However, most apps that would be to the level of using DB logic would probably go with PostgreSQL. Which means that all the rest of us programmers only use MySQL for a storage tank for our data (you know it's true) which makes keeping the fields as small, fast, UNIX INT's seem like it is actually the best option.

So what do you guys think?

Are timestamps really more suited to PHP apps than the MySQL date fields?

A: 

I always prefer to store dates in mySQL format as it makes comparisons simpler in your queries. mySQL has some great date formatting options too: http://www.dan.co.uk/mysql-date-format/

Sorry, I should add, I really don't know about which is more efficient speed-wise which was an important part of your question.

direct00
+1  A: 

I like to keep all of the logic in a single, high-level domain (that being the app written in php). MySQL is a storage tank--as it should stay. I prefer to use a class such as http://www.symfony-project.org/plugins/sfDateTime2Plugin and then ->dump() or ->get() to the appropriate format anyways. It's much faster and easier to write (and extend) high level manipulations in the application domain than it is using the static mysql interface.

PostgreSQL's interface cleans up on MySQL. But we're still talking about MySQL here because it's popular. Which brings up an important consideration. When writing code or designing systems, often it makes sense to observe the convention, even if it is less computationally efficient than other less known options. This is important because it favours a different kind of efficiency--readability for others. Often readability and understandability inefficiencies account for larger business expenses (and time) than do computational inefficiencies.

I'm all for trying INTs though. Please give it a shot and write about your findings.

Cheers

Homer6
+5  A: 

MySQL's date format has no year 2038 problem.

MySQL's dates are reliable from the year 1000 to the year 9999, whereas Unix timestamps can screw up after 2038 or before 1902 unless everything in your system is 64-bit.

If you're using PHP, however, this can be moot: PHP uses unix timestamps for dates and times throughout most of its date and time functions and unless you are using a 64-bit build it will have the same limitation.

You'd be using the field type that was intended for this purpose.

If you care. Putting date into an INT field as a unix timestamp is not as self-describing; you can't look at the data without converting it in the appropriate way. But that may make no difference to you.

The flip side of this, given that you're using PHP, is that once you get the time into PHP you'd have to convert it back to a Unix timestamp anyway to do anything useful with it, because to PHP, Unix timestamps are native.

thomasrutter
Modern PHP's [DateTime](http://us2.php.net/manual/en/book.datetime.php) class internally uses a [64-bit timestamp](http://us2.php.net/manual/en/intro.datetime.php) and does not suffer from the Y2038 bug. The constructor uses [strtotime](http://us2.php.net/manual/en/function.strtotime.php) to parse the passed timestamp, and understands MySQL's datetime format without any coercion. New, modern PHP applications should be using DateTime instead of integer timestamps.
Charles
Now, that comment deserves to be an answer. I really need to move to the DateTime class (and others) but I just haven't done it yet.
Xeoncross
+1  A: 

Using MySQL's various time and date formats allows for queries that would be difficult using Unix timestamps.

An example would be filtering data based on a particular week (week number), or using a value in the database after adding or removing a certain timeframe from it.

MySQL has some great functions for time and date manipulation which work well with the date,datetime, and time formats.

We use PHP/MySQL for most of our sites, and we automate database to PHP object creation, the code to change from PHP to MySQL formats is very simple:

if($parameter->Type() == DatabaseType::DATETIME)
    $parameterValueArray[] = date('Y-m-d H:i:s', $parameter->Value());
elseif($parameter->Type() == DatabaseType::DATE)
    $parameterValueArray[] = date('Y-m-d', $parameter->Value());
elseif($parameter->Type() == DatabaseType::TIME)
    $parameterValueArray[] = date('H:i:s', $parameter->Value());

MySQL to PHP:

strtotime() for datetime mktime() for time and date

Farid
True, using a unix timestamp to do a week search like `date > time + week OR date < time - week` would be more trouble than `date = CURTIME(w)` (*all pseudo code guys*). However, see the extra logic you have to add when saving and retrieving values because it is no longer native to PHP?
Xeoncross
Absolutely, there is no perfect solution though.PHP works really well with unix time, MySQL works really well with its own formats. Having a bridge between the two is the best solution in my opinion.I've worked on projects that relied purely on unix time, it became harder to debug database issues, and didn't support things like having a birthday before 1970!
Farid
A: 

If you use INT instead of DATETIME you lose flexibility in GROUP by date, hour or time, making different manipulation with intervals.

You can make it with INT using function FROM_UNIXTIME, but your query's will be unreadable.

Using INT instead of DATE makes your programing cost x3 then your work with DATE. You safe executing time not enough to cover programming cost. Hardware is more cheaper then complex programming.

Once we have made this mistake and holds date in INT. After half year we decide to refractory about 30 sites for easy maintenance.

Liutas
+1  A: 

Nice and open question. I see you are a perfeccionist. So am I.

But as almost everything in programming and life, it depends on how it would fit to your problem.

If performance is really critical, you should use UNIX timestamps.

But I really don't think it is the case. I tell you why. It is because I share the same point of view as Rasmus Lerdorf. PHP is a scripting language, which brings many facilities to small to medium business.

For really really critical/big apps, where scalability and performance really matters, you should not use PHP + MySQL at all.

Java or C++ are the better solutions. I think most guys here will ask "What is wrong with PHP, your bastard?!". Well, many things actually. I worked as a performance tester for a while, and I say that developers should keep in mind that your favorite language is not always the best solution for every problem.

Let me give you and example. A critical math/physics application. Just need a number for a phenomenon analysis. You can do it on Shell Script and C. C will perform far way better. See, choosing the most appropriate language and tools to fit your problem is the answer to your correct answer.

Let's get back on MySQL, PHP and the data types. If you are using those, I suppose the application is not so big, nor full of business rules (if it is that big, you would consider some compiled languages, and if it is so critical, you should consider using PostgreSQL or Oracle).

And in this case, what matters is the speed to build up the application. If you do so, I think a good way to start is base your form fields on database metadata. This can help you to automate the form building. And in this case, I recommend using native database types.

Dave
True, I guess when you get large enough you might move to the lower level languages that offer more speed. However, I've never actually seen a company do this. Look at Facebook, Twitter, or Digg. When web apps get larger they don't move for some reason - they state it's because PHP/Ruby "allows them to iterate quicker". Facebook for example just built HipHop so that they wouldn't have to move to C++ *even if it is faster* - the parser will do that for them automatically.
Xeoncross
At any rate, this answer hits right at the heart of the matter and that is that having more options with datetime is worth the cost of the tiny, tiny loss of speed since most apps have bigger problems else where (like using bloated frameworks). As others have also stated, not being able to see clear dates when looking over database rows is also a downfall while developing.
Xeoncross
However, I still don't buy the line that using MySQL datetime fields allows better sorting with MySQL's built in time functions - just use `time() - ###`! Much faster and all your logic is placed in one part of the codebase.
Xeoncross
I knew someone would cite some big PHP apps. Notably, Facebook is our major sample.Well, I need to speak about some points you posted here. First, I was part of optimization task force at IBM Brazil, and yes, there are some big business that came to us asking for some help. One of them, the major brazilian furniture store, made a e-commerce website, and on the launch it was just bad. All application were built on PHP, but no tests were made. They came to IBM in order to get rid of users limit. 350 users at same time and the server got down.
Dave
After some planning, tests, and refactoring, the site end up with 68000 users on concurrency limit. Far way better.Another point I must mention is: despite Facebook keeps their code on PHP, I think it is reasonable enough to assume theirs backend is not as simple as most websites, for both infrastructure and coding.You can bet they have a nice CDN, forking threads, databases redundancy, coherent data design, a powerful caching, well designed classes making an intelligent use of resources, and so forth.
Dave
Your last comment however, I think was made upon my last paragraph, and I think we had a miscommunication on that spot.I was saying about form build automation process, i.e., on most frameworks/ORM we see over the market, or even for a high-quality hand made solution. And I think you were talking about the speed on some specific statements.If that is the case, I can't get pass by without mentioning the long tail theory, and its appliances on websites optimization, like using image sprites (one single image file and css positioning) instead of multiple image files.
Dave
Some polishing bytes benchmarks are not always the most important to do, when there are other most worth the effort spots and techniques, in the overall perspective, of course.Well, I agree with you that you can wrap and simplify all logic into just one place, and this is even better and true if this method gives you a better performance.Cheers.
Dave
Yes, facebook has stated that 98% of their site is run off cached data.
Xeoncross