views:

49

answers:

3

I am currently part of a team designing a site that will potentially have thousands of users who will be doing a number of date related searches. During the design phase we have been trying to determine which makes more sense for performance optimization.

Should we store the datetime field as a mysql datetime. Or should be break it up into a number of fields (year, month, day, hour, minute, ...)

The question is with a large data set and a potentially large set of users, would we gain performance wise breaking the datetime into multiple fields and saving on relying on mysql date functions? Or is mysql already optimized for this?

+1  A: 

Have a look at the MySQL Date & Time Functions documentation, because you can pull specific information from a date using existing functions like YEAR, MONTH, etc. But while these exist, if you have an index on the date column(s), using these functions means those indexes can not be used...

The problem with storing a date as separate components is the work needed to reconstruct them into a date when you want to do range comparisons or date operations.

Ultimately, choose what works best with your application. If there's seldom need for the date to be split out, consider using a VIEW to expose the date components without writing possibly redundant information into your tables.

OMG Ponies
A: 

Use a regular datetime field. You can always switch over to the separated components down the line if performance becomes an issue. Try to avoid premature optimization - in many cases, YAGNI. You may wind up employing both the datetime field and the separated component methodology, since they both have their strengths.

James Jones
A: 

If you know ahead of time some key criteria that all searches will have, MySQL (>= v5.1) table partitioning might help.

For example, if you have a table like this:

create table Books(pubDate dateTime, title varchar(50));

And you know all searches must at least include a year, you could partition it on the date field, along these lines:

create table Books(pubDate dateTime,title varchar(50)  
partition by hash(year(pubDate)) partitions 10;

Then, when you run a select against the table, if your where clause includes criteria that limit the partition the results can exist on, the search will only scan that partition, rather than a full table scan. You can see this in action with:

-- scans entire table
explain partitions select * from Books where title='%title%';

versus something like:

-- scans just one partition
explain partitions select * from Books 
where year(pubDate)=2010
and title='%title%'; 

The MySQL documentation on this is quite good, and you can choose from multiple partitioning algorithms.

Even if you opt to break up the date, a table partition on, say, year (int) (assuming searches will always specify a year) could help.

Jeffrey Knight