views:

32

answers:

2

Hi community.

I have a bunch of data ordered by date and each table holds only one month of data. (The reason for this is to cut down query time, I'm talking about millions of rows in each month table)

For ex.

data_01_2010 holds data from 2010-01-01 to 2010-01-31
data_02_2010 holds data from 2010-02-01 to 2010-02-28

Sometimes I have to query these tables according to a specific date range. Now if the range is across multiple months for ex. 2010-01-01 to 2010-02-28 then I need to query both tables.

Can this be achieved with a single query? Like for example:

SELECT * 
FROM data_01_2010, data_02_2010 
WHERE date BETWEEN '2010-01-01' AND '2010-02-28'

The problem with the above query is that it says the column date is ambiguous which it is, because the column is present in both table. (tables have the same structure)

So is this achievable with a single query or do I have to query it for each table separately?

A: 

If you do the logic elsewhere to figure out what tables you need to query, and each month has the same schema, you could just union the tables:

SELECT *
FROM data_01_2010
WHERE date BETWEEN '2010-01-01' AND '2010-02-28'
UNION ALL
SELECT *
FROM data_02_2010
WHERE date BETWEEN '2010-01-01' AND '2010-02-28'

But if you need the query to calculate which tables to union, you are venturing into realm of stored procedures.

adharris
I have the logic written in PHP.I never though of UNIONS. How UNION will hit query performance?
feketegy
Using UNION is a bit slower, because it diffs the results and removes duplicates, however UNION ALL will ignore dup checking. Both require MySQL to create a temporary table, which is a little overhead, but not much. See http://www.mysqlperformanceblog.com/2007/10/05/union-vs-union-all-performance/
adharris
So maybe I would be better off to use multiple queries then?
feketegy
Try both, see what is faster. I think that it will probably be negligible. Also keep in mind that if you execute multiple queries, you will need PHP to loop through multiple result sets, which may be a pain depending on your existing code. The advantage to a union is that the format of the result set stays the same, regardless how many months the dates span.
adharris
+2  A: 

This is a perfect example of why partitioning is so powerful. Partitioning allows you to logically store all of your records in the same table without sacrificing query performance.

In this example, you would have one table called data (hopefully you would name it better than this) and range partition it based on the value of the date column (again hopefully you would name this column better). This means that you could meet your requirement by a simple select:

SELECT * FROM data WHERE date BETWEEN '2010-01-01' AND '2010-02-28';

Under the covers, the database will only access the partitions required based on the where clause.

Reference: http://dev.mysql.com/doc/refman/5.5/en/partitioning.html

ar
I don't have a database with partitioning feature.I think partitioning is still in beta. Am I wrong?
feketegy
Partitioning was introduced in MySQL 5.1.
ar