tags:

views:

56

answers:

3

I am working in MySQL and having a bit of trouble with building a query that will summarize columns from two tables. I want to compare the quantity of requests per day, for a table containing hourly records and a table containing daily aggregation, per day. Ideally the sums of each would be identical.

Here is the schema:

Hourly Table:

CREATE TABLE requests_hourly (
 customer_id INT,
 date DATETIME,
 requests BIGINT,
 req_type SMALLINT );

Daily Table

CREATE TABLE requests_daily (
 customer_id INT,
 date DATE,
 requests BIGINT,
 req_type SMALLINT );

Not working SQL to get me all the requests, by req_type across both tables for June 2010

SELECT
 SUM(h.requests),
 SUM(d.requests),
 h.req_type
FROM requests_hourly h
LEFT OUTER JOIN requests_daily d ON d.req_type = h.req_type
WHERE h.date >= '2010-06-01 00:00:00'
 AND h.date < '2010-07-01 00:00:00'
 AND d.date >= '2010-06-01 00:00:00'
 AND d.date < '2010-07-01 00:00:00'
GROUP BY h.req_type;

I have a feeling the error is in the JOIN. Thank you in advance for your help!

Answer

I gave credit to Peter for for the answer, but it did require a little modification. So here is the MySQL SQL code:

SELECT *
FROM
  (SELECT SUM(requests) AS 'Daily Request Sum', req_type
   FROM requests_daily
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type, date) d
  INNER JOIN
  (SELECT SUM(requests) AS 'Hourly Request Sum', req_type
   FROM requests_hourly
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type, DATE(date)) h
USING (req_type, date)
A: 

I'd use a subquery.

SELECT `req_type`
     , `date`
     , COUNT(0) AS `daily_data_count`
     , (
           SELECT COUNT(0)
             FROM `requests_hourly`
            WHERE CAST(`date` AS DATE)
                = `requests_daily`.`date`
       ) AS `hourly_data_count`
  FROM `requests_daily`
 WHERE `date` BETWEEN '2010-06-01' AND '2010-06-30'
 GROUP BY `req_type`, `date`
Don
A: 

Hi,

Have you considered approaching the problem with a union all.

select
  sum(requests) total,
  req_type
from
  requests_daily
where
 requests_daily.date >= '2010-06-01 00:00:00' AND requests_daily.date < '2010-07-01 00:00:00' 
group by requests_daily.req_type
union all
select
  sum(requests) total,
  req_type
from
  requests_hourly
where
 requests_hourly.date >= '2010-06-01 00:00:00' AND requests_hourly.date < '2010-07-01 00:00:00' 
group by requests_hourly.req_type

Enjoy!

Doug
I have, but does not fit the requirement. The UNION delivers two rows for each date+req_type. :-( I need it in one row. Thanks though!
randy melder
A: 

According to my query analyzer this is the fastest / most efficient method presented thus far:

SELECT *
FROM
  (SELECT SUM(requests) AS 'Daily Request Sum', req_type
   FROM requests_daily
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type)
  INNER JOIN
  (SELECT SUM(requests) AS 'Hourly Request Sum', req_type
   FROM requests_hourly
   WHERE date BETWEEN '2010-06-01 00:00:00' AND '2010-07-01 00:00:00'
   GROUP BY req_type)
USING (req_type, date)

It fallows simple logic that by restricting the size and prearranging the indexes in the tables being joined before the operation takes place significantly reduces the relatively "expensive" overheard incurring by matching the indexes of the two tables (req_type) against each other especially since you do not have unique indexes (req_type) which will force the database to perform searches rather than seeks.

Hope this was helpful.

Peter Hanneman
I tried this approach, it does not work. First I got an error saying that each table required an alias. Then the data was strange as all the 2nd SELECT subjects data was the same for each req_type for every day. Finally, after adding the date to the USING(req_type, date) it worked. I appreciate your help. I will credit you with the answer.
randy melder
I apologize I was thinking with an Oracle 10g mindset not MySQL.
Peter Hanneman
It's all good here. If you wouldn't mind, append the corrections to the bottom of your answer so that others don't get confused/misled. Thanks again!
randy melder