tags:

views:

149

answers:

5

I have 3 tables

bl_main (bl_id UNIQUE, bl_area)
bl_details (bl_id UNIQUE, name) 
bl_data(bl_id, month, paper_tons, bottles_tons)

bl_id is not unique in the last table. There will be multiple rows of same bl_id.

I am trying to retrieve data in the following way

bl_id | name | bl_area | sum(paper_tons) | sum (bottles_tons) | paper_tons | bottles_tons

sum(paper_tons) should return the sum of all the paper tons for the same bl_id like Jan to December.

Using the below query i am able to retrieve all the data correctly except in the result, there are multiple occurances of bl_ids(From bl_data table).

SELECT bl_main.bl_id,name,bl_area,sums.SummedPaper, sums.SummedBottles,paper_tons,bottles_tons
FROM bl_main
JOIN bl_details ON 
    bl_main.bl_id= bl_details.bl_id
left outer JOIN bl_data ON
    bl_data.bl_id= bl_main.bl_id
left outer JOIN (
    SELECT bl_id, SUM(Paper_tons) As SummedPaper, SUM(bottle_tons) As SummedBottles 
FROM bl_data

GROUP by bl_id) sums ON sums.bl_id = bl_main.bl_id

I wanto retrieve only the unique values of bl_ids without repetition and it should contain the bl_id which has the max month and not all the months for the same bl_id.

For ex:

INCORRECT


**0601**    University Hall  75.76  17051  1356  4040  1154  **11**     
**0601**    University Hall  75.76  17051  1356  9190  101  **12**  
**0605**    UIC Student  22.86  3331  14799  0  356   **8**   

CORRECT   
**0601**    University Hall  75.76  17051  1356  9190  101  **12**  
**0605**    UIC Student  22.86  3331  14799  0  356   **8**

I know I can get the max value using

WHERE Month = (SELECT MAX(Month)

but where exactlt should i add this in the query and should i change the join definition. Any help is highly appreciated as i am new to sql. Thanks in advance.

A: 

You can join the table containing the month against itself, using a subquery of the form:

Select *
From mytable m
    Inner Join (Select max(Month) as Month, myId
                From mytable
                Group By myId) mnth
        On mnth.myId = m.myId and mnth.Month = m.Month
marr75
A: 

Your JOIN clause

left outer JOIN bl_data ON
  bl_data.bl_id= bl_main.bl_id

does not specify which month to select for the data you are displaying with paper_tons and bottles_tons.

You could update that JOIN to only contain the max month, and this should limit the entries, like so:

left outer JOIN (SELECT bl_id, MAX(Month) as Month from bl_data GROUP BY bl_id) as Month
  ON Month.bl_id = bl_main.bl_id
left outer JOIN bl_data ON
  bl_data.bl_id = bl_main.bl_id AND bl_data.Month = Month.bl_Month
Marshall
A: 

I think this query is what you are looking for

SELECT bl_main.bl_id,name, bl_area, sums.SummedPaper, sums.SummedBottles, paper_tons, bottles_tons
FROM bl_main
    JOIN bl_details ON bl_main.bl_id= bl_details.bl_id
    left outer JOIN bl_data ON    bl_data.bl_id= bl_main.bl_id
    left outer JOIN
    (
        SELECT bl_id, month, SUM(Paper_tons) As SummedPaper, SUM(bottle_tons) As SummedBottles 
        FROM bl_data
        WHERE month in
        (SELECT MAX(month) FROM bl_data GROUP BY bl_id)
        GROUP BY bl_id, month
    ) sums ON sums.bl_id = bl_main.bl_id
David
This doesnot give the sum correctly nor the unique bl_ids.
+1  A: 

You have two tables that probably should be combined into one (bl_main and bl_details). But putting that aside, what you need is a self-join subquery to select the row with the max month. Something like the following (untested):

SELECT bl_main.bl_id, bl_details.name, bl_main.bl_area, sums.sum_paper_tons,
       sums.sum_bottles_tons, maxmonth.paper_tons, maxmonth.bottles_tons
FROM bl_main
INNER JOIN bl_details ON bl_main.bl_id = bl_details.bl_id
LEFT OUTER JOIN (SELECT bl_id, SUM(paper_tons) AS sum_paper_tons, 
                        SUM(bottles_tons) AS sum_bottles_tons
                 FROM bl_data
                 GROUP BY bl_id) sums ON bl_main.bl_id = sums.bl_id
LEFT OUTER JOIN (SELECT bl_id, paper_tons, bottles_tons
                 FROM bl_data data2
                 INNER JOIN (SELECT bl_id, MAX(month) AS max_month
                             FROM bl_data
                             GROUP BY bl_id) m
                    ON m.bl_id = data2.bl_id
                    AND m.max_month = data2.month) maxmonth
    ON bl_main.bl_id = maxmonth.bl_id
lc
Thanks Ic. I am trying really hard but i am unable to run the query. It gives an error - "column ambiguously defined". Any help is appreciated.
Could be the column aliases having the same name as the column they aggregate? I thought it could differentiate, but maybe not...
lc
nope. Dint help. It still gives the error "column ambiguously defined". Thanks once again for your time. What do you think could be wrong?
This is what is breaking the query("column ambiguously defined") SELECT bl_id, paper_tons, bottles_tons FROM bl_data data2 INNER JOIN (SELECT bl_id, MAX(month) AS max_month FROM bl_data GROUP BY bl_id Any help is appreciated
Got it. Thank a lot. It should be actually SELECT data2.bl_id, data2 .paper_tons, data2 .bottles_tons from .... and it works fine
@faffy Glad it worked. Too bad it didn't give a more specific error message but I guess that makes sense.
lc
A: 

I wanted to just add a comment to the answer lc gave, but I don't have 50 reputation points yet. It is a link to an article that I believe explains this question and adds the why the solution that lc gave is correct.

http://www.sqlteam.com/article/how-to-use-group-by-with-distinct-aggregates-and-derived-tables

David Glass