ansaurus

Question

Sql query with joins between four tables with millions of rows

Answer 1

+7 A:

I'm not sure the query you posted will yield the results you're expecting.

It will cross join all the dimension tables (MoneyTransactions etc.) and multiply all the results.

Try this:

SELECT  E.EmployeeName,
        (
        SELECT  SUM(amount)
        FROM    MoneyTransactions m
        WHERE   M.EmployeeID = E.EmployeeID
        ) AS TotalAmount,
        (
        SELECT  SUM(amount)
        FROM    BudgetTransactions m
        WHERE   M.EmployeeID = E.EmployeeID
        ) AS BudgetAmount,
        (
        SELECT  SUM(hours)
        FROM    TimeTransactions m
        WHERE   M.EmployeeID = E.EmployeeID
        ) AS TotalHours,
        (
        SELECT  SUM(hours)
        FROM    TimeBudgetTransactions m
        WHERE   M.EmployeeID = E.EmployeeID
        ) AS BudgetHours
FROM    Employees E

Quassnoi 2009-04-07 10:04:50

Hmmm... Isn't that SELECT EmployeeID, EmployeeName, SUM(...), SUM(...) FROM Employees GROUP BY EmployeeID, EmployeeName?

Tomalak 2009-04-07 10:43:55

Why group on primary key?

Quassnoi 2009-04-07 10:52:39

That's a mistake (sloppy reading). Never mind, my bad. +1

Tomalak 2009-04-07 10:57:32

@Quassnoi: Thanks. I thought nested queries (SELECTs inside SELECT) would be slower than JOINs... Have not tried your suggestion yet...

Ole Lynge 2009-04-07 12:41:53

No, they won't. Actually, the query you posted is slow because it makes lots of unnesessary joins and produces the incorrect results. If there are 100 rows per employee in each transaction table, you'll get 10,000,000 rows for each employee in a result, which is most probably not what you want.

Quassnoi 2009-04-07 12:53:33

Thanks. Of course. I must be bombed after lunch...

Ole Lynge 2009-04-07 13:27:40

Answer 2

+1 A:

I don't know if you have all the indexes on your tables that will speed up things, but having big tables could have this impact on a query time. I would recommend partitioning the tables if possible. It is more work, but everything you do to speed up the query now it won't be enough after few millions new records.

Bojan Milenkoski 2009-04-07 10:17:27

ansaurus

tags:

views:

answers:

Sql query with joins between four tables with millions of rows

related questions