views:

96

answers:

3

My supervisor told me this query will 'crash the server' (because it has millions of tables i believe). Can anyone tell me WHY? maybe show me the way i am suppose to do it? I am currently reading manuals to figure this out.

What i was told is these joins are to slow. What is slow about it?

"JOIN    A AS o ON a.A =aa.A " .
"JOIN    B AS i ON ... " .
"JOIN    C AS p ON ... " .
"JOIN    D as t ON ... " 
"JOIN DB.E as a ON ... " .
"JOIN DB.F as d ON ... " .
"JOIN DB.G as g ON ... " .
"JOIN DB.H AS h ON ... " .
+3  A: 

A couple of thoughts:

  1. There is an argument for a database being too normalized...one should not have to join so many tables together to get what they need. It may make sense to keep a column in more than one table and/or combine tables if data is used together frequently.

  2. If you're generating a report (can't think of many other outputs where you'd want so many columns), how often does this query need to run? Consider using an ETL process to build your data incrementally from the various source tables and load it into a single table for your big query.

  3. Make sure you have the proper indexes in place.

Also, assuming you're using MySQL (the question didn't really specify), you can always use the 'EXPLAIN' statement to tell you how bad this query is...

AJ
+1 mainly for databases being too normalized but good answer overall. that use my biggest misnomer coming out of college.
Berek Bryan
The EXPLAIN documentation (http://dev.mysql.com/doc/refman/5.1/en/explain.html) also has good references for optimizing your queries.
Jason
Yes report. Yes i tagged it with MySql. AFAIK the problem is the query and not the DB. I checked out explain and i dont understand it well enough to figure out the problem.
An employee
Here's an example of using EXPLAIN on a SELECT statement that joins across three tables:mysql> explain select u.user_name, m.location from users u inner join meetings_users mu on u.id=mu.user_id inner join meetings m on mu.meeting_id=m.id;All you're doing is adding the keyword EXPLAIN as the first word before SELECT.
AJ
I mean i did look at the explain and explain extended but i cant use the info its showing me to figure out what is slow about it
An employee
Rather that being **too** normalized, a proliferation of many (millions?!?) tables is almost certainly a sign that the database isn't normalized enough! If you really can usefully join many tables together with a simple inner join like that, then you probably need to consolidate more data into one table.
Daniel Pryden
A: 

You could parse the query (run the plan but not the query) to see the estimated number of rows returned. Such a query could well cripple a database if large number of rows would be returned as a result of the joins (I don't think the number of JOINS is the problem - other than when one exceeds any limit given in the docs), but the fact that the JOIN term could return a vast amount of data, unless limited by index usage based on the WHERE clause. Another approach could be to use in-line SELECTs in the main SELECT part, and see what query plan the optimizer comes up with, particularly if one or more tables have significantly more rows than other from which you are selecting.

i.e.

SELECT *
, (select col1 from b where b.A = A.A) as x
...
from A
...
davek
A: 
JOIN    A AS o ON a.A =aa.A

This immediately raises a red flag, as the join condition doesn't mention table A (called 'o'). That will give you the cross product of all rows, which isn't what you want. Your join condition (after the ON) should always mention the table you just joined (the x in AS 'x').

Keith Randall