views:

65

answers:

5

I have 2 tables:

book ( id, title, age ) ----> 100 milions of rows

author ( id, book_id, name, born ) ----> 10 millions of rows

Now, supposing I have a generic id of a book. I need to print this page:

Title: mybook

authors: Tom, Graham, Luis, Clarke, George

So... what is the best way to do this ?

1) Simple join like this:

Select book.title, author.name 
From book, author 
WHERE ( author.book_id = book.id ) AND ( book.id = 342 )

2) For avoid the join, I could make 2 simple query:

Select title FROM book WHERE id = 342

Select name FROM author WHERE book_id = 342 

What is the most efficient way ?

+3  A: 

The first one. It's only a single round trip. It requires a little processing to collapse the rows of authors into a comma-separated list like you want but that's basically boilerplate code.

Separate related queries are a bad habit that will kill your performance faster than most things.

cletus
The DB doesn't collapse the rows into a comma-separated list, the client will presumably with some scripting language.
Evan Carroll
@Evan - It can do with GROUP_CONCAT or similar.
Martin Smith
I didn't say it couldn't be done. I just meant to imply that it isn't being done and it is external to the question.
Evan Carroll
+2  A: 

The best option is to run speed tests on your own server. Depending on how often the different tables are accessed together and apart, either one could be faster.

This has been answered in depth before: http://stackoverflow.com/questions/376375/left-join-vs-multiple-select-statements

Aaron Harun
+1  A: 

The first one, and especially if you have an index on author.book_id. A clostered index would be best if you have many authors pr book and it's possible, else a non-clostered would also help you a lot.

jweber
A: 

I know it shouldn't be a consideration, but the first query will return you a result set like this:

title     name
-----------------
mybook    Tom
mybook    Graham
mybook    Luis
mybook    Clarke
mybook    George

whereas the second pair will return you a pair of result sets like this:

title
-------
mybook

and

name
--------
Tom
Graham
Luis
Clarke
George

so each approach returns the data in a different way. In this simple example the repeating of the book's title isn't going to be significant, but if instead of the title you were returning the first chapter (say) then this would less efficient as there would be a lot of repeated data. So while the second might take longer in the database, it might be quicker and more efficient when sending that data across the network.

You need to test your actual results and see which one performs best.

ChrisF
yes I know... but I want to know what is the most efficient way :)
xRobot
You might as well as if VI or Emacs is better. There is no "correct" answer for all situations. Run a hundred or a thousand test queries using loops and see what one takes longer. The use the other one.
Aaron Harun
@Aaron, good analogy.
ChrisF
@xRobot - I agree with @cletus and @jweber that the `join` is probably the most efficient, but like @Aaron says you should verify this yourself by running some tests. I'm merely pointing out that you are returning different data with each.
ChrisF
+1  A: 

Round trip minimization and promotion of sane execution plans are the most salient items on my performance list.

If you have a situation with static dependancies between fields in a query preventing the optimizer from using an index then breaking them out into separate queries may provide huge performance gains as indexes are used and row count of the dataset increases. For most database transport protocols additional result sets equal additional round trips. This can potentially have performance implications if data is regularly accessed over a WAN. Fortunatly there are ways to have your cake and eat it too:

Select title,NULL AS name FROM book WHERE id = 342 
UNION ALL
Select NULL,name FROM author WHERE book_id = 342 

In your specific example I would choose #1 with a warning to consider what would happen if there were no authors on file for a given book.

Einstein