views:

96

answers:

3

I found this paragraph in Oracle documentation

if you want to select the name of each department along with the name of its manager, you can write the query in one of two ways. In the first example which follows, the hint /++ordered++/ says to do the join in the order the tables appear in the FROM clause with attempting to optimize the join order.

SELECT /*+ordered*/ d.NAME, e.NAME
FROM DEPT d, EMP e WHERE d.MGR = e.SS#

or:

SELECT /*+ordered*/ d.NAME, e.NAME 
FROM EMP e, DEPT d WHERE d.MGR = e.SS# 

Suppose that there are 10 departments and 1000 employees, and that the inner table in each query has an index on the join column. In the first query, the first table produces 10 qualifying rows (in this case, the whole table). In the second query, the first table produces 1000 qualifying rows. The first query will access the EMP table 10 times and scan the DEPT table once. The second query will scan the EMP table once but will access the DEPT table 1000 times. Therefore the first query will perform much better. As a rule of thumb, tables should be arranged from smallest effective number rows to largest effective number of rows. The effective row size of a table in a query is obtained by applying the logical conditions that are resolved entirely on that table.

But I don't correctly understand this. If there are m rows in table t1 and n rows in table t2, wouldn't the sql engine go through m x n rows in both cases?

Update: Thanks for all the replies. I won't be overriding the optimizer, just wanted to confirm my thought.

+2  A: 

That depends on the WHERE statement.

SELECT /++ordered++/ d.NAME, e.NAME FROM DEPT d, EMP e WHERE d.MGR = e.SS#

Will select all managers for each department. As there are 10 departments, this results in 10 records being fetched.

SELECT /++ordered++/ d.NAME, e.NAME FROM EMP e, DEPT d

This will select all employees with the name of the department they're working in. As there are 1000 employees, your resultset will have 1000 rows.

A JOIN will never cause your engine to loop over m x n rows, you're resultset of an inner join will always be m if m < n

Anzeo
I don't understand what you mean by "resultset of an inner join will always be m x m if m < n". The result set of joining EMP and DEPT in this example will be 1000 (i.e. n)
Tony Andrews
Is this true even if there is no index on the joining keys? (Again, this is just for knowledge. I understand joins on non indexes is not recommended.)
Jerry
@Tony, It depends on your WHERE/ON clause. In the first case the resultset will contain 10 rows, in the second 1000.@Jerry What do you mean by index on the joining keys?
Anzeo
@Tony - Since this is inner join isn't it 10, not 100?
Jerry
@Anzeo, thanks for the correction
Tony Andrews
@Anzeo - I meant if there is an index on e.SS# and d.MGR. If there is no index on both the columns, wouldn't the engine go for a full scan for each records?
Jerry
@Jerry, yes I was thinking of the "show all employees and their departments" query rather than the "show all departments and their managers" query that you are actually doing in the example. But the point is it is m or n, not m x m as Anzeo said.
Tony Andrews
@Tony You are correct, I've edited my answer. @Jerry If I understand you correctly, in the case no indices are defined, The search for your rows matching d.MGR = e.SS# will increase greatly, but it wouldn't necessarily search the complete table on each run.
Anzeo
+3  A: 

Well, in the first case the number of logical reads is 10+10, in the second 1000+1000, with each department being read on average 100 times.

However, writing queries with the ORDERED hitn like this is not normal practice. It's best to leave optimisation to the optimiser most of the time.

I'm not sure exactly which documentation you got that quote from, but where I have seen it it is preceded by this very important paragraph that you omitted. I quote it here for the benefit of others who may otherwise think this method of writing queries is standard:

Normally optimizer picks the best execution plan, an optimal order of tables to be joined. In case the optimizer is not producing a good execution plan you can control the order of execution using the HINTS feature SQL. For more information see the Oracle Database Lite SQL Reference.

-- Oracle® Database Lite Developer's Guide

Tony Andrews
+1, altough I think it assumes a NESTED LOOP JOIN (a HASH JOIN would probably be chosen if there is no filter condition). The fact that the doc assumes a nested loop join (RULE optimizer?) and advocates the use of hints makes me think that this advice is from a very old doc (Oracle 7?). Anyway, good advice on letting the optimizer do its work.
Vincent Malgrat
+1  A: 

You really found that in oracle docs?

You should not use the ORDERED hint and let oracle do the decision for you--that is most of the time working very well nowadays.

However, the join order makes an difference performance wise.

The example seems to discuss the NESTED LOOPS join:

Case 1:
 -> 1 lookup to find 10 rows in table A
 -> 10 index lookups in table B

Case 2:
 -> 1 lookup to find 1000 rows in table B 
 -> 1000 index lookups in table A
Markus Winand