views:

561

answers:

2

This may be a silly question, but it may shed some light on how joins work internally.

Let's say I have a large table L and a small table S (100K rows vs. 100 rows).

Would there be any difference in terms of speed between the following two options?:

OPTION 1:                 OPTION 2:
---------                 ---------
SELECT *                  SELECT *
FROM L INNER JOIN S       FROM S INNER JOIN L
ON L.id = S.id;           ON L.id = S.id;

Notice that the only difference is the order in which the tables are joined.

I realize performance may vary between different SQL languages. If so, how would MySQL compare to Access?

+4  A: 

Almost all RDBMS's (such MS Access, MySQL, SQL Server, ORACLE etc) use a cost based optimiser based upon column statistics. In most situations, the optimiser will choose a correct plan. In the example you gave, the order should not matter (provided statistics are up to date).

To decide what query strategy to use, the Jet Engine optimizer uses statistics. The following factors are some of the factors that these statistics are based on:

  • The number of records in a table
  • The number of data pages in a table
  • The location of the table
  • Whether indexes are present
  • How unique the indexes are

Note: You cannot view Jet database engine optimization schemes, and you cannot specify how to optimize a query. However, you can use the Database Documenter to determine whether indexes are present and how unique an index is.

Based on these statistics, the Optimizer then selects the best internal query strategy for dealing with a particular query.

The statistics are updated whenever a query is compiled. A query is flagged for compiling when you save any changes to the query (or its underlying tables) and when the database is compacted. If a query is flagged for compiling, the compiling and the updating of statistics occurs the next time that the query is run. Compiling typically takes from one second to four seconds.

If you add a significant number of records to your database, you must open and then save your queries to recompile the queries. For example, if you design and then test a query by using a small set of sample data, you must re-compile the query after additional records are added to the database. When you do this, you want to make sure that optimal query performance is achieved when your application is in use.

Ref.

Might be of interest: ACC: How to Optimize Queries in Microsoft Access 2.0, Microsoft Access 95, and Microsoft Access 97

Tony Toews's Microsoft Access Performance FAQ is worth reading.

Mitch Wheat
So, given that both tables have unique indexes, performance will vary on a case-by-case basis?
Zaid
@Zaid: if the stats are up to date (and query is recompiled as noted above) then the order of the join won't matter; the optimizer will pick the right way.
Mitch Wheat
In more complex queries (many joins) I have seen behaviour that was order specific, but I'm talking quite a few years ago....so I may be wrong!
Mitch Wheat
It is a real pity that the JET Engine does not have the ability to display execution plans....
Mitch Wheat
Yes, maybe I should've included multiple and nested joins in the OP...
Zaid
+1  A: 

I know Oracle's not on your list, but I think that most modern databases will behave that way.

You can see in the following execution plan, that there is no difference between the two statements.

It's a full access to each of the two tables (no index in my case), and then a HASH JOIN. Since you want everything from both tables, both tables need to be read and joined, the sequence does not have an impact.

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   100 |   700 |    42  (12)| 00:00:01 |
|*  1 |  HASH JOIN         |      |   100 |   700 |    42  (12)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| S    |   100 |   300 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| L    |   100K|   390K|    38   (8)| 00:00:01 |
---------------------------------------------------------------------------
Peter Lang