tags:

views:

86

answers:

6

Here's my query, it is fairly straightforward:

SELECT
    INVOICE_ITEMS.II_IVNUM, INVOICE_ITEMS.IIQSHP
FROM
    INVOICE_ITEMS
LEFT JOIN
    INVOICES
ON 
    INVOICES.INNUM = INVOICE_ITEMS.II_INNUM
WHERE
    INVOICES.IN_DATE
BETWEEN
    '2010-08-29' AND '2010-08-30'
;

I have very limited knowledge of SQL, but I'm trying to understand some of the concepts like subqueries and the like. I'm not looking for a redesign of this code, but rather an explanation of why it is so slow (600+ seconds on my test database) and how I can make it faster.

From my understanding, the left join is creating a virtual table and populating it with every result row from the join, meaning that it is processing every row. How would I stop the query from reading the table completely and just finding the WHERE/BETWEEN clause first, then creating a virtual table after that (if it is possible)?

How is my logic? Are there any consistently recommended resources to get me to SQL ninja status?

Edit: Thanks everyone for the quick and polite responses. Currently, I'm connecting over ODBC to a proprietary database that is used in the rapid application development framework called OMNIS. Therefore, I really have no idea what sort of optimization is being run, but I believe it is based loosely on MSSQL.

+3  A: 

I would rewrite it like this, and make sure you have an index on i.INNUM, ii.INNUM, and i.IN_DATE. The LEFT JOIN is being turned into an INNER JOIN by your WHERE clause, so I rewrote it as such:

SELECT ii.II_IVNUM, ii.IIQSHP 
FROM INVOICE_ITEMS ii
INNER JOIN INVOICES i ON i.INNUM = ii.II_INNUM 
WHERE i.IN_DATE BETWEEN '2010-08-29' AND '2010-08-30' 

Depending on what database you are using, what may be happening is all of the records from INVOICE_ITEMS are being joined (due to the LEFT JOIN), regardless of whether there is a match with INVOICE or not, and then the WHERE clause is filtering down to the ones that matched that had a date within range. By switching to an INNER JOIN, you may make the query more efficient, by only needing to apply the WHERE clause to INVOICES records that have a matching INVOICE_ITEMS record.

RedFilter
That would give differnt results, is there a reason why he needed the left join?
HLGEM
@HLGEM, see my additional info above.
RedFilter
Thanks for the clear and comprehensive answer. Between yours and HLGEM's, I've had a very productive day so far! (Upvoted both as well)
melee
+3  A: 

SInce that is a very basic query the optimizer should do fine with it, likely your problem would be incorrect indexing. DO you haveindexes on the In_date field and INVOICE_ITEMS.II_INNUM field? If you have properly set up PK Fk relationships, INVOICES.INNUM should already be indexed but FKs are not indexed automatically.

HLGEM
So is the query then structurally correct? To be honest, I have no idea what kind of optimization/DBMS is running, as I'm connecting over ODBC to an OMNIS database. It's a temporary thing.
melee
@melee: Yes, the query looks OK. Try creating a combined index on INVOICES on the columns (IN_DATE, INNUM).
FrustratedWithFormsDesigner
+3  A: 

The experts will chime in with a lot more useful answers, but in the meantime you should read up on indexes.

Jay
+1  A: 

well there is no reason why this query is slow... the only thing that comes to mind is, do you have indexes on INVOICES.INNUM = INVOICE_ITEMS.II_INNUM? if you add them it could speed up the select but it would slow down updates/inserts...

Petoj
+1  A: 

A join doesn't create a "virtual table" on anything more than just a conceptual level.

The performance issue with your query most likely lies in poor or insufficient indexing. You should have indexes on:

INVOICE_ITEMS.II_INNUM
INVOICES.IN_DATE

You should also have an index on INVOICES.INNUM, but if that's the primary key of the table then it already has one.

Also, don't use a left join here. If there's a foreign key between INVOICE_ITEMS.II_INNUM and INVOICES.INNUM (and INVOICE_ITEMS.II_INNUM is not nullable), then you'll never encounter a record in INVOICE_ITEMS that won't match up to a record in INVOICES. Even if there were, your WHERE condition is using a value from INVOICES, so you'd eliminate any unmatched rows anyway. Just use a regular JOIN.

Adam Robinson
+2  A: 

Your query is fine, it's the indexes you have to look at.

Are INVOICES.INNUM and INVOICE_ITEMS.II_INNUM indexed?

If not SQL has to do something called a 'scan' - it searches every single record.

You can think of indexes as like the tabs on the side of a phone book - you know where to start looking for people based on the first letters of their surname. Without an index (say you want to look for names that end in '...son') you have to search the entire book.

There are different types of index - they can be ordered (like the phone book index - all ordered by surname) or not (like the index at the back of a book - there's an overhead in finding the index and then the actual page).

You should also be able to view the query plan - this is how the server executes the SQL statement. That can tell you all sorts of more advanced stuff - for instance there are multiple ways to do the job: a merge join is possible if both tables are sorted by the join field or a nested join will loop through the smaller table for every record in the larger table.

Keith
Thanks for the information, much appreciated.
melee