views:

274

answers:

10

I have a database which is 6GB in size, with a multitude of tables however smaller queries seem to have the most problems, and want to know what can be done to optimise them for example there is a Stock, Items and Order Table.
The Stock table is the items in stock this has around 100,000 records within with 25 fields storing ProductCode, Price and other stock specific data.
The Items table stores the information about the items there are over 2,000,000 of these with over 50 fields storing Item Names and other details about the item or product in question.
The Orders table stores the Orders of Stock Items, which is the when the order was placed plus the price sold for and has around 50,000 records.

Here is a query from this Database:

SELECT Stock.SKU, Items.Name, Stock.ProductCode FROM Stock
INNER JOIN Order ON Order.OrderID = Stock.OrderID
INNER JOIN Items ON Stock.ProductCode = Items.ProductCode
WHERE (Stock.Status = 1 OR Stock.Status = 2) AND Order.Customer = 12345
ORDER BY Order.OrderDate DESC;

Given the information here what could be done to improve this query, there are others like this, what alternatives are there. The nature of the data and the database cannot be detailed further however, so if general optmisation tricks and methods are given these will be fine, or anything which applies generally to databases.
The Database is MS SQL 2000 on Windows Server 2003 with the latest service packs for each. DB Upgrade / OS Upgrade are not options for now.


Edit

Indices are Stock.SKU, Items.ProductCode and Orders.OrderID on the tables mentioned.
Execution plan is 13-16 seconds for a Query like this 75% time spent in Stock


Thanks for all the responses so far - Indexing seems to be the problem, all the different examples given have been helpful - dispite a few mistakes with the query, but this has helped me a lot some of these queries have run quicker but combined with the index suggestions I think I might be on the right path now - thanks for the quick responses - has really helped me and made me consider things I did not think or know about before!


Indexes ARE my problem added one to the Foriegn Key with Orders (Customer) and this has improved performance by halfing execution time!
Looks like I got tunnel vision and focused on the query - I have been working with DBs for a couple of years now, but this has been very helpful. However thanks for all the query examples they are combinations and features I had not considered may be useful too!

+2  A: 

The most important (if not already done): define your primary keys for the tables (if not already defined) and add indexes for the foreign keys and for the columns you are using in the joins.

Cătălin Pitiș
Thanks for suggesting the Indexing thing first - this has made the biggest difference to performance, however all the queries mentioned have also helped too!
RoguePlanetoid
A: 

Some general pointers

  • Are all of the fields that you are joining on indexed?

  • Is the ORDER BY necessary?

  • What does the execution plan look like?

BTW, you don't seem to be referencing the Order table in the question query example.

Russ Cam
Sorry was a couple of mistakes in the query, but all suggestions have been helpful non-the less a few things I can try and use, dispite the mistakes in transcibing the query - responses have been helpful, thanks!
RoguePlanetoid
+3  A: 

is your code correct??? I'm sure you're missing something

INNER JOIN Batch ON Order.OrderID = Orders.OrderID

and you have a ) in the code ...


you can always test some variants against the execution plan tool, like

SELECT 
    s.SKU, i.Name, s.ProductCode 
FROM 
    Stock s, Orders o, Batch b, Items i
WHERE 
    b.OrderID = o.OrderID AND
    s.ProductCode = i.ProductCode AND
    s.Status IN (1, 2) AND 
    o.Customer = 12345
ORDER BY 
    o.OrderDate DESC;

and you should return just a fraction, like TOP 10... it will take some milliseconds to just choose the TOP 10 but you will save plenty of time when binding it to your application.

balexandre
Saw it also, but posted my answer before.
Stefan Steinegger
RoguePlanetoid
not just that, your saying that the Order.OrderID has to be equal than the same Order.OrderID in the INNER JOIN...
balexandre
+2  A: 

Did you specify indexes? On

  • Items.ProductCode
  • Stock.ProductCode
  • Orders.OrderID
  • Orders.Customer

Sometimes, IN could be faster than OR, but this is not as important as having indexes.

See balexandre answer, you query looks wrong.

Stefan Steinegger
I have indices but only the Primary Key on each table as each of these has a unique ID (SKU, Product Code and Order ID)
RoguePlanetoid
Here is your problem - you need to have indices on your foreign keys too. Whether the key is unique or not has no bearing on the requirement of an index (obviously, this won't be a UNIQUE index).
BrynJ
A: 

Table index will certainly help as Cătălin Pitiș suggested.

Another trick is to reduce the size of the join rows by either use sub select or to be more extreme use temp tables. For example rather than join on the whole Orders table, join on

(SELECT * FROM Orders WHERE Customer = 12345)

also, don't join directly on Stock table join on

(SELECT * FROM Stock WHERE Status = 1 OR Status = 2)
oykuo
The database server should be clever enough to perform this optimisation (pushing the filter predicates as far down as possible) by itself, though.
Thilo
trust me it doesn't. In one of my project I did exactly what RoguePlanetoid, the query looks more elegant but it takes almost a minute to run. After I start using sub selects and temp tables it now comes down to 5 seconds.
oykuo
A: 

Setting the correct indexes on the tables is usually what makes the biggest difference for performance.

In Management Studio (or Query Analyzer for earlier versions) you can choose to view the execution plan of the query when you run it. In the execution plan you can see what the database is really doing to get the result, and what parts takes the most work. There are some things to look for there, like table scans, that usually is the most costly part of a query.

The primary key of a table normally has an index, but you should verify that it's actually so. Then you probably need indexes on the fields that you use to look up records, and fields that you use for sorting.

Once you have added an index, you can rerun the query and see in the execution plan if it's actually using the index. (You may need to wait a while after creating the index for the database to build the index before it can use it.)

Guffa
Thanks for the suggestion - never considered a Primary Key might not have an Index associated with it - in this case they do but that could have been an issue I was unaware of!
RoguePlanetoid
A: 

Put your code into Query Analyser and then show the Execution Plan. You can use the execution plan to idenfity potential bottlenecks/problems in your query. Look for the likes of table scans showing up in there, as thats where most of the time could be getting lost in your query.

Also it would be useful to attach the Execution Plan to this question for us to take a look at. I've attached a link below going over it.

Query Execution Plan

kevchadders
A: 

Could you give it a go?

SELECT Stock.SKU, Items.Name, Stock.ProductCode FROM Stock
INNER JOIN Order ON Order.OrderID = Stock.OrderID AND (Order.Customer = 12345) AND (Stock.Status = 1 OR Stock.Status = 2))
INNER JOIN Items ON Stock.ProductCode = Items.ProductCode
ORDER BY Order.OrderDate DESC;
Tudor Olariu
Interesting, this seems to execute the same way - but all the examples people have gave have showed things I have not tried before will apply these ideas to my other queries. I think Indexing is the issue as this appears to cause to most slow down especially my Stock table.
RoguePlanetoid
A: 

Elaborating on what Cătălin Pitiș said already: in your query

SELECT Stock.SKU, Items.Name, Stock.ProductCode
    FROM Stock
      INNER JOIN Order ON Order.OrderID = Stock.OrderID
      INNER JOIN Items ON Stock.ProductCode = Items.ProductCode
  WHERE (Stock.Status = 1 OR Stock.Status = 2) AND Order.Customer = 12345
  ORDER BY Order.OrderDate DESC;

the criterion Order.Customer = 12345 looks very specific, whereas (Stock.Status = 1 OR Stock.Status = 2) sounds unspecific. If this is correct, an efficient query consists of

1) first finding the orders belonging to a specific customer,

2) then finding the corresponding rows of Stock (with same OrderID) filtering out those with Status in (1, 2),

3) and finally finding the items with the same ProductCode as the rows of Stock in 2)

For 1) you need an index on Customer for the table Order, for 2) an index on OrderID for the table Stock and for 3) an index on ProductCode for the table Items.

As long your query does not become much more complicated (like being a subquery in a bigger query, or that Stock, Order and Items are only views, not tables), the query optimizer should be able to find this plan already from your query. Otherwise, you'll have to do what kuoson is suggesting (but the 2nd suggestion does not help, if Status in (1, 2) is not very specific and/or Status is not indexed on the table Status). But also remember that keeping indexes up-to-date costs performance if you do many inserts/updates on the table.

A: 

To shorten my answer I gave 2 hours ago (when my cookies where switched off):

You need three indexes: Customer for table Order, OrderID for Stock and ProductCode for Items.

If you miss any of these, you'll have to wait for a complete table scan on the according table.