views:

84

answers:

3

I have a mysql table with 9 million records that doesn't have any indices set. I need to join this to another table based on a common ID. I'm going to add an index to this ID, but I also have other fields in the select and where clause.

Should I add an index to all of the fields in the where clause?

What about the fields in the select clause? Should I create one index for all fields, or an index per field?

Update - Added tables and query

Here is the query - I need to get the number of sales, item name, and item ID by item based on the store name and store ID (the store name and ID by themselves are not unique)

SELECT COUNT(*) as salescount, items.itemName, CONCAT(items.ID, items.productcode) as itemId 
FROM items JOIN sales ON items.itemId = sales.itemId WHERE items.StoreName = ? 
AND sales.storeID = ? GROUP BY items.ItemId ORDER BY salescount DESC LIMIT 10;

Here is the sales table:

+----------------+------------------------------+------+-----+---------+-------+
| Field          | Type                         | Null | Key | Default | Extra |
+----------------+------------------------------+------+-----+---------+-------+
| StoreId        | bigint(20) unsigned          | NO   |     | NULL    |       |
| ItemId         | bigint(20) unsigned          | NO   |     | NULL    |       |
+----------------+------------------------------+------+-----+---------+-------+

and the items table:

+--------------------+------------------------------+------+-----+---------+-------+
| Field              | Type                         | Null | Key | Default | Extra |
+--------------------+------------------------------+------+-----+---------+-------+
| ItemId             | bigint(20) unsigned          | NO   | PRI | NULL    |       |
| ProductCode        | bigint(20) unsigned          | NO   |     | NULL    |       |
| ItemName           | varchar(100)                 | NO   |     | NULL    |       |
| StoreName          | varchar(100)                 | NO   | PRI | NULL    |       |
+--------------------+------------------------------+------+-----+---------+-------+
+1  A: 

You should index all fields that will be searched for in the leading table in the WHERE clause and in the driven table in the WHERE and JOIN clauses.

Making the indexes to cover all fields used in the query (including SELECT and ORDER BY clauses) will also help, since no table lookups will be needed.

Just post your query here and I'll probably be able to tell you how to index the tables.

Update:

Your query will return at most 1 row with 1 as a COUNT(*)

This will select the sale with the given StoreID (which is the PRIMARY KEY), and join the items on the sale's itemId and given StoreName (this combination is a PRIMARY KEY too).

This join either succeeds (returning 1 row) or fails (returning no rows).

If it succeeds, the COUNT(*) will be 1.

If it's really what you want, then your table is indexed fine.

However, it seems to me that your table design is a little more complex and you just missed some fields when copying the field definitions.

Update 2:

  1. Create a composite index on sales (storeId, itemId)

  2. Make sure that you PRIMARY KEY on items is defined as (StoreName, ItemId) (in that order).

    If the PK is defined as (ItemID, StoreName), the create an index on items (StoreName, ItemID).

Quassnoi
This answer seems to general to me - as you know, `WHERE LastName like '%Smith%'` or `WHERE SUBSTRING(ID, 2, 1) = '2'` will not benefit from indexes at all.
RedFilter
It's hard to give more specific answer given the lack of information. But even with these conditions, a query can still benefit from the indexes. If the table records are large, then full index scan can be used to search for the records instead of the full table scan. `MySQL`, though, does not support late row lookups, so a query rewrite may be required.
Quassnoi
`@Orbman`: and the first query, given that `Smith` is capitalized, will benefit from a `FULLTEXT` index, being rewritten as `MATCH(LastName) AGAINST ('+Smith*' IN BOOLEAN MODE)`. This, of course, will fail on `McSmiths` and similar surnames :)
Quassnoi
I added the table and query
George
The sales StoreId is not a primary key - I removed that from the table. This query is returning multiple rows for me, it is just very slow.
George
A: 

Indexing is great -- when used in the correct form. Remember, indexes must be indexed.

Concentrate your indexes on your primary, shared keys, as well as fields which require heavy and common data comparisons, such as literal fields and date ranges.

Indexes are great when used correctly, but indexes arn't a cure-all problem. Even properly indexed tables can be brought to their knees with a bad query and a flick of the wrist.

George
A: 

Yes, you really should have indexes, but they should be appropriate for all your queries. Without having a good rummage about in your database its difficult to recommend exactly what indexes to configured.

9 milion rows is enough that indexes will make a big difference - but not so big that you can't afford to tinker a bit.

A crude solution would be to create indexes on items(storeid),items(itemid,storename), items(storename,itemid), sales(itemid),sales(storeid),sales(itemid,storeid) and sales(storeid,itemid) then drop the indexes that aren't getting used.

C.

symcbean