ansaurus

Question

Optimizing MySQL queries with IN operator

Answer 1

+2 A:

Index on categoryId won't help in this case, IN (...) queries will produce sequence scan instead of index lookup anyway.

I would consider first redesigning the system to get rid of multiple category select and if it is not appropriate, caching query results.

For example, you can create a helper table items_category_groups(hash, item_id) and after client query on multiple categories hash their combined ids and lookup this table. If not found, make an expensive query and fill this table. If found, make a cheap query joining these tables. Other caching tools like memcached will work too.

Andrey 2010-03-09 10:31:25

good idea but this query is used to pulls out product for given category and its subcategories (whole branch of this category) so it is imposible

Arkadiusz Kondas 2010-03-09 10:35:42

Then caching is the only way I see, although the cache size will be large (250k * count of category combinations), so I would store them in DB, not in memcache or something.

Andrey 2010-03-09 11:06:30

If you just want to pull out products from ONE category (and its subcategories), you can create a helper table (category_id, item_id) and fill it with all category_id-item_id pairs including subcategories. This table won't be very large (say, if you have category nesing level of 4, it will contain not more than 1m*number of categories two-int rows, it's not a big deal). Then you will fetch products with a single fast query using indexes.

Andrey 2010-03-09 11:13:33

but this is the same. I will also must use IN on the helper table

Arkadiusz Kondas 2010-03-09 11:29:40

No.For example, you have category 1 with subcategory 2 and 2 has a subcategory 3.Item 1 belongs to category 3.Then you will have 3 rows for one item in this table:1 1 |1 2 |1 3And if you want to select all items from, say, category 2, you write, roughly:SELECT items.* FROM items i, item_categories ic WHERE i.id=ic.item_id AND ic.category_id=2;And it will be equal toSELECT * FROM items WHERE category_id IN (2, 3);Of course, if you want to select items from two not related categories, this won't work, but in case of nested categories it will work just fine.

Andrey 2010-03-09 11:55:43

thx for anserw and example i will try and check results

Arkadiusz Kondas 2010-03-09 12:37:39

after my reflection: when i want add product ,for example, to category 3 i must put in item_categories 3 rows ? 1 1 | 1 2 | 1 3 ?If yes the save proccess will take longer and i must generate tree for each save.

Arkadiusz Kondas 2010-03-09 12:45:30

sure, but this is just one time on save, and even these 3 queries may work faster than one select with IN(). If you have much more selects than inserts, I guess this is worth it.

Andrey 2010-03-09 13:30:07

ansaurus

tags:

views:

answers:

Optimizing MySQL queries with IN operator

related questions