ansaurus

Question

Answer 1

A:

Have you tried:

SELECT COUNT(DISTINCT workcode) FROM worktag WHERE tagId IN (123, 124)

?

I'm not a MySQL expert, but it looks to me like you might be looking at a significant failure of the query optimizer.

On the other had, good for MySQL that it optimizes the OR in the second statement. I know databases that will successfully optimize the IN (), but not the OR version of the same logical request.

Larry Lustig 2009-10-09 20:22:14

Answer 2

+2 A:

Why don't you use a join instead of a subquery?

SELECT COUNT(DISTINCT workcode)
FROM worktag
LEFT JOIN tag
  ON worktag.tagId = tag.tagID
WHERE tag.name = 'programming'

P.S.: Seems to be reported as bug.

Zed 2009-10-09 20:22:59

My coworker just came over and gave me the exact same solution. According to him, mysql can more easily optimize a JOIN as apposed to a subquery. I don't like the statement as much (subquery looks nicer to me), but it works great. Thanks.

chadgh 2009-10-09 20:30:10

The sub-query is more imperative: you explicitly ask for selecting the IDs first, and then select the count based on the previous result. In my opinion this gives less freedom to the optimizer. (Your problem still seems to be a bug though, and not an EBKAC! :)

Zed 2009-10-09 20:35:51

Answer 3

+1 A:

A database admin told me recently, that the syntax WHERE x IN ( ... ) is a pain for the database. A join is almost always better:

SELECT COUNT(DISTINCT wt.workcode) 
  FROM worktag wt, tag t 
 WHERE wt.tagId = t.tagId 
   AND t.name='programming'

tangens 2009-10-09 20:23:17

It doesn't have to be, and at least with Oracle and Sql server, it's a myth. The OP has hit some kind of bad behavior in MySQL. Your point might be correct for MySQL, there's NOTHING inherently slow with IN.

erikkallen 2009-10-09 20:33:56

You need to tell your database admin that that statement is highly product dependent. Plenty of databases do better with IN than with either JOINs or OR. I can't see any particular reason why IN () would be a pain for a database, it seems to naturally offer plenty of opportunity for hitting an index.

Larry Lustig 2009-10-09 22:15:55

Answer 4

+1 A:

SELECT COUNT(DISTINCT workcode) 
FROM worktag 
inner join tag on worktag.tagid = tag.tagid
WHERE tag.name='programming'

Gratzy 2009-10-09 20:23:35

Answer 5

A:

I guess the optimizer does some bad guess. Replacing the query with an inner join might help.

erikkallen 2009-10-09 20:31:06

Answer 6

A:

MySQL generally doesn't do so well with subqueries, even independent ones. The posters who discussed joins are right - if you've got a choice, use a join. If you can't easily use a join (ie, foo.x in (select y from bar where y = xxx limit 10)), you're better off running the limit into a temporary IN MEMORY table and using a join on it.

If you're using MySQL a lot, use EXPLAIN and you'll see how it's using your indexes and such things.

2009-10-10 00:26:46

ansaurus

tags:

views:

answers:

MySQL performance using IN predicate

related questions