views:

69

answers:

3

Hey guys, trying to optimize this query to solve a duplicate user issue:

SELECT userid, 'ismaster' AS name, 'false' AS propvalue FROM user 
WHERE userid NOT IN (SELECT userid FROM userprop WHERE name = 'ismaster');

The problem is that the select after the NOT IN is 120.000 records and it's taking forever.

Using the explain prefix as suggested in the comments returns:

                                    QUERY PLAN

--------------------------------------------------------------------------------
--
 Seq Scan on user  (cost=5559.38..122738966.99 rows=61597 width=8)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Materialize  (cost=5559.38..7248.33 rows=121395 width=8)
           ->  Seq Scan on userprop  (cost=0.00..4962.99 rows=121395 width=8
)
                 Filter: ((name)::text = 'ismaster'::text)
(6 rows)

Any suggestion?

A: 

Did you put index on userid?

Or try another variation:

SELECT userid, 'ismaster' AS name, 'false' AS propvalue FROM user 
WHERE NOT EXISTS 
 (SELECT * FROM userprop 
 WHERE userpop.userid = user.userid 
   AND name = 'ismaster');
Michael Buen
Thx! Very fast and works, I wonder why though. :P
hdx
A: 

Is the name column indexed? How selective is the name value? Also anytime you want to have someone recommend changes to a query provide the query plan, even on what appears to be a simple query. That way we really know what the planner is doing.

StarShip3000
Sorry man, this is the first time I hear about a "Query Plan" I'm gonna learn more about it now. Thanks for teaching me something new :P
hdx
Check this out for starters http://www.postgresql.org/docs/8.4/interactive/sql-explain.html
StarShip3000
A: 

According to this answer, using a LEFT JOIN ... IS NULL might be either faster or slower than NOT EXISTS, depending on the RDBMS, though they're equivalent on PostGres.

SELECT u.userid, 'ismaster' AS name, 'false' AS propvalue FROM user u
LEFT JOIN userprop up ON u.userid = up.userid AND up.name <> 'ismaster'
WHERE up.userid IS NULL
aib