views:

117

answers:

3
SELECT a.*, b.* 
FROM a
    LEFT OUTER JOIN b 
        ON b.user IN (:userlist) 
        AND b.key = a.fk_to_b
WHERE 
a.user IN (:userlist) 
OR b.user IN (:userlist)
  • Table b has an index of: (user, key)

The database only uses the index when the :userlist parameter contains a single value. When :users contains multiple values (which internally expands to multiple OR statements?) the index is not used and a table scan (of b) is performed.

Why isn't the database using the index when multiple :userlist values are supplied?

Does anyone know of a more optimal version of this query?

+1  A: 

This query will work in all major systems and probably will be more efficient:

SELECT  a.*, NULL
FROM    a
WHERE   a.user IN (:userlist)
        AND a.fk_to_b NOT IN
        (
        SELECT  key
        FROM    b
        )
UNION ALL
SELECT  a.*, b.id
FROM    a
JOIN    b
ON      b.key = a.fk_to_b
WHERE   b.user IN (:userlist)

Could you please tell which RDBMS do you use?

Quassnoi
Multiple DBs - mainly Sybase, Ingres and Firebird.Interesting answer!Unfortunately, we're using compound primary keys so we can only implement the NOT IN using string concatenation (e.g. a.x + a.y NOT IN (SELECT x + y FROM b). Correct?Also I tested the lower section and it ran slower than my current query on Ingres.
Starfield
A: 

The quick answer is: It depends.

If you specify multiple values in :userlist, then the database server may choose to optimise the query in a different way, it may choose a full table scan for instance.

Most of the time, the best option is to see how the query is optimised, by doing

  1. EXPLAIN PLAN in Oracle
  2. Show Execution Plan in SQL server.

To help you more, we really need to know which database you're using.

MatthieuF
A: 

IN (:userlist) expands to multiple OR statements.
The query optimizer disregards OR lines/clauses.
Here's what to do if the DB is Oracle:

CREATE TABLE userListTable  
(  
   sessionId NUMBER(9),  
   user      NUMBER(9)  
);  

CREATE INDEX userListTableMulti1 ON userListTable(sessionId,user);

...

CREATE OR REPLACE FUNCTION fn_getUserList(parmUserList VARCHAR2)  
   RETURN NUMBER DETERMINISTIC  
   varUser      NUMBER(9);  
   varSessionId NUMBER(9);  
BEGIN  
   varSessionId := sys_context('USERENV','SESSIONID');  

   -- You have to work on a VARCHAR2TOLIST() function  
   FOR varUser IN VARCHAR2TOLIST(parmUserList) LOOP  
      INSERT INTO userListTable(sessionId,user)  
      VALUES(varSessionId, varUser)  
   END LOOP;  

   INSERT INTO resultsTable  
      SELECT  
         varSessionId as sessionId ,  
         a.*                       ,  
         b.*  
      FROM  
         (SELECT a.*  
          FROM a  
             INNER JOIN userListTable  
             ON a.user = userListTable.user AND  
                userListTable.sessionId = varSessionId) a  
         LEFT OUTER JOIN (SELECT b.*  
                          FROM b  
                             INNER JOIN userListTable  
                             ON b.user = userListTable.user AND  
                                userListTable.sessionId = varSessionId) b  
         ON b.key = a.fk_to_b;  

   RETURN varSessionId;  
END;  
/

...

// C Client side  
int   varSessionId;  
char* parmUserList;  
char* sqlStr;  

...  

sqlStr = (char*)malloc( strlen(parmUserList) + 17 ) ;  
sprintf(sqlStr,"fn_getUserList(%s)", parmUserList);  

// EXEC_SQL_FUNC_C_MACRO  
// EXEC_SQL_RETURN_QUERY_RESULTS_C_MACRO  
// EXEC_SQL_C_MACRO  
// are all based on the database API C libraries  

// Run the function for this session  
varSessionId = EXEC_SQL_FUNC_C_MACRO(sqlStr);  
free(sqlStr);  

// Get the results  
sqlStr = (char*)malloc(128);  
sprintf(  
   sqlStr,  
   "SELECT * "  
   "FROM resultsTable "  
   "WHERE sessionId=%s",  
   varSessionId);  
EXEC_SQL_RETURN_QUERY_RESULTS_C_MACRO(sqlStr);  
free(sqlStr);  

...  

// Clean up the resultsTable for this session  
sqlStr = (char*)malloc(128);  
sprintf(  
   sqlStr,  
   "DELETE "  
   "FROM resultsTable "  
   "WHERE sessionId=%s",  
   varSessionId);  
EXEC_SQL_C_MACRO(sqlStr);  
free(sqlStr);  

// Clean up the userListTable for this session  
sqlStr = (char*)malloc(128);  
sprintf(  
   sqlStr,  
   "DELETE "  
   "FROM userListTable "  
   "WHERE sessionId=%s",  
   varSessionId);  
EXEC_SQL_C_MACRO(sqlStr);  
free(sqlStr);
Dragos Toader
You would also need to do periodic sweeps of the resultsTable and userListTable to clean up any data left over from interrupted client side app sessions. Point being that using a sessionId to keep track of transactions frees you to use the DB fully without having to bend over backwards to get data out using complicated SQL. You'll also find that doing it this way makes everything easier to maintain in the long run.
Dragos Toader