views:

42

answers:

3

I have two queries that each return a list of node ids

SELECT node.nid 
  FROM dpf_node AS node 
 WHERE node.type = 'image' AND node.nid; 

SELECT node.nid 
  FROM dpf_node AS node, dpf_image_galleries_images AS image 
 WHERE image.image_nid = node.nid  
   AND node.type = 'image'
   AND image.gallery_nid = 138;

Both of these are working correctly

Ultimately though I want to get the list of nodes ids that are in the first list of results and not in the second, I have been using this query:

SELECT node.nid 
  FROM dpf_node AS node 
 WHERE node.type = 'image' 
   AND node.nid NOT IN (SELECT node.nid 
                          FROM dpf_node AS node, dpf_image_galleries_images AS image 
                         WHERE image.image_nid = node.nid 
                           AND node.type = 'image' 
                           AND image.gallery_nid = 138);

For a while this was working fine but just this evening it is beach balling sequal pro and causing apache to grind to a halt. I suspect(/hope) that clearing out the data and starting again will solve it but really want to fix the real problem in case it rears its head once the system is live.

+1  A: 

Change it to a NOT EXISTS query to help improve performance:

SELECT node.nid 
  FROM dpf_node AS node 
 WHERE node.type = 'image'
       AND NOT EXISTS (
         SELECT 1
         FROM dpf_node AS i_node
         JOIN dpf_image_galleries_images AS image ON i_node.nid = image.image_nod
         WHERE node.type = 'image'
         AND image.gallery_nid = 138
         AND i_node.nid = node.nid
       )

You should also verify that you have adequate indexes.

Donnie
This also causes the crashWhat do you mean by "verify that you have adequate indexes"?
msaspence
+1  A: 

MySQL might be doing rather silly stuff with correlated subqueries (look at the query plan). If you just want things from the first query that are not in the second, you might get significantly better performance from a LEFT OUTER JOIN of the two, with a filter condition that filters out rows that have nulls for the second set of results.

SquareCog
A: 

You should be able to do this with one select. Rather than selecting all the nodes you don't want just so you can exclude them, it makes more sense to just select the nodes you want in the first place. If there can be many relationships between image galleries and nodes and you only want unique node.nids then make it a select distinct.

SELECT node.nid
FROM dpf_node AS node
JOIN dpf_image_galleries_images AS image 
    ON node.nid = image.image_nod
WHERE node.type = 'image'
AND image.gallery_nid != 138

Depending on how your data is laid out and how much of it there is, this should perform much better than a where clause sub-query. And it's much easier to understand.

patrickmcgraw
This doesn't work because the join will select node.nid where that image has been attached to another galleries and the distinct will not filter these out only reduce them to one row
msaspence
I've rebuilt the data and as predicted it is working finealthough I hear what you say about trying to get it done to one selectI just cant figure out how to modify your query to prevent the problem descibed in the previous comment
msaspence