views:

69

answers:

4

I'm getting a very strange behavior in MySQL, which looks like some kind of weird bug. I know it's common to blame the tried and tested tool for one's mistakes, but I've been going around this for a while.

I have 2 tables, I, with 2797 records, and C, with 1429. C references I.
I want to delete all records in I that are not used by C, so i'm doing:

select * from i where id not in (select id_i from c);

That returns 0 records, which, given the record counts in each table, is physically impossible. I'm also pretty sure that the query is right, since it's the same type of query i've been using for the last 2 hours to clean up other tables with orphaned records.

To make things even weirder...

select * from i where id in (select id_i from c);

DOES work, and brings me the 1297 records that I do NOT want to delete.
So, IN works, but NOT IN doesn't.

Even worse:

select * from i where id not in (
  select i.id from i inner join c ON i.id = c.id_i
);

That DOES work, although it should be equivalent to the first query (i'm just trying mad stuff at this point).
Alas, I can't use this query to delete, because I'm using the same table i'm deleting from in the subquery.

I'm assuming something in my database is corrupt at this point.

In case it matters, these are all MyISAM tables without any foreign keys, whatsoever, and I've run the same queries in my dev machine and in the production server with the same result, so whatever corruption there might be survived a mysqldump / source cycle, which sounds awfully strange.

Any ideas on what could be going wrong, or, even more importantly, how I can fix/work around this?

Thanks!
Daniel

+8  A: 

I would guess that you have a row containing a NULL value. IN and NOT IN behave in a way that you might not expect when there are NULL values. From the manual:

To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.

The value of NOT NULL is still NULL, not true as you hoped.

Here's a demonstration of the behaviour on a simplified example:

CREATE TABLE c (id_i INT NULL);
INSERT INTO c (id_i) VALUES
(1),
(2),
(NULL);

CREATE TABLE i (id INT NOT NULL);
INSERT INTO i (id) VALUES
(1),
(2),
(3);

SELECT * FROM i WHERE id NOT IN (SELECT id_i FROM c); -- returns nothing
SELECT * FROM i WHERE id IN (SELECT id_i FROM c);     -- returns 1,2
SELECT * FROM i WHERE id NOT IN (                     -- returns 3
  SELECT i.id FROM i INNER JOIN c ON i.id = c.id_i
);
Mark Byers
Wow, I feel incredibly dumb, incredibly grateful to you, and very, VERY surprised i've never found this behaviour in over 10 years of using SQL.Thank you so much sir!
Daniel Magliola
@Daniel - `NOT IN (...,NULL)` will always return nothing. To understand why this is the case see this link http://www.simple-talk.com/sql/learn-sql-server/sql-and-the-snare-of-three-valued-logic/
Martin Smith
+2  A: 

I want to delete all records in I that are not used by C

For what it's worth, MySQL supports an extension to standard SQL for multi-table DELETE syntax, which you can use to get around idiosyncrasies in NOT IN predicates and NULL and subqueries.

DELETE I
FROM I LEFT OUTER JOIN C ON (i.id = c.id_i)
WHERE c.id_i IS NULL;

NB: I didn't test this, it's just from memory. Please read the manual yourself and test it on some sample data to make sure it works as you expect before using it on your real database.

Bill Karwin
+1  A: 

What od you get returned if you try this query?

select * 
  from i 
 where NOT EXISTS (select id_i 
                     from c
                    where c.id_i = i.id
                  )
Mark Baker
A: 

Correct Mark !

The Exists was also my first thought - because it relates to the existence of the entire row -

but I like Bill's outer join - it is the math way of doing it ...

A hint - expressions - like sum/avg - could end up in null - that is NULL - not 0 ;o)

Therefore you cannot check for correct return codes (syntax) nor check the value - you must check for null first ;o)

Mike