Guys, I have two basic tables (id
, first_name
).
I want to run table 1 against table 2, delete duplicates, and spit out the cleansed version of table 1.
What's the easiest way to do this with PHP/MySQL?
Thanks.
Guys, I have two basic tables (id
, first_name
).
I want to run table 1 against table 2, delete duplicates, and spit out the cleansed version of table 1.
What's the easiest way to do this with PHP/MySQL?
Thanks.
Step 1 :
Create table t3 as (
select t1.*
from table1 t1 left outer join table2 t2 on t1.id = t2.id and t1.first_name = t2.first_name
where t2.id is null
)
Step 2 :
Delete table1, table2
Step 3 :
Rename table3 in table1
I don't know if it is the easiest way, but it is the fastest if you have big tables.
This will delete all records from t1
that also exist in t2
, leaving you with a stripped-down t1
, but possibly with records in t2
that don't exist in t1
:
DELETE FROM table1 t1
WHERE t1.id IN
(SELECT id from table2 t2
WHERE t2.id = t1.id AND t2.first_name = t1.first_name
)
*You may want to consider using EXISTS
instead of IN
, as per Brian Hooper's suggestion.
This will combine the two tables into a third table (t3
), removing duplicates along the way:
SELECT * INTO t3
FROM t1 UNION SELECT * FROM t2
That will work for SQL Server (definitely) and MySQL (I think) but MySQL supports CREATE TABLE table_name AS select-statement
, so you could use:
CREATE TABLE t3 AS
(SELECT * FROM t1 UNION DISTINCT SELECT * FROM t2)
*The DISTINCT
keyword is optional - it's the default behaviour
DELETE FROM table1 t1
WHERE EXISTS (SELECT *
FROM table2 t2
WHERE t2.id = t1.id AND
t2.first_name = t1.first_name);
I'd prefer the exists to IN which I have had trouble with in the past, with it taking a long time.
Why not use
DELETE FROM table1 WHERE first_name IN (SELECT first_name from table2)
? Correct me if I missed something.
This is without using subquery (not tested):
DELETE t1
FROM table1 t1
JOIN table2 t2
ON t2.id = t1.id
AND t2.first_name = t1.first_name