ansaurus

Question

Answer 1

+1 A:

This will populate NEW_TABLE with unique values, and the id value is the first id of the bunch:

INSERT INTO NEW_TABLE
  SELECT MIN(ot.id),
         ot.city,
         ot.post_code,
         ot.short_ccode
    FROM OLD_TABLE ot
GROUP BY ot.city, ot.post_code, ot.short_ccode

If you want the highest id value per bunch:

INSERT INTO NEW_TABLE
  SELECT MAX(ot.id),
         ot.city,
         ot.post_code,
         ot.short_ccode
    FROM OLD_TABLE ot
GROUP BY ot.city, ot.post_code, ot.short_ccode

OMG Ponies 2010-09-19 17:25:02

+1 This is what was asked for

Unreason 2010-09-20 11:49:57

Answer 2

A:

Hello there, my friend posted this question (we are both in that trouble :) ), it is not problem in syntax it is problem in getting this to work. It is a large amount of data which have to be grouped. I suppose some my.ini settings should be modified? :-) Or to make it somehow step by step? :) , thanx for help. Srdan from Slovenia.

Srki 2010-09-19 17:37:09

Answer 3

A:

You don't need to group data. Try this:

 delete from old_table
    USING old_table, old_table as vtable  
    WHERE (old_table.id > vtable.id)  
    AND (old_table.city=vtable.city AND 
old_table.post_code=vtable.post_code 
AND old_table.short_code=vtable.short_code)

I can't comment posts becouse of my points ... repair table old_table; next: show:

EXPLAIN SELECT old_table.id FROM   old_table, old_table as vtable  
        WHERE (old_table.id > vtable.id)  
        AND (old_table.city=vtable.city AND 
    old_table.post_code=vtable.post_code 
    AND old_table.short_code=vtable.short_code

Show: os~> ulimit -a; mysql>SHOW VARIABLES LIKE 'open_files_limit';

next: Remove all os restrictions form the mysql process.

ulimit -n 1024 etc.

iddqd 2010-09-19 17:46:18

Answer 4

A:

To avoid the memory issue, avoid the big select by having a small external program, using the logic as below. First, backup your database. Then:

do {
# find a record
x=sql: select * from table1 limit 1;
if (null x)
then
 exit # no more data in table1
fi
insert x into table2

# find the value of the field that should NOT be duplicated
a=parse(x for table1.a)
# delete all such entries from table1
sql: delete * from table1 where a='$a';

}

Beel 2010-09-19 17:55:36

Answer 5

A:

hello, this query is quite nice but not working for me. I have a fresh table, repaired and optimized, have tried to run this query:

[SQL] DELETE FROM jos_world_cities USING jos_world_cities, jos_world_cities AS vtable WHERE (jos_world_cities.id > vtable.id) AND( jos_world_cities.city = vtable.city AND jos_world_cities.post_code=vtable.post_code AND jos_world_cities.short_ccode = vtable.short_ccode )

[Err] 1030 - Got error 134 from storage engine

but got the error above?

br, Srdan

Srki 2010-09-19 18:07:28

Online sources state that this is a corruption error that needs to be fixed with REPAIR TABLE of myisamchk (read and follow http://forums.devshed.com/mysql-help-4/got-error-134-from-storage-engine-error-number-1030t-446448.html)

Unreason 2010-09-20 12:12:25

Answer 6

A:

@Srki, the problem with the DELETE FROM query is still that the query is too big, so please see my earlier post to avoid having too large of a query.

-134 ISAM error: no more locks.

The ISAM processor needs to lock a row or an index page, but no locks are available. The number of locks that an operation requires depends primarily on the number of rows that a single transaction modifies. You can reduce the number of locks that an operation needs by doing less in each transaction or by locking entire tables instead of locking rows. Depending on the implementation that you are using, the number of locks that is available is configured in one of three places: the operating-system kernel, the shared-memory segment, or the database server. Consult your database server administrator about making more locks available.

Beel 2010-09-19 18:58:35

Answer 7

A:

A bit dirty maybe, but it has done the trick for me the few times that I've needed it: Remove duplicate entries in MySQL.

Basically, you simply create a unique index consisting of all the columns that you wan't to be unique in the table.

As always before this kind of procedures, a backup before proceeding is recommended.

nikc 2010-09-20 10:19:38

Answer 8

A:

MySQL has a INSERT IGNORE. From the docs:

[...] however, when INSERT IGNORE is used, the insert operation fails silently for the row containing the unmatched value, but any rows that are matched are inserted.

So you could use your query from above b just adding a IGNORE

DrColossos 2010-09-20 10:34:32

Answer 9

A:

hello there, @Beel, we have tested that solution with c#. But there is a problem with speed. 1 record per second is too slow if I say that we have 7mio records...am i missing something maybe? :-) this is the code:

           povezava = "SERVER=" + textBox1.Text + ";" + "DATABASE=" +
            textBox2.Text + ";" + "UID=" + textBox3.Text + ";" + "PASSWORD=" + textBox4.Text + ";";
            conn = new MySqlConnection(povezava);
            String[] vrstica = new String[3];
            try
            {

                string sqlStavk = "SELECT * FROM jos_world_cities";
                bool konec = false;
                conn.Open();
                while (!konec)
                {
                    MySqlCommand cmd = new MySqlCommand(sqlStavk, conn);
                    MySqlDataReader dataReader = cmd.ExecuteReader();

                    konec = true;
                    while (dataReader.Read())
                    {
                        konec = false;
                        vrstica[0]=(string)dataReader["city"];
                        vrstica[1]=(string)dataReader["post_code"];
                        vrstica[2]=(string)dataReader["short_ccode"];
                    }
                    dataReader.Close();
                    string qu = "INSERT INTO jos_world_cities_uniq (city, post_code, short_ccode) VALUES ('" +vrstica[0]+ "', '" +vrstica[1]+ "', '" +vrstica[2]+ "')";
                    cmd = new MySqlCommand(qu,conn);
                    cmd.ExecuteNonQuery();


                    qu = "DELETE FROM jos_world_cities WHERE short_ccode LIKE '"+vrstica[2]+"'";
                    cmd = new MySqlCommand(qu,conn);
                    cmd.ExecuteNonQuery(); 

                }
               conn.Close();

            }
            catch (Exception exc)
            {
                MessageBox.Show("Napaka! : " + exc.Message);
            }

Srki 2010-09-20 11:44:23

@nikc, hey this seems promising...already testing, will take a while...to post the results.@iddqd is giving us the right thing also i think...just have to remove limitations on the server as he say...will try both, what will work better...@Unreason :-) I can run the queries but because of large amount of data it would take ages or it just does not finish the job...the table is not corrupted...indexes are fine, table is too big...

Srki 2010-09-20 12:55:36

Answer 10

A:

hello there, only @nikcs solution worked fine :-), thanks to all. It would possible work also @iddqds solution, but have to admit that I did not know how to set open_files_limit variable...tried in my.ini file but MySQL just did not want to listen :))) great, now I have my wanted data. Thanks, this website is great, everyone wants to help :) thanks to everyone, who helped me in this one :), BR Srdan

Srki 2010-09-20 16:58:21

ansaurus

tags:

views:

answers:

Remove duplicates in large MySql table

related questions