views:

195

answers:

7

Hey!

I am trying to find the fastest way to insert data into a table (data from a select) I always clear the table:

TRUNCATE TABLE table;

Then I do this to insert the data:

INSERT INTO table(id,total) (SELECT id, COUNT(id) AS Total FROM table2 GROUP BY id);

Someone told me I shouldn't do this. He said this would be much faster:

CREATE TABLE IF NOT EXISTS table (PRIMARY KEY (inskey)) SELECT id, count(id) AS total FROM table2 GROUP BY id

Any ideas on this one? I think my solution is cleaner, because I don't have to check for the table. This will be ran in a cron job a few times a day

EDIT: I wasn't clear. The truncate is always ran. It's just the matter of the fastest why to insert all the data

+2  A: 

I also think your solution is cleaner, plus the solution by "someone" looks to me to have some problems:

  • it does not actually delete old data that may be in the table
  • create table...select will create table columns with types based on what the select returns. That means changes in the table structure of table2 will propagate to table. That may or may not be what you want. It at least introduces an implicit coupling, which I find to be a bad idea.

As for performance, I see no reason why one should be faster than the other. So the usual advice applies: Choose the cleanest, most maintainable solution, test it, only optimize if performance is a problem :-).

sleske
A: 
CREATE TABLE IF NOT EXISTS table (PRIMARY KEY (inskey))
SELECT  id, count(id) AS total
FROM    table2
GROUP BY
        id

This will not delete old values from the table.

If that's what you want, it will be faster indeed.

Quassnoi
+1  A: 

Your solution would be my choice, the performance difference loss (if any, which I'm not sure because you don't drop/create the table and re-compute column type) is negligible and IMHO overweight cleanliness.

streetpc
A: 

Perhaps something has been lost in the translation between your Someone and yourself. One possibility s/he might have been referring to is DROP/SELECT INTO vs TRUNCATE/INSERT.

I have heard that the latter is faster as it is minimally logged (but then again, what's the eventual cost of the DROP here?). I have no hard stats to back this up.

Joel Goodwin
A: 

I agree with "sleske"s suggestion in asking you test it and optimize the solution yourself. DIY!

Every self respecting DB will give you the opportunity to rollback your transaction. 1. Rolling back your INSERT INTO... will require DB to keep track of every row inserted into the table 2. Rolling back the CREATE TABLE... is super easy for the DB - Simply get rid of the table.

Now, if you were designing & coding the DB, which would be faster? 1 or 2?

"someone"s suggestion DOES have merit especially if you are using Oracle.

Regards,
Shiva

Shiva
A: 

I'm sure that any time difference is indistinguishable, but yours is IMHO preferable because it's one SQL statement rather than two; any change in your INSERT statement doesn't require more work on the other statement; and yours doesn't require the host to validate that your INSERT matches the fields in the table.

le dorfier
A: 

From the manual: Beginning with MySQL 5.1.32, TRUNCATE is treated for purposes of binary logging and replication as DROP TABLE followed by CREATE TABLE — that is, as DDL rather than DML. This is due to the fact that, when using InnoDB and other transactional storage engines where the transaction isolation level does not allow for statement-based logging (READ COMMITTED or READ UNCOMMITTED), the statement was not logged and replicated when using STATEMENT or MIXED logging mode.

You can simplify your insert to:

INSERT INTO table
( SELECT id, COUNT(id) FROM table2 GROUP BY id );
Paul Morgan