views:

1207

answers:

11

When running a query like "insert into table " how do we handle the commit size? I.e. are all records from anotherTable inserted in a single transaction OR is there a way to set a commit size?

Thanks very much, ~Sri

PS: I am a first timer here, and this site looks very good!

+2  A: 

In good databases that is an atomic statement, so no, there is no way to limit the number of records inserted - which is a good thing!

Otávio Décio
A: 

You can't handle the commit size unless you explicitly code it. For example you could use a where loop, and code up a way to limit the ammount of data your selecting.

JoshBerke
but with large inserts this is usually sub-optimal.
TrickyNixon
Not sure why this answer is getting down-voted as it's correct.
Nick Pierpoint
Me neither nick never understand it.
JoshBerke
A: 

If you need the data set to be limited, build that limit into the query.

For example, in Microsoft SQL Server parlance, you can use "TOP N" to make sure the query only returns a limited number of rows.

INSERT INTO thisTable
  SELECT TOP 100 * FROM anotherTable;
Bill Karwin
But then he needs to insert 101-200 and then 201-300...
+1  A: 

The reason why I want to do that is to avoid the rollback segment going out of space. Also, I want to see results being populated in the target table at regular intervals.

I dont want to use a where loop because it might add performance overheads. Isn't it?

~Sri

You should tag your question with "oracle" since IIRC the rollback segment is an Oracle feature.
Bill Karwin
I believe the expected approach on this site is that you should edit your original question with any further information, as it could potentially get lost as other answers are voted up.
RSlaughter
Can you add this as a modification to the original question?
David Aldridge
+1  A: 

I've written code in various langues, mostly Java, to do bulk inserts like what you described. Each time I did it, mostly from parsing some input file or something like that, I would basically just prepare a sub-set of data to insert from the total amount (usually batches of 4000 or so) and feed that data to our DAO layer. So it was done programatically. We never noticed any real performance hit for doing it this way and we were dealing with a few million records. If you have large data sets to insert the operation will "take awhile" regardless of how you do it.

tmeisenh
A: 

You are right, you may want to run large inserts in batches. The attached link shows a way to do it in SQL Server, if you are using a different backend you would do something simliar but the exact syntax might be differnt. This is a case when a loop is acceptable.

http://www.tek-tips.com/faqs.cfm?fid=3141

HLGEM
A: 

"The reason why I want to do that is to avoid the rollback segment going out of space. Also, I want to see results being populated in the target table at regular intervals."

The first is simply a matter of sizing the undo tablespace correctly. Since the undo is a delete of an existing row, it doesn't require a lot of space. Conversely, a delete generally requires more space because it has to have a copy of the entire deleted row to re-insert it to undo it.

For the second, have a look at v$session_longops and/or rows_processed in v$sql

Gary
+1  A: 

In the context that the original poster wants to avoid rollback space problems, the answer is pretty straightforward. The rollback segments should be sized to accpomodate the size of transactions, not the other way round. You commit when your transaction is complete.

David Aldridge
+1 for being the most sensible answer here
Rob van Wijk
A: 

David Aldridge is right, size the rollback segment based on the maximum transaction, when you want the INSERT to either succeed or fail as a whole.

Some alternatives:

If you don't care about being able to roll it back (which is what the segment is there for), you could ALTER TABLE and add the NOLOGGING clause. But that's not a wise move unless you're loading a reporting table where you drop all old rows and load new ones, or some other special cases.

If you're okay with some rows getting inserted and others failing for some reason, then add support for handling the failures, using the INSERT INTO LOG ERRORS INTO syntax.

Stew S
A: 

You may just want to make the indexes NOLOGGING. That way the table data is recoverable, but the indexes will need to be rebuilt if table is recovered. Index maintenance can create a lot of undo.

RussellH
A: 
INSERT INTO TableInserted
SELECT *
FROM (
   SELECT  *,
          ROW_NUMBER() OVER (ORDER BY ID) AS RowNumber
   FROM TableSelected
) X
WHERE RowNumber BETWEEN 101 AND 200

You could wrap the above into a while loop pretty easily, replacing the 101 and 200 with variables. It's better than doing 1 record at a time.

I don't know what versions of Oracle support window functions.

Lurker Indeed
-1 because this is a horribly slow approach. For a 100,000 record table you are accessing "TableSelected" 1000 times, and each access likely is a full table scan ...
Rob van Wijk
You're assuming that it's always a table scan. okay, put the data in a temp table.Where's your better way? Don't see it in here.
Lurker Indeed