views:

196

answers:

6

I have a 5-6 tables in my database that I need to populate with test data to test peroformance in my app.

I can write a code and do a bulk insert my prediction is that it will take nearly 3 days to run so I assumed there must be a quicker way.

Any ideas?

+3  A: 

Depending on how you are generating your data, SSIS may be the best way to insert data.

Also, make sure you turn off all triggers where possible and remove as many indexes as possible, re-adding them afterwards.

ck
+6  A: 

SSIS is one way to go, but you'll either be using the Bulk Insert Task, or looping through the file and calling an INSERT INTO statement. There is BCP available.

Ardman
I was considering that. Is BCP much quicker?
Jon
BCP isn't quicker than BULK INSERT. See this link: http://msdn.microsoft.com/en-us/library/aa178096(SQL.80).aspx
Ardman
Thanks, weird I always thought BCP was quicker
Jon
BCP is a lot quicker at extracting data. :o)
Ardman
A: 

Use recursive CTEs to generate and insert huge amounts of rows at a time:

with table1Test
as (
--
-- generate some test rows
--
)
insert into table1 select ... from table1Test
Ovidiu Pacurar
+1  A: 

BCP is very very fast, and not that difficult to use, also you can run it simply and quickly from a batch file afterwards.

SSIS, is super, you can build nice scripts with branching and decision trees.

For a once off situation, the speed will be similar in both.

You will be shocked at how fact it can handle 60 million. Wiht a neat BCP batch file, you can import, - us sql then to modify, export the new data, and create different sets of test data. All will work in seconds or minutes.

Remember also, if you need to specify a format file for bcp, one way to do that is to key in two records into the table, then bcp export from that table, allowing bcp to generate a fmt file. Now you have the file, you can import using it.

Good Luck,

Mark Breen Ireland BMW R80GS 1987

Mark Breen
A: 

You could have a look at the redgate tools, they have one called data generator. It may help you from having to write your own scripts and I'm sure they have this optimised somehow for speed of inserts.

(Disclaimer, I'm not associated with redgate, just like the software).

Paddy
+1  A: 

You can save significant time when inserting tons of data by disabling indexes beforehand, and re-enabling them upon completion. While each row is being added, it needs to constantly rebalance the index, sometimes splitting pages, etc. Better to skip a bunch of repetitive manipulation, and let it do the work once when the inserts are done.

If you're brave and you're sure that the data is correct, you might also kill any foreign key relationships, and add them back on completion. Otherwise, it's going to do a lookup to check the validity of each FK value in each row inserted. Same goes for other constraints.

Chris Wuestefeld