views:

2580

answers:

18

I need to insert 800000 records into an MS Access table. I am using Delphi 2007 and the TAdoXxxx components. The table contains some integer fields, one float field and one text field with only one character. There is a primary key on one of the integer fields (which is not autoinc) and two indexes on another integer and the float field.

Inserting the data using AdoTable.AppendRecord(...) takes > 10 Minutes which is not acceptable since this is done every time the user starts using a new database with the program. I cannot prefill the table because the data comes from another database (which is not accessible through ADO).

I managed to get down to around 1 minute by writing the records to a tab separated text file and using a tAdoCommand object to execute

insert into table (...) select * from [filename.txt] in "c:\somedir" "Text;HDR=Yes"

But I don't like the overhead of this.

There must be a better way, I think.

EDIT:

Some additional information:

  • MS Access was chosen because it does not need any additional installation on the target machine(s) and the whole database is contained in one file which can be easily copied.
  • This is a single user application.
  • The data will be inserted only once and will not change for the lifetime of the database. Though, the table contains one additional field that is used as a flag to indicate that the corresponding record in another database has been processed by the user.
  • One minute is acceptable (up to 3 minutes would be too) and my solution works, but it seems too complicated to me, so I thought there should be an easier way to do this.
  • Once the data has been inserted, the performance of the table is quite good.
  • When I started planning/implementing the feature of the program working with the Access database the table was not required. It only became necessary later on, when another feature was requested by the customer. (Isn't that always the case?)

EDIT:

From all the answers I got so far, it seems that I already got the fastest method for inserting that much data into an Access table. Thanks to everybody, I appreciate your help.

+4  A: 

It would be quicker without the indexes. Can you add them after the import?

There are a number of suggestions that may be of interest in this thread http://stackoverflow.com/questions/327658/slow-msaccess-disk-writing

Remou
Yes, I thought about this, but some trials didn't really show much performance improvement.
dummzeuch
That link was interesting, thanks.
dummzeuch
+1  A: 

You're looking in the right direction in one way. Using a single statement to bulk insert will be faster than trying to iterate through the data and insert it row by row. Access, being a file-based database will be exceedingly slow in iterative writes.

The problem is that Access is handling how it optimizes writes internally and there's not really any way to control it. You've probably reached the maximum efficiency of an INSERT statement. For additional speed, you should probably evaluate if there's any way around writing 800,000 records to the database every time you start the application.

Jekke
The OP said that it only occurrs when the user creates a new database, not every time the app starts. That would be a problem :)
Ed Swangren
It is not every time I start the application. It happens only when I start using a new access database, which was created using a different tool. Once the data has been added, it will not change for the lifetime of the database.
dummzeuch
+2  A: 

Get SQL Server Express (free) and connect to it from Access an external table. SQL express is much faster than MS Access.

Diodeus
This is not an option in this case.
dummzeuch
Get SQL Server embedded - no install needed, just a DLL. STIL La LOT better than Access.
TomTom
+8  A: 

Since you've said that the 800K records data won't change for the life of the database, I'd suggest linking to the text file as a table, and skip the insert altogether.

If you insist on pulling it into the database, then 800,000 records in 1 minute is over 13,000 / second. I don't think you're gonna beat that in MS Access.

If you want it to be more responsive for the user, then you might want to consider loading some minimal set of data, and setting up a background thread to load the rest while they work.

JosephStyons
I need the data inside the database because the user might copy it from one computer to another to continue working on it. Linking would require him to copy the text file as well and also keep the path to it. The intended users are not computer savy enough for this.
dummzeuch
How long does it take to simply import the text file as a table (i.e., don't insert it anywhere, just import it as-is?)
JosephStyons
A: 

Perhaps you could open a ADO Recordset to the table with lock mode adLockBatchOptimistic and CursorLocation adUseClient, write all the data to the recordset, then do a batch update (rs.UpdateBatch).

Tmdean
A: 

I'm not trolling when I ask this, but is MS Access the correct solution for something with 800,000 records?

I don't know much about DB's, so maybe someone can enlighten me.

Regards

Mick
Yes - using access for something approaching a million records seems like a really dumb idea.
mP
I start to get nervous at about 100k records.
Marc Bernier
For naive file-copying users it's just simpler, and I've gone to several MM records with little problem (esp. for a single user).
le dorfier
A million records really is just not that many for a single user database.
Tmdean
It's certainly not too many when the records are so small, as described in the original post. And I've had robust production apps using Jet with tables having *long* records and three tables of 250K records or more.
David-W-Fenton
+3  A: 

How about an alternate arrangement...

Would it be an option to make a copy of an existing Access database file that has this table you need and then just delete all the other data in there besides this one large table (don't know if Access has an equivalent to something like "truncate table" in SQL server)?

pfunk
If it doesn't then at least "delete * from table" should work
Hosam Aly
A: 

Also check to see how long it takes to copy the file. That will be the lower bound of how fast you can write data. In db's like SQL, it usually takes a bulk load utility to get close to that speed. As far as I know, MS never created a tool to write directly to MS Access tables the way bcp does. Specialized ETL tools will also optimize some of the steps surrounding the insert, such as the way SSIS does transformations in memory, DTS likewise has some optimizations.

MatthewMartin
+1  A: 

I would prefill the database, and hand them the file itself, rather than filling an existing (but empty) database.

If the data you have to fill changes, then keep an ODBC access database (MDB file) synchronized on the server using a bit of code to see changes in the main database and copy them to the access database.

When the user requests a new database zip up the MDB, transfer it to them, and open it.

Alternately, you may be able to find code that opens and inserts data into databases directly.

Alternately, alternately, you may be able to find another format (other than csv) which access can import that is faster.

Adam Davis
Hm, your last sentence got me thinking: Currently the other database is dbase and I think ADO can also read dbase so I might actually be able to directly use that original table in the insert statement...
dummzeuch
It sounds to me more and more like the original data doesn't change at all. If so, then your best option I think would be writing the data to the access db file once and then just copying the file when the user needs a "new" database
pfunk
No, that's not an option. The access db will be created according to custom configurations. The only thing that will always be in that database is the table in question. Also, there are multiple main databases and I need to copy from the one that is active when the user starts using the access db.
dummzeuch
What kind of custom configuration are we talking about? Copying the database would just skip the step of creating the dabatase, you just copy the existing file. Any custom configurations could still be done after the copy?
Joel Gauvreau
A: 

The best way to surpass an obstacle is removing it.
Are you sure you need a RDBMS for just one table?

friol
There are more tables in the database. Actually, all other tables are the ones that contain the real data, the table in question is only used to flag records in the main database as being processed (and no, I cannot use the main database for that).
dummzeuch
A: 

If it's coming from dbase, can you just copy the data and index files and attach directly without loading? Should be pretty efficient (from the people who bring you FoxPro.) I imagine it would use the existing indexes too.

At the least, it should be a pretty efficient single-command Import.

le dorfier
No, I can't. the new table actually contains one additional column which I fill and query during the later operation.
dummzeuch
+4  A: 

What about skipping the text file and using ODBC or OLEDB to import directly from the source table? That would mean altering your FROM clause to use the source table name and an appropriate connect string as the IN '' part of the FROM clause.

EDIT: Actually I see you say the original format is xBase, so it should be possible to use the xBase ISAM that is part of Jet instead of needing ODBC or OLEDB. That would look something like this:

INSERT INTO table (...) 
SELECT * 
FROM tablename IN 'c:\somedir\'[dBase 5.0;HDR=NO;IMEX=2;];

You might have to tweak that -- I just grabbed the connect string for a linked table pointing at a DBF file, so the parameters might be slightly different.

David-W-Fenton
This might just be possible, but it would require some rather ugly redesign of the module (which currently does not know about the source table but only has an interface for getting the data). I'll check on that again tomorrow.
dummzeuch
I don't know enough about your code to know why it would be a problem -- since you were executing an INSERT on a text file, it seems you'd just skip creating the text file and replace it with a direct INSERT from the table, with SQL written on the fly if you don't know the tablename until runtime.
David-W-Fenton
A: 

how much do the 800,000 records change from one creation to the next? Would it be possible to pre populate the records and then just update the ones that have changed in the external database when creating the new database?

This may allow you to create the new database file quicker.

Toby Allen
All those records may change, because it is possible (even likely) that the user switched to a different master database.
dummzeuch
+3  A: 

I would replace MS Access with another database, and for your situation I see Sqlite is the best choice, it doesn't require any installation into client machine, and it's very fast database and one of the best embedded database solution.

You can use it in Delphi in two ways:

  1. You can download the Database engine Dll from Sqlite website and use Free Delphi component to access it like Delphi SQLite components or SQLite4Delphi

  2. Use DISQLite3 which have the engine built in, and you don't have to distribute the dll with your application, they have a free version ;-)

if you still need to use MS Access, try to use TAdoCommand with SQL Insert statment directly instead of using TADOTable, that should be faster than using TADOTable.Append;

Mohammed Nasman
Unfortunately replacing Access at this time in the project is not an option
dummzeuch
Beside, are you *sure* that it would make importing the records faster?
Renaud Bompuis
A: 

How fast is your disk turning? If it's 7200RPM, then 800,000 rows in 3 minutes is still 37 rows per disk revolution. I don't think you're going to do much better than that.

Meanwhile, if the goal is to streamline the process, how about a table link?

You say you can't access the source database via ADO. Can you set up a table link in MS Access to a table or view in the source database? Then a simple append query from the table link would copy the data over from the source database to the target database for you. I'm not sure, but I think this would be pretty fast.

If you can't set up a table link until runtime, maybe you could build the table link programatically via ADO, then build the append query programatically, then invoke the append query.

Walter Mitty
Both, the access database and the master database can change due to user actions, so the only way to create a link inside the access db would be at runtime. I will for now stick with the text file solution because it is fast enough and easy to implement and I don't know how to create links via ADO.
dummzeuch
A: 

HI The best way is Bulk Insert from txt File as they said you should insert your record's in txt file then bulk insert the txt file into table that time should be less than 3 second.

tiphooo
+1  A: 

Your text based solution seems the fastest, but you can get it quicker if you could get an preallocated MS Access in a size near the end one. You can do that by filling an typical user database, closing the application (so the buffers are flushed) and doing a manual deletion of all records of that big table - but not shrinking/compacting it.

So, use that file to start the real filling - Access will not request any (or very few) additional disk space. Don't remeber if MS Access have a way to automate this, but it can help much...

Fabricio Araujo
Interesting idea. Unfortunately it will not work for me because I have no idea how large that table will actually become (800000 records was the test case I used, the actual size depends on the contents of a different table that is not available when the access db is being created.) Maybe I'll do a few tests to see whether to fill it to e.g. 500000 records speeds up the later import significantly enough to make it worth wile.
dummzeuch
My experience is for SQLServer, but other db engines(ex: Firebird) does not auto shrink dbfiles for that exact same reason: avoiding asking new disk space allocation to the underlying OS if possible. Even the restore operation in SQLServer works preallocating disk space.
Fabricio Araujo
How much improvement you got, dummzeuch?
Fabricio Araujo
+1  A: 

You won't be importing 800,000 records in less than a minute, as someone mentioned; that's really fast already.

You can skip the annoying translate-to-text-file step however if you use the right method (DAO recordsets) for doing the inserts. See a previous question I asked and had answered on StackOverflow: http://stackoverflow.com/questions/2986831/ms-access-why-is-adodb-recordset-batchupdate-so-much-slower-than-application-imp

Don't use INSERT INTO even with DAO; it's slow. Don't use ADO either; it's slow. But DAO + Delphi + Recordsets + instantiating the DbEngine COM object directly (instead of via the Access.Application object) will give you lots of speed.

apenwarr
Interesting approach. I'll give it a try later (the program in question has already been shipped with the text file import method).
dummzeuch