views:

734

answers:

10

I'm using ADO to save data to an MS Access database. It was taking quite a while to save the data to file (about 7 seconds - which is too long for our purposes). I looked at the number of SQL queries being run, and it's about 4200; although there isn't a whole heap of data.

The database connection seems to be the bottleneck. Do you know of any way to decrease the amount of time this takes; either by somehow combining multiple statements into one to reduce overhead, or some ADO/MS-Access trick?

Can you, for instance, insert multiple rows into a table at once, and would this be noticeably faster?

Extra info:

One reason we have so many queries is that we insert a row, and then have another query to retrieve its autoincremented ID; then use this ID to insert several more rows, linking them to the first

In response to several comments and responses: I am leaving the connection open the entire time, and performing it as a single transaction with BeginTransaction() and CommitTransaciton()

+2  A: 

More recent versions of Access support the @@IDENTITY variable. You can use this to retrieve the identity column after an insert, without doing a query.

INSERT INTO mytable (field1,field2) VALUES (val1,val2);
SELECT @@IDENTITY;

See this knowledge base article.

Andomar
A: 

we insert a row, and then have another query to retrieve its autoincremented ID; then use this ID to insert several more rows, linking them to the first

Is this one table, two tables or more than two tables?

If one table, you could consider a different design e.g. you could generate your own random identifiers, use the nested sets model rather than the adjacency list model, etc. Hard to know if you won't share your design ;-)

If two tables, assuming there is a FOREIGN KEY between the tables on the AUTOINCREMENT/IDENTITY column, you can create a VIEW that INNER JOINs the two tables and INSERT INTO the VIEW and the AUTOINCREMENT/IDENTITY value will be 'copied' to the referencing table. More details and a working example in this Stack Overflow answer (link below).

If more than one table, the VIEW trick doesn't extend beyond two tables AFAIK, so you may just have to live with the poor performance or change technology e.g. DAO is reported faster than ADO, SQL Server may be faster than ACE/Jet, etc. Again, feel free to share your design, there could be a 'think outside the box' solution.

How to Insert Data?

onedaywhen
A: 

For your situation, it may be best to use ADO to do the insert instead of running an SQL command.

for i = 1 to 100
   rs.AddNew
   rs("fieldOne") = "value1"
   rs("fieldOne") = "value2"
   rs.Update
   id = rs("uniqueIdColumn")
   'do stuff with id...
next

I know that using ADO is slow, but it's many times quicker than opening 4200 connections...

CB
I'm not opening 4200 connections, though - I'm leaving the connection open the whole time. Would it make a difference to do this?
Smashery
This should still be much quicker then running the queries separately.
CB
A: 

Some proposal (that can even be combinated):

  • Why don't you get the autoincremented ID before inserting the row, so you'll have the value already available for your next query? This should spare you some time.
  • The closing/opening of the connexion has also to be checked.
  • Have you thought about manipulating recordsets (with Visual basic code to update your data) instead of tables (with SQL instructions sent to the database)?
  • Another solution (if you are sure that the connection is the bottleneck) would be to create the table locally and then export it to the access file once the job is done.
  • As you are using ADO, you can even save your data as an XML file, load a recordset from this XML file, manipulate it with VBA, and export it to an Access database
Philippe Grondier
A: 

Some thoughts that may or may not be helpful:

  1. Let me second the SELECT @@IDENTITY suggestion -- definitely a vastly faster way to insert data and retrieve the Autonumber value than open an AddOnly recordset, updating the fields, saving the record and storing the Autonumber value. SQL INSERT will always be faster than using the row-by-row recordset.

  2. using DAO transactions you might be able to convert multiple such inserts into a batch that gets executed at once. I'm not sure about this, since I don't know exactly what you're doing and how many tables you're using. The point is you'd run the series of SQL INSERTs in a transaction, and then do the .Commit at the end, such that the actual writing to the real database file will be happening as a single operation, instead of as 4200 (or however many) individual operations.

David-W-Fenton
See my answers suggesting the 'row-by-row recordset' approach is much quicker than the @@IDENTITY suggestion.
onedaywhen
A: 

Not meaning to be a smart-arse ... But, is there a reason for continuing to use Access? SQL Server Express is free, faster and more capable ...

Adrien
Inertia, I guess. I know it's not the nicest solution.
Smashery
Unfortunately, I think we've all had our battles with inertia. Good luck to you.
Adrien
A: 

It sounds like you're importing data, and Access has much better facilities available for importing data. Yes, it inserts many records at a time, and it would be faster.

Can you describe the application a bit more?

le dorfier
We have a data structure which we need to save into an MDB file. It's made up of C structs (so all very static). However, as the start of a move to a more flexible data structure, we're moving away from serializing.
Smashery
A: 

The cheapest and (mostly everytime) best speed boost you can give to a database is to keep non changing data in cache.

Don't know if it can apply to you case, but it is very simple. You can just plug in a (free) library that do it or keep a local collection of items that were read. Example :

  • One want to read item X.
  • Does item X is in the local collection ?
  • No : read item X from database and put it in the local collection.
  • Yes : just return local copy of item X.

Of course, it can be a little more complicated when you are runnning a web site with a lot of server. In that case, juste use Application Blocks or Cached.

Sylvain
+4  A: 

Some folk have posted that @@IDENTITY would be fast, so here’s a proof (using VBA) of how my INSERT INTO two tables at once via a VIEW trick is about three times faster than doing two INSERTS and grabbing the @@IDENTITY values each time... which is hardly surprising because the latter involves three Execute statements and the former only involves one :)

On my machine for the 4200 iterations, the VIEW trick took 45 seconds and the @@IDENTITY approach took 127 seconds:

Sub InitInerts()
  On Error Resume Next
  Kill Environ$("temp") & "\DropMe.mdb"
  On Error GoTo 0
  Dim cat
  Set cat = CreateObject("ADOX.Catalog")
  With cat
    .Create _
        "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & _
        Environ$("temp") & "\DropMe.mdb"

    With .ActiveConnection

      Dim Sql As String

      Sql = _
      "CREATE TABLE TableA" & vbCr & "(" & vbCr & "   ID IDENTITY NOT" & _
      " NULL UNIQUE, " & vbCr & "   a_col INTEGER NOT NULL" & vbCr & ")"
      .Execute Sql

      Sql = _
      "CREATE TABLE TableB" & vbCr & "(" & vbCr & "   ID INTEGER NOT" & _
      " NULL UNIQUE" & vbCr & "      REFERENCES TableA (ID)," & _
      "  " & vbCr & "   b_col INTEGER NOT NULL" & vbCr & ")"
      .Execute Sql

      Sql = _
      "CREATE VIEW TestAB" & vbCr & "(" & vbCr & "   a_ID, a_col, " & vbCr & " " & _
      "  b_ID, b_col" & vbCr & ")" & vbCr & "AS " & vbCr & "SELECT A1.ID, A1.a_col," & _
      " " & vbCr & "       B1.ID, B1.b_col" & vbCr & "  FROM TableA AS" & _
      " A1" & vbCr & "       INNER JOIN TableB AS B1" & vbCr & "    " & _
      "      ON A1.ID = B1.ID"
      .Execute Sql

    End With
    Set .ActiveConnection = Nothing
  End With
End Sub

Sub TestInerts_VIEW()

  Dim con
  Set con = CreateObject("ADODB.Connection")
  With con
    .Open _
        "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & _
        Environ$("temp") & "\DropMe.mdb"

    Dim timer As CPerformanceTimer
    Set timer = New CPerformanceTimer
    timer.StartTimer

    Dim counter As Long
    For counter = 1 To 4200
      .Execute "INSERT INTO TestAB (a_col, b_col) VALUES (" & _
                   CStr(counter) & ", " & _
                   CStr(counter) & ");"
    Next

    Debug.Print "VIEW = " & timer.GetTimeSeconds

  End With

End Sub

Sub TestInerts_IDENTITY()

  Dim con
  Set con = CreateObject("ADODB.Connection")
  With con
    .Open _
        "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & _
        Environ$("temp") & "\DropMe.mdb"

    Dim timer As CPerformanceTimer
    Set timer = New CPerformanceTimer
    timer.StartTimer

    Dim counter As Long
    For counter = 1 To 4200
      .Execute "INSERT INTO TableA (a_col) VALUES (" & _
          CStr(counter) & ");"

      Dim identity As Long
      identity = .Execute("SELECT @@IDENTITY;")(0)

      .Execute "INSERT INTO TableB (ID, b_col) VALUES (" & _
                   CStr(identity) & ", " & _
                   CStr(counter) & ");"

    Next

    Debug.Print "@@IDENTITY = " & timer.GetTimeSeconds

  End With

End Sub

What this shows is the the bottleneck now is the overhead associated with executing multiple statements. What if we could do it in just one statement? Well, guess what, using my contrived example, we can. First, create a Sequence table of unique integers, being a standard SQL trick (every database should have one, IMO):

Sub InitSequence()

  Dim con
  Set con = CreateObject("ADODB.Connection")
  With con
    .Open _
        "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & _
        Environ$("temp") & "\DropMe.mdb"

    Dim sql As String

    sql = _
        "CREATE TABLE [Sequence]" & vbCr & "(" & vbCr & "   seq INTEGER NOT NULL" & _
        " UNIQUE" & vbCr & ");"
    .Execute sql

    sql = _
        "INSERT INTO [Sequence] (seq) VALUES (-1);"
    .Execute sql

    sql = _
        "INSERT INTO [Sequence] (seq) SELECT Units.nbr + Tens.nbr" & _
        " + Hundreds.nbr + Thousands.nbr AS seq FROM ( SELECT" & _
        " nbr FROM ( SELECT 0 AS nbr FROM [Sequence] UNION" & _
        " ALL SELECT 1 FROM [Sequence] UNION ALL SELECT 2 FROM" & _
        " [Sequence] UNION ALL SELECT 3 FROM [Sequence] UNION" & _
        " ALL SELECT 4 FROM [Sequence] UNION ALL SELECT 5 FROM" & _
        " [Sequence] UNION ALL SELECT 6 FROM [Sequence] UNION" & _
        " ALL SELECT 7 FROM [Sequence] UNION ALL SELECT 8 FROM" & _
        " [Sequence] UNION ALL SELECT 9 FROM [Sequence] ) AS" & _
        " Digits ) AS Units, ( SELECT nbr * 10 AS nbr FROM" & _
        " ( SELECT 0 AS nbr FROM [Sequence] UNION ALL SELECT" & _
        " 1 FROM [Sequence] UNION ALL SELECT 2 FROM [Sequence]" & _
        " UNION ALL SELECT 3 FROM [Sequence] UNION ALL SELECT" & _
        " 4 FROM [Sequence] UNION ALL SELECT 5 FROM [Sequence]" & _
        " UNION ALL SELECT 6 FROM [Sequence] UNION ALL SELECT" & _
        " 7 FROM [Sequence] UNION ALL SELECT 8 FROM [Sequence]" & _
        " UNION ALL SELECT 9 FROM [Sequence] ) AS Digits )" & _
        " AS Tens, ( SELECT nbr * 100 AS nbr FROM ( SELECT" & _
        " 0 AS nbr FROM [Sequence] UNION ALL SELECT 1 FROM" & _
        " [Sequence] UNION ALL SELECT 2 FROM [Sequence] UNION"
    sql = sql & _
        " ALL SELECT 3 FROM [Sequence] UNION ALL SELECT 4 FROM" & _
        " [Sequence] UNION ALL SELECT 5 FROM [Sequence] UNION" & _
        " ALL SELECT 6 FROM [Sequence] UNION ALL SELECT 7 FROM" & _
        " [Sequence] UNION ALL SELECT 8 FROM [Sequence] UNION" & _
        " ALL SELECT 9 FROM [Sequence] ) AS Digits ) AS Hundreds," & _
        " ( SELECT nbr * 1000 AS nbr FROM ( SELECT 0 AS nbr" & _
        " FROM [Sequence] UNION ALL SELECT 1 FROM [Sequence]" & _
        " UNION ALL SELECT 2 FROM [Sequence] UNION ALL SELECT" & _
        " 3 FROM [Sequence] UNION ALL SELECT 4 FROM [Sequence]" & _
        " UNION ALL SELECT 5 FROM [Sequence] UNION ALL SELECT" & _
        " 6 FROM [Sequence] UNION ALL SELECT 7 FROM [Sequence]" & _
        " UNION ALL SELECT 8 FROM [Sequence] UNION ALL SELECT" & _
        " 9 FROM [Sequence] ) AS Digits ) AS Thousands;"
    .Execute sql

  End With

End Sub

Then use the Sequence table to enumerate the values from 1 to 42000 and construct rows in a single INSERT INTO..SELECT statement:

Sub TestInerts_Sequence()

  Dim con
  Set con = CreateObject("ADODB.Connection")
  With con
    .Open _
        "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & _
        Environ$("temp") & "\DropMe.mdb"

    Dim timer As CPerformanceTimer
    Set timer = New CPerformanceTimer
    timer.StartTimer

    .Execute "INSERT INTO TestAB (a_col, b_col) " & _
             "SELECT seq, seq " & _
             "FROM Sequence " & _
             "WHERE seq BETWEEN 1 AND 4200;"

    Debug.Print "Sequence = " & timer.GetTimeSeconds



  End With

End Sub

That executes on my machine in 0.2 of a second!

onedaywhen
A: 

Allow me to refute the following assertions:

the SELECT @@IDENTITY suggestion -- definitely a vastly faster way to insert data and retrieve the Autonumber value than open an AddOnly recordset, updating the fields, saving the record and storing the Autonumber value. SQL INSERT will always be faster than using the row-by-row recordset.

I thought the two recordsets approach might be a bit quicker than the @@IDENTITY approach because I suspected it involves fewer database roundtrips. In my testing it was a lot quicker at 1.2 seconds, compared with 127 seconds for the @@IDENTITY approach. Here's my code (code to create the .mdb is posted in another answer):

Sub TestInerts_rs()

  Dim con
  Set con = CreateObject("ADODB.Connection")
  con.Open _
        "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & _
        Environ$("temp") & "\DropMe.mdb"

  Dim timer As CPerformanceTimer
  Set timer = New CPerformanceTimer
  timer.StartTimer

  Dim rs1
  Set rs1 = CreateObject("ADODB.Recordset")
  With rs1
    .ActiveConnection = con
    .CursorType = 1  ' keyset
    .LockType = 3  ' optimistic
    .Source = "SELECT a_col, ID FROM TableA;"
    .Open
  End With

  Dim rs2
  Set rs2 = CreateObject("ADODB.Recordset")
  With rs2
    .ActiveConnection = con
    .CursorType = 1  ' keyset
    .LockType = 3  ' optimistic
    .Source = "SELECT b_col, ID FROM TableB;"
    .Open
  End With

  Dim counter As Long
  For counter = 1 To 4200
    rs1.AddNew "a_col", counter

    Dim identity As Long
    identity = rs1.Fields("ID").value

    rs2.AddNew Array(0, 1), Array(counter, identity)

  Next

  Debug.Print "rs = " & timer.GetTimeSeconds

End Sub
onedaywhen