views:

276

answers:

4

How could this be optimized for speed by batching or other techniques? Its a 20MB Access2003 Database I am searching from Excel 2003 VBA.

I have my Access Table Keyed (autonumber) so I though this would provide intelligent non-linear searching like binary searches. Currently searching for 4000 values from a table of 147k records is taking 4.2 minutes.

I found this in a search:

The problem with a straight SELECT on the SQL Server side is that the DB will do a linear search through the table unless the column you're working with has an index on it; then the DB can be smarter.StackOverflow SQL C# Binary Search Question

Is this true and does it also apply to Access2003 DB?

VBA Code, Example:

    Dim cnn As ADODB.Connection
    Dim rst As ADODB.Recordset
    Set cnn = New ADODB.Connection 'open the connection
    With cnn
      .Provider = "Microsoft.Jet.OLEDB.4.0"
      .Open "PNdb2003.mdb"
    End With

    'define the record set
    Set rst = New ADODB.Recordset
    rst.CursorLocation = adUseClient   'for smaller datasets that fit into RAM

    For Each myVariant In Selection.Cells
        strSearchText = myVariant

        Dim sSQL As String
        sSQL = "SELECT Key FROM [MasterTable] WHERE PN=""" & strSearchText & """"

        rst.Open Source:=sSQL, ActiveConnection:=cnn, CursorType:=adOpenStatic, LockType:=adLockOptimistic

        Cells(myVariant.Row, 7).CopyFromRecordset rst

        rst.Close
    Next myVariant

cnn.Close
+1  A: 

Yes, unless your select contains an indexed field, preferrably a primary key (or clustered index) searches will in general be performed linearly since any query optimization cannot be done to determine the data order or layout.

GrayWizardx
+3  A: 

When you state "Access Table Keyed" is this the same field as the PN field? If not, and I suspect it isn't then yes, creating an index on the PN field will greatly improve performance. You should also do this for any other fields on which you do any searching. Even indexing a boolean field can make a significant difference in searching but do a before and after comparision.

Tony Toews
Yep bingo, I only had the field=key as index, but adding PN as indexed made a massive difference. Wow, 14 seconds for that same search that took 4.2 minutes.
ExcelCyclist
A: 

The optimization tips that seem most obvious based on your code sample are these:

1) Make sure you have an index on MasterTable.PN. This will be absolutely necessary to maximize performance and minimize table scans.

2) Experiment with combining the parameters and running them in a single query instead of 4000 separate queries. Perhaps with an IN statement that concatenates all the values you are interested in. This isn't 100% guaranteed to be faster, but in my experience there tends to be a ton of overhead running multiple queries in a loop like this.

Important: Don't take these a certain optimizations, test run each version several times to confirm that they really are faster in your specific situation. Your mileage may vary.

JohnFx
+1  A: 

What about creating a SQL JOIN between the table in your Excel workbook and the table in your Access database, fetch the resultset once and and, most crucially, in the same order as your workbook then using CopyFromRecordset once to populate all rows in the workbook in one go. I imagine this would be orders of magnitude faster than opening a recordset four thousand times.

Update

Can you provide the SQL for how you join an Excel spreadsheet and an Access table

Something like this:

SELECT A1.customer_number
  FROM [MS Access;Database=C:\Tempo\New_Jet_DB.mdb;].Customers AS A1
       LEFT OUTER JOIN
          [Excel 8.0;HDR=YES;IMEX=1;Database=C:\db.xls;].[Sheet1$] AS E1
          ON A1.customer_number = E1.col1;
onedaywhen
I don't do much work with Access from Excel, so I can't figure out how to do what you're recommending. Can you provide the SQL for how you join an Excel spreadsheet and an Access table in ADO from Excel?
David-W-Fenton
+1 for excellent answer (I, too, would immediately prefer a set operation on 4000 rows in a batch to 4000 individual operations), and with providing the SQL to do it.
David-W-Fenton