views:

786

answers:

3

I am trying to extract a table of values from an excel (2003) spreadsheet using vb6, the result of which needs to be stored in a (adodb) recordset. The table looks like this:

    Name   Option.1  Option.2  Option.3  Option.4  Option.5  Option.6 
    -----------------------------------------------------------------
    Name1         2         3         4
    Name2         2         3         4
    Name3         2         3         4
    Name4         2         3         4
    Name5         2         3         4
    Name6         2         3         4
    Name7         2         3         4
    Name8         2         3         4
    Name9         2         3         4         5         6         7  

Upon connecting and executing the query "SELECT * FROM [Sheet1$]" or even a column-specific, "SELECT [Option#6] FROM [Sheet1$]" (see footnote 1) and looping through the results, I am given Null values for the row Name9, Option.4 --> Option.6 rather than the correct values 5, 6, and 7. It seems the connection to the spreadsheet is using a "best guess" of deciding what the valid table limits are, and only takes a set number of rows into account.

To connect to the spreadsheet, I have tried both connection providers Microsoft.Jet.OLEDB.4.0 and MSDASQL and get the same problem.

Here are the connection settings I use:

Set cn = New ADODB.Connection
With cn
    .Provider = "Microsoft.Jet.OLEDB.4.0"
    .ConnectionString = "Data Source=" & filePath & ";Extended Properties=Excel 8.0;"
    - - - - OR - - - - 
    .Provider = "MSDASQL"
    .ConnectionString = "Driver={Microsoft Excel Driver (*.xls)};" & _
                        "DBQ=" & filePath & ";MaxScanRows=0;"
    .CursorLocation = adUseClient
    .Open
End With  
Set rsSelects = New ADODB.Recordset
Set rsSelects = cn.Execute("SELECT [Option#5] FROM " & "[" & strTbl & "]")

This problem only occurs when there are more than 8 rows (excluding the column names), and I have set MaxScanRow=0 for the MSDASQL connection, but this has produced the same results.

Notable project references I have included are:

  • MS ActiveX Data Objects 2.8 Library
  • MS ActiveX Data Objects Recordset 2.8 Library
  • MS Excel 11.0 Object Library
  • MS Data Binding Collection VB 6.0 (SP4)

Any help in this matter would be very appreciated!

(1) For some reason, when including a decimal point in the column name, it is interpreted as a #.


Thanks everyone! Halfway through trying to set up a Schema.ini "programmatically" from KB155512 onedaywhen's excellent post pointed me towards the solution:

.Provider = "Microsoft.Jet.OLEDB.4.0"
.ConnectionString = "Data Source=" & filePath & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

I would encourage anyone with similar problems to read the post and comments, since there are slight variations to a solution from one person to another.

+3  A: 

The Excel ISAM driver by default looks into the first handful of your rows and guesses their data type. Should there be (later in the table) data that does not fit into the initial assumption, it frowns and turns it to NULL.

Your MaxScanRows=0 setting is the key to this problem. It sounds like it would do the Right Thing (scan the whole table for the data type to use), but really it doesn't.

See onedaywhen's answer for further details, my first info about KB282263 was not the correct advice.

Tomalak
The KB article you linked to is for text files, not Excel. A schema.ini file cannot be used in conjuction with Excel (which is a shame).
onedaywhen
A: 

The best advice I can give you is to stop doing it in the VB6 environment. Open Excel, press ALT+F11 and you with get the VBA IDE. Put your code in their. From within this environment you can access the full Excel object model.

I've seen many people try and interact with Excel in many different ways and they all have problems. Using either the VBA macro, or Add-in method is a best way I have found of getting at the data. It how Microsoft get Excel and Project to integrate with TFS.

Sometimes you need to rethink the process a little for this approach to be suitable. E.g. You may need to get the user who is using the spreadsheet to run a macro that will push the data out of the spreadsheet instead of you running a process to pull the data from the spreadsheet but usually it is quite doable.

Darrel Miller
This solution presumes an active user, and some measure of control over the user's environment. This may not be the case, in particular if Excel is simply being used as a data entry mechanism (e.g., collecting 50 spreadsheets from various entry personnel). True, one could include the VBA with the workbook / worksheet / template, but it becomes more complicated if the data will end up in a variety of databases (e.g., replicated databases around the world). Also, this solution assumes homogeneity of Excel versions.
David T. Macknet
+1  A: 

You are correct: it is guessing the data type based on a number of rows. There are local machine registry keys you may be able to alter to influence the data type chosen. For more details, see External Data - Mixed Data Types.

onedaywhen
At least I was in the ballpark. +1
Tomalak