views:

1450

answers:

4

I have an asp.net C# application where i am reading the contents of a spreadsheet using OLEDBConnection. I am using the below line of code to read from the excel spreadsheet.

 OleDbConnection con = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fullFilePath + ";Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'");

One of my column has data in various formats like strings,numbers,date etc in various rows.When running this ,When the data format is different,its not reading that value fromthe excel file. I searched in net a lot and found that we need to mention IMEX proprety in connection string.I added that,but no positive response !.

After surfine a lot, ifound that Any built-in Excel driver will query the first 8 rows of a sheet and then make a determination (without your permission or knowledge) as to what type of column it is, thereby ignoring anything that doesn’t meet this data type later in the sheet.

http://www.mattjwilson.com/blog/2009/02/13/microsoft-excel-drivers-and-imex/

Is there anyway to get rid of this problem ?

A: 

Update: it seems like Microsoft really does not recommend using Excel COM services on servers. Still, many developers do, both on non-.NET (as my employer does) and .NET (see here) enviroment, as alternatives are costly. All problems are mostly solvable (aside from potential scalability and performance problems in high-volume applications and in some cases licence problems). Costly alternatives are using third-party solutions like this.

You should not use OleDbConnection when you have data of different datatypes in one column. You can try to read from Excel using Excel COM/OLE API, for example (compiled from here, may contain errors):

Include the following reference into the project :

Microsoft Excel 10.0 Object Library

Microsoft Office 10.0 Object Library

Include the name space Excel.

  using Excel;
  ...
      Excel.ApplicationClass xl = new Excel.Application();
      xl.Visible = false;
      xl.UserControl = false;
      Excel.Workbook theWorkbook = xl.Workbooks.Open(
         fileName, 0, true, 5,
          "", "", true, Excel.XlPlatform.xlWindows, "\t", false, false,
          0, true); 
     Excel.Sheets sheets = theWorkbook.Worksheets;
     Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);
     System.Array myvalues;
     Excel.Range range = worksheet.get_Range("A1", "E1".ToString());
     myvalues = (System.Array)range.Cells.Value;

Important! You should free the resources used. From here:

// Need all following code to clean up and extingush all references!!!
theWorkbook.Close(null,null,null);
xl.Workbooks.Close();
xl.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject (range);
System.Runtime.InteropServices.Marshal.ReleaseComObject (sheets);
System.Runtime.InteropServices.Marshal.ReleaseComObject (xl);
System.Runtime.InteropServices.Marshal.ReleaseComObject (worksheet);
System.Runtime.InteropServices.Marshal.ReleaseComObject (theWorkbook);
worksheet=null;
sheets=null;
theWorkbook=null;
xl = null;
GC.Collect(); // force final cleanup!
Dmitry
You shouldn't generally use COM Interop from a server application.
Joe
Why? I can see no other way to deal with Excel, and working with Excel is a must in some kind of applications.
Dmitry
Added about cleanup - forgot the first time.
Dmitry
A: 

SpreadsheetGear for .NET can read, write, calculate, etc... Excel workbooks and allows you to access the underlying data (number, text, logical, error) of any cell or the formatted text of any cell using APIs such as IWorksheet.Cells[rowIndex, colIndex].Value or IWorksheet.Cells[rowIndex, colIndex].Text. There is no limitation based on the type of data in each column / cell. SpreadsheetGear is 100% safe .NET code (no COM interop, no unsafe native calls, etc...) so it is easier to deploy than other options - especially in server scenarios.

You can see live samples here and download the free trial here.

Disclaimer: I own SpreadsheetGear LLC

Joe Erickson
A: 

Hi there,

Did you find a solution to this? I have the imex=1 set in my code but it doesnt seem to do any good. The only solution I can think of is to change the TypeGuessRows reg entry and take the performance hit, but what happens if you dont have access to the registry because the security policy of that user is locked down. I cant believe that microsoft would write a feature and in order to turn it off you have to set a feature to test everything just to find out that you dont want to use the feature, seems backwards.

Thanks

Will

wdhough
+1  A: 

You are running into one of the many fun features of the JET engine. This one will basically sample all the data in each row for a single column and it will try to guess the data format. If you want your code to "just work" then there is a registry setting that will help with this. However be forewarned that this registry setting will affect how JET works with all imports on a system, not just your particular import.

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]
"ImportMixedTypes"="Text"
"TypeGuessRows"=dword:00000000

This registry setting will tell JET to check the format of every row in a column before guessing a format. If it finds mixed content it will import the row as text.

By default JET tests the first 25 rows when type guessing.

Alternatively you can change TypeGuessRows to 1 and JET will check the first row only when type guessing. That means if the first row is a number and the second row is a string JET will assume all rows are numbers and you will not be able to read them using ADO.NET

Another caveat: Make sure you are careful when editing your registry. You can decimate your system very quickly if you do not use care.

havana59er
But this does not work for a public website which is hosted in a public server
Shyju