views:

9894

answers:

6

We have a C#/ASP.NET (2.0) application running on IIS 6 on Windows Server 2003 Enterprise Edition. This application reads Excel files using OleDb, but there are instances when we get an "Unspecified Error" exception thrown from within the application.

The file is stored in the temporary directory by our file upload code before opening. Since we have anonymous access enabled in IIS and since we also use impersonation in web.config, the folder C:\Windows\Temp\ has the proper permissions for the Internet Guest User Account (IUSR_[MachineName]) to be able to create, modify and delete files there.

OleDb connection string:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Windows\Temp\tmp123.tmp.xls;
Extended Properties="Excel 8.0;HDR=Yes;IMEX=1;"

[The "Data Source" attribute above would change for every file.]

The stack trace of the exception is:
    System.Exception: FileParsingFailed ---> System.Data.OleDb.OleDbException:  
    Unspecified error at  
    System.Data.OleDb.OleDbConnectionInternal..ctor(OleDbConnectionString constr,  
    OleDbConnection connection) at  
    System.Data.OleDb.OleDbConnectionFactory.CreateConnection(DbConnectionOptions options,   
    Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningObject) at  
    System.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(DbConnection  
    owningConnection, DbConnectionPoolGroup poolGroup) at  
    System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection  
    owningConnection) at  
    System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection  
    outerConnection, DbConnectionFactory connectionFactory) at  
    System.Data.OleDb.OleDbConnection.Open()  

Workaround:
So far, the only workaround we could come up with was to do an iisreset (we also have application pool recycling configured to happen once everyday in IIS, but it doesn't seem to help since the issue sometimes persists over several consecutive days). While this is not a good thing to do, what makes it worse is that we have other applications on the same web site that would get impacted whenever we reset IIS.

The questions:
1. How do we resolve this error since it happens occasionally and we don't see a pattern?
2. Are there any better (and free) ways of processing Excel files from C#/ASP.NET apart from OleDb? (We prefer not to install MS Office on the servers since it's not recommended by Microsoft)

Our limitations:
1. We're stuck with the MS Office 2003 (.xls) format and cannot move to the MS Office 2007 (OOXML) format.
2. The reasons we don't use CSV are because we may have commas within our data (this is a pain to deal with even if we use quoting) and we also use multiple worksheets in our spreadsheet (this cannot be done with CSV).

Thanks! :)

Update:
Thanks, Keith. It does seem like an issue with the Jet engine, but we use it because of the lack of (free and easy to use) alternatives.
Thanks, Joe. But we're on a limited budget - so we're mainly looking for free tools/libraries.

A: 

I suspect the error is something to do with the venerable Jet OLEDB engine. It's fairly creaky - fine for most desktop things but not much use for enterprise data interchange.

If you can upgrade to a recent C# 3/.Net 3.5 you can use the System.IO.Packaging library to open Office 2007 files (.xlsx or .xlsm files).

These files are actually zips - rename them to .zip and you can just view the XML files inside.

The XML files' formats are fairly horrible (for instance cell comments are VML, ugh!) but readable.

Alternatively get your users to save the Excel tables as CSVs. I'd avoid the Microsoft text driver DB provider though - it's rubbish and can't handle unicode. CSVs are easy to read anyway.

Keith
+1  A: 

Make sure you are closing your connections.

For example when developing MS Access applications (Jet) this error occurs if too many connections are left open. It works fine (there is your occasionally) for a while until it reaches max open connections.

CRice
+2  A: 

I've been using SpreadSheetGear.NET for a while, mostly to create Excel files, and it works well.

http://www.spreadsheetgear.com/products/spreadsheetgear.net.aspx

It provides binary Excel file reading/writing in native .NET, solving all the previous problems I've encountered trying to use OLE and JET to read and create Excel files.

The basic version used to come free as a perk for registering Visual C++ Express 2005. This was unadvertised, so it may or may not still exist with the 2008 edition.

David
A: 

SpreadsheetGear for .NET gives you an API for working with xls and xlsx workbooks from .NET. It is easier to use and faster than OleDB or the Excel COM object model (keep reading for some evidence of this).

Disclaimer: I own SpreadsheetGear LLC

Below is the code to create a 50,000 row by 10 column workbook with SpreadsheetGear, save it to disk, and then sum the numbers using OleDb and SpreadsheetGear. SpreadsheetGear reads the 500K cells in 0.31 seconds compared to 0.63 seconds with OleDB - just over twice as fast. SpreadsheetGear actually creates and reads the workbook in less time than it takes to read the workbook with OleDB.

The code is below. You can see the live samples or try it for yourself with the free trial.

using System;
using System.Data; 
using System.Data.OleDb; 
using SpreadsheetGear;
using SpreadsheetGear.Advanced.Cells;
using System.Diagnostics;

namespace SpreadsheetGearAndOleDBBenchmark
{
    class Program
    {
        static void Main(string[] args)
        {
            // Warm up (get the code JITed).
            BM(10, 10);

            // Do it for real.
            BM(50000, 10);
        }

        static void BM(int rows, int cols)
        {
            // Compare the performance of OleDB to SpreadsheetGear for reading
            // workbooks. We sum numbers just to have something to do.
            //
            // Run on Windows Vista 32 bit, Visual Studio 2008, Release Build,
            // Run Without Debugger:
            //  Create time: 0.25 seconds
            //  OleDb Time: 0.63 seconds
            //  SpreadsheetGear Time: 0.31 seconds
            //
            // SpreadsheetGear is more than twice as fast at reading. Furthermore,
            // SpreadsheetGear can create the file and read it faster than OleDB
            // can just read it.
            string filename = @"C:\tmp\SpreadsheetGearOleDbBenchmark.xls";
            Console.WriteLine("\nCreating {0} rows x {1} columns", rows, cols);
            Stopwatch timer = Stopwatch.StartNew();
            double createSum = CreateWorkbook(filename, rows, cols);
            double createTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("Create sum of {0} took {1} seconds.", createSum, createTime);
            timer = Stopwatch.StartNew();
            double oleDbSum = ReadWithOleDB(filename);
            double oleDbTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("OleDb sum of {0} took {1} seconds.", oleDbSum, oleDbTime);
            timer = Stopwatch.StartNew();
            double spreadsheetGearSum = ReadWithSpreadsheetGear(filename);
            double spreadsheetGearTime = timer.Elapsed.TotalSeconds;
            Console.WriteLine("SpreadsheetGear sum of {0} took {1} seconds.", spreadsheetGearSum, spreadsheetGearTime);
        }

        static double CreateWorkbook(string filename, int rows, int cols)
        {
            IWorkbook workbook = Factory.GetWorkbook();
            IWorksheet worksheet = workbook.Worksheets[0];
            IValues values = (IValues)worksheet;
            double sum = 0.0;
            Random rand = new Random();
            // Put labels in the first row.
            foreach (IRange cell in worksheet.Cells[0, 0, 0, cols - 1])
                cell.Value = "Cell-" + cell.Address;
            // Using IRange and foreach would be less code, 
            // but we'll do it the fast way.
            for (int row = 1; row <= rows; row++)
            {
                for (int col = 0; col < cols; col++)
                {
                    double number = rand.NextDouble();
                    sum += number;
                    values.SetNumber(row, col, number);
                }
            }
            workbook.SaveAs(filename, FileFormat.Excel8);
            return sum;
        }

        static double ReadWithSpreadsheetGear(string filename)
        {
            IWorkbook workbook = Factory.GetWorkbook(filename);
            IWorksheet worksheet = workbook.Worksheets[0];
            IValues values = (IValues)worksheet;
            IRange usedRahge = worksheet.UsedRange;
            int rowCount = usedRahge.RowCount;
            int colCount = usedRahge.ColumnCount;
            double sum = 0.0;
            // We could use foreach (IRange cell in usedRange) for cleaner 
            // code, but this is faster.
            for (int row = 1; row <= rowCount; row++)
            {
                for (int col = 0; col < colCount; col++)
                {
                    IValue value = values[row, col];
                    if (value != null && value.Type == SpreadsheetGear.Advanced.Cells.ValueType.Number)
                        sum += value.Number;
                }
            }
            return sum;
        }

        static double ReadWithOleDB(string filename)
        {
            String connectionString =  
                "Provider=Microsoft.Jet.OLEDB.4.0;" + 
                "Data Source=" + filename + ";" + 
                "Extended Properties=Excel 8.0;"; 
            OleDbConnection connection = new OleDbConnection(connectionString); 
            connection.Open(); 
            OleDbCommand selectCommand =new OleDbCommand("SELECT * FROM [Sheet1$]", connection); 
            OleDbDataAdapter dataAdapter = new OleDbDataAdapter(); 
            dataAdapter.SelectCommand = selectCommand; 
            DataSet dataSet = new DataSet(); 
            dataAdapter.Fill(dataSet); 
            connection.Close(); 
            double sum = 0.0;
            // We'll make some assumptions for brevity of the code.
            DataTable dataTable = dataSet.Tables[0];
            int cols = dataTable.Columns.Count;
            foreach (DataRow row in dataTable.Rows)
            {
                for (int i = 0; i < cols; i++)
                {
                    object val = row[i];
                    if (val is double)
                        sum += (double)val;
                }
            }
            return sum;
        }
    }
}
Joe Erickson
A: 

The OLEDB steps are follows : try this

MyConnection = new System.Data.OleDb.OleDbConnection("provider=Microsoft.Jet.OLEDB.4.0;Data Source='c:\csharp.net-informations.xls';Extended Properties=Excel 8.0;");

MyCommand = new System.Data.OleDb.OleDbDataAdapter("select * from [Sheet1$]", MyConnection);

MyCommand.TableMappings.Add("Table", "TestTable"); DtSet = new System.Data.DataSet(); MyCommand.Fill(DtSet);

http://csharp.net-informations.com/excel/csharp-excel-oledb.htm

blt.

bolton
A: 

I had the same issue and it seems that it is fixed by closing the connection to the file (xls or csv) on every iteration of the loop. I assume you are also looping through a list of files and .Open() a new connection to each file. If you .Close() the connection at the end of the loop, the problem seems to go away.

mt99