tags:

views:

569

answers:

4

Suppose you don't know the name of the first worksheet in an excel workbook. And you want to find a way to read from the first page. This snippet sometimes works, but not always. Is it just me? Or is there a no brainer way to do this?

            MyConnection = new System.Data.OleDb.OleDbConnection("provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + inputFile + "';Extended Properties=Excel 8.0;");

            String[] excelSheets = new String[tbl.Rows.Count]; 
            int i = 0;
            foreach (DataRow row in tbl.Rows)
            {
                excelSheets[i] = row["TABLE_NAME"].ToString();
                i++;
            }
            string pageName = excelSheets[0]; 

            OleDbDataAdapter myAdapter = new System.Data.OleDb.OleDbDataAdapter("SELECT * FROM [" + pageName + "]", MyConnection);

Note: I am looking for the name of the first worksheet.

+3  A: 

If you have Office installed on the machine, why not just use Visual Studio Tools for Office (VSTO). Here is essentially the code to get the worksheet:

Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook workbook =  app.Workbooks.Open(fileName,otherarguments);
Microsoft.Office.Interop.Excel.Worksheet worksheet =  workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
Jacob Adams
I need to use an OleDbConnection. VSTO while nice, is not allowed for this project. thanks
eviljack
holy crap that actually worked!
eviljack
sorry, I downvoted you till I worked out the "otherarguments" then realized it worked. Trying to upvote but it won't let me.
eviljack
The call to 'workbook.Worksheets[0]' should be 'workbook.Worksheets[1]' since the Excel.Worksheets collection is base 1.
Mike Rosenblum
@mike-rosenblum good catch. I've edited my answer to include this
Jacob Adams
+1  A: 

This is the same as this other question First sheet Excel

I think that the order of the returned table gets messed up. We would need to find a way to get the order of the tabs. For now if you check your code, sometime the first sheet is index 0. But it can be returned in any order. I have tried deleting the other sheets and with only one you get the right name. But that wouldn't be pratical.

edit : after some research, it could be the tabs are returned in order of names Using Excel OleDb to get sheet names IN SHEET ORDER see link

Khan
+2  A: 

Your code seems to be missing the defintion of tbl. I assume it is something like

DataTable tbl = MyConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);

If so, you will probably get the sheetnames but in the wrong order.

I could not find a proper solution for this issue, so I approached it from another point of view. I decided to look for sheets that actual had information on it. You can probably do this by looking at the rows, but the method I used was to look at the columns from the schema information. (This obviously will fail in your used sheet only has one column as unused sheets also have one column), but it worked in my case, and I also used it to check I had the expected number of columns (in my case nine)

This uses the GetOleDbSchemaTable(OleDbSchemaGuid.Columns, null) method to return the column information.

The code is probably irrelevant/trival, and as I happened to be learning LINQ when I came across this issue, so I wrote it in LINQ style

It does require a small class called LinqList which you can get here

DataTable columnDetails = objConn.GetOleDbSchemaTable(
                          System.Data.OleDb.OleDbSchemaGuid.Columns,  null);    

LinqList<DataRow> rows = new LinqList<DataRow>(columnDetails.Rows);        

var query= (from r in rows
             group r by r["Table_Name"] into results
             select new { results.Key , count=results.Count() }  
            ); 

var activeSheets =  (from sheet in query
                      where sheet.count == 9 
                      select sheet.Key
                     ).ToList();

 if (activeSheets.Count != 1)
 ... display error
sgmoore
+1 for avoiding the use of automation and showing an actual example of the OleDbConnection.GetOleDbSchemaTable method.
Mike Rosenblum
Not sure the LinqList<T> class is needed though. Instead,you could make use the Enumerable.Cast<T>() extension method, as in 'LinqList<DataRow> rows = columnDetails.Rows.Cast<DataRow>();'.
Mike Rosenblum
I take it you mean 'var rows = columnDetails.Rows.Cast<DataRow>(); 'And yes that seems to work as well.Thanks
sgmoore
Oh, shoot, yes, I meant 'IEnumerable<DataRow> rows = columnDetails.Rows.Cast<DataRow>();', but of course 'var' works fine as well. Thanks for the pickup.
Mike Rosenblum
A: 

SpreadsheetGear for .NET will let you load a workbook and get the names of sheets (with IWorkbook.Worksheets[sheetIndex].Name) and get the raw data or formatted text of each cell (it does more but that's probably what you are looking for if you are currently using OleDB).

You can download a free trial here.

Disclaimer: I own SpreadsheetGear LLC

Joe Erickson