ansaurus

Question

Answer 1

A:

I highly recommend saving this Excel document in a CSV format before doing anything else with it. You can do using this code After you have a CSV, you can either parse it using that library, or write your own parser for it.

Hamish Grubijan 2009-12-16 21:47:52

Answer 2

A:

Not a straight answer to your question but an alternative idea:

Your data looks like a pivot-ish table. I'd recommend "unpivoting" it into simple table.

Example:

           Russia      USA 
Q1            123       323
Q2            456       321
Q3            567       843

Becomes:

Quarter Country  Value
Q1      Russia    123       
Q1      USA       323 
Q2      Russia    321
....

If that is the case, not sure if I got this right in your question, than processing the data using a OleDB driver or whatever CSV kind of stuff should be become much less painful.

Alex 2009-12-16 22:14:07

Answer 3

A:

You can access Excel directly using ADO.NET via the ODBC driver. See http://www.davidhayden.com/blog/dave/archive/2006/05/26/2973.aspx or Google for more info on how to do that. You may wish to try HDR=No in your connection string, since your first row isn't really proper headers by the looks of it.

I haven't done this for a while, but I remember that it is a bit "temperamental" and takes some playing around with to get the column names right, but it should work. Try SELECT * FROM [Sheet1$] and see what you get.

Evgeny 2009-12-16 22:24:14

But assuming that there are many rows above the name that are garbage, would this method still work?

Norla 2009-12-16 22:31:37

It might. You'd have to see what Excel returns to you and figure out how to detect what's garbage and what's not.

Evgeny 2009-12-16 23:58:42

Answer 4

A:

As I did before, I prefer to use OLEDB connection in order to connect to an Excel document.

By the way, you can take a look at the following article for more information: http://www.codeproject.com/KB/office/excel%5Fusing%5Foledb.aspx

Ramezanpour 2009-12-16 22:29:34

From the link: "How does it happen?Apparently, the engine reads the first 8 cells of each column and check it's data type. if most of the first 8 cells are int / double, the problem remains. " I have a minimum of 14 rows before I get to real data. And yes, these do have other data (garbage) scattered throughout them.

Norla 2009-12-16 22:33:58

As far as I know, there's no problem with the solution about. Maybe I can't understand your problem but if your problem is about to reading data from an Excel document, OLEDB is a solution :-)

Ramezanpour 2009-12-16 23:06:20

Answer 5

+2 A:

If you want to read from excel in C#, i've used this library with great success, it'll give you the flexibility to parse columns/rows just however you'd like:

http://sourceforge.net/projects/koogra/ (read-only)

Other open source libraries i haven't used but could be good:

http://nexcel.sourceforge.net/ (read-only)
http://npoi.codeplex.com/ (can read and write)
~~http://developer.novell.com/wiki/index.php/Poi.Net~~ (this project is dead)

Alternatively, you can use one of the many good Java libraries, and convert it into a C# assembly using IKVM:

http://jxls.sourceforge.net/
http://www.andykhan.com/jexcelapi/
http://poi.apache.org/ (this one's the grand-daddy of java XLS libraries)

I've covered how to do the IKVM Java -> C# conversion here (it's really not as horrible an option as you think):

http://splinter.com.au/blog/?p=207

Chris 2009-12-16 22:36:34

I look forward to trying koogra and nexcel. I'll let you know how they work out for me.

Norla 2009-12-16 22:47:40

I've been using koogra for more than a year in a production system parsing half a dozen excel files daily, it really works quite well, however its documentation is nonexistent. Get in touch if you get stuck with it.

Chris 2009-12-16 22:50:13

+1 Chris, POI via IKVM isn't such an unwise option as it first appears. It's a very robust library.

Mark Nold 2009-12-17 01:46:59

POI.NET is definitely dead. NPOI is very much alive and robust. Definitely a better option than POI + IKVM

Nate 2009-12-17 14:21:47

Answer 6

A:

SpreadsheetGear for .NET can load workbooks and access any cells on any sheet in any order. You can get the formatted text of the cell (such as "1/1/09") or the underlying value ("1/1/09" is stored as the double 39814.0 in Excel or SpreadsheetGear).

You can see some live ASP.NET samples here and download the free trial here if you want to try it yourself.

Disclaimer: I own SpreadsheetGear LLC

Joe Erickson 2009-12-16 23:56:14

Not gonna lie. I wish I had a disclaimer like that. I am, however, looking for something more... free-ish.

Norla 2009-12-17 14:37:16

ansaurus

tags:

views:

answers:

C#: Reading data from an xls document

related questions