tags:

views:

93

answers:

3

Hi

I want to read from an excel file in C. The excel 2007 file contains about 6000 rows and 2 columns. I want to store the contents in a 2-D array in C. If there exists a C library or any other method then please let me know.

Thanks.

+2  A: 

You have several choices:
1) Save your excel worksheet to a csv file and parse that.
2) Use the COM API (Windows proprietary and tricky)
3) See this link for a C++ class that you could modify.

Romain Hippeau
4) pure python solution :) http://juno-devel.ovh.org/Public/Code/Python/xls2csv.0.4.py - only depends on the package "xlrd"
Marco Mariani
The linked C++ class only appears to work for the older `.xls` format, not the Excel 2007 `.xlsx` format. There's little similarity between the formats.
Jerry Coffin
@Marco Mariani: The OP wants to read Excel data into memory; he doesn't want to butcher it and write it to a pseudo-CSV file. He should use `xlrd` directly [if he's happy with Python instead of C, and happy with using alpha code for reading XLSX files].
John Machin
A: 

Another C lib to read data from excel files can be found here.

Praveen S
+3  A: 

Excel 2007 stores the data in a bunch of files, most of them in XML, all crammed together into a zip file. If you want to look at the contents, you can rename your .xlsx to whatever.zip and then open it and look at the files inside.

Assuming your Excel file just contains raw data, and all you care about is reading it (i.e., you do not need/want to update its contents and get Excel to open it again), reading the data is actually pretty easy. Inside the zip file, you're looking for the subdirectory xl\worksheets\, which will contain a number of .xml files, one for each worksheet from Excel (e.g., a default workbook will have three worksheets named sheet1.xml, sheet2.xml and sheet3.xml).

Inside of those, you're looking for the <sheet data> tag. Inside of that, you'll have <row> tags (one for each row of data), and inside of them <c> tags with an attribute r=RC where RC is replaced by the normal row/column notation (e.g., "A1"). The <c> tag will have nested <v> tag where you'll find the value for that cell.

Jerry Coffin
I was just typing this up! +1 for thinking like me <grin>
cciotti
@Jerry Coffin, @cciotti: It's all **superficially** very easy ... BUT: need to examine the `xl/_rels/workbook.xml.rels` stream in case the user has shuffled the sheet order; `<v>` element for text cells give you an index into the `xl/sharedStrings.xml` stream; got dates in your data? You might want to look in the `xl/styles.xml` stream and decode the `num_format_str` to tell whether your floats are dates or numbers and look in the `xl/workbook.xml` stream for date epoch (1900 or 1904) plus other maybe-useful workbook-level info; `_xdead__xbeef_`-style escaping of not-valid-XML characters.
John Machin