views:

1790

answers:

9

Hi I'm looking to parse spreadsheets (xls/ods) in Groovy. I have been using the Roo library for Ruby and was looking to try the same tasks in Groovy, as Java is already installed on a development server I use, and I would like to keep the number of technologies on the server to a simple core few.

I am aware that the ods format is zipped XML, and so can be parsed as such, but I would like to process the file using spreadsheet concepts, not XML concepts.

The ability to process xls files is not of major importance, but would save me having to save multiple xls files to ods (as this is for parsing data from clients).

Thanks

+7  A: 

I would suggest Apache POI for access to .xls files.

I've never had to work with the .ods format, so no information on that one.

jdmichal
+2  A: 

There's also JExcelAPI, which has a nice, clean, simple interface (for the most part).

Can't help you with ODS Files though.

cletus
I've used JExcelAPI in a Grails project and it worked beautifully. I highly recommend it.
anschoewe
A: 

If your spreadsheets are simple enught - without charts and other embedded contents - you should simply convert the spreadsheet to CSV.

Pros:

  • Both xls and ods will produce the same CSV - You'll have to handle just one input type.
  • You won't have to mess with new versions of (Open) Office.
  • Handling plaintext is always more fun than other obscure formats.

Cons:

  • One that I can think of - finding a reliable converter from xls and odf to csv. Shouldn't be too hard - OpenOffice has a built in one.
Adam Matan
In the perfect situation i would like to automate the data extraction if needed with a few steps as possible. So want to treat the spreadsheets as the raw data
Brian Heylin
A: 

A couple things:

1) I agree that using a CSV format can simplify some of the development work. OpenCSV can help with processing CSV files. There are other good CSV parsers for Java out there. Just remember that anything that's available for Java can be used by Groovy due to Groovy's unparalleled integration with Java.

2) I know you said you wanted to avoid handling XML, but Groovy makes XML processing exceedingly simple.

yawmark
it sure does, but as always, i want even simpler :) I'm basically looking for a Roo clone, as in open spreadsheet, iterate over lines, iterate over cells, extract data. Otherwise it means opening and extracting a zip file, and then parsing an XML file + knowing the XML schema
Brian Heylin
+1  A: 

I second jdmichal's vote for Apache POI. I have selected it as our library of choose to handle Excel file input (.XLS). The project is also working on the .XLSX file format if you ever decide you want to support that. Based on your specifications, I don't think you want to get into converting things into CSV and it seems like you have established input and output paths. For anyone who hasn't had the joy of dealing with CSV to Excel conversion, it can get a bit dicey. I have spent hours dealing with issues created by Excel converting string data to numeric data. You can see other testimonies to this effect on the POI Case Studies page. Beyond these issues, I simply don't want to personally have to handle these inputs. I'd rather invest the programming effort and streamline the workflow for the future.

I too have not dealt with ODF and have no plans to support it in my current project. You might want to check out the OpenOffice.org ODF Toolkit Project.

Good luck and have fun, - D.

dshaw
+1  A: 

How about looking at 'odftoolkit' ? http://odftoolkit.openoffice.org/

Amit
+1  A: 

Groovy in Action has a chapter named "Groovy on Windows" that discusses using Scriptom, a Groovy/COM bridge (using JACOB under the covers), to access several Windows apps including Excel.

For OpenOffice, you can use ODF Toolkit, as Amit pointed out.

Matt Passell
+1  A: 

I suggest you to take a look at SimpleXlsBuilder and SimpleXlsSlurper, both are based on apache POI and can fit your basic needs for reading from and writing to Excel 97 spreadsheets in a concise way.

A: 

You may want to take a look at SmartXLS for Java,it is a pure java spreadsheet component,support most excel feathers like formulas,charts,rich texts ... etc.

liya