views:

449

answers:

3

I need to combine several tab-separated value (TSV) files into an Excel 2007 (XLSX) spreadsheet, preferably using Python. There is not much cleverness needed in combining them - just copying each TSV file onto a separate sheet in Excel will do. Of course, the data needs to be split into columns and rows same as Excel does when I manually copy-paste the data into the UI.

I've had a look at the raw XML file Excel 2007 generates and it's huge and complex, so writing that from scratch doesn't seem realistic. Are there any libraries available for this?

+1  A: 

Looks like xlwt may serve your needs -- you can read each TSV file with Python's standard library csv module (which DOES do tab-separated as well as comma-separated etc, don't worry!-) and use xlwt (maybe via this cheatsheet;-) to create an XLS file, make sheets in it, build each sheet from the data you read via csv, etc. Not sure about XLSX vs plain XLS support but maybe the XLS might be enough...?

Alex Martelli
A: 

The best python module for directly creating Excel files is xlwt, but it doesn't support XLSX.

As I see it, your options are:

  1. If you only have "several", you could just do it by hand.
  2. Use pythonwin to control Excel through COM. This requires you to run the code on a Windows machine with Excel 2007 installed.
  3. Use python to do some preprocessing on the TSV to produce a format that will make step (1) easier. I'm not sure if Excel reads TSV, but it will certainly read CSV files directly.
John Fouhy
+1  A: 

Note that Excel 2007 will quite happily read "legacy" XLS files (those written by Excel 97-2003 and by xlwt). You need XLSX files because .....?

If you want to go with the defaults that Excel will choose when deciding whether each piece of your data is a number, a date, or some text, use pythonwin to drive Excel 2007. If the data is in a fixed layout such that other than a possible heading row, each column contains data that is all of one known type, consider using xlwt.

You may wish to approach xlwt via http://www.python-excel.org which contains an up-to-date tutorial for xlrd, xlwt, and xlutils.

John Machin