views:

82

answers:

2

I know some Python xlsx readers are emerging, but from what I've seen they don't seem nearly as intuitive as the built-in csv module.

What I want is a module that can do something like this:

reader = xlsx.reader(open('/path/to/file'))

for sheet in reader:
    print 'In %s we have the following employees:' % (sheet.name)
    for row in sheet:
        print '%s, %s years old' % (row['Employee'], row['Age'])

Is there such a reader?

+2  A: 

Well, maybe not for the xlsx format, but certainly for xls. Grab xlrd from here:

http://www.python-excel.org/

Here's some example code to get a feel for how easy it is to work with:

import xlrd

EMPLOYEE_CELL = 5
AGE_CELL = 6

reader = xlrd.open_workbook('C:\\path\\to\\excel_file.xls')
for sheet in reader.sheets():
    print 'In %s we have the following employees:' % (sheet.name)
    for r in xrange(sheet.nrows):
        row_cells = sheet.row(r)
        print '%s, %s years old' % (row_cells[EMPLOYEE_CELL].value, row_cells[AGE_CELL].value)

If you can save the documents as an xls, you should be good. I didn't try out the code above, but that's pretty close if not 100% correct. Try it out and let me know.

EDIT:

I'm guessing you're trying to do this on a non-windows machine. You may be able to use something like PyODConverter to convert the document from xlsx to xls, and then run against the converted file. Something like this:

user@server:~# python DocumentConverter.py excel_file.xlxs excel_file.xls user@server:~# python script_with_code_above.py

Once again, haven't tested it out but hopefully it'll work for your needs.

Eric Palakovich Carr
+2  A: 

xlrd has xlsx handling for basic data extraction, using the same APIs as for xls, in alpha test at the moment. Send me private e-mail if interested.

John Machin