views:

176

answers:

5

I have an Excel spreadsheet that has many people's estimates of another person's height and weight. In addition, some people have left comments on both estimate cells like "This estimate takes into account such and such".

I want to take the data from the spreadsheet (I've already figured out how to parse it), and represent it in a plain text file such that I can easily parse it back into a structured format (using Perl, ideally).

Originally I thought to use YAML:

Tom:
  Height:
    Estimate: 5
    Comment: Not that confident
  Weight:
    Estimate: 7
    Comment: Very confident
Natalia: ...

But now I'm thinking this is a bit difficult to read, and I was wondering if there were some textual tabular representation that would would be easier to read and still parsable.

Something like:

PERSON      HEIGHT     Weight
-----------------------------
Tom         5          7
___START_HEIGHT_COMMENT___
    We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.  That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed [...]  
Wait, what's this project about again?
___END_HEIGHT_COMMENT___
___START_WEIGHT_COMMENT___
    We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.  That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed [...]  
Wait, what's this project about again?
___END_WEIGHT_COMMENT___

Natalia     2          4
John        3          3

Is there a better way to do this?

+3  A: 

CSV (Comma Separated Values).

You can even save it directly into this format from Excel, and read it directly into Excel from this format. Yet it is also human readable, and easily machine parseable.

Robert Harvey
I don't think this takes into account the comments.
Matthew Flaschen
You can simply use "comment" columns instead. The end result would be the following set of columns: person, height estimate, height comment, weight estimate, weight comment.
Wim Leers
CSV is far less readable than YAML. It's horrid if there are many columns and the column width varies.
Michael Carman
@Michael Carman: Horrid for who? humans or computers? almost every environment i've touched had a builtin csv reader, but very few have builtin yaml libraries.
TokenMacGuy
@TokenMacGuy: Horrid for humans. The OP wanted something both readable and parseable. CSV makes the second easy but fails at the former, IMHO. Personally, I think YAML is a good choice, but the OP already ruled it out.
Michael Carman
A: 

Adding to Robert's answer, you can simply put the comments in additional columns (commas will be escaped by the CSV output filter of Excel etc). More on CSV format: www.csvreader.com/csv_format.php

Eric Drechsel
A: 

No reason you can't use XML, though I'd imagine it's overkill in this particular case.

Ari Roth
+1  A: 

Normally if I want to capture data from a spreadsheet in textual form I use CSV (which Excel can read and write). It's easy to generate and parse as well as being compatible with many other tools but it doesn't rank high on the "human readable" chart. It can be read but it's awkward for anything but simple files with equal field widths.

XML is an option, but YAML is easier to read. Being human-readable is one of the design goals of YAML. The YAML::Tiny module is a nice and lightweight module for typical cases.

It looks like what you have in mind is a plain text table, or possibly a tabular format with fixed with columns. There are some modules on CPAN that might be useful: Text::Table, Text::SimpleTable, others... These modules can generate a representation that's easy to read but parsing it will be harder. (They're intended for data presentation, not storage and retrieval.) You'd probably have to build your own parser.

Michael Carman
A: 

There's also Config::General for simple data, and its family of related classes.

Ether