tags:

views:

339

answers:

4

So I have a CSV that contains a filename and a file's contents. The file is full of newlines and all sorts of other characters. I need to have each row contain the filename in one column and the file contents in the next column, and then I need a new row, until the end of the file.

The data looks like this:

"filename.txt","hey there
buddy
how are you
doing"
"filename2.txt","and so on..."

According to http://creativyst.com/Doc/Articles/CSV/CSV01.htm#EmbedBRs, my CSV importer should read those line-breaky values as single values, not new rows. However, I have tried OO Calc, KSpread, Gnumeric, and even Microsoft Excel 2007 and none of them read it that way; they all consider each newline a new row.

Anyone know how to fix this? I've looked through other related questions but none of them seem to say.

If this doesn't work, I'll have to write the file directly with a Python Excel-writing module or something. Anyone know what to do here?

A: 

Specify quote (") as text qualifier when importing. Newlines will be ignored then.

LymanZerga
I do that and it doesn't seem to make a difference. I was using """" as the quote character for a while because my files' contents might contain ". Tried all of that with no difference.Right now using " for quote and , for delimiter because the linked site says that Excel would read that correctly. It doesn't.
cookiecaper
Judging by the set of tools you are trying to parse your csv with, I would also venture a guess that there might be a mix up with *nix newline characters and Windows' carriage return + line feed. I.e. Excel won't recognize newline as a proper line break.
LymanZerga
A: 

The free LumenWorks CSV reader handles newlines in the data if you set an appropriate flag.

Eric J.
A: 

Excel 2007 will read them fine, but you will still have the carriage returns in the column.

You'll need to remove them before importing the file.

If your creating the file yourself from SQL Server you can remove them easily.

REPLACE(REPLACE(Field, CHAR(13),' '), CHAR(10), ' ') AS FixedField
JeremySpouken
A: 

I had to switch to something that did Excel format I/O directly, I did not get Excel to parse newlines from CSV as expected.

Even then it didn't end up working out because Excel has a hard character-per-cell limit. The client decided not to continue to pursue this and worked out an alternate submission process with the developer of the software they needed the Excel file for.

cookiecaper