tags:

views:

1197

answers:

4

For part of a web application the user needs to import some data from a spreadsheet. Such as a list of names and email addresses. Currently we do this by asking the user to browse for and upload a CSV file.

Unfortunately, this isn't always possible as various corporate IT systems only allow users to access files from document management systems and they have no privileges to save these to a local drive or network share.

The solution we have is to allow the user to cut and paste the entire sheet (CSV) into a text area and submit that. On doing this you get a nice tab delimited list of the data that is easily parsed as below.

Lorem   ipsum dolor sit amet
consectetur adipiscing elit Vivamus fermentum
Vivamus et diam eu eros
egestas rutrum Morbi id neque
id  enim molestie tincidunt Nullam

Unfortunately, various cells could produce unexpected results. In the set below you can see a " within the word prerium, a tab with the word Suspendisse and a line break within the word sollicitudin.

bibendum    ante molestie lectus Sed
pret"ium    "Susp endisse" "sollic
itudin" nisi at
urna    Sed sit amet odio
eu  neque egestas tincidunt Nunc
metus   Curabitur at nibh Nulla

In this case I cannot just split on tabs and line breaks without a more enhance mechanism to deal with the tabs, quotes and line breaks within the actual data.

Does anyone know of any code that can handle this reliably? Or even if excel and the clipboard like this can be relied upon to produce consistant results?

I am working in Asp.net 3.5 using C#. The users excel version may vary but should always be Windows 2000/XP/Vista and IE 6/7.

A: 

try find something helpful in System.Windows.Forms.Clipboard members

abatishchev
A: 

I've seen one solution with pasting csv files that involved getting the data from the clipboard in the IDataObject, and then reading each line of the csv into a data object using a StreamReader with the IDataObject.GetData("csv")

This would probably be a bit more reliable than pasting the csv into a textbox and then cramming it into another data object.

Sorry I have no example code, I'll try and edit this when I get some time and try it out.

Luke
+1  A: 

This looks like a "delimited values" list to me, so basically the same as CSV with TAB as the field delimiter and line break as the row delimiter. You could try it with the CSV Reader library from CodeProject, it should be handle to handle different delimiters, not just comma.

csgero
A: 

From a quick check, it's possible to create an Excel cell value that contains a tab (try ="abc"&CHAR(9)&"def" and copy the cell to an html text area) and one or more quotes (e.g. ="this has""one quote"). I can't see any completely foolproof way to be sure what you're getting.

The problem is that you don't have any real control over the interaction between Excel (as the source) and the text area within the browser (as the paste target). I'm not even sure if the result would necessarily be the same across different browsers!

I wonder if there might be a way to manage this better via some intermediate application that can negotiate a more extensive clipboard exchange with Excel such that an unambiguous format can be delivered to the browser? Not exactly a seamless user experience, I know, but when you get into the technology, you soon discover that inter-application copy-and-paste is really really hard.

Given that ill-formed data should be the exception, I think I'd go ahead and write a parser (you could probably do it with a regular expression or two) for the "normal" case and present the parsed data back to the user for confirmation, with the ability to edit lines that weren't parsed properly. With a little Ajax magic, I'd guess you could deliver something fairly slick.

Mike Woodhouse