views:

671

answers:

4

After having the unpleasant surprise that Comma Seperated Value (CSV) files are not necessarily comma-separated, I'm trying to find out if there is any way to detect what the regional settings list separator value is on the client machine from http request.

Scenario is as follows: A user can download some data in CSV format from web site (RoR, if it matters). That CSV file is generated on the fly, sent to the user, and most of the time double-clicked and opened in MS Excel on Windows machine at the destination. Now, if the user has ',' set as the list separator, the data is properly arranged in columns, but if any other separator (';' is widely used here) is set, it all just gets thrown into a single column. So, is there any way to detect what separator is used on the client machine, and generate the file accordingly?

I have a sinking feeling that it is not, but I'd like to be sure before I pass the 'can't be done, sorry' line to the customer :)

A: 

Could you just have the users with non comma separators set a profile kind of option and then generate CSVs based on user settings with the default being commas?

srboisvert
A: 

Toms, as far as I'm aware there is no way of achieving what you're after. The most you can do is try and detect the user locale and map it against a database of locales/list separators, altering the list separator in the .CSV file as a result.

Barry Gallagher
A: 

I think everyone should use Calc from OpenOffice - it asks when you open a file about encoding, column separators and other. I don't know answer for your question, but maybe you can try to send data in html tables or in xml - excel should read both of them correctly. From my experience it isn't easy to export data to excel. Few weeks ago I have problem with it and after few hours of work I asked a person, who couldn't open my csv file in excel, about version. It was Excel 98...

Take a look on html example and xml.

klew
+2  A: 

Here's a JavaScript solution that I just wrote based on the method shown here:

function getListSeparator() {
    var list = ['a', 'b'], str;
    if (list.toLocaleString) {
        str = list.toLocaleString();
        if (str.indexOf(';') > 0 && str.indexOf(',') == -1) {
            return ';';
        }
    }
    return ',';
}

The key is in the toLocaleString() method that uses the system list separator.

You could use JavaScript to get the list separator and set it in a cookie which you could then detect from your server.

I checked all the Windows Locales, and it seems that the default list separator is virtually always either ',' or ';'. For some locales the drop-down list in the Control Panel offers both options; for others it offers just ','. One locale, Divehi, has a strange character that I've not seen before as the list separator, and, for any locale, it is possible for the user to enter any string they want as the list separator.

Putting random strings as the separator in a CSV file sounds like trouble to me, so my function above will only return either a ';' or a '.', and it will only return a ';' if it can't find a ',' in the Array.toLocaleString string. I'm not entirely sure about whether array.toLocaleString has a format that's guaranteed across browsers, hence the indexOf checks rather than picking out a character at a specific index.

Using Array.toLocaleString to get the list separator works on IE6, IE7, and IE8, but unfortunately it doesn't seem to work on Firefox, Safari, Opera, and Chrome (or at least the versions of those browsers on my computer): they all seem to separate array items with a comma, irrespective of the Windows "list separator" setting.

Also worth noting that by default Excel seems to use the system "decimal separator" when it's parsing numbers out of CSV files. Yuk. So, if you're localizing the list separator you might want to localize the decimal separator too.

MB