views:

1167

answers:

5

I am "exporting" data from a page to Excel using an HTML table. The table looks fine in the browser but in Excel special characters such as apostrophes, trademark symbols, etc. are improperly encoded. If I open the exported file in notepad and save it as ANSII encoded then open it in Excel everything looks fine again.

I tried setting the page encoding in ColdFusion using cfcontent, setencoding, and cfpagedirective with no luck so far. Any suggestions?

EDIT: As a temporary work around, the erroneous characters can be removed by saving the exported file to disk, opening it in Notepad, and then saving it again. Not exactly the best solution for the problem.

As for the strange characters:

® becomes ®

™ becomes â„¢

’ becomes ’

A: 

Rather than export using an HTML table, could you not export to a CSV and have "'s around each item so:

"value1","value2"

You can use the Java stringbuffer and create a queryToCSV function like this:

<cffunction name="queryToCSV" returntype="string" access="public" output="false">
 <cfargument name="query" type="query" required="true">

 <cfscript>
  var csv = createobject( 'java', 'java.lang.StringBuffer');
  var i = 1;
  var j = 1;
  var cols = "";
  var headers = "";
  var endOfLine = chr(13) & chr(10);
  if (arraylen(arguments) gte 2) headers = arguments[2];
  if (arraylen(arguments) gte 3) cols = arguments[3];
  if (not len( trim( cols ) ) ) cols = query.columnlist;
  if (not len( trim( headers ) ) ) headers = cols;
  headers = listtoarray( headers );
  cols = listtoarray( cols );

  for (i = 1; i lte arraylen( headers ); i = i + 1)
   csv.append( '"' & headers[i] & '",' );
  csv.append( endOfLine );

  for (i = 1; i lte query.recordcount; i= i + 1){
   for (j = 1; j lte arraylen( cols ); j=j + 1){
    if (isNumeric( query[cols[j]][i] ) )
     csv.append( query[cols[j]][i] & ',' );
    else
     csv.append( '"' & query[cols[j]][i] & '",' );

   }
   csv.append( endOfLine );
  }
  return csv.toString();
 </cfscript>
</cffunction>
Ian
While I certainly could, that doesn't seem to do anything for the end result. That is, the special characters still don't show up properly.
illvm
Not sure if it'd work, but, could using charsetEncode() help?
Ian
A: 

Have you tried charset="utf-8" in CFFile? If it still doesn't work, try charset="windows-1252" ?

FYI: CF9 can export to .xle (not just csv, real excel file!) with CFSpreadsheet

Henry
I'm not really writing it to a file before I send it, but I'll give it a whirl and see what comes up.
illvm
A: 

Is this a programming-related question?

At the very least, you could make like a programmer and supply some debug information. Forget apostrophe, there are a few different characters you could mean by that; what does a what should be a trademark symbol look like in Excel: do you get 0, 1, or 2 characters displayed? If non-zero, what character(s) is/are displayed? Is it possible for you to use the CODE() function in a formula to tell us the internal code for the character(s) that you see?

Update Thanks for the info about what funny characters actually appear on the screen. Based on that, and on how you managed to workaround it with Notepad, it appears that the saved page has been encoded in UTF-8, but Excel is displaying it as though it were encoded in the "ANSI" code page, e.g. cp1252 in places using a Western European language.

Next questions: how are you creating the HTML table? Up near the start of the HTML there should be something like charset=UTF-8 -- what do you see? Has the HTML file been saved with an extension of .htm, .html, or something else? How are you importing it into Excel? Which Excel: 2007, 2003, other?

John Machin
I don't think there is a meta tag for the charset. I'll give that a try and see if that works. Right now the HTML table is just being generated as plain HTML in ColdFusion then sent out with:<cfheader name="Content-Disposition" value="attachment; filename=MTR_#fromdate#_to_#todate#.xls"> <cfcontent type="application/vnd.ms-excel">.The office is using Excel 2007 for viewing it.
illvm
Added the meta tag and everything seems to be displaying correctly now. Thanks for the help.
illvm
added which meta tag?
Henry
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
illvm
+1  A: 

Have you looked into creating the speadsheet using XML? It makes the file a little more bloated and seems to have some problems when you have 10,000 + records. However, for small to medium sized datasets it works pretty well and provides the formatting functionality I beleive you are looking for.

http://www.bennadel.com/blog/917-Creating-Excel-Files-With-ColdFusion-XML-And-POI.htm

Jason
A: 

Firstly, what version of Excel are you trying to export to? Anything below 2007 (or possibly 2003) has notoriously bad handling of anything but the default character set (cp1252 I believe but I'm not 100% sure on this).

As others suggest, you're probably using the wrong character set in cfcontent - try something like:

<cfcontent type="application/vnd.ms-excel; charset=windows-1252">

If that's not the right charset you can try some others e.g. iso-8859-1 or us-ascii. See http://livedocs.adobe.com/coldfusion/7/htmldocs/wwhelp/wwhimpl/common/html/wwhelp.htm?context=ColdFusion_Documentation&amp;file=00000232.htm for the full list.

Also you might want to checkout cfreport when can generate excel files as well.

Loftx
Changing the charset in the HTTP headers didn't work. Adding the meta tag, however, did.
illvm