views:

542

answers:

3

The following code gets a 'report line' as an array and uses fputcsv to tranform it into CSV. Everything is working great except for the fact that regardless of the charset I use, it is putting a UTF-8 bom at the beginning of the file. This is exceptionally annoying because A) I am specifying iso and B) We have lots of users using tools that show the UTF-8 bom as characters of garbage.

I have even tried writing the results to a string, stripping the UTF-8 BOM and then echo'ing it out and still get it. Is it possible that the issue resides with Apache? If I change the fopen to a local file it writes it just fine without the UTF-8 BOM.

header("Content-type: text/csv; charset=iso-8859-1");
header("Cache-Control: no-store, no-cache");
header("Content-Disposition: attachment; filename=\"report.csv\"");

$outstream = fopen("php://output",'w');

for($i = 0; $i < $report->rowCount; $i++) {
    fputcsv($outstream, $report->getTaxMatrixLineValues($i), ',', '"');
}
fclose($outstream);

exit;
+3  A: 

My guess would be that your php source code file has a BOM, and you have php's output buffering enabled.

chris
Output buffering IS enabled, but the php source code files is in iso-8859-1. When I disable output buffering I get errors that the headers were already sent by the time I get to my first header(..); line, but the only things I have ahead of that are require_once and session_start
manyxcxi
I disabled output buffering, changed some code around, so I wouldn't get headers already sent issues and I'm still getting UTF-8 BOM data on streamed output.
manyxcxi
Chris, thank you for your help! I went through and made sure every file in my project was saved as ASCII. The issue seemed to be that even included files that had UTF-8 BOM data infected the output. With output buffering still enabled, and all the source code files changed to ASCII everything seems to be coming through fine!
manyxcxi
Yup. Even though your text editor may not visually render the BOM, it's there, and php sure sees it. To php, it basically looks like _<?php_ which to it, is not different than _hello<?php_
chris
A: 

I don't know if this solves your problem but have you tried using the print and implode functions to do the same thing?

header("Content-type: text/csv; charset=iso-8859-1");
header("Cache-Control: no-store, no-cache");
header("Content-Disposition: attachment; filename=\"report.csv\"");

for($i = 0; $i < $report->rowCount; $i++) {
    print(implode(',',$report->getTaxMatrixLineValues($i)));
}

That's not tested but pretty obvious.

zaf
A: 

Have you tried converting data before output?

$line = iconv("UTF-8","ISO-8859-1//TRANSLIT",$line);
dev-null-dweller