views:

102

answers:

7

I'm getting characters in my PDF, i've stripped out \r\n \r \n \t, trimmed everything, decoded html entities and stripped tags. Nothing helps. The data is coming from a MySQL database.

Any help would be appreciated.

A: 

Did you try using utf8_decode()? http://php.net/manual/en/function.utf8-decode.php

Itamar Bar-Lev
Tried it, didn't help
gAMBOOKa
A: 

You might be using a font that is not available.

starmageza
The text is displayed like Hello There
gAMBOOKa
A: 

Try something like this to determine its numeric value and replace it:

$str = 'Hello  World';
echo str_replace(chr(ord('')), '[removed]', $str);

Output:

Hello [removed] World
Sarfraz
A: 

Have you tried

$string = "testContainingSpecialCharsäöüöüäüß";
$pdf->Cell(0,0,$string);

What characters should have been displayed instead of those things?

dhh
A: 

FPDF doesn't support unicode characters, so that might be the cause of your problem. There's an extension you could try at http://acko.net/node/56, or alternatively you could switch to another PDF generator library (I recommend TCPDF).

Or you could try using iconv to convert the text from UTF-8 to a supported character set (ie. $str = iconv('UTF-8', 'windows-1252', $str);) if you want to stick with FPDF.

wimvds
We're using a FPDF extension actually, UFPDF which supports unicode
gAMBOOKa
+1  A: 

Check string encoding (with mb_detect_encoding) before adding to pdf, is it unicode string? Data in MySQL db can be in unicode but your db connection can use some another encoding.

Zyava
Interesting. The MySQL db is utf-8. How can I change the encoding of my db connection if that is the case?
gAMBOOKa
To set connection encoding you should execute next query mysql_query("SET NAMES 'utf8' COLLATE 'utf8_general_ci'") right after mysql_connect. If your db collation isn't utf8_general_ci - set needed collation. If you have access to my.cnf you can add this query to it:init-connect="SET NAMES 'utf8' COLLATE 'utf8_general_ci'". More info there: http://dev.mysql.com/doc/refman/5.5/en/charset-connection.html
Zyava
A: 

Looks like the result of what happens when you copy / paste text from Microsoft word. Does the PDF file contain text from a MS Word document by any chance? That might be your problem. There are some interesting comments for converting and stripping these characters in PHP on the PHP.net website: http://www.php.net/manual/en/function.strtr.php#39383

I am only presuming it is MS Word characters in your PDF file.

Dwayne
No, it's crawled data from webpages, and the data where the characters show is only filled with tabs and spaces.
gAMBOOKa