views:

995

answers:

1

Hi!

I got a HTML/PHP5 page with a form, then when it gets posted, it creates a XML file with the form input as data.

But all åäö looks like if I had used utf8_encode() on them. I can't utf8_decode() them, because then the "service" I send the XML files to, complains that is not UTF-8 (like it should).

Parser failed. Reason :2: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xE5 0x73 0x61 0x2E

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <meta http-equiv="content-type" content="text/html; charset=utf-8" />
        <title>Blabla></title>
    </head>

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<name>åäö</name>

I have tried mb_convert_encoding(), sending the data as CDATA with htmlentities() instead and changed default_charset = "utf-8" in php.ini but nothing works like it supposed to.

What should I do?

A: 

There were nothing wrong with the DOMDocument, I just forgot to specify UTF-8 in htmlentities() when I outputted the result.

Johan Olsson
If your data is UTF-8, why don’t you use UTF-8 for your output too?
Gumbo
I did, but I missed that the standard charset for the htmlentities() method is ISO-8859-1, then when I used htmlentities() on my UTF-8 data it transformed my åäö to åäö.The reason I use htmlentities() on the data is to show it on a webpage with <>.
Johan Olsson