tags:

views:

1614

answers:

2

Hi peeps,

Im fairly new to php, and im trying to load an xml source from a remote location, so i have no control of the formatting. Unfortanely the xml file im trying to load has no encoding:

<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"> <NODE> </NODE> </ROOT>

When trying something like:

$doc = new DOMDocument( );
$doc->load(URI);

I get:

Input is not proper UTF-8, indicate encoding ! Bytes: 0xA3 0x38 0x2C 0x38

Ive looked at ways to supress this, but no luck. How should i load this so that I can use it with DOMDocument?

Thanks!

A: 

You can try using the XMLReader class instead. The XMLReader is designed specifically for XML and has options for what encoding to use (including 'null' for none).

Steven Surowiec
+1  A: 

You could edit the document ('pre-process it') to specify the encoding it is being delivered in adding an XML declaration. What that is, you'll have to ascertain yourself, of course. The DOM object should then parse it.

Example XML declaration:

<?xml version="1.0" encoding="UTF-8" ?>
Rushyo
The bytes being complained about indicate that its ISO-8859-1, not UTF-8. In particular, 0xA3 is a GBP currency symbol (pound sign).
Dominic Mitchell
Obviously it wasn't UTF-8, or this wouldn't have been a problem. I refer to the crucial word 'example'. FYI. Those codes do not automatically infer ISO-8859-1 either.
Rushyo