views:

59

answers:

3

hello .

I'm new to xml. I'm trying to parse an xml file to extract data from, but it shows the error below message when I call doc=minidom.parse('D:\\CONFIGRATION.xml') ...

xml.parsers.expat.ExpatError:not well-formed (invalid token): line 474, column 15

473 <Extras>
474    <extra Type>
475      jpg
476    </extra Type>
477    <extra Type>
478      psd
479    </extra Type>
480 </Extras>

Can anyone please help me? What is a well-formed XML document?

Thanks in advance

A: 

Check to see if your document has any errors on line 474, column 15. There is probably a clue at or near that point.

Also, did you misspell CONFIGURATION? You are missing a 'U'.

jhs
He probably didn't misspell it (in the sense of not specifying a valid file) since it read the file correctly.
John Feminella
what does (Well formed document) mean anyway ? does it mean . that i have no closed tags ???i checked it. and it was fine i guess .
Moayyad Yaghi
@John, yeah, I just wanted to remind him his config file is spelled wrong besides this bug.
jhs
+2  A: 

You ask what "well-formed" means. It means that the XML conforms to the standard. Not being "well-formed" means you've used illegal syntax. In your specific case you have a tag that looks like:

<@extra Type>

You can't have a space in your tag name. You have other problems as well -- you can't start a tag with @, and your closing tags are also wrong. The slash needs to immediately follow the <

The official specification for well-formed XML is on the W3C website. your xml against the specification. If you want more detailed information about your document you can use one of many xml validation services. Use your favorite search engine to search for "xml validation".

Bryan Oakley
+1 thats exactly what i was looking for
Moayyad Yaghi
one more thing ( that @ was to show the tags in this page ) i didn't put @ in my code
Moayyad Yaghi
+2  A: 

"Well-formed XML" means the document conforms to the W3C standards. The error message means your document does not meet those standards for some reason. For instance, those <EXTRA TYPE> tags are illegal because they contain spaces.

Read an overview like this one at Developer.com.

APC