tags:

views:

28

answers:

1

I am trying to parse an XML file using the SAX interface of libxml2 in C.

My problem is that whitespace characters between end of a tag and start of a new tag are causing the callback "Characters" to be executed...Hi All,

i.e.
<?xml version="1.0"?>
<doc>
<para>Hello, world!</para>
</doc>

produces these events:

start document
start element: doc
start element: para
characters: Hello, world!
end element: para
characters:  

end element: doc
characters:  

end document  

It would be really nice if somehow these whitespaces don't get recognized as "characters".

Anybody got any idea why this is happening or how this can be prevented from happening???

+1  A: 

This is, of course, happening since whitespace between elements is significant in XML. So it's just operating according to specification.

See, for instance, this discussion.

unwind
Ahem.... got the point. So I need to linearize the XML. Thanks for the help unwind !!!
puffadder