XML: what processing rules apply for values intertwined with tags? | ansaurus

tags:

views:

46

answers:

1

Q:

XML: what processing rules apply for values intertwined with tags?

I've started working on a simple XML pull-parser, and as I've just defuzzed my mind on what's correct syntax in XML with regards to certain characters/sequences, ignorable whitespace and such (thank you, http://www.w3schools.com/xml/xml_elements.asp), I realized that I still don't know squat about what can be sketched up as the following case (which Validome finds well-formed very much; note that I only want to use xml files for data storage, no entities, DTD or Schemas needed):

<bookstore>
   <book id="1">
      <author>Kurt Vonnegut Jr.</author>
      <title>Slapstick</title>
   </book>
We drop a pie here.
   <book id="2">Who cares anyway?
      <author>Stephen King</author>
      <title>The Green Mile</title>
   </book>
And another one here.
   <book id="3">
      <author>Next one</author>
      <title>This time with its own title</title>
   </book>
</bookstore>

"We drop a pie here." and "And another one here." are values of the 'bookstore' element. "Who cares anyway?" is a value related to the second 'book' element.

How are these processed, if at all? Will "We drop a pie here." and "Another one here." be concatenated to form one value for the 'bookstore' element, or are they treated separately, stored somewhere, affecting the outcome of the parsing of the element they belong to, or...?

A:

Easiest way to go is to parse it with a few standards-compliant parsers and dump the output.

Tahir Akhtar 2009-05-19 17:20:13

related questions

Load an XmlNodeList into an XmlDocument without looping?

Does System.Xml use MSXML?

Using an XML catalog with Python's lxml?

Why Are People Still Creating RSS Feeds?

Pretty printing XML files on Emacs

Application configuration files

What is the best XML editor?

How much extra overhead is generated when sending a file over a web service as a byte array?

XPATHS and Default Namespaces

How to parse XML in VBA

Small modification to an XML document using StAX

how to use xpath in python

Best binary XML format for JavaME

How can I split an XML document into thirds (or, even better, n pieces)?

Test serialization encoding

Is it "bad practice" to be sensitive to linebreaks in XML documents?

HTML comments break down

Authoritative source on XML-sig

Best way to get InnerXml of an XElement?

HTML version choice

SQL 2005 For XML Explicit - Need help formatting

Any experiences with Protocol Buffers?

XML Editing/Viewing Software

XML Processing in Python

Converting CSV File to XML in Java