views:

98

answers:

2

i want to parse the following type of text. Example1 <root>my name is <j> <b> mike</b> </j> </root>

example 2

<root> my name is   <mytag1 attribute="val" >mike</mytag1> and yours is <mytag2> john</mytag2> </root>

can i parse it using a DOM parser?I will not have the same format evry time .I can have different formats in which the tags are nested.I dont know the format in advance.

A: 

You can use a DOM parser for the examples you've given - they're valid XML. However, you wouldn't be able to use it for non-XML as per your subject line.

When you say you can have "different formats in which the tags are nested" what exactly do you mean? If it's always simple nesting, e.g.

<root>
  <tag1>
    <tag2>
      <tag3>
        Stuff
      </tag3>
    </tag2>
  </tag1>
</root>

Then that will be fine. However, an XML parser won't like markup where an "outer" tag is closed before an "inner" one:

<root>
  <tag1>
    <tag2>
      Stuff
    </tag1> <!-- Invalid -->
  </tag2>
</root>
Jon Skeet
+1  A: 

Both these examples are valid XML documents so there's no reason you can;t do this.

If your XML is very simple, especially if it combines text and tags together, you may want to run it via an XSL transformation first, to have a format easier to parse or to convert it to other format, such as HTML.

David Rabinowitz