    <p> text 1 </p>
    <div> <p> text 2</p> </div>
    <p> Here is a list
                <li> ListItem1 </li>
            <li> ListItem1 </li>
            <li> dl item </li>
            <li> dl item2 </li>
    <p> I was here</p>

And I am trying to put it into a nicely formated XML file. In my xslt file I recursively check if all children of a p or div are other p's or div's and just promote them, other wise I use them as stand alone paragraphs. I extended this idea so that if a p or div with a child list show up properly but don't promote the list children.

A problem that I am having is that the output xml I get is the following

    <?xml version="1.0" encoding="utf-8"?><html>

    <p> text 1 </p>
     <p> text 2</p> 
     Here is a list
            <li> ListItem1 </li>
            <li> ListItem1 </li>
            <li> dl item </li>
            <li> dl item2 </li>

    <p> I was here</p>


"Here is a list" needs to be in paragraph tags too! I am going crazy trying to solve this ... Any input/links would be greatly appreciated.


You could first check that a <p> has a finishing tag </p>. If it doesn't then you take all text you find until you reach a new tag, that is a <p>, <div>, <li> or anything like that and simply copy it to your xml file where you have addad a full <p></p> strukture.

This is how I would do it, might not be the best way but it will work.

This transformation:

<xsl:stylesheet version="1.0"
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
       <xsl:apply-templates select="node()|@*"/>

 <xsl:template match=
  "div[descendant::div or descendant::p]
   p[descendant::div or descendant::p]

 <xsl:template match=
  "div[descendant::div or descendant::p]/text()
   p[descendant::div or descendant::p]/text()
   <xsl:element name="{name(..)}"
     <xsl:copy-of select="."/>

when applied on the provided XML document, produces the wanted, correct output:

      <p> text 1 </p>
      <p> text 2</p>
      <p> Here is a list

            <li> ListItem1 </li>
            <li> ListItem1 </li>
            <li> dl item </li>
            <li> dl item2 </li>
      <p> I was here</p>
