views:

5170

answers:

4

Hi, I am writing this because I have really hit the wall and cannot go ahead. In my database I have escaped HTML like this: "<p>My name is Freddy and I was".

I want to show it as HTML OR strip the HTML tags in my XSL template. Both solutions will work for me and I will choose the quicker solution.

I have read several posts online but cannot find a solution. I have also tried disable-output-escape with no success. Basically it seems the problem is that somewhere in the XSL execution the engine is changing this <p> into this: <p>.

It is converting the & into &. If it helps, here is my XSL code. I have tried several combinations with and without the output tag on the top.

Any help will be appreciated. Thanks in advance.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  <xsl:output method="html" omit-xml-declaration="yes"/>

  <xsl:template match="DocumentElement">
    <div>
      <xsl:attribute name="id">mySlides</xsl:attribute>
      <xsl:apply-templates>
        <xsl:with-param name="templatenumber" select="0"/>
      </xsl:apply-templates>
    </div>

    <div>
      <xsl:attribute name="id">myController</xsl:attribute>
      <xsl:apply-templates>
        <xsl:with-param name="templatenumber" select="1"/>
      </xsl:apply-templates>
    </div>
  </xsl:template>

  <xsl:template match="DocumentElement/QueryResults">
    <xsl:param name="templatenumber">tobereplace</xsl:param>

    <xsl:if test="$templatenumber=0">
      <div>
        <xsl:attribute name="id">myController</xsl:attribute>
        <div>
          <xsl:attribute name="class">article</xsl:attribute>
          <h2>
            <a>
              <xsl:attribute name="class">title</xsl:attribute>
              <xsl:attribute name="title"><xsl:value-of select="Title"/></xsl:attribute>
              <xsl:attribute name="href">/stories/stories-details/articletype/articleview/articleid/<xsl:value-of select="ArticleId"/>/<xsl:value-of select="SEOTitle"/>.aspx</xsl:attribute>
              <xsl:value-of select="Title"/>
            </a>
          </h2>
          <div>
            <xsl:attribute name="style">text-indent: 25px;</xsl:attribute>
            <xsl:attribute name="class">articlesummary</xsl:attribute>
            <xsl:call-template name="removeHtmlTags">
              <xsl:with-param name="html" select="Summary" />
            </xsl:call-template>
          </div>
        </div>
      </div>
    </xsl:if>
    <xsl:if test="$templatenumber=1">
      <div>
        <xsl:attribute name="id">myController</xsl:attribute>
        <span>
          <xsl:attribute name="class">jFlowControl</xsl:attribute>
          aa
        </span>
      </div>
    </xsl:if>
  </xsl:template>

  <xsl:template name="removeHtmlTags">
    <xsl:param name="html"/>
    <xsl:choose>
      <xsl:when test="contains($html, '&lt;')">
        <xsl:value-of select="substring-before($html, '&lt;')"/>
        <!-- Recurse through HTML -->
        <xsl:call-template name="removeHtmlTags">
          <xsl:with-param name="html" select="substring-after($html, '&gt;')"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$html"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>
+6  A: 

You shouldn't store escaped HTML in your database. If your database contained the actual "<" character, then the "disable-output-escaping" command would do what you wanted.

If you can't change the data then you'll have to unescape the data before your perform the transform.

David
Thks! I had this insight right now: Instead of trying to unescape < with the HTML STRIP template, I have actually done the STRIP with this: < and it worked. Unfortunately, I am pulling information from another application's table and I can't change the way it works. I am using DotNetNuke.
Marcos Buarque
+1. I hate when people store structures like this in databases. Its a management nightmare.
Kent Fredric
+6  A: 

Based in the assumption that you have this HTML string,

<p>My name is Freddy &amp; I was

then if you escape it and store it in a database it would become this:

&lt;p&gt;My name is Freddy &amp;amp; I was

Consequently, if you retrieve it as XML (without unescaping it beforehand), the result would be this:

&amp;lt;p&amp;gt;My name is Freddy &amp;amp;amp; I was

and <xsl:value-of select="." disable-output-escaping="yes" /> would produce:

&lt;p&gt;My name is Freddy &amp;amp; I was

You are getting exactly the same thing you have in your database, but of course you see the HTML tags in the output. So what you need is a mechanism that does the following string replacements:

  • "&amp;lt;" with "&lt;" (effectively changing &lt; to < in unescaped ouput)
  • "&amp;gt;" with "&gt;" (effectively changing &gt; to > in unescaped ouput)
  • "&amp;quot;" with "&quot;" (effectively changing &quot; to " in unescaped ouput)
  • "&amp;amp;" with "&amp;" (effectively changing &amp; to & in unescaped ouput)

From your XSL I have inferred the following test input XML:

<DocumentElement>
  <QueryResults>
    <Title>Article 1</Title>
    <ArticleId>1</ArticleId>
    <SEOTitle>Article_1</SEOTitle>
    <Summary>&amp;lt;p&amp;gt;Article 1 summary &amp;amp;amp; description.&amp;lt;/p&amp;gt;</Summary>
  </QueryResults>
  <QueryResults>
    <Title>Article 2</Title>
    <ArticleId>2</ArticleId>
    <SEOTitle>Article_2</SEOTitle>
    <Summary>&amp;lt;p&amp;gt;Article 2 summary &amp;amp;amp; description.&amp;lt;/p&amp;gt;</Summary>
  </QueryResults>
</DocumentElement>

I have changed the stylesheet you supplied and implemented such a replacement mechanism. If you apply the following XSLT 1.0 template to it:

<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:my="my:namespace"
  exclude-result-prefixes="my"
>

  <xsl:output method="html" omit-xml-declaration="yes"/>

  <my:unescape>
    <my:char literal="&lt;" escaped="&amp;lt;" />
    <my:char literal="&gt;" escaped="&amp;gt;" />
    <my:char literal="&quot;" escaped="&amp;quot;" />
    <my:char literal="&amp;" escaped="&amp;amp;" />
  </my:unescape>

  <xsl:template match="DocumentElement">
    <div id="mySlides">
      <xsl:apply-templates mode="slides" />
    </div>
    <div id="myController">
      <xsl:apply-templates mode="controller" />
    </div>
  </xsl:template>

  <xsl:template match="DocumentElement/QueryResults" mode="slides">
    <div class="article">
      <h2>
        <a class="title" title="{Title}" href="{concat('/stories/stories-details/articletype/articleview/articleid/', ArticleId, '/', SEOTitle, '.aspx')}">
          <xsl:value-of select="Title"/>
        </a>
      </h2>
      <div class="articlesummary" style="text-indent: 25px;">
        <xsl:apply-templates select="document('')/*/my:unescape/my:char[1]">
          <xsl:with-param name="html" select="Summary" />
        </xsl:apply-templates>
      </div>
    </div>
  </xsl:template>

  <xsl:template match="DocumentElement/QueryResults" mode="controller">
    <span class="jFlowControl">
      <xsl:text>aa </xsl:text>
      <xsl:value-of select="Title" />
    </span>
  </xsl:template>

  <xsl:template match="my:char">
    <xsl:param name="html" />
    <xsl:variable name="intermediate">
      <xsl:choose>
        <xsl:when test="following-sibling::my:char">
          <xsl:apply-templates select="following-sibling::my:char[1]">
            <xsl:with-param name="html" select="$html" />
          </xsl:apply-templates>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="$html" disable-output-escaping="yes" />
        </xsl:otherwise>
      </xsl:choose>
    </xsl:variable>
    <xsl:call-template name="unescape">
      <xsl:with-param name="html" select="$intermediate" />
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="unescape">
    <xsl:param name="html" />
    <xsl:choose>
      <xsl:when test="contains($html, @escaped)">
        <xsl:value-of select="substring-before($html, @escaped)" disable-output-escaping="yes"/>
        <xsl:value-of select="@literal" disable-output-escaping="yes" />
        <xsl:call-template name="unescape">
          <xsl:with-param name="html" select="substring-after($html, @escaped)"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$html" disable-output-escaping="yes"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

Then this output HTML is produced:

<div id="mySlides">
  <div class="article">
    <h2>
      <a class="title" title="Article 1" href="/stories/stories-details/articletype/articleview/articleid/1/Article_1.aspx">Article 1</a>
    </h2>
    <div class="articlesummary" style="text-indent: 25px;">
      <p>Article 1 summary &amp; description.</p>
    </div>
  </div>
  <div class="article">
    <h2>
      <a class="title" title="Article 2" href="/stories/stories-details/articletype/articleview/articleid/2/Article_2.aspx">Article 2</a>
    </h2>
    <div class="articlesummary" style="text-indent: 25px;">
      <p>Article 2 summary &amp; description.</p>
    </div>
  </div>
</div>
<div id="myController">
  <span class="jFlowControl">aa Article 1</span>
  <span class="jFlowControl">aa Article 2</span>
</div>

Note

  • the use of a temporary namespace and embedded elements (<my:unescape>) to create a list of characters to replace
  • the use of recursion to emulate an iterative replacement of all affected characters in the input
  • the use of the implicit context within the unescape template to transport the information which character is to be replaced at the moment

Furthermore note:

  • the use of template modes to get different output for the same input (this replaces your templatenumber parameter)
  • most of the time there is no need for <xsl:attribute> elements. They can safely be replaced by inline notation (attributename="{attributevalue}")
  • the use of the concat() function to create the URL

Generally speaking, it is a bad idea to store escaped HTML in a database (more generally speaking: It is a bad idea to store HTML in a database.). You set yourself up to get all kinds of problems, this being one of them. If you can't change this setup, I hope that the solution helps you.

I cannot guarantee that it does the right thing in all situations, and it may open up security holes (think XSS), but dealing with this was not part of the question. In any case, consider yourself warned.

I need a break now. ;-)

Tomalak
A: 

thanks a lot.. harika olmuş teşekkürler.

A: 

"It is a bad idea to store HTML in a database"

What? How are you supposed to store it then? In an XML doc so you have to use XSLT anyway? As a web developer, we've always used SQL databases to store user-defined html data. There's nothing wrong with that method as long as it is sanitized properly for your purposes.

Shiggity