tags:

views:

2362

answers:

4

I am trying to create a new XML file from an exisiting one using XSL. When writing the new file, I want to mask data appearing in the accountname field.

This is how my XML looks like:

<?xml version="1.0" encoding="UTF-8"?>
<Sumit>
    <AccountName>Sumit</AccountName>
      <CCT_datasetT id="Table">
       <row>
         <CCTTitle2>Title</CCTTitle2>
       </row>
       </CCT_datasetT>
</Sumit>

Here is my XSL Code:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
<xsl:output method="xml" encoding="UTF-8" indent="yes" omit-xml-declaration="no" />

  <xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

  <xsl:template match="@*">
    <xsl:attribute namespace="{namespace-uri()}" name="{name()}"/>
  </xsl:template>

<xsl:template match="AccountName">
<AccountName>acc_no</AccountName>
</xsl:template>

</xsl:stylesheet>

When I apply the XSL code to my XML, I get the following output:

<?xml version="1.0" encoding="UTF-16"?>
<Sumit>
<AccountName>acc_no</AccountName>
<CCT_datasetT id="">
<row>
<CCTTitle2>Title</CCTTitle2>
</row>
</CCT_datasetT>
</Sumit>

with the following issues:

1) It creates the output using UTF-16 encoding

2) The output of the second line is:

<CCT_datasetT id="">

The attribute value(Table) is missing.

Can anyone please tell me how do I get rid of these two issues. Many thanks.

A: 

Your XML file seems to be wrong. You can only have one root element.

erik
I had just pasted the snippet which had the issue. Anyways, I have updated the XML.
Sumit
A: 

Remove your second template rule. The first template rule (the identity rule) will already copy attributes for you. By including the second one (which has the explicit <xsl:attribute> instruction), you're creating a conflict--an error condition, and the XSLT processor is recovering by picking the one that comes later in your stylesheet. The reason the "id" attribute is empty is that your second rule is creating a new attribute with the same name but with no value. But again, that second rule is unnecessary anyway, so you should just delete it. That will solve the missing attribute value issue.

As for the output encoding, it sounds like your XSLT processor is not honoring the <xsl:output> directive you've given it, or it's being invoked in a context (such as a server-side framework?) where the encoding is determined by the framework, rather than the XSLT code. What XSLT processor are you using and how are you invoking it?

UPDATE (re: character encoding):

The save Method (DOMDocument) documentation says this:

Character encoding is based on the encoding attribute in the XML declaration, such as <?xml version="1.0" encoding="windows-1252"?>. When no encoding attribute is specified, the default setting is UTF-8.

I would try using transformNodeToObject() and save() instead of outputting to a string.

I haven't tested this, but you probably want something like this:

var result = new ActiveXObject("Microsoft.XMLDOM")

// Transform
xml.transformNodeToObject(xsl, result);

result.save("Output.xml");

UPDATE (re: unwanted whitespace):

If you want to have ultimate control over what whitespace appears in the result, you should not specify indent="yes" on the <xsl:output> element. Try removing that.

Evan Lenz
Thanks for the help on the first point. Removing the template, makes it work.For the second, I am using a javascript to apply the XSL code to my XML. It doesn't allow me to put more than 600 characters here, so I am posting the code as a reply.
Sumit
+1  A: 

Try this:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
  <xsl:output method="xml" encoding="UTF-8" indent="yes" omit-xml-declaration="no" />

    <xsl:template match="@*|node()">
      <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
  </xsl:template>

    <!-- You don't actually need this template -->
    <!-- but I think this was what you were trying to do -->
    <xsl:template match="@*" priority="2">
      <xsl:attribute namespace="{namespace-uri()}" name="{name()}"><xsl:value-of select="."/></xsl:attribute>
    </xsl:template>

  <xsl:template match="AccountName" priority="2">
  <AccountName>acc_no</AccountName>
  </xsl:template>

</xsl:stylesheet>

As for the UTF issue, you are doing the right thing.

From www.w3.org/TR/xslt: The encoding attribute specifies the preferred encoding to use for outputting the result tree. XSLT processors are required to respect values of UTF-8 and UTF-16.

Mark Worth
Mark, as you and Evan pointed out, I don't need the template. But having one with your code, does solve the issue. Many thanks for the reply. :) Any idea on why the encoding is getting changed?
Sumit
A: 

@Evan Lenz:

Here is the javascript code:

var oArgs = WScript.Arguments;

if (oArgs.length == 0)
{
   WScript.Echo ("Usage : cscript xslt.js xml xsl");
   WScript.Quit();
}
xmlFile = oArgs(0) + ".xml";
xslFile = oArgs(1) + ".xsl";


var xml = new ActiveXObject("Microsoft.XMLDOM")
xml.async = false
xml.load(xmlFile)

// Load the XSL
var xsl = new ActiveXObject("Microsoft.XMLDOM")
xsl.async = false
xsl.load(xslFile)

// Transform
var msg = xml.transformNode(xsl)



var fso = new ActiveXObject("Scripting.FileSystemObject");



// Open the text file at the specified location with write mode

var txtFile = fso.OpenTextFile("Output.xml", 2, false, 0);

txtFile.Write(msg);
txtFile.close();

It creates the output in a new file "Output.xml", but I don't know why the encoding is getting changed. I am more concerned about it, because of the following reason:

My input XML containg the following code:

<Status></Status>

And in the output it appears as

<Status>
</Section>

A carriage return is introduced for all empty tags. I am not sure, if it has something to do with the encoding. Please suggest.

Many Thanks.

Sumit
See my edited answer. I added some more info.
Evan Lenz
Well, I tried that and it does give me the output using UTF-8 with the following:It introduces whitespace before each line. Earlier the code was all left-aligned as it appears in my first output screenshot. I need to have it that ways.Second, it does not solve the empty tag problem. They closing tag still appears in the second line.
Sumit
Just added another note to my answer. Basically: try removing indent="yes".
Evan Lenz
I cannot remove indent="yes" because if I do so, the entire code comes in one single line in the output file(though the XML formatting is maintained).Also, any idea on why a carriage return is introduced when it encounters a empty element?Many thanks for sticking with me.
Sumit
Try:<xsl:strip-space elements="*"/>to get rid of your whitespace.
Mark Worth
If your input document has whitespace, then those should be propagated to the output via the identity template. It sounds like the XMLDOM is stripping whitespace-only text nodes by default. Try adding this to your code:xml.preserveWhiteSpace = true;And leave out indent="yes", which is still what I suspect is causing all the undesirable whitespace
Evan Lenz