views:

158

answers:

4

Hello,

I am trying to canonicalize an xml node by using System.Security.Cryptography.Xml.XMLDsigC14nTransform class of c# .net Framework 2.0.

The instance expects three different input types, NodeList, Stream and XMLDocument. I try the transform with all of these input types but I get different results. What I really want to do is to canonicalize a single node, but as you can see in the output file, the output does not contain any of the inner xml.

Any suggestions about the proper way to canonicalize an XML Node are very appreciated. Best,

string path = @"D:\Test\xml imza\sign.xml";
XmlDocument xDoc = new XmlDocument();
xDoc.PreserveWhitespace = true;
using (FileStream fs = new FileStream(path, FileMode.Open))
{
    xDoc.Load(fs);
}

// canon node list
XmlNodeList nodeList = xDoc.SelectNodes("//Child1");

XmlDsigC14NTransform transform = new XmlDsigC14NTransform();
transform.LoadInput(nodeList);
MemoryStream ms = (MemoryStream)transform.GetOutput(typeof(Stream));

File.WriteAllBytes(@"D:\Test\xml imza\child1.xml", ms.ToArray());

// canon XMLDocument
transform = new XmlDsigC14NTransform();
transform.LoadInput(xDoc);
ms = (MemoryStream)transform.GetOutput(typeof(Stream));

File.WriteAllBytes(@"D:\Test\xml imza\doc.xml", ms.ToArray());

// Document to Stream
ms = new MemoryStream();
XmlWriter xw = XmlWriter.Create(ms);
xDoc.WriteTo(xw);
xw.Flush();
ms.Position = 0;

transform = new XmlDsigC14NTransform();
transform.LoadInput(ms);
ms = (MemoryStream)transform.GetOutput(typeof(Stream));

File.WriteAllBytes(@"D:\Test\xml imza\ms.xml", ms.ToArray());

// node to stream
ms = new MemoryStream();
xw = XmlWriter.Create(ms);
nodeList[0].WriteTo(xw);
xw.Flush();
ms.Position = 0;

transform = new XmlDsigC14NTransform();
transform.LoadInput(ms);
ms = (MemoryStream)transform.GetOutput(typeof(Stream));

File.WriteAllBytes(@"D:\Test\xml imza\ms2.xml", ms.ToArray());

sign.xml

<?xml version="1.0" encoding="utf-8" ?>
<Root Attr="root" xmlns:test="http://www.test.com/xades#"&gt;
  <Child1 Cttribute="c3" Attribute1="c1" Bttribute="c2">
    <child11 Attribute11="c11">Element11</child11>
  </Child1>
  <Child2 Attribute2="c2">
    <child21 Attribute21="c21">Element21</child21>
    <child22 Attribute22="c22">Element22</child22>
  </Child2>
  <Child3 Attribute3="c3">
    <child31 Attribute32="c31">
      <child311 Attribute311="c311">Element311</child311>
    </child31>
  </Child3>  
</Root>

Child1.xml

<Child1 xmlns:test="http://www.test.com/xades#"&gt;&lt;/Child1&gt;

doc.xml

<Root xmlns:test="http://www.test.com/xades#" Attr="root">&#xD;
  <Child1 Attribute1="c1" Bttribute="c2" Cttribute="c3">&#xD;
    <child11 Attribute11="c11">Element11</child11>&#xD;
  </Child1>&#xD;
  <Child2 Attribute2="c2">&#xD;
    <child21 Attribute21="c21">Element21</child21>&#xD;
    <child22 Attribute22="c22">Element22</child22>&#xD;
  </Child2>&#xD;
  <Child3 Attribute3="c3">&#xD;
    <child31 Attribute32="c31">&#xD;
      <child311 Attribute311="c311">Element311</child311>&#xD;
    </child31>&#xD;
  </Child3>  &#xD;
</Root>

ms.xml

<Root xmlns:test="http://www.test.com/xades#" Attr="root">
  <Child1 Attribute1="c1" Bttribute="c2" Cttribute="c3">
    <child11 Attribute11="c11">Element11</child11>
  </Child1>
  <Child2 Attribute2="c2">
    <child21 Attribute21="c21">Element21</child21>
    <child22 Attribute22="c22">Element22</child22>
  </Child2>
  <Child3 Attribute3="c3">
    <child31 Attribute32="c31">
      <child311 Attribute311="c311">Element311</child311>
    </child31>
  </Child3>  
</Root>

ms2.xml

<Child1 Attribute1="c1" Bttribute="c2" Cttribute="c3">
    <child11 Attribute11="c11">Element11</child11>
  </Child1>
A: 

I found probably the solution at MSDN If I got the problem correctly.

Does this solve the problem?:

string path = @"sign.xml";
var xDoc = new XmlDocument();
xDoc.PreserveWhitespace = true;
using (var fs = new FileStream(path, FileMode.Open))
{
    xDoc.Load(fs);
}

// canon node list
XmlNodeList nodeList = xDoc.SelectNodes("//Child1");

var transform = new XmlDsigC14NTransform(true)
                    {
                        Algorithm = SignedXml.XmlDsigExcC14NTransformUrl
                    };

var validInTypes = transform.InputTypes;
var inputType = nodeList.GetType();
if (!validInTypes.Any(t => t.IsAssignableFrom(inputType)))
{
    throw new ArgumentException("Invalid Input");
}

transform.LoadInput(xDoc);
var innerTransform = new XmlDsigC14NTransform();

innerTransform.LoadInnerXml(xDoc.SelectNodes("//."));
var ms = (MemoryStream) transform.GetOutput(typeof (Stream));
ms.Flush();
File.WriteAllBytes(@"child1.xml", ms.ToArray());

In child1.xml I have:

<Root xmlns:test="http://www.test.com/xades#" Attr="root">&#xD;
  <Child1 Attribute1="c1" Bttribute="c2" Cttribute="c3">&#xD;
    <child11 Attribute11="c11">Element11</child11>&#xD;
  </Child1>&#xD;
  <Child2 Attribute2="c2">&#xD;
    <child21 Attribute21="c21">Element21</child21>&#xD;
    <child22 Attribute22="c22">Element22</child22>&#xD;
  </Child2>&#xD;
  <Child3 Attribute3="c3">&#xD;
    <child31 Attribute32="c31">&#xD;
      <child311 Attribute311="c311">Element311</child311>&#xD;
    </child31>&#xD;
  </Child3>&#xD;
</Root>

Hope it helped. Tobias

schoetbi
Helo tobias. I still cannot get the inner xml in a canonicalized node. I see that you are using another algorithm for canonizalization. I must strictly use the XmlDsigC14NTransdormUrl. Regardless, changing the algorithm or using the LoadInnerXml method doesn't seem to effect the output. Thank you for your time and effort.
artsince
I added my child1.xml. Isn't that what you wanted? What is your expected result?
schoetbi
What I want is something like this.<Child1 xmlns:test="http://www.test.com/xades#" Attribute1="c1" Bttribute="c2" Cttribute="c3"><child11 Attribute11="c11">Element11</child11></Child1>. I just want to canonicalize the Child1 node, not the entire document. I cannot get it with c# methods. I use some internal java methods and I can get a reasonable output. This discrepancy is actually annoying and it is a subject matter for another question. Thanks again
artsince
A: 

Have you checked MSDN: http://msdn.microsoft.com/en-us/library/fzh48tx1.aspx Sample on their page has a comment which says that "This transform does not contain inner XML elements" - meaning a known issue.

You could try different XPaths like //child1/* or //child1|//child1/* or //child1// or explicit nodes() selection (check full XPath syntax at http://msdn.microsoft.com/en-us/library/ms256471.aspx) but you are in a gray zone - gambling with a bug.

So, in your ms2.xml is the actual output you wanted you'll just have to do that intermediary serialization for the time being.

Also fire up Reflector and take a look - the class is probably not terribly complicated.

ZXX
We actually looked into the canonicalization code with Reflector and got some idea of how it worked. I guess we should also look into how canonicalization is handled while signing, as there is a lot of inner nodes involved for the signedInfo tag.
artsince
It's not a known issue, it's by design and insisted upon by the relevant W3C RECs.
Jon Hanna
So it's a design flaw ? :-) What if you five it both the parent and a child node - is it going to delete the child and then re-attach it ? Or turn a tree into a forest? say //Child2[1]|//Child2[1]/* Mind you that I'm not disputing good intentions of the idea (to prevent inadvertent signing of a 1MB when 10K is enough) but apparent self-contradiction of node removal based on arbitrary node-list (I can construct one to be much more devious - just a root and a node 3 levels deep with nothing in-between :-).
ZXX
No, it's a design win. It is possible in the use-cases that c14n is designed to cope with, that someone may *want* to sign a node ("I stand by what this says") and not a child node ("This has nothing to do with me"), esp. if parts of the document may mutate later. Hence c14n operates on the nodes returned by the XPath, while some other operations operate on the subtrees of nodes returned by XPath, because that is more useful in their cases. If c14n dealt with subtrees it wouldn't serve some real use-cases. Since we can specify "and attribute and descendant nodes" in the XPath, noone loses.
Jon Hanna
OK a design win :-)) So what will happen if you pass in nodes that are on the same path but have at least one node in between them that wasn't passed in?
ZXX
The result has them in their relative positions (descendants will still be descendants). See http://www.w3.org/TR/xml-c14n for more (I gave a working answer with zero knowledge of the microsoft tools, but a knowledge of the W3C spec, which in its own way shows MS have it right) with http://www.w3.org/TR/xmldsig-core/ also being worth reading.Remember, it doesn't prevent uses that need sub-trees, and the XPath's involved in getting whole subtrees can be efficiently processed, it just doesn't prevent those uses that need to not have sub-trees either.
Jon Hanna
A: 

I think, your answer is in your question, "What I really want to do is to canonicalize a single node, but as you can see in the output file, the output does not contain any of the inner xml."

If I understand you, then really you don't want to canonicalise a single node, or you'd be happy that it doesn't contain the inner XML. You want to canonicalise a single subtree.

XPath returns nodes, not subtrees. Some operations on the nodes returned by an XPath expression will then include their child and attribute nodes by default, but canonicalisation deliberately isn't one of these, as potentially some of those very child nodes could be mutable in ways that you are not signing. In signing, you are only signing precisely those nodes that you say you are signing.

Changing the line in your code from:

XmlNodeList nodeList = xDoc.SelectNodes("//Child1");

to:

XmlNodeList nodeList =
    xDoc.SelectNodes("//Child1/descendant-or-self::node()|//Child1//@*");

Means I get the following in child1.xml:

<Child1 xmlns:test="http://www.test.com/xades#" Attribute1="c1" Bttribute="c2" Cttribute="c3">&#xD;
    <child11 Attribute11="c11">Element11</child11>&#xD;
  </Child1>

Am I correct in thinking that this is what you want?

Incidentally, more precision along the lines of:

XmlNodeList nodeList =
    xDoc.SelectNodes("//Child1[1]/descendant-or-self::node()|//Child1[1]//@*");

May be useful, as then the xpath evaluation can stop when it gets to the first </Child1>, with a performance gain that could be significant if your real data is large.

Jon Hanna
That output seems to be correct. Thank you very much for your time.
artsince
I am comparing results with some java code I wrote. With java I do not get that "" at the end of lines. Which one is the correct implementation?
artsince
If the source had - at the time it reaches the processing above, got U+000D characters in the element content, then the form is correct, if it doesn't, then the other form is correct. It's likely that the difference is not in the canonicalisation, but in the underlying stream-reading or document loading. I suspect that one or other implementation is "fixing" the line breaks. Whichever is doing so is flawed, though again in terms of how you get the XML in and/or the nodes out, not the c14n.
Jon Hanna
@artsince It's suddenly occurred to me that I misspoke here. c14n is meant to preserve *explicit* U+000D characters ( etc. in the source). But XmlDocument doesn't distinguish this, so we need to do this earlier. Going to add something to my answer to demonstrate this.
Jon Hanna
Thank you. I was confused about this and created a question here: http://stackoverflow.com/questions/3449209/handling-carriage-return-in-canonicalization-with-java
artsince
A: 

A separate answer on how to deal with the fact that XmlDocument doesn't distinguish U+000D in the source (should not be preserved) from explicit references such as &#xd; in the source (should be preserved).

Instead of:

using (FileStream fs = new FileStream(path, FileMode.Open))
{
    xDoc.Load(fs);
}

We first create a newline-cleaning TextReader:

private class LineCleaningTextReader : TextReader
{
  private readonly TextReader _src;
  public LineCleaningTextReader(TextReader src)
  {
    _src = src;
  }
  public override int Read()
  {
    int r = _src.Read();
    switch(r)
    {
      case 0xD:// \r
        switch(_src.Peek())
        {
          case 0xA: case 0x85: // \n or NEL char
            _src.Read();
            break;
        }
        return 0xA;
      case 0x85://NEL
        return 0xA;
      default:
        return r;
    }
  }
}

We then use this in Loading the xDoc:

using (FileStream fs = new FileStream(path, FileMode.Open))
{
  using(TextReader tr = new StreamReader(fs))
    xDoc.Load(new LineCleaningTextReader(tr));
}

This then normalises newlines prior to processing, but leaves explicit alone.

Jon Hanna