views:

902

answers:

10

What unit testing strategies do people recommend for testing xml is being generated correctly.

The my current tests seem abit primitive, something along the lines of:

[Test]
public void pseudo_test()
{
   XmlDocument myDOC = new XmlDocument();
   mydoc = _task.MyMethodToMakeXMLDoc();

   Assert.AreEqual(myDoc.OuterXML(),"big string of XML")
}
A: 

why not assume that some commercial xml parser is correct and validate your xml code against it? something like.

Assert.IsTrue(myDoc.Xml.ParseOK)

other than that and if you want to be thorough I'd say you would have to build a parser yourself and validate each rule the xml specification requires.

vitorsilva
+1  A: 

Another possibility might be to use XmlReader and check for an error count > 0. Something like this:

    void CheckXml()
    {
        string _xmlFile = "this.xml";
        string _xsdFile = "schema.xsd"; 
        StringCollection _xmlErrors = new StringCollection();

        XmlReader reader = null;
        XmlReaderSettings settings = new XmlReaderSettings();
        settings.ValidationEventHandler += new ValidationEventHandler(this.ValidationEventHandler);
        settings.ValidationType = ValidationType.Schema;
        settings.IgnoreComments = chkIgnoreComments.Checked;
        settings.IgnoreProcessingInstructions = chkIgnoreProcessingInstructions.Checked;
        settings.IgnoreWhitespace = chkIgnoreWhiteSpace.Checked;
        settings.Schemas.Add(null, XmlReader.Create(_xsdFile));
        reader = XmlReader.Create(_xmlFile, settings);
        while (reader.Read())
        {
        }
        reader.Close();
        Assert.AreEqual(_xmlErrors.Count,0);
    }    

    void ValidationEventHandler(object sender, ValidationEventArgs args)
    {
        _xmlErrors.Add("<" + args.Severity + "> " + args.Message);
    }
Mike K.
XMLUnit already will compare xml files and count the number of differences if you want...
djangofan
+2  A: 

If you have a standard format that you expect the output to be, why not create an XML schema or DTD and validate against that. This won't depend on the data, so will be flexible. Also defining how the XML can be formed can be helpful when designing you system.

Jeremy French
+3  A: 

Validate against XML schema or DTD, also check key that nodes have the values you expect.

svinto
+1, and C#'s XmlSerialization could help with this.
sixlettervariables
+7  A: 

XMLUnit may help you.

Ionuț G. Stan
A: 

Validate it against an XSD schema using XmlSchema class. Its found under System.XML i think. Another option would be to write a serialization class (XMLSerializer) to deserialize your XML into an object. The gain will be that it will implicitly validate your structure and after that the values can be easily accessed for testing using the resulting object.

AZ
there is a better validation method using XMLUnit ...
djangofan
+6  A: 

First, as pretty much everyone is saying, validate the XML if there's a schema defined for it. (If there's not, define one.)

But you can build tests that are a lot more granular than that by executing XPath queries against the document, e.g.:

path = "/doc/element1[@id='key1']/element2[. = 'value2']";
Assert.IsTrue(doc.SelectSingleNode(path) != null);

This lets you test not only whether or not your document is semantically valid, but whether or not the method producing it is populating it with the values that you expect.

Robert Rossney
A: 

Another reason to use a Schema to validate against is that while XML nodes are explicitly ordered, XML attributes are not.

So your string comparison of:

Assert.AreEqual(myDoc.OuterXML(),"big string of XML")

would fail if the attributes are in a different order, as could easily happen if one bit of XML was manually created and the other programatically.

merlinc
A: 

Actually I used exactly the same test as you used in your question (above).

I think this way of testing is perfectly fine for simple situations.

jonathanconway
A: 

Verify the resulting document is well formed Verify the resulting document is valid Verify the resulting document is correct.

Presumably, you are crafting an XML document out of useful data, so you will want to ensure that you have the right coverage of inputs for your tests. The most common problems I see are

  • Incorrectly escaped elements
  • Incorrectly escaped attributes
  • Incorrectly escaped element names
  • Incorrectly escaped attribute names

So if you haven't already done so, you would need to review the XML spec to see what's allowed in each place.

How much "checking" should happen in each test isn't immediately clear. It will depend a lot on what a unit is in your problem space, I suppose. It seems reasonable that each unit test is checking that one piece of data is correctly expressed in the XML. In this case, I'm in agreement with Robert that a simple check that you find the right data at a single XPath location is best.

For larger automated tests, where you want to check the entire document, what I've found to be effective is to have an Expected results which is also a document, and walk through it node by node, using XPath expressions to find the corresponding node in the actual document, and then applying the correct comparison of the data encoded in the two nodes.

With this approach, you'll normally want to catch all failures at once, rather than aborting on first failure, so you may need to be tricksy about how you track where mismatches occurred.

With a bit more work, you can recognize certain element types as being excused from a test (like a time stamp), or to validate that they are pointers to equivalent nodes, or... whatever sort of custom verification you want.

VoiceOfUnreason