views:

108

answers:

5

I get a string variable with XML in it and have a XSD file. I have to validate the XML in the string against XSD file and know there is more than one way (XmlDocument, XmlReader, ... ?).

After the validation I just have to store the XML, so I don't need it in an XDocument oder XmlDocument.

What's the way to go if I want the fastest performance?

+3  A: 

I would go for the XmlReader with XmlReaderSettings because does not need to load the complete XML in memory. It will be more efficient for big XML files.

Johann Blais
A: 

XmlReader is fastest.

Aliostad
+2  A: 

I think the fastest way is to use an XmlReader that validates the document as it is being read. This allows you to validate the document in only one pass: http://msdn.microsoft.com/en-us/library/hdf992b8.aspx

Rune Grimstad
A: 

Use an XmlReader configured to perform validation, with the source being a TextReader.

You can manually specify the XSD the XmlReader is to use if you don't want to rely on declarations in the input document (with XmlReaderSettings.Schemas property)

A start (just assumes XSD-instance declarations in the input document) would be:

var settings = new XmlSettings {
  ConfirmanceLevel = ConfirmanceLevel.Document,
  ValidationType = ValidationType.Schema,
  ValidationFlags = ValidationFlags.ProcessSchemaLocation|ValidationFlags.ProcessInlineSchema,
};
int warnings = 0;
int errors = 0;
settings.ValidationEventHandler += (obj, ea) => {
  if (args.Severity==XmlSeverityType.Warning) {
    +warnings;
  } else {
    ++errors;
  }
};
XmlReader xvr = XmlReader.Create(new StringReader(inputDocInString, settings);

try {
    while (xvr.Read()) {
        // do nothing
    }
    if (0 != errors) {
        Console.WriteLine("\nFailed to load XML, {0} error(s) and {1} warning(s).", errors, warnings);
    } else if (0 != warnings) {
        Console.WriteLine("\nLoaded XML with {0} warning(s).", warnings);
    } else {
        System.Console.WriteLine("Loaded XML OK");
    }

    Console.WriteLine("\nSchemas loaded durring validation:");
    ListSchemas(xvr.Schemas, 1);

} catch (System.Xml.Schema.XmlSchemaException e) {
    System.Console.Error.WriteLine("Failed to read XML: {0}", e.Message);

} catch (System.Xml.XmlException e) {
    System.Console.Error.WriteLine("XML Error: {0}", e.Message);

} catch (System.IO.IOException e) {
    System.Console.Error.WriteLine("IO error: {0}", e.Message);
}
Richard
+1  A: 

Others have already mentioned the XmlReader class for doing the validation, and I wont elaborate further into that.

Your question does not specify much context. Will you be doing this validation repeatedly for several xml documents, or just once? I'm reading a scenario where you are just validating a lot of xml documents (from a third party system?) and storing them for future use.

My contribution to the performance hunt would be to use a compiled XmlSchemaSet which would be thread safe, so several threads can reuse it without needing to parse the xsd document again.

var xmlSchema = XmlSchema.Read(stream, null);
var xmlSchemaSet = new XmlSchemaSet();
xmlSchemaSet.Add(xmlSchema);
xmlSchemaSet.Compile();

CachedSchemas.Add(name, xmlSchemaSet);
Simon Svensson
Yes, I validate and store a lot of xml document from a third party system for later use. The XSD is always the same, so your hint, compiling the schema set is much apprechiated, thanks!
Hinek