What would be the best way to validate XML?

views:

198

answers:

+1 Q:

What would be the best way to validate XML?

I been looking at XML Serialization for C# and it looks interesting. I was reading this tutorial

http://www.switchonthecode.com/tutorials/csharp-tutorial-xml-serialization

and of course you can de serialize it back to a list of objects. So I am wondering would it be better to de serialize it back to to a list of objects and then go through each object and validate it or validate it by using a schema then de serializing it and doing stuff with it?

http://support.microsoft.com/kb/307379

Thanks

+1 A:

I guess it would depend a bit on what you want to validate, and for what purpose. If it is intended for interop to other systems, then validating via xsd is a reasonable idea not least because you can use xsd.exe to write your classes for you from the xsd (you can also generate xsd from xml or dll, but it isn't as accurate). Likewise you can use XmlReader (appropriately configured) to check against xsd,

If you just want valid .NET objects, I'd be tempted to leave the serialized form as an implementation detail, and write some C# validation code - perhaps implementing IDataErrorInfo, or using data-annotations.

Marc Gravell 2010-01-23 23:05:41

I have run into issues where xml was legal against the XSD definition but not against the actual schema. It's a good sanity check for the serializer to work against your input, but it is not true validation of the XML against the schema.

Spence 2010-01-23 23:07:30

@Spence - for info, what are you meaning by "actual schema" in this scenario?

Marc Gravell 2010-01-23 23:10:54

Well the output from xsd.exe woudl create say a property with an integer. If your schema then puts a restriction on the range of that integer xsd.exe will not represent this. Furthermore for speed the xmlserializer will not check either for speed. So if you don't actually validate the input against the schema, you can have a business object and XML output which is actually illegal under your spec.

Spence 2010-01-23 23:17:44

@Spence - I'm confused though; I'm suggesting (via XmlReader) to validate against the xsd - I *believe* this checks restrictions?

Marc Gravell 2010-01-23 23:26:09

Not unless you pass in the schemas to be validated against?

Spence 2010-01-23 23:32:54

(appropriately configured) XmlReader probably covers that. Still making the point that an XmlReader on it's own needs to be configured with the schemas to actually perform validation.

Spence 2010-01-24 20:38:18

can I use xsd.exe to take my classes and make a xsd out of it?

chobo2 2010-01-25 00:12:43

@chobo2 - sure: `xsd.exe <assembly>.dll|.exe [/outputdir:] [/type: [...]]`

Marc Gravell 2010-01-25 05:32:14

I didn't know you could make an XSD from an assembly with decorated attributes, that's handy. How does it validate the xsi:type="" attribute?

Chris S 2010-01-25 10:28:50

You can create an XmlValidatingReader and pass that into your serializer. That way you can read the file in one pass and validate it at the same time.

I believe the same technique will work even if you are using hand rolled XML classes (for extremely large XML files) so you might find it worth a look.

Edit:

Sorry just reread some of my code, XmlValidatingReader is obsolete, you can do what you need with the XmlReader.

See XmlReader Settings

Spence 2010-01-23 23:06:09

For speed I would do it in C#, however for completeness you might want to do it using an XSD. The issue with that is you have to learn the verbose and cumbersome XSD syntax, which from experience takes a lot of trial and error, is time consuming and holds not a lot of reward for serialization. Particularly with constants where you have to map them in C# and also in the XSD.

You'll always be writing the XML as C#. Anything not known when read back in is simply ignored. If you aren't editing the XML with a text editor you can guarantee that it will come back in the right way, in which case XSD is definitely not needed.

Chris S 2010-01-23 23:06:10

Well I am expecting the people to edit the XML file and it needs to be in the right format so that the data can be entered into my database.

chobo2 2010-01-25 00:10:02

@chobo could you write an application for them to edit it with? If you aren't or don't want to be an XML [xsd] expert this will end up taking less time

Chris S 2010-01-25 10:22:15

If you validate the XML, you can only prove that it's structurally correct. An attempt to deserialize from the XML will tell you the same thing.

Typically business objects can implement business logic/rules/conditions that go beyond a valid schema. That type of knowledge should stay with the business objects themselves, rather than being duplicated in some sort of external validation routine (otherwise, if you change a business rule, you have to update the validator at the same time).

Eric J. 2010-01-23 23:07:16

I want to validate the structure and the actual fields. Like say Field A should be not blank. I think a schema could do both of these but I am not sure.

chobo2 2010-01-23 23:20:13

ansaurus

tags:

views:

answers:

What would be the best way to validate XML?

related questions