tags:

views:

98

answers:

2

I have a chunk of xml data that's coming out of a database that I need to generate an xsd for. Got it all working using xsd.exe but all the elements are showing up as string, even things like 2079.0200. How do I get xsd.exe to guess at types? Would the XmlSchemaExporter class be able to do this?

The issue here is that Visual Studio is generating the xsd that I want (with decimal types etc) when I use the XML --> Create Schema command, but I don't want to have to do this by hand. I'm setting up a process that takes in a chunk of xml and generates an XSD. But it needs to have more types than just "string".

Related, but don't know if it's a solution yet (XmlSchemaInference class): http://stackoverflow.com/questions/74879/any-tools-to-generate-an-xsd-schema-from-an-xml-instance-document

A: 

The solution is to create the schema by hand, based on the one that's been generated. Then don't run XSD.EXE again.

John Saunders
Visual Studio manages to generate the schema that I want based on a sample of xml, so "do it by hand" is not a good answer. I wonder if I can hook into Visual Studio...
jcollum
Visual Studio is calling a .NET API that guesses the schema. You don't have to guess - you know what the schema should be. So use your knowledge to fix up the schema to be correct. Note that XSD.EXE / Visual Studio also can't guess whether or not an element is mandatory, or what the occurrence limits are, or the inheritance structure, or missing attributes, etc. You can't depend on the inferred schema.
John Saunders
I've got 40+ of these to generate and they don't have to be super accurate. It looks like XmlSchemaInference will do what I need, or at least that's the direction i'm going.
jcollum
You cannot place an upper bound on the degree of inaccuracy of the generated schema. All you can know for certain is that the original sample will validate against the schema. In fact, I'm not sure there's any guarantee that the same sample will continue to generate the same schema over time. Believe me, it's worth learning XML schema and getting this right.
John Saunders
A: 

John's answer is valid for situations where accuracy is more important than speed. For my situation, I needed many schemas that were identical to what would be produced via the VS "Create Schema" command. So accuracy wasn't as important as matching a known baseline and speed.

This is what I ended up doing. It produced output identical to the VS "Create Schema" command:

XmlSchemaInference inf = new XmlSchemaInference();

// xml variable on the next line is a string being passed in
XmlSchemaSet schemas = inf.InferSchema(new XmlTextReader(xml, XmlNodeType.Element, null));
schemas.Compile();

XmlSchema[] schemaArray = new XmlSchema[1];
schemas.CopyTo(schemaArray, 0);
XmlTextWriter wr = new XmlTextWriter(xsdOutputFileNameAndPath, Encoding.UTF8);
wr.Formatting = Formatting.Indented;
schemaArray[0].Write(wr);
wr.Close();
jcollum