tags:

views:

7974

answers:

6

Hi,

I am trying to get a xml file into a dataset and am using the following code:

DataSet ds = new DataSet("TestDataSet");
ds.ReadXml(FileName);

and this xml file:

<Catalog>
 <Rec>
   <ITEM dt:dt="string"/>
   <QTY dt:dt="string">1</QTY>
   <SUB dt:dt="string">1</SUB>
   <CATALOG dt:dt="string">ABC123</CATALOG>
  </Rec>
  <Rec>
   <ITEM dt:dt="string"/>
   <QTY dt:dt="string">1</QTY>
   <SUB dt:dt="string">1</SUB>
   <CATALOG dt:dt="string">ABC124</CATALOG>
  </Rec>
  <Rec>
   <ITEM dt:dt="string"/>
   <QTY dt:dt="string">1</QTY>
   <SUB dt:dt="string">1</SUB>
   <CATALOG dt:dt="string">ABC125</CATALOG>
  </Rec>
 </Catalog>

The trouble is that the after setting a watch on ds, it only appears to contain a table called Rec and a column called Rec_Id. If I remove the "dt:dt="String"" datatype everything works fine.

I am using C#.net 2008...

Can someone please advise of the correct way to import this data without having to alter the xml file?

Thanks

+1  A: 

In order for the ReadXml to function properly in this case, I think you need to specify a Schema in your XML file. Otherwise, the reader will not know what to do with the datatypes.

Jon
Bingo. For reference, try doing a WriteXML on a known DataSet and see what it produces. You'll notice in the root node for the class name it adds the namespace attributes Jon mentioned.
Yadyn
+2  A: 

As soon as you define your XML namespace used in the XML elements, you can easily import this - no problem.

You need to have your XML look something like this:

<Catalog xmlns:dt="some-xml-namespace-here">
 <Rec>
   <ITEM dt:dt="string"/>
   <QTY dt:dt="string">1</QTY>
   <SUB dt:dt="string">1</SUB>
   <CATALOG dt:dt="string">ABC123</CATALOG>
  </Rec>
  .....
 </Catalog>

After I do this, your two lines of code work like a charm and the data gets imported, no problem (into 5 tables inside the DataSet).

Marc

marc_s
This cannot be validated against any schema, but if you add xmlns="dt" as well, a schema can be created that will validate the xml.
Richard Hein
True - but the original poster didn't ask for validation - he only wants to import the file into a dataset. This can be accomplished as I showed.
marc_s
A: 

This will make it parseable. The dt namespace usually refers to xmlns:dt="urn:schemas-microsoft-com:datatypes". Something or someone messed up your XML, but if you must be able to import it, you can just modify the xmlns attributes on the catalog element as shown:

            string xml = @"<Catalog xmlns=""dt"" xmlns:dt=""dt"">
                        <Rec>
                        <ITEM dt:dt=""string""/>   
                        <QTY dt:dt=""string"">1</QTY>   
                        <SUB dt:dt=""string"">1</SUB>   
                        <CATALOG dt:dt=""string"">ABC123</CATALOG>  
                        </Rec>  
                        <Rec>   
                        <ITEM dt:dt=""string""/>   
                        <QTY dt:dt=""string"">1</QTY>   
                        <SUB dt:dt=""string"">1</SUB>   
                        <CATALOG dt:dt=""string"">ABC124</CATALOG>  
                        </Rec>  
                        <Rec>   
                        <ITEM dt:dt=""string""/>   
                        <QTY dt:dt=""string"">1</QTY>   
                        <SUB dt:dt=""string"">1</SUB>   
                        <CATALOG dt:dt=""string"">ABC125</CATALOG>  
                        </Rec> 
                        </Catalog>";

        DataSet ds = new DataSet("Whatev");

        TextReader txtReader = new StringReader(xml);
        XmlReader reader = new XmlTextReader(txtReader);
        ds.ReadXml(reader);
        Debug.Assert(ds.Tables.Count ==5);
        Debug.Assert((string)ds.Tables[2].Rows[0][0] == "string");
        Debug.Assert((string)ds.Tables[3].Rows[0][1] == "1");

Also Marc is correct, but with the definition above you can generate a matching schema:

<xs:schema xmlns:dt="dt" attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="dt" xmlns:xs="http://www.w3.org/2001/XMLSchema"&gt;&lt;xs:element name="Catalog"><xs:complexType>    <xs:sequence><xs:element maxOccurs="unbounded" name="Rec"><xs:complexType><xs:sequence><xs:element name="ITEM"><xs:complexType><xs:attribute ref="dt:dt" use="required" /></xs:complexType></xs:element><xs:element name="QTY"><xs:complexType>
              <xs:simpleContent><xs:extension base="xs:unsignedByte"><xs:attribute ref="dt:dt" use="required" /></xs:extension></xs:simpleContent></xs:complexType>
          </xs:element><xs:element name="SUB"><xs:complexType><xs:simpleContent><xs:extension base="xs:unsignedByte"><xs:attribute ref="dt:dt" use="required" /></xs:extension></xs:simpleContent></xs:complexType></xs:element><xs:element name="CATALOG"><xs:complexType><xs:simpleContent><xs:extension base="xs:string"><xs:attribute ref="dt:dt" use="required" /></xs:extension></xs:simpleContent></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element><xs:attribute name="dt" type="xs:string" /></xs:schema>

The attribute "dt" is a reference attribute. So the xml cannot be valid against any schema without the xmlns="ds" declaration as well.

Richard Hein
It won't let me paste the whole schema in ...
Richard Hein
Had to remove all whitespace from the schema to be able to have it show all of it.
Richard Hein
As the file is automatically generated by another program, is there any way to specify a schema via the C# program prior to reading the file into the dataset? The reason I ask is that I do not want to have to (unless it is easy to do so) have to modify the xml file to add a schema so that my software can read it; I am trying to automate this without human intervention!! :)Also, what is the easiest way to create a schema? Are there any wizards or software that can assist in this?Thanks for your help guys :)
You can use Visual Studio to create a schema from an XML document. If you have the XML file open in the editor, the toolbar has an XML menu. In the menu is the option to Create Schema. You will not be able to create a Schema without either removing the "dt:" bit, or adding the xmlns attributes as shown. If I were you, and I just have to import it, then I wouldn't bother with the schema as Marc mentioned. However, there is no way around the fact that you must edit the XML before loading it into your DataSet. It's invalid otherwise.
Richard Hein
You can of course add the xmlns attributes required to your XML at runtime, in memory..., so fixing this shouldn't be much of a problem. You just have to add 1 (or 2 if you need a schema) to your Catalog element. Also, you can simply copy my schema and use it (if the real XML is the same ... if it's different, generate one as suggested after modifying a sample of your XML, or just modify mine).
Richard Hein
I am really confused now...:( The first thing puzzling me is this:If I understand correcly I have two options: 1) modify the file so that the catalog contains this <Catalog xmlns="dt"xmlns:dt="dt"> (which can be done in memory or before hand) or 2) I use a schema and reference it from the xml file.The second thing is this: Using Auxons example above, I managed to get something into a dataset however it created four tables ITEM, SUB, QTY and CATALOG however these need to be columns in the REC table...Have I missed something obvious?Thanks again for your help.
You have 5 tables total including REC right? DataSet.ReadXml has a constructor that accepts an XmlReadMode. The default is Auto. Auto means that it will infer a schema if there is no schema. Yours doesn't have one. So it infers it. The default inferrence produces separate tables for each element. If you want to control how the dataset is made, you must define a schema. You can include the schema inside (inline) your XML or use DataSet.ReadXmlSchema to read it first from a separate file, then use ReadXml.
Richard Hein
What is the relationship between the root Catalog element and the child CATALOG? Is there any?
Richard Hein
Actually, you know what, your XML is just in horrible shape, if that's the way you are going to get it, don't even bother messing around, just parse each node and build up a DataSet the way you want, it'll take less time than it has already taken me to try to help you.
Richard Hein
A: 
DataSet ds = new DataSet("Whatev");
        DataTable catalog = ds.Tables.Add("Catalog");
        DataColumn recCol = catalog.Columns.Add("Rec");
        DataTable rec = ds.Tables.Add("Rec");

        rec.Columns.AddRange(new DataColumn[] { 
            new DataColumn("ITEM", typeof(string)), 
            new DataColumn("QTY", typeof(string)),
            new DataColumn("SUB", typeof(string)),
            new DataColumn("CATALOG", typeof(string))
        });            

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(xml);
        foreach (XmlNode recNode in doc.GetElementsByTagName("Rec"))
        {
            DataRow row = rec.Rows.Add(
                recNode["ITEM"].InnerText,
                recNode["QTY"].InnerText,
                recNode["SUB"].InnerText,
                recNode["CATALOG"].InnerText);
        }

There ya go. Now there will be two tables, Catalog and Rec. I suspect you only want Rec though, because Catalog is useless. So just remove the catalog datatable code if that's the case, or add an id attribute each catalog row and link it to rec:

DataSet ds = new DataSet("Whatev");
        DataTable catalog = ds.Tables.Add("Catalog");
        DataColumn idCol = catalog.Columns.Add("Id");
        DataTable rec = ds.Tables.Add("Rec");

        rec.Columns.AddRange(new DataColumn[] { 
            new DataColumn("ITEM", typeof(string)), 
            new DataColumn("QTY", typeof(string)),
            new DataColumn("SUB", typeof(string)),
            new DataColumn("CATALOG", typeof(string))
        });

        catalog.ChildRelations.Add("catToRecRelation", idCol, rec.Columns["CATALOG"]);

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(xml);
        foreach (XmlNode recNode in doc.GetElementsByTagName("Rec"))
        {
            // Create id in parent Catalog node, based on CATALOG value
            catalog.Rows.Add(recNode["CATALOG"].InnerText);

            DataRow row = rec.Rows.Add(
            recNode["ITEM"].InnerText,
            recNode["QTY"].InnerText,
            recNode["SUB"].InnerText,
            recNode["CATALOG"].InnerText);
        }

        var childRows = catalog.Rows[0].GetChildRows("catToRecRelation");
Richard Hein
A: 

I have asked a similar question and get a nice answer that helped me. You can take a look my SO

adopilot
A: 

Please help me here!!

I have this URL where there is XML data. I have to extract that data from URL and dump it into DW table. I am using SSIS Script Task for that.

This is how the data looks like:

-<currency>

 <csymbol>AED</csymbol>

 <cname>United Arab Emirates Dirhams</cname>

 <crate>3.6732001305</crate>

 <cinverse>0.2722421770</cinverse>

</currency>

−<currency>

 <csymbol>AFN</csymbol>

 <cname>Afghanistan Afghanis</cname>

 <crate>44.0000000000</crate>

 <cinverse>0.0227272727</cinverse>

</currency>

−<currency>

 <csymbol>ALL</csymbol>

 <cname>Albania Leke</cname>

 <crate>104.4100000001</crate>

 <cinverse>0.0095776267</cinverse>

</currency>

This is the code i'm using to load it into some Object type or something. But i dont know how to do that.

 public void Main()
    {
        String URLString = "http://www.xe.com/dfs/datafeed2.cgi?beeline";
        XmlDocument doc = new XmlDocument();
        XmlTextReader reader = new XmlTextReader(URLString);
        doc.Load(reader);

        XmlNodeList currencynodes = doc.SelectNodes("currency");
        foreach(XmlNode currency in currencynodes)
        {
            XmlNode csymbol = currency.SelectSingleNode("csymbol");
            string csymbolvalue = csymbol.Value;

            XmlNode cname = currency.SelectSingleNode("cname");
            string cnamevalue = cname.Value;

            XmlNode crate = currency.SelectSingleNode("crate");
            string cratevalue = crate.Value;

            XmlNode cinverse = currency.SelectSingleNode("cinverse");
            string cinversevalue = cinverse.Value;

            Dts.Variables["User::oCurrencyConversion"].Value = csymbol.Value;
        }
jack
You need to ask a new question and not post your question as an answer: http://stackoverflow.com/questions/ask
Martin Liversage
Oops!! Sorry. Here is the link to new question. http://stackoverflow.com/questions/3754297/writing-xml-data-from-url-to-object-in-ssis-script-task
jack