tags:

views:

122

answers:

3
+3  Q: 

XML-to-SQL Mapping

We're looking for a way to parse large amounts of XML files that conform to a rigid schema (specifically this one). All we want to do is create 1-to-1 data mappings between several database tables and the XML file, so we can set up a trigger on the table where our web service software package inserts the file, then automatically parses it and inserts the data into the database as we have defined.

What're the best ways to accomplish this within the realm of Microsoft's SQL Server and similar?

A: 

Sounds like what you want to do is write an XSL transform that renders the XML into a format usable by SQLServer's bulk import tool. (I do mostly Oracle so I don't know what's available on the SQLServer side)

Another option would be to transform the XML into SQL 'insert' statements and run the result as a SQL script.

Whichever approach you choose, it will probably be important to consider transaction boundaries so that errors or invalid data in the input don't lead to inconsistencies in the database.

Jim Garrison
+2  A: 

You can use SQLXML Bulk load 4.0 to bulk load your XML file into the SQL Server tables.
SQLXML bulkload object uses the XML data file and Schema file.

SCHEMA (XSD or XML) file contains the mapping information between the XML node to database column name.

set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkload.4.0")
objBL.ConnectionString = "Connection String"
objBL.ErrorLogFile = "c:\error.log"
objBL.Execute "SampleSchema.xml", "SampleXMLData.xml"

http://msdn.microsoft.com/en-us/library/ms171878.aspx

http://msdn.microsoft.com/en-us/library/ms171806.aspx

Following is the sample data file and schema file.

Data File

<ROOT>  
  <Order OrderID="11" CustomerID="ALFKI">
    <Product ProductID="11" ProductName="Chai" />
    <Product ProductID="22" ProductName="Chang" />
  </Order>
  <Order OrderID="22" CustomerID="ANATR">
     <Product ProductID="33" ProductName="Aniseed Syrup" />
    <Product ProductID="44" ProductName="Gumbo Mix" />
  </Order>
</ROOT>

Schema Definition

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
<xsd:annotation>
  <xsd:appinfo>
    <sql:relationship name="OrderOD"
          parent="Ord"
          parent-key="OrderID"
          child="OrderDetail"
          child-key="OrderID" />

    <sql:relationship name="ODProduct"
          parent="OrderDetail"
          parent-key="ProductID"
          child="Product"
          child-key="ProductID" 
          inverse="true"/>
  </xsd:appinfo>
</xsd:annotation>

  <xsd:element name="Order" sql:relation="Ord" 
                            sql:key-fields="OrderID" >
   <xsd:complexType>
     <xsd:sequence>
        <xsd:element name="Product" sql:relation="Product" 
                     sql:key-fields="ProductID"
                     sql:relationship="OrderOD ODProduct">
          <xsd:complexType>
             <xsd:attribute name="ProductID" type="xsd:int" />
             <xsd:attribute name="ProductName" type="xsd:string" />
          </xsd:complexType>
        </xsd:element>
     </xsd:sequence>
        <xsd:attribute name="OrderID"   type="xsd:integer" /> 
        <xsd:attribute name="CustomerID"   type="xsd:string" />
    </xsd:complexType>
  </xsd:element>
</xsd:schema>
Rasik Jain
SQLXMLBulkLoad is known to cause severe performance issues in scenarios of processing large amount of xml files with relatively little data in it. The reason being it creates lot's of temporary staging tables for processing each file internally!!!. To avoid this you should merge some xml files at a time and bulkload them. Also make sure that all the files exist on your local drive.
A_Var
+1  A: 

You could consider using SSIS to create "ETL package" with XML data source(s). Then you execute the package. Inside the package you can specify schema, mappings etc. SSIS does have "For Each File In Directory" container etc..

Damir Sudarevic