Assuming you're using SQL Server, you should parse the XML into DataTable
s and use the SqlBulkCopy
object to shoot them into the database super-fast. There are lots of resources to help you learn about SqlBulkCopy. Here's a recent discussion from another StackOverflow question to get you started: http://stackoverflow.com/questions/3358797/sql-server-2008-tuning-with-large-transactions-700k-rows-transaction/3358867#3358867
If the XML file is really large, you should be careful what sort of parser you use. XDocument and XmlDocument load the whole thing into memory. If the files are small enough, say under 10MB, you should be fine using those parsers.
EDIT:
Here's a quick mock-up of how you could get the XML into DataTables. It's in VB since VB makes XML a tad easier.
Option Strict On : Option Explicit On : Option Infer On : Option Compare Binary
Imports System.Data
Imports System.Linq
Imports System.Xml.Linq
Module Module1
Sub Main()
Dim xml =
<Trip>
<TripDetails id="1">
<DepartureDate/>
<ReturnDate/>
<TripTypeA>3</TripTypeA>
<TripTypeB>1</TripTypeB>
<PurposeOfTrip>vacation</PurposeOfTrip>
<Region>5</Region>
<Countries>
<Country>105</Country>
<Country>135</Country>
</Countries>
<Cities>
<City>Cancun</City>
<City>Tokyo</City>
<City>Mayo</City>
</Cities>
<OverallRating>4</OverallRating>
<Suppliers>53</Suppliers>
<SuppliersComments>Good flight</SuppliersComments>
<Transport>
<TransportType>1</TransportType>
<TransportType>3</TransportType>
</Transport>
<TransportComment>Transportation was fast</TransportComment>
</TripDetails>
</Trip>
Dim dtTripDetails As New DataTable()
With dtTripDetails.Columns
.Add("TripID", GetType(Integer))
.Add("TripTypeA", GetType(Integer))
.Add("DepartureDate", GetType(DateTime))
.Add("TransportComment", GetType(String))
End With
Dim dtTripDetailXrefCountries As New DataTable()
With dtTripDetailXrefCountries.Columns
.Add("TripID", GetType(Integer))
.Add("CountryID", GetType(Integer))
End With
Dim xdetails = From td In xml.Descendants("TripDetails") Select td
For Each xdetailRecord As XElement In xdetails
Dim tripID As Integer = CInt(xdetailRecord.Attribute("id").Value)
Dim tripTypeA As Integer = CInt(xdetailRecord.Element("TripTypeA").Value)
Dim strDepDate As String = xdetailRecord.Element("DepartureDate").Value
Dim depDate As Object = If(String.IsNullOrEmpty(strDepDate), CType(DBNull.Value, Object), CType(DateTime.Parse(strDepDate), Object))
Dim transportComment As String = xdetailRecord.Element("TransportComment").Value
dtTripDetails.Rows.Add(tripID, tripTypeA, depDate, transportComment)
Dim xcountries = From c In xdetailRecord.Element("Countries").Elements("Country") Select c
For Each xcountryRecord As XElement In xcountries
Dim countryID As Integer = CInt(xcountryRecord.Value)
dtTripDetailXrefCountries.Rows.Add(tripID, countryID)
Next
Next
Console.WriteLine("Done")
Console.ReadKey(True)
End Sub
End Module
BTW - when doing this kind of ETL, it's best to pump your data into staging tables first rather than directly into your production tables. That way, you can validate data types and ensure referential integrity and handle key management and get everything perfectly situated without locking up or polluting your production tables.