views:

367

answers:

2

I have a very large (2.5GB, 55 million node) XML file with the following format:

<TopNode>
    <Item id = "Something">
         <Link>A link</Link>
         <Link>Another link</Link>
         <Link>One More Link</Link>
    </Item>
    <Item id = "Something else">
         <Link>Some link</Link>
         <Link>You get the idea</Link>
    </Item>
  </TopNode>

I want to flatten this into the following SQL table:

 -----------------------------------------
 |  Item          |          Link        |
 -----------------------------------------
 | Something      |  A link              |
 | Something      |  Another link        |
 | Something      |  One More Link       |
 | Something Else |  Some Link           |
 | Something Else |  You get the idea    |
 |----------------|----------------------|

I'm using SQL2008, if that makes a difference.

What's the simplest, most efficient way (preferably using the SQL Server/.NET stack) to get from point A to point B, keeping in mind the size of the file involved?

A: 

Take a look at Oslo/M.

Josh
+5  A: 

I would use the XML Bulk Load. This is a nice approach because it doesn't read in the entire document at once, it streams it. It's also quite fast and keeps to your requirement of sticking with an SQL Server based tool.

Mat Nadrofsky
Is the XML Bulk Load tool still available for SQL2008? All of the references I've seen to it are circa SQL2000...
Michael Dorfman
You betcha. http://technet.microsoft.com/en-us/library/ms171769.aspx
Mat Nadrofsky
You might need to make sure it's actually installed. http://technet.microsoft.com/en-us/library/cc645615.aspx
Mat Nadrofsky
Example of how to actually use BulkLoad http://support.microsoft.com/kb/316005
dilbert789