tags:

views:

93

answers:

3

What is the quickest (as in least effort, not super performance) way to convert 112K rows in XML to a Excel view.

A: 

Probably just read the XML in some high level language (JAVA, C#, etc. all have such facilities), write the file out as a .csv file, and then import it into excel using the Data->Import feature.

There may be better ways, this is one simple way though.

dcp
A: 

Why so complicated? Just open the File with File->Open choose xml and load it. See what'll happen.

Friedrich
+1  A: 

If you're using Excel 2007 and want to use XSLT, your best bet would probably be use the EXPath Zip Module features to modify an existing Excel .xslx file. An alternative would be to use a tool such as CoherentWeb (disclosure: this is developed by my company) that, as well as supporting EXPath Zip, also lets you modify .xslx files using standard XSLT.

My preferred option, however, would be to use a small Excel VBA Macro.

I've included sample code below for a VBA procedure called 'load' - this sample uses the XML DOM, so all 112K rows of your XML will be first loaded into memory, but if performance isn't an issue its simpler than the SAX alternative.

You would need to modify xpathToExtractRow to suit your XML input structure. There is also an assumption that the immediate child nodes of the XML row element contain the cell data you wish to import as text nodes, if not, you will need to use a SelectNode call to get the data you require.

Private dom As DOMDocument60

Public Sub load()

Dim nodeList As IXMLDOMNodeList
Dim nodeRow As IXMLDOMNode
Dim nodeCell As IXMLDOMNode
Dim rowCount As Integer
Dim cellCount As Integer
Dim rowRange As Range
Dim cellRange As Range
Dim sheet As Worksheet

Dim xpathToExtractRow As String
xpathToExtractRow = "/feed/row"

Set dom = New DOMDocument60
dom.load ("c:\test\source.xml")
Set sheet = ActiveSheet
Set nodeList = dom.SelectNodes(xpathToExtractRow)

rowCount = 0
For Each nodeRow In nodeList
    rowCount = rowCount + 1
    cellCount = 0
    For Each nodeCell In nodeRow.ChildNodes
        cellCount = cellCount + 1
        Set cellRange = sheet.Cells(rowCount, cellCount)
        cellRange.Value = nodeCell.Text
    Next nodeCell
Next nodeRow

End Sub

Sample Input XML:

<?xml version="1.0" encoding="utf-8"?>
<feed>
  <row>
    <firstname>joe</firstname>
    <lastname>smith</lastname>
    <country>jamaica</country>
  </row>
  <row>
    <firstname>bill</firstname>
    <lastname>coots</lastname>
    <country>uk</country>
  </row>
</feed>
pgfearo