What is the quickest (as in least effort, not super performance) way to convert 112K rows in XML to a Excel view.
Probably just read the XML in some high level language (JAVA, C#, etc. all have such facilities), write the file out as a .csv file, and then import it into excel using the Data->Import feature.
There may be better ways, this is one simple way though.
Why so complicated? Just open the File with File->Open choose xml and load it. See what'll happen.
If you're using Excel 2007 and want to use XSLT, your best bet would probably be use the EXPath Zip Module features to modify an existing Excel .xslx file. An alternative would be to use a tool such as CoherentWeb (disclosure: this is developed by my company) that, as well as supporting EXPath Zip, also lets you modify .xslx files using standard XSLT.
My preferred option, however, would be to use a small Excel VBA Macro.
I've included sample code below for a VBA procedure called 'load' - this sample uses the XML DOM, so all 112K rows of your XML will be first loaded into memory, but if performance isn't an issue its simpler than the SAX alternative.
You would need to modify xpathToExtractRow
to suit your XML input structure. There is also an assumption that the immediate child nodes of the XML row element contain the cell data you wish to import as text nodes, if not, you will need to use a SelectNode
call to get the data you require.
Private dom As DOMDocument60
Public Sub load()
Dim nodeList As IXMLDOMNodeList
Dim nodeRow As IXMLDOMNode
Dim nodeCell As IXMLDOMNode
Dim rowCount As Integer
Dim cellCount As Integer
Dim rowRange As Range
Dim cellRange As Range
Dim sheet As Worksheet
Dim xpathToExtractRow As String
xpathToExtractRow = "/feed/row"
Set dom = New DOMDocument60
dom.load ("c:\test\source.xml")
Set sheet = ActiveSheet
Set nodeList = dom.SelectNodes(xpathToExtractRow)
rowCount = 0
For Each nodeRow In nodeList
rowCount = rowCount + 1
cellCount = 0
For Each nodeCell In nodeRow.ChildNodes
cellCount = cellCount + 1
Set cellRange = sheet.Cells(rowCount, cellCount)
cellRange.Value = nodeCell.Text
Next nodeCell
Next nodeRow
End Sub
Sample Input XML:
<?xml version="1.0" encoding="utf-8"?>
<feed>
<row>
<firstname>joe</firstname>
<lastname>smith</lastname>
<country>jamaica</country>
</row>
<row>
<firstname>bill</firstname>
<lastname>coots</lastname>
<country>uk</country>
</row>
</feed>