I'm parsing an XML document into my own structure but building it is very slow for large inputs is there a better way to do it?
public static DomTree<String> createTreeInstance(String path)
throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder db = docBuilderFactory.newDocumentBuilder();
File f = new File(path);
Document doc = db.parse(f);
Node node = doc.getDocumentElement();
DomTree<String> tree = new DomTree<String>(node);
return tree;
}
Here is my DomTree constructor:
/**
* Recursively builds a tree structure from a DOM object.
* @param root
*/
public DomTree(Node root){
node = root;
NodeList children = root.getChildNodes();
DomTree<String> child = null;
for(int i = 0; i < children.getLength(); i++){
child = new DomTree<String>(children.item(i));
if (children.item(i).getNodeType() != Node.TEXT_NODE){
super.children.add(child);
}
}
}
UPDATE:
I have benchmarked the createTreeInstance() method using a 100MB XML file:
- Creating docBuilderFactory... Done [3ms]
- Creating docBuilder... Done [21ms]
- parsing file... Done [5646ms]
- getDocumentElement... Done [1ms]
- creating DomTree... Done [17076ms]
UPDATE:
As John Doe suggests below it may be more appropriate to use SAX - I have never used SAX before, so is there a good way to convert what I have to using SAX?