i want to parse a file which is similar to a HTML file . Its not exactly a html file.It can contain some user defined tags. I dont know in advance how the tags are nested in one another in advance.The tags may also have attributes. I think i shold use a SAX parser. Does java have a inbuilt SAX . Can i call a function when i encounter each tag?
A:
SAX was originally Java only, so yes, Java has a built-in SAX parser - http://java.sun.com/j2se/1.4.2/docs/api/javax/xml/parsers/SAXParser.html. This will only work if your document is well formed.
stevedbrown
2009-08-28 15:18:19
+2
A:
I think you should use StAX instead, which is faster and easier to use than SAX. It's part of Java SE 6.
gustafc
2009-08-28 15:25:27
I disagree with it being easier to use. startElement() in SAX essentially passes you a map of attributes. You otehrwise have to write a more complicated piece of code to derive this information from StAX.
cletus
2009-11-10 07:32:27
On the other hand, StAX lets you parse XML documents with a simple recursive descent parser where the call stack matches the element stack. Using SAX you'd have to write a state machine, which requires a lot more boilerplate and which at least I consider a lot harder to get right than a util method reading the attributes from a StAX cursor into a map.
gustafc
2009-11-10 08:28:09
+3
A:
Use following packages, java.io,javax.xml.parsers,org.xml.sax.
SAXParserFactory spf = SAXParserFactory.newInstance();
XMLReader reader = null;
SAXParser parser = spf.newSAXParser();
reader = parser.getXMLReader();
reader.setContentHandler(new MyContentHandler());
//XMLReader to parse the entire file.
InputSource is = new InputSource(filename);
reader.parse(is);
// Implements the methods of ContentHandler
class MyContentHandler implements ContentHandler {
}
adatapost
2009-08-28 15:27:24