What do you use the boolean variables for? To keep track of nesting?
I recently implemented this by using an enum for every element.
The code is at work but this is a rough approximation of it off the top of my head:
enum Element {
// special markers:
ROOT,
DONT_CARE,
// Element tag parents
RootElement( "root" ROOT),
AnElement( "anelement"), // DONT_CARE
AnotherElement( "anotherelement"),// DONT_CARE
AChild( "child", AnElement),
AnotherChild( "child", AnotherElement);
Element() {...}
Element(String tag, Element ... parents) {...}
}
class MySaxParser extends DefaultHandler {
Map<Pair<Element, String>, Element> elementMap = buildElementMap();
LinkedList<Element> nestingStack = new LinkedList<Element>();
public void startElement(String namespaceURI, String sName, String qName, Attributes attrs) {
Element parent = nestingStack.isEmpty() ? ROOT : nestingStack.lastElement();
Element element = elementMap.get(pair(parent, sName));
if (element == null)
element = elementMap.get(DONT_CARE, sName);
if (element == null)
throw new IllegalStateException("I did not expect <" + sName + "> in this context");
nestingStack.addLast(element);
switch (element) {
case RootElement: ... // Probably don't need cases for many elements at start unless we have attributes
case AnElement: ...
case AnotherElement: ...
case AChild: ...
case AnotherChild: ...
default: // Most cases here. Generally nothing to do on startElement
}
}
public void endElement(String namespaceURI, String sName, String qName) {
// Similar to startElement() but most switch cases do something with the data.
Element element = nestingStack.removeLast();
if (!element.tag.equals(sName)) throw IllegalStateException();
switch (element) {
...
}
}
// Construct the structure map from the parent information.
private Map<Pair<Element, String>, Element> buildElementMap() {
Map<Pair<Element, String>, Element> result = new LinkedHashMap<Pair<Element, String>, Element>();
for (Element element: Element.values()) {
if (element.tag == null) continue;
if (element.parents.length == 0)
result.put(pair(DONT_CARE, element.tag), element);
else for (Element parent: element.parents) {
result.put(pair(parent, element.tag), element);
}
}
return result;
}
// Convenience method to avoid the need for using "new Pair()" with verbose Type parameters
private <A,B> Pair<A,B> pair(A a, B b) {
return new Pair<A, B>(a, b);
}
// A simple Pair class, just for completeness. Better to use an existing implementation.
private static class Pair<A,B> {
final A a;
final B b;
Pair(A a, B b){ this.a = a; this.b = b;}
public boolean equals(Object o) {...};
public int hashCode() {...};
}
}
Edit:
The position within the XML structure is tracked by a stack of elements. When startElement is called, the appropriate Element
enum can be determined by using 1) the parent element from the tracking stack and 2) the element tag passed as the sName parameter as the key to a Map generated from the parent information defined as part of the Element
enum. The Pair
class is simply a holder for the 2-part key.
This approach allows the same element-tag that appears repeatedly in different parts of the XML structure with different semantics to be represented by different Element
enums. For example:
<root>
<anelement>
<child>Data pertaining to child of anelement</child>
</anelement>
<anotherelement>
<child>Data pertaining to child of anotherelement</child>
</anotherelement>
</root>
Using this technique, we don't need to use flags to track the context so that we know which <child>
element is being processed. The context is declared as part of the Element
enum definition and reduces confusion by eliminating assorted state variables.