tags:

views:

95

answers:

3

I have a relatively general question regarding SAX. I understand how it works, and based on tutorials I've read, I've learned to keep the state by having a ton of data members like inNode that are booleans and then in each event handler, check each boolean and handle the parameters accordingly.

To me, this seems really inefficient, is there more efficient way or is that just the nature of SAX?

Thanks, Chris

A: 

This is how SAX works. It was designed for low memory usage, and simpler processing. If your code becomes too complex, you might want to use the DOM model instead.

Zed
Fair enough, thanks!
Chris Thompson
+2  A: 

Often, you can keep state by having a simple stack of tags.

When you enter a node, you push.

When you leave a node, you pop.

Sometimes this is better than a lot of booleans. Instead, you examine the stack to see if the correct context is in place to preserve the data being parsed.

S.Lott
Ah that's really creative. I think it would make the code a lot cleaner too, although you would probably still have to have a bunch of if/else blocks right? Unless you got really creative and had some sort of handler architecture stored in a hashmap with the key as the node name....
Chris Thompson
Rarely do you have a "bunch" of if/else blocks. Usually you're doing an XPath-like matching of the current context to see if you want to preserve it. Since the context is a stack (a list in Python) the comparison is trivial. In other languages comparing a stack against a template pattern might be a little hard, but it's more more regular expression or XPath matching than anything else.
S.Lott
It's always better than a bunch of booleans - actually, I find it impossible to imagine using SAX without a stack of some sort.
anon