I am a big fan of using apache-digester to load XML files into my object model.
I am dealing with large files that contain many duplicates (event logs), and would therefore like to String.intern() the strings for specific attributes (the ones that frequently repeat).
Since Apache-Digester reads the whole file before relinquishing control, it initially generates a lot of duplicates that eat up a lot of memory; I can then go and iterate over all my objects and intern, but I still pay the cost of using up lots of memory.
Another altenrative is to have my corresponding setProperty bean function in my object model always intern the parameter, but I use the same function from within my code on already interned strings, so that would be wasteful; besides, I don't want to introduce digester specific code into my model.
Is there a way to get Digester to intern or execute custom code before/after setting properties?