tags:

views:

37

answers:

1

I need to use JDOM to generate XML files, which could be pretty big. I'm wondering how much additional memory space JDOM needs other than the data, mainly strings, that is already in the memory. I wrote a simple program to test and it turned out that the overhead is about twice as much as the XML content.

Does anybody know why JDOM needs that much additional memory and if there is a way I can optimize it? Shouldn't JDOM objects just keep reference to existing strings?

Here is the program I used to test:

public class TestJdomMemoryOverhead {
    private static Runtime runtime = Runtime.getRuntime();

    public static void gc() {
        // Try to give the JVM some hints to run garbage collection
        for (int i = 0; i < 5; i++) {
            runtime.runFinalization();
            runtime.gc();
            Thread.currentThread().yield();
        }
    }

    public static void generateXml(List<String> filenames) throws IOException {
        // generate a simple XML file by these file names. It looks like:
        // <?xml version="1.0" encoding="UTF-8"?>
        // <files>
        // <f n="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" />
        // <f n="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" />
        // <f n="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" />
        // ....
        // ....
        // </files>
        Element filesElem = new Element("files");
        Document doc = new Document(filesElem);
        for (String name : filenames) {
            Element fileElem = new Element("f");
            fileElem.setAttribute("n", name);
            filesElem.addContent(fileElem);
        }
        gc();
        System.out.println("After generating JDOM objects: " + (runtime.totalMemory() - runtime.freeMemory()) + " bytes");
        XMLOutputter outputter = new XMLOutputter(Format.getPrettyFormat());
        BufferedWriter writer = new BufferedWriter(new FileWriter("test.xml", false));
        outputter.output(doc, writer);
        writer.close();
        gc();
        System.out.println("After writing to XML file: " + (runtime.totalMemory() - runtime.freeMemory()) + " bytes");
    }

    public static void main(String[] cmdArgs) throws IOException {
        List<String> filenames = new ArrayList<String>();
        StringBuilder builder = new StringBuilder();
        // 30 unicode chracters, repated 500,000 times. The memory to store
        // these file name strings should be about 30MB.
        builder.append("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
        for (int i = 0; i < 500000; i++) {
            filenames.add(builder.toString());
        }
        gc();
        System.out.println("After generating file names: " + (runtime.totalMemory() - runtime.freeMemory()) + " bytes");
        generateXml(filenames);
        gc();
        System.out.println("Get back to main: " + (runtime.totalMemory() - runtime.freeMemory()) + " bytes");
    }
}

The output is:

After generating file names: 51941096 bytes
After generating JDOM objects: 125766824 bytes
After writing to XML file: 126036768 bytes
Get back to main: 51087440 bytes

As you can see, the JDOM objects used about 70MB.

A: 

The reason JDOM needs so much memory is because JDOM is mainly a tree based API like DOM (The document tree is created in memory, the way you have used it.). But its more performant than DOM. If you are creating large XML documents, you may want to consider using something like XMLStreamWriter that is bundled with jdk6

Here's a short article on what JDOM isn't capable of

naikus