I have a large xml file (1Gb). I need to make many queries on this xml file (using xpath for example). The results are small parts of the xml. I want the queries to be as fast as possible but the 1Gb file is probably too large for working memory.
The xml looks something like this:
<all>
<record>
<id>1</id>
... lots of fields. (Very different fields per record including (sometimes) subrecords
so mapping on a relational database would be hard).
</record>
<record>
<id>2</id>
... lots of fields.
</record>
.. lots and lots and lots of records
</all>
I need random access, selecting records using for instance as an key. (Id is most important, but other fields might be used as key too). I don't know the queries in advance, they arrive and have to be executed ASAP, no batch executing but real time. SAX does not look very promising because I don't want to reread the entire file for every query. But DOM doesn't look very promising either, because the file is very large and adding additional structure overhead will almost certainly mean that it is not going to fit in working memory.
Which java library / approach could I use best to handle this problem?