I have written a converter that takes openstreetmap xml files and converts them to a binary runtime rendering format that is typically about 10% of the original size. Input file sizes are typically 3gb and larger. The input files are not loaded into memory all at once, but streamed as points and polys are collected, then a bsp is run on them and the file is output. Recently on larger files it runs out of memory and dies (the one in question has 14million points and 1million polygons). Typically my program is using about 1gb to 1.2 gb of ram when this happens. I've tried increasing virtual memory from 2 to 8gb (on XP) but this change made no effect. Also, since this code is open-source I would like to have it work regardless of the available ram (albeit slower), it runs on Windows, Linux and Mac.
What techniques can I use to avoid having it run out of memory? Processing the data in smaller sub-sets and then merging the final results? Using my own virtual memory type of handler? Any other ideas?