I have a file (fasta file to be specific) that I would like to index, so that I can quickly locate any substring within the file and then find the location within the original fasta file.
This would be easy to do in many cases, using a Trie or substring array, unfortunately the strings I need to index are 800+ MBs which means that doing...
I enjoy developing algorithms using the STL, however, I have this recurring problem where my data sets are too large for the heap.
I have been searching for drop-in replacements for STL containers and algorithms which are disk-backed, i.e. the data structures on stored on disk rather than the heap.
A friend recently pointed me toward...
Hi,
I need to store large number of integers. There can be
duplicates in the input stream of integers, I just need
to store distinct amongst them.
I was using stl set initially but It went OutOfMem when
input number of integers went too high.
I am looking for some C++ container library which would
allow me to store numbers with the said...
Does anyone know where to find a B+Tree on-disk implementation? I went through google forward and backward and unfortunately I couldn't find anything sensible. Other threads have suggested to maybe take the tree from sqlite, sqljet or bdb but these trees are nested in the whole database and you can't really "just" filter out the B+Tree.
...