I have a program that generates a variable amount of data that it has to store to use later. When should I choose to use mallod+realloc and when should I choose to use temporary files?
mmap(2,3p)
(or file mappings) means never having to choose between the two.
In a modern OS, all the memory gets paged out to disk if needed anyway, so feel free to malloc() anything up to a couple of gigabytes.
Prefer a temporary file if you need/want it to be visible to other processes, and malloc/realloc if not. Also consider the amount of data compared to your address space and virtual memory: will the data consume too much swap space if left in memory? Also consider how good a fit the respective usage is for your application: file read/write etc. can be a pain compared to memory access... memory mapped files make it easier, but you may need custom library support to do dynamic memory allocation within them.
If you know the maximum size, it's not too big and you only need one copy, you should use a static buffer, allocated at program load time:
char buffer[1000];
int buffSizeUsed;
If any of those pre-conditions are false and you only need the information while the program is running, use malloc
:
char *buffer = malloc (actualSize);
Just make sure you check that the allocations work and that you free whatever you allocate.
If the information has to survive the termination of your program or be usable from other programs at the same time, it'll need to go into a file (or long-lived shared memory if you have that capability).
And, if it's too big to fit into your address space at once, you'll need to store it in a file and read it in a bit at a time.
That's basically going from the easiest/least-flexible to the hardest/most-flexible possibilities.
Where your requirements lie along that line is a decision you need to make.
On a 32-bit system, you won't be able to malloc() more than 2GB or 3GB or so. The big advantage of files is that they are limited only by disk size. Even with a 64-bit system, it's unusual to be able to allocate more than 8GB or 16GB because there are usually limits on how large the swap file can grow.
Use temporary files if the size of your data is larger than the virtual address space size of your target system (2-3 gb on 32-bit hosts) or if it's at least big enough that it would put serious resource strain on the system.
Otherwise use malloc
.
If you go the route of temporary files, use the tmpfile
function to create them, since on good systems they will never have names in the filesystem and have no chance of getting left around if your program terminates abnormally. Most people do not like temp file cruft like Microsoft Office products tend to leave all over the place. ;-)
Use ram for data that is private and for the life of a single process. Use a temp file if the data needs to persist beyond the a single process.