Use file-only APIs in memory

views:

181

answers:

+3 Q:

Use file-only APIs in memory

Some APIs only support output to files. e.g. a library that converts a BMP to PNG and only has a Save(file) option - no in memory function. Disk IO is slow, though, and sometimes you just want in-memory operations.

Is there a generic solution to such a problem? Maybe a fake in-memory file of sorts that would allow one to use the library, yet not pay the performance penalty of disk IO?

+1 A:

Typically the OS interface for "temporary files" (eg tmpfile() / tmpnam()) actually creates storage inside the disk cache so that the operations go to memory and not to disk (up to a certain limit). It's not a perfect solution in that it relies on the OS rather than explicitly creating a file-like buffer inside the process space, but it's probably the easiest one.

tmpnam() is the generic C stdlib interface, but various OSes may have their own methods of doing what you want more precisely. For example, Windows has GetTempFileName().

Crashworks 2009-06-28 08:45:52

These libraries often just accept a filename as input, not an ostream... So you're saying temp files are as fast as memory writing?

Assaf Lavie 2009-06-28 08:47:12

Assaf, it depends on your OS, configuration, and how much RAM you have. Set up correctly, you can avoid the physical disk completely. There will still be some filesystem overhead.

Matthew Flaschen 2009-06-28 08:57:44

tmpnam() if you need a filename rather than a file handle.

Crashworks 2009-06-28 09:01:35

There's no general answer to this, but as you tagged your question C++, you should remember that the in-memory stringstream classes, declared in the <sstream> standard header, provide the same interface to the world as do the fstreams, when it comes to reading and writing.

anon 2009-06-28 08:49:25

I think he's asking about API's that only take a file name, and then open the file themselves.

jalf 2009-06-28 11:57:10

+1 A:

These libraries often just accept a filename as input, not an ostream

In this case, though not really being a programming solution, you could set up a ram disk.

VolkerK 2009-06-28 09:00:14

True. Yet I hardly think I can justify setting up a RAM disk on the customer's machine just so some utility function would not write to disk.

Assaf Lavie 2009-06-28 12:28:06

If the library lets you provide your own implementation of the file access functions (through function pointers or an interface class, a lot do, you'll just need to hunt around the header files a bit) then you should be able to provide an in memory solution without too much problem. There are two approaches I usually take depending on the memory requirements of the system I'm working with.

The first is to allocate all the the memory for the file up front in your implementation of the Open() call. You then just need to memcpy() into the buffer and update how far through the buffer you are in your Write() calls. Finally, in your implementation for Close(), simply write the file to disk using whatever IO function your platform provides. The advantage of this approach is that it's easy to implement but the disadvantage is that memory usage can be unpredicatable if you don't know how large the final file will be. Will you need a 1kb buffer or a 10mb buffer? Do you have enough memory for the whole file?

The second approach can avoid the memory problems of the above implementation, but is only really useful if the system isn't already providing buffered IO. This approach is to use a single buffer of a fixed size (e.g. 32kb) and use your Write() implementation to fill the buffer as before. However, every time you reach the 32kb limit, you write the buffer to disk and empty it, ready to be filled again. Your Close() implementation just needs to write any remaining data disk. The size of the buffer you need will depends on the system you're using, so you may have to experiment a bit to find an optimal size.

If the library needs seek access to the file, it will be trivial to add to the "all in memory" solution, but a bit trickier to add to the buffered solution. Not impossible though, so it's still worth considering if memory overheads are an issue for you.

Midpoint 2009-06-28 09:13:59

+2 A:

You can catch file I/O APIs (using detours, N-CodeHook for example), and route them to your implementation (Which will use the memory instead).

Here is a walk thought of someone who done something slimier, now I'm sure the some where there is full implementatio that will do that for you but I could not find one.

Shay Erlichmen 2009-06-28 10:02:15

yay, +1 for detours. You have to do some heavy testing before you declare your software "stable" but iirc it was built for internal use at microsoft to add dcom features to existing, older applications (?)

VolkerK 2009-06-28 16:59:54

+3 A:

Use named pipes.

Similar constructs exist for both Windowsand Unix (and this).

But I don't believe it worth the effort setting up all those constructs. Choose an alternative library or just write to disk if you may.

J-16 SDiZ 2009-06-28 10:43:10

I wish I could. But I'm stuck with this lib, and writing to disk is really too slow. Would a named pipe work with any library that uses stdio to write files?

Assaf Lavie 2009-06-28 12:31:07

Yes, for win32 fopen/fwrite/... work as long as the library accepts a path like \\.\pipe\nameOfPipe and you can even have multiple clients connecting to the same named pipe. On *nix it's a file/node like any other, so yes open/write/...will work, too.

VolkerK 2009-06-28 17:08:22

My generic solution is "find another API". It doesn't always work, but for a lot of tasks, it's possible. It is certianly possible to find a PNG -> BMP converter that can work in memory.

jalf 2009-06-28 11:58:20

ansaurus

tags:

views:

answers:

Use file-only APIs in memory

related questions