views:

174

answers:

5

I'm writing a straightforward C program on Linux and wish to use an existing library's API which expects data from a file. I must feed it a file name as a const char*. But i have data, just like content of a file, already sitting in a buffer allocated on the heap. There is plenty of RAM and we want high performance. Wanting to avoid writing a temporary file to disk, what is a good way to feed the data to this API in a way that looks like a file?

Here's a cheap pretend version of my code:

marvelouslibrary.h:

int marvelousfunction(const char *filename);

normal-persons-usage.cpp, for which library was originally designed:

#include "marvelouslibrary.h"
int somefunction(char *somefilename)
{
    return marvelousfunction(somefilename);
}

myprogram.cpp:

#include "marvelouslibrary.h"
int one_of_my_routines() 
{
    byte* stuff = new byte[1000000];
    // fill stuff[] with...stuff!
    // stuff[] holds same bytes as might be found in a file

    /* magic goes here: make filename referring to stuff[] */

   return marvelousfunction( ??? );
}

To be clear, the marvelouslibrary does not offer any API functions that accept data by pointer; it can only read a file.

I thought of pipes and mkfifo(), but seems meant for communicating between processes. I am no expert at these things. Does a named pipe work okay read and written in the same process? Is this a wise approach?

Or skip being clever, go with plan "B" which is to shuddup and just write a temp file. However, i'd like to learn something new and find out what's possible in this situation, beside getting high performance.

A: 

mmap(), perhaps?

smcameron
well, there's two mentions of mmap() - will look at it.
DarenW
This would require more explanations. I don't see how mmap() could solve this problem.
bortzmeyer
+2  A: 

I'm not sure what kind of input the library function wants ... does it need a path/file name, or open file pointer, or open file descriptor?

If you don't want to hack the library and the function wants a string (path to a file), try making the temporary file in /dev/shm.

Otherwise, mmap might be the best option, please be sure to research posix_madvise() when using mmap() (or its counterpart posix_fadvise() if using a temporary file).

It looks like your talking about very little data to begin with, so I don't think you'll see a performance impact in whatever route you take.

Edit

Sorry, I just re-read your question .. perhaps I just read too fast. There is no way you are going to feed a function like:

char * foo(const char *filepath)

... with mmap().

If you can not modify the library to accept a file descriptor instead (or as an alternate to the path) .. just use /dev/shm and a temporary file, it will be quite cheap.

Tim Post
the data is megabytes in size.
DarenW
If you can, go with mmap()
Tim Post
ok, reading about mmap() - how to use it in this case? relevant web pages or book references?
DarenW
Can you update your question to include a prototype of the library function? Is it asking for a pathname, file pointer or file descriptor?
Tim Post
I, myself, do not understand how to use mmap() in such a case. mmap() works in the opposite direction, from a file descriptor to an address. I do not see how to use to go from an address to a file name.
bortzmeyer
That's why I'm asking the OP what his function wants .. if he can edit it and if not suggested /dev/shm.
Tim Post
I'm not quite clear if the function in question (in the library) wants a file name, pointer or descriptor.
Tim Post
code put in. yes, mmap does work backward to what i think i want. what i would like is to have no file involved, but the library can only get its data from a file, or some file-like entity such as a fifo, device, etc.
DarenW
+3  A: 

Given that you likely have a function like:

char *read_data(const char *fileName)

I think you will need to "skip being clever, go with plan "B" which is to shuddup and just write a temp file."

If you can dig around and find out if the call you are making is calling another function that takes a File * or an int for the file descriptor then you can do something better.

One thought that does come to mind, can you cahnge your code to write to a memory mapped file instead of to the heap? That way you would have a file on disk already and you would avoid the copying (though it'll still be on disk) and you can still give the function call the file name.

TofuBeer
so far, while it goes against my goal of learning something marvelous and new, this does let me get on with work and be productive.
DarenW
The "marvelous and new" thing you learned is that when you are making an API like that that you will provide a filename version AND a File * or file descriptor version too.
TofuBeer
A: 

Edit: Sorry. Just read the question. With my advise below, you fork a spare process, and the question of "does in work in a single process does not come up". I also see no reason you couldn't spawn a separate thread to do the push...


Not in the least elegant, but you could:

  1. open a named pipe.
  2. fork a streamer that does nothing but try to write to the pipe
  3. pass the name of the pipe

which should be pretty robust...

dmckee
A: 

You're on Linux, can't you just grab the source of the library and hack in the function you need? If it's useful to others, you could even send a patch to the original author, so it will be in future versions for everyone.

Paul Betts
yes, it would be nice to hack the library, but i'll have to put that on my to-do list for next week. not sure it's open source, though. even if _this_ time i can hack onward, next time i may be dealing with some proprietary junk for some client.
DarenW