views:

679

answers:

6

Win32's CreateFile has FILE_FLAG_DELETE_ON_CLOSE, but I'm on Linux.

I want to open a temporary file which will always be deleted upon program termination. I could understand that in the case of a program crash it may not be practical to guarantee this, but in any other case I'd like it to work.

I know about RAII. I know about signals. I know about atexit(3). I know I can open the file and delete it immediately and the file will remain accessible until the file descriptor is closed (which even handles a crash). None of these seem like a complete and straightforward solution:

  1. RAII: been there, done that: I have an object whose destructor deletes the file, but the destructor is not called if the program is terminated by a signal.
  2. signals: I'm writing a low-level library which makes registering a signal handler a tricky proposition. For example, what if the application uses signals itself? I don't want to step on any toes. I might consider some clever use of sigaction(2) to cope...but haven't put enough thought into this possibility yet.
  3. atexit(3): apparently useless, since it isn't called during abnormal termination (e.g. via a signal).
  4. preemptive unlink(2): this is pretty good except that I need the file to remain visible in the filesystem (otherwise the system is harder to monitor/troubleshoot).

What would you do here?

Further Explanation

I elided one detail in my original post which I now realize I should have included. The "file" in this case is not strictly a normal file, but rather is a POSIX Message Queue. I create it via mq_open(). It can be closed via mq_close() or close() (the former is an alias for the latter on my system). It can be removed from the system via mq_unlink(). All of this makes it analogous to a regular file, except that I cannot choose the directory in which the file resides. This makes the current most popular answer (placing the file in /tmp) unworkable, because the "file" is created by the system in a virtual filesystem with very limited capacity. (I've mounted the virtual filesystem in /dev/mqueue, following the example in man mq_overview) .

This also explains why I need the name to remain visible (making the immediate-unlink approach unworkable): the "file" must be shared between two or more processes.

+6  A: 

If you're just making a temporary file, just create it in /tmp or a subdirectory thereof. Then make a best effort to remove it when done through atexit(3) or similar. As long as you use unique names picked through mkstemp(3) or similar even if it fails to be deleted because of a program crash, you don't risk reading it again on subsequent runs or other such conditions.

At that point it's just a system-level problem of keeping /tmp clean. Most distros wipe it on boot or shutdown, or run a regular cronjob to delete old files.

Kamil Kisiel
This is no fault of yours, but I can't use this solution. Please see my "Further Explanation" in the question. I cannot choose the path for the file, and the filesystem in which it resides has very limited size. If I am left with no other choice I might well sweep it with cron-an ugly workaround.
John Zwinck
+3  A: 

In the past, I have build a "temporary file manager" that kept track of temporary files.

One would request a temporary file name from the manager and this name was registered.

Once you don't need the temporary file name any more, you inform the manager and the filename is unregistered.

Upon receipt of a termination signal, all the registered temporary files were destroyed.

Temporary filenames were UUID based to avoid collisions.

David Segonds
Complex, but clearly it works - doubly so if the temporary file manager has a way to detect whether the process that requested the temporary file still exists. Gets a bit tricky if that process forks; doubly so if the parent exits.
Jonathan Leffler
You could add a requirement that when a process forks, the parent must inform the manager of the pid of the child process. This also allows the flexibility of other types of inter-process comm. to be used to pass the filename of the tmp file around.
KeithB
David, your answer sounds as if this "manager" lives as a module in the same process. Jonathan's comment makes it sound like it would be a separate process. I can see how it could work as a separate process, sure, but were you suggesting it could be in-process? If so I don't see as much value....
John Zwinck
The temporary file manager was in-process. This works well in most cisrcumstances but I agree that it is not fullproof.
David Segonds
+5  A: 

The requirement that the name remains visible while the process is running makes this hard to achieve. Can you revisit that requirement?

If not, then there probably isn't a perfect solution. I would consider combining a signal handling strategy with what Kamil Kisiel suggests. You could keep track of the signal handlers installed before you install your signal handlers. If the default handler is SIG_IGN, you wouldn't normally install your own handler; if it is SIG_DFL, you would remember that; if it is something else - a user-defined signal handler - you would remember that pointer, and install your own. When your handler was called, you'd do whatever you need to do, and then call the remembered handler, thus chaining the handlers. You would also install an atexit() handler. You would also document that you do this, and the signals for which you do it.

Note that signal handling is an imperfect strategy; SIGKILL cannot be caught, and the atexit() handler won't be called, and the file will be left around.

David Segond's suggestion - a temporary file name daemon - is interesting. For simple processes, it is sufficient; if the process requesting the temporary file forks and expects the child to own the file thereafter (and exits) then the daemon has a problem detecting when the last process using it dies - because it doesn't automatically know the processes that have it open.

Jonathan Leffler
I do not believe I can remove the requirement for the filename visibility. I added "Further Explanation" to the original question which explains that this is not a regular file, and it lives in a fixed location with very limited capacity. I think I need to have the files visible for monitoring.
John Zwinck
Of course, if there is some other way to achieve my goals without the name visibility requirement, I could do that instead. Basically I need a decent way to keep the files from piling up, because they consume a very limited resource (imagine a filesystem with 16MB capacity and ~200KB files).
John Zwinck
Oh, and let me not forget, the names need to remain visible during normal operation because they are shared between programs. I can't just unlink them immediately after their creation--that would render them useless. I should have been explicit about that originally.
John Zwinck
A: 

You could have the process fork after creating the file, and then wait on the child to close, and then the parent can unlink the file and exit.

Nikron
+1  A: 

Maybe someone suggested this already, but I'm unable to spot it, given all your requirements, the best I can think of is to have the filename somehow communicated to a parent process, such as a start-script, which will clean up after the process dies, had it failed to do so. This is perhaps mostly known as a watchdog, but then with the more common use case added to kill and/or restart the process when it somehow fails.

If your parent process dies as well, you're pretty much out of luck, but most script environments are fairly robust and rarely die unless the script is broken, which is often easier to keep correct than a program.

roe
Actually, this is a pretty good idea - except it is a low-level library that's being written. The library initialization routine create the file and then fork. The child goes on to do all the real work. The parent simply sits there, waiting for the child to terminate, then removes the file.
Jonathan Leffler
This is tricky to justify in a general purpose set of routines; the program may have its own requirements on process structure. If it is permissible - you have enough control of the clients - then it would be pretty effective.
Jonathan Leffler
+1  A: 

Hi John. I just joined stackoverflow and found you here :)

If you're problem is to manage mq files and keep them from piling up, you don't really need to guarantee file deletion upon termination. If you just wanted to useless files from piling up, than keeping a journal may be all you need. Add an entry to the journal file after a mq is opened, another entry when it is closed, and when your library is initialized, check for inconsistency in the journal and take whatever action needed to correct the inconsistency. If you worry about crashing when mq_open/mq_close is being called, you can also add an journal entry just before those functions are called.

Shing Yip