views:

777

answers:

11

The vast majority of applications does not handle "disk full" scenarios properly.

Example: an installer doesn't see that the disk is full, ignores all errors, and finally happily announces "installation complete!", or an email program is unaware that the message it has just downloaded could not be saved, and tells the server to delete the original.

What techniques are there to handle this situation gracefully? Do you use them? Do you test them?

A: 

To extend this further, are there techniques for handling "disk full" on a SQL server? Should never happen I know, but it can (and has happened to me).

I'm familiar with testing for disk full on standalone PC apps (having grown up programming in the floppy disk era). Even on an older SCO Unix system that was very tight on space and would freeze if you ran it out of space. Not so familiar with what it means on a modern system.

Brian Knoblauch
The SQL server would presumably handle the low-level error by refusing to insert more data. You could handle that with transactions and rollback in your SQL.
Sherm Pendley
@Sherm, good point. Is there a way to know ahead of time though?
Brian Knoblauch
That would be prone to a race condition, where you check for space, find that you have enough, then something else uses the space before you get to use it.
Sherm Pendley
+2  A: 

I check for errors when I open, write to, or close a file. On the other hand, I don't do anything in particular to handle "disk full" errors. Instead, I rely on the underlying OS to report those errors to me in the same way it reports any other types of errors.

Sherm Pendley
Do you allow the user to retry the save (once they or the OS has free'ed space) or like one infamous app do you just quit anyway throwing away the unsaved data.
Martin Beckett
+2  A: 

I work with data acquisition software. Since I can estimate the size of a file before I create it (based on the amount of data requested to acquire), I warn even before I create the file based on how much disk space exists if I expect that I will run out of room.

Nick
A: 

You are totally right. Software should handle gracefully this kind of situation.

You can always check on an IOException to see if the disk is full or if the user have rights to write to that location.

SQL Server does handle this situation but doesn't recover from it. When the disk is full... it stops working. :)

Maxim
+1  A: 

It's important to structure your persistance routines to be within a block that loops, prompting the user to select a location then attempting to save, until either the data saves correctly or the user chooses to cancel. It seems obvious to me, but I've seen a lot of apps that hit an exception when you try to persist that isn't even handled and pushes execution up out of your routine and lose your in-memory data.

AgentThirteen
+3  A: 

As a user, I want software to:

  1. Preserve my data.
  2. Validate my environment as early as possible, before I do any real work.
  3. If #2 is impossible, tell me about any special requirements.
  4. Clean up after itself.

As a developer, techniques to do this include:

  1. Aborting only when there is no alternative, and allowing the user a chance to make a new choice if the previous one fails (see AgentThirteen's answer).
  2. Checking for required resources (memory, disk space, peripherals) as early as possible. Stop immediately if failure is certain; display a warning if success is uncertain, allowing the user to choose whether to continue.
  3. Pre-allocating resources to ensure they will still be available when they are required.
  4. Displaying warnings and errors in non-modal dialogs so the user can place the application in the background and use other tools to fix the problem.
  5. Maintaining an "undo" list: the history of actions that have been performed so far. If the application must abort, offer an opportunity to undo those actions.
Adam Liss
+1  A: 

As is often the case, my position on this question contradicts conventional thinking.

In general, I ignore out-of-disk-space the same way I ignore out-of-memory. Partly because it's impossible to reliably predict these conditions. Partly because when we're talking about the behavior of software in unexpected conditions (like a bug you don't know you have which causes you to eat all disk or memory), it's impossible to reason well enough about the situation to code for it AND TEST IT. (It's safe to assume that if you have code that isn't tested, it doesn't work).

However, there are specific conditions that indicate a different approach:

  • If you're holding important, unsaved, user state (like a text file with some edits), consider pre-saving the data in the background, so that a crash is recoverable later.

  • If you're about to write to disk based on an interactive user's command (e.g. File->Save), you can catch a failure and offer to let them try again.

In either cases, it's important that bugs look like bugs. Crashing bugs should crash. Catching unexpected exceptions and continuing quietly robs you of diagnostic opportunities while leaving your software in an unsafe state.

Jay Bazuzi
A: 

We have just add support for this in our product. Running on an embedded device, we check disk space being below 20% (10mb) each hour, and transmit warnings to the office server, log the problem, and warn the user.

Once in this state we check every two minutes for sub 2mb space, and gracefully stop the application (a guidance system), and refuse to run until the space problem is solved.

As our product is a core system in our customers workplace, this gets the administers attention.

Simeon Pilgrim
A: 

If I'm writing an app, I can have it recover in some way from I/O failure, but if I'm running someone else's app, I have to take a different approach, which is to recover as much disk space as possible. This is the method I use. There can be large amounts of recoverable disk space in the form of 1) a small number of large files hidden away deeply in obscure directories, or 2) large numbers of small files dispersed over the disk, again in non-obvious locations. This method finds them either way.

Mike Dunlavey
+1  A: 

For testing, create a small partitions that will run out of space at different points, perhaps as virtual PCs so that the tests are well-contained and reproducible.

Adrian McCarthy
A: 

Yeah, it's probably a really bad idea to try to do something "smart" about this kind of problem. Basically the most you can do is try to minimize the loss. For interactive apps with unsaved user data, try to avoid crashing before the user has a chance to resolve the problem.

SamB