views:

64

answers:

4

Let's say you have a program, like a text editor or a word processor, that writes to user-created files. What steps should be taken to guarantee the minimum risk of data loss or corruption in the face of crashes, out-of-space errors, sudden power loss, race conditions, etc?

A: 

Use SQLite

Well, OK, it is weird to use a DB for a text editor, but a word processor has so much of state that it might make some sense. Certainly it makes sense for as a storage format for many kinds of applications. There is a page on the SQLite wiki site about using it for undo/redo logs.

For a text editor you can use techniques that databases do: write ahead log, or rollback log, and good commit synchronization with the disk. Or you can store two versions of every file.

Doug Currie
+1  A: 

A good rule of thumb for safeguarding important data is

NEVER MODIFY THE ONLY COPY

In the case of Word processors and text editors, I believe it's standard to create a "shadow copy" (This might not be the technical term) which is a copy of the original file where all changes are made. Periodically (or when the user requests) you can force a save which contains modifications over the original file. The advantage of this is if there is a failure at any moment there is always at least one valid copy of the data.

The real goal is to achieve atomicity - an operation can only succeed or fail, never have an incomplete state. There are many other ways to attain atomicity aside from "shadow copies" , but this is how I believe text editors do it.

Falaina
A: 

I wrote an earlier answer to a similar problem that applies here as well. The steps are:

  1. Write a temporary file with the new data
  2. Move the temporary file to a backup file in the original file's directory.
  3. Perform an atomic swap of the backup and original file (File.Replace in Windows or swapping inodes in Unix).
  4. Delete the backup (now original) file.
Dour High Arch
A: 

This is perhaps outdated with today's multi-gigabyte machines, but when developing on the mac I remember we used to allocate a memory block that would be large enough to perform a save operation.

If we ran out of memory, we could then give the user a warning that he/she was out of memory, then free that block so the actual save operation could take place.

Other features that are important to saving user data is to provide undo -- ideally unlimited undo/redo.

Larry Watanabe