I'm planning on running a ruby process that may take a month to finish. If possible, I'd like to ensure that a blackout or hitting the wrong button won't cost me the whole month's work.
Is there an easy way to periodically save the program's state to disk? (Techniques that involve more effort would include adding code that marshals everything apart from the database, or possibly using a virtual machine for the process' operating system)
(For those interested: the process involves parsing a multi-gigabyte XML file of a well-known website, processing some information, and saving the information to an ActiveRecord database as it goes along. Twice.)
Edit: The project is this one, and the XML file is pages-articles.xml (eg enwiki-20090306-pages-articles.xml). Nothing proprietary, I just didn't want to be in "Plz halp" mode. The first pass gets a list of Wikipedia page titles, the next pass determines the first link from each page to another page, and then I calculate some statistics.
Continuing from where I left off, as suggested by some answerers, is probably a valid option. If it crashes during the first pass, then I probably could re-run it, telling it not to add entries that already exist. If it crashes during the second pass, then I should only ask it to build links for pages that haven't already had their link calculated. If it crashes during calculating the statistics, I could just re-calculate the statistics.
Another edit: More general version of this question asked at Save a process’ memory for later use?. It looks like you can't easily back up long-running processes.