I'm writing a program that will continuously process files placed into a hot folder.
This program should have 100% uptime with no admin intervention. In other words it should not fail on "stupid" errors. i.e. Someone deletes the output directory it should simply recreate it and move on.
What I'm thinking about doing is to code the entire program and then go through and look for "error points" and then add code to handle the error.
What I'm trying to avoid is adding erroneous or unnecessary error handling or even building error handling into the control flow of the program (i.e. the error handling controls the flow of the program). Well perhaps it could control the flow to a certain extent, but that would constitute bad design (subjective).
What are some methodologies for "error proofing" a "critical" process?