tags:

views:

170

answers:

6

I have a general question that is rather open-ended (i.e. "depends on platform, application type, etc.") but I am looking for general guidelines as an answer.

When is it preferable to design an application for continuous operation (100% uptime) vs. scheduled daily shutdown/restart?

Obviously, web apps need to be up all the time, so assume for this question that we are discussing an internal enterprise application, such as an accounting system, or a B2B system that is only used actively during weekday business hours.

Arguments I've heard for each are as follows:

Pro 100% Uptime: "once you get an application running, it's better to keep it up, because there's a chance it won't restart when you shut it down."

Pro daily restarts: "an application that is up continuously for 3 years might one day go down, and nobody will know how to bring it back online."

Other considerations are memory growth, performance, need for maintenance, etc. This is a programming issue because the choice you make can affect your technical design. For example, you don't need to code certain batch jobs and clear state daily if you know the application will be shutdown/restarted daily.

Thoughts?

A: 

If you're not extremely confident in your team, it might be better to go down from time to time, to clear everything. Once a day could do it, but there is a range from this to "never" ...

But this is generally dictated by business contraints. If you don't have those constraints yet ...

Well, why don't you also postpone your decision then ?

KLE
Over the past few years, I have actually built or contributed to several messaging systems that take both approaches. I've actually got an opinion of my own but I want to see what the community thinks.
noahz
OK. Thank you for your comment anyway.
KLE
+2  A: 

When is it preferable to design an application for continuous operation (100% uptime) vs. scheduled daily shutdown/restart?

I think this is really an orthogonal question to application design. Many web servers and application containers can support hot restarts. In other words, this is not a question so much of "application design" but rather a choice of technology. For example, you can avoid the question entirely by simply having N copies of your application (N > 1), then systematically bringing a particular instance down for maintenance and restarting as needed.

Furthermore, business needs and requirements should be determining the appropriate downtime, not your choice of technology.

Pro daily restarts: "an application that is up continuously for 3 years might one day go down, and nobody will know how to bring it back online."

Hogwash. That is a social/organizational argument, not a technical one. This is solved by having an obvious build process which includes starting the server as one of its possible tasks. That reduces the task of "restarting" to a single command.

John Feminella
+3  A: 

The arguments you state both for and against 100% uptime are foolish arguments, in my opinion. If you're worried about the application not restarting when it is shutdown then you have larger issues than uptime concerns. Likewise, if you feel that nobody will know how to bring it back online after a prolonged period of uptime you have training and documentation issues.

The reality is that you should always design an application to be efficient when it comes to memory consumption and performance. Generally, by doing this you end up with an application that can sucessfully survive as a long running process or one that restarts frequently. Keep in mind that your typical computer system is rebooted periodically anyway due to OS updates, etc.

Unless you have requirements and service level agreements that guarantee 100% uptime, this isn't usually something you have to be overly concerned about as long as you design an application efficiently.

Scott Dorman
What grinds me is having an application that runs in a server farm the developer has zero access to, yet if bugs manifest themselves (memory leaks, etc.) the "admins" fail to ever report the problem, pass on event logs, etc. Instead they put in some caveman-like regimen of server rebooting or service restarts on a daly cycle. Then the bugs can *never* get fixed.
Bob Riemersma
A: 

As others said, if you can't trust your app to start up again you have much larger issues.

From experience my general, personal, recommendation for web-apps is to cycle them once a day (in the early hours of the morning i.e. at the lowest traffic point) staggered over the whole server cluster. No matter how memory efficient your app is web-apps in particular can always have cache bloat issues over extended periods of up-time and one you accept the inevitability of a restart you absolutely want that to happen on your schedule and not t the whim of w3wp.exe.

Of course this all depends on the number of servers you have, the traffic manager you have (if any) and what your traffic profile looks like.

annakata
"Obviously, web apps need to be up all the time, so assume for this question that we are discussing an internal enterprise application, such as an accounting system, or a B2B system that is only used actively during weekday business hours."
noahz
Having service up-time of 100% is not the same as having application up-time 100% (assuming you have more than 1 server) so I thought my experience here was still relevant
annakata
Understood. I'll take your answer as "100% up-time" where the application must be distributed to ensure 100% service up-time.
noahz
A: 

Apart from "Your app is not good enough if you need to restart it" ideas (which I see them perfect and I like them), I would prefer something in the middle as a preventive measure.

If you application is not too big, and one person can restart it without much trouble, it would be fine to restart it maybe once per month or 3/4 times per year. This way you will ensure that the sysadmin knows well how to do it (people sometimes comes and go form the companies) and also his knowledge keeps fresh.

If you have a problem and your sysadmin has not restarted the application since two years ago, he will have several manuals available and courses done, but probably he has forget some steps, or he is not so quick to solve the problem.

Other topic to consider is: "Is a fully implemented application or are you still working on it?" If it's an application made for yourselves, you still code on it and make frequent upgrades for new features, it can be interesting to restart it more frequently. If a problem appears, it has more probabilities to be hidden on the new code. It will help your programmers to fix it and your sysadmin to keep updated about what's happening with the app.

Of course, making a perfect application is always a top-prio element, but... ok, we all know that not always is possible

dexem
+1  A: 

Sorry, but I'm not getting the point or this question is totally pointless.

An application, any application, should be designed, IMO, to stay up unless it's needed. If an application/platform needs to be restarted daily, then it has memory leaks, or bugs, or it's, in general, poorly written.

The point "don't make it stay up too long, otherwise you'd risk nobody will ever remember how to turn it up again" is quite laughable. I do Application Management (Operations) as my daily job, and I've never seen an application staying up for more than one month. After that period, you have to cope with OS maintainance, db patching, software upgrades, etc.

So, to summarize: write applications that can stay up as long as it's needed.

friol
Unfortunately I have differing opinions on this question from colleagues with 10-15 years experience. So it's not so obvious.
noahz

related questions