We currently facing some stability issues with our develop web application product. This product was built in part by our partner contractor, and we want to have a good standard metric for stability. The issues we have been facing is constant crashing. The web application is unable to identify when there are more request than it can handle, it builds up memory (like a memory leak) and later it dies without any type of possible recovery.
We would like to write a very simple measurement for our partner contractor to meet. We thought about a few ideas:
- A system that is able to identify high loads of request and serve server unavailable try again pages, until it recovers from the high load.
- A set number of users concurrent or pageviews that will allow us to have a clear metric of when to use scalability options like Load Balancer and Caching.
At this moment we have to use caching and load balancing to be able to recycle the web applications every x hours (depending on the load) so they don't die constantly.
Thanks for your help.