views:

107

answers:

3

I assist in maintaining an enterprise web-based system (programmed in J2EE, but this is a more general question) and I'd like to know: what good tools are out there to measure the "health" of an enterprise system? For instance, tools to check memory space on servers, check the status of batch runs, the number of records processed in a certain amount of time, etc?

I don't wish to limit this to one tool per answer, though, multiple tools per answer are certainly acceptible.

+1  A: 

We use Nagios

I'd provide more detail but our admin guys set it up so hopefully someone can give more info in comments. What i do know is that we use it for hosting a couple clients sites and the sites are rather large with quite a bit of traffic. It works exceptionally well.

JoshReedSchramm
+3  A: 

OpenNMS is a nice monitoring tool. Out of the box it can monitor various aspects of a server, mostly things like memory, network usage, disk space. But it's open source, and can be extended to monitor other things.

We use it to monitor thousands of services. It's very good at what it does.

It may not be a good fit for the number of records processed, at least we don't use it that way.

Steve K
A: 

+1 for OpenNMS. In addition to its out-of-the-box system-level monitoring, it can be easily extended with JMX, so your applications can expose their innards as JMX attributes, and OpenNMS can monitor them, graph them, raise alerts based on them, etc.

We've also extended OpenNMS to send SMS alerts when things go wonky.

skaffman