I have several dozen 64-bit Windows 2003 servers in a high performance environment with very bursty system utilization. I am looking for a tool (or tools) to monitor and analyze system performance (eg, CPU utilization, bandwidth, etc).
This tool can either query servers from a central location (SNMP) or require installation of a component on each server. It should poll on a 1-second interval. It should be able to generate pretty graphs which show trends over time. As a nice-to-have, it should be able to send out emails or IMs when certain thresholds are exceeded.
The tools I have investigated so far, including Solarwinds and PRTG, are not designed to poll this frequently. They seem to be designed for a ~30-second interval.
Solarwinds wouldn't go lower than 3-sec, and PRTG chokes at 1-sec. They both default to 1-min. All three tools seem more focused on outage monitoring and reporting than metric collection. Given the bursty nature of our applications, such infrequent polling would result in a very inaccurate picture of performance.
I am considering rolling my own solution using perfmon. This would be a lot of work, and seems like reinventing the wheel. Are there any tools out there that meet my needs?