views:

78

answers:

3

Sorry, I'm not really sure of the right way to ask this one so bear with me...

We have a web application that runs on a set of servers at a data center (not in our offices) We want to be able to somehow 'advertise' to our clients/users that the availability or response time of our servers has met a standard throughout the day.

I am being asked to come up with a standard metric that we can easily advertise on our login screen that shows current "standard response time" checked every x minutes.

My thinking is that I need to capture something like the results of a traceroute from a server (either in our office, amazon, etc..) to one of the data center servers and come up with a Red/Yellow/Green type of a notifier for the login screen to let the user know that our tests are responding normally and if they are having delay issues it could be their network or connection to the internet. We have lots of clients in rural areas that have poor connectivity and we are trying to let them know any slowness might be on their end, not ours.

I've got the LAMP stack to work with, but this could also be some other system all together as long as it can update the main server with the results.

I already have pingdom reports that are available, but that's a bit more than people want to read sometimes.

Any ideas on what I can do?

Resolution:

I ended up going with Tim's PEAR Net_Ping idea. I used the following:

$ping->setArgs(array('count' => 6));
$results = $ping->ping('x.x.x.x');
$avgPing = $results->_round_trip['avg'];

To get the avg of 6 pings to the server. I then stored the result in a DB and was able to show the avg of the last 5 checks to give an idea of health. We'll see how clients like it.

+1  A: 

A network monitoring app like Nagios or OpenNMS would give you most any statistic you could need, and at least Nagios already has a web interface. I believe OpenNMS does as well, but it requires Tomcat.

I've only worked with Nagios and that was many years ago (for a regional dial-up ISP, many of those phone calls every day). I had to scrape the web interface and borrow a few of its generated graph images for my own display, but it was easy enough to automate hourly via cron. I'd be surprised if it doesn't have some sort of add-on to make the process simpler now, if not a proper API.

tadamson
Thanks for the idea. I'm looking for something simple for my users but I may end up using this as a tool internally.
Jason
A: 

http://oss.oetiker.ch/smokeping/ isn't bad

Jimmy Ruska
+3  A: 

If all you care about is network latency (not something like server load), why not ping from the server to a set of known good hosts. Then report the average ping time to the user (using a color scale for different response ranges).

Then additionally you could ping the user's IP and show that 'status', since the goal is to illustrate where the 'lag' is.

This seems better and simpler than doing the latency check from an external server (at least to me).

To do the actual ping, you could use a cron job and process the results with a php script, or use a ping library. PEAR's Net_Ping isn't maintained, but does work.

For something more complex than ping, you could fetch some pages (or images), then calculate the fetch time/response size. That maybe a better indicator if bandwidth load is a potential issue (this is the essential concept behind 'speed test' sites).

Tim Lytle
I'm trying Net_Ping out...it gives tons of info when I use the example ( print_r($ping->ping('192.168.123.1')); ). Do you know how I could get the [avg] values out of [_round_trip] by itself?
Jason
I've had trouble with getting data from Net_Ping, in one case I just ended up parsing the raw output (which is pretty much the same as running the ping command and parsing it).
Tim Lytle