ansaurus

Question

Answer 1

+1 A:

The fact that it was available for 22/24 hours doesn't tell you anything about the composition of the uptime and downtime.

Example 1: Up for 22 hours straight, down for 2 hours.
Example 2: Up for 11 hours, down for 1. Up for another 11 hours, down for 1.

In the 2nd example, you still have 91.6%, but continuity would be half that because it was up for only 11 hours at a time for the 24 hour period.

Babak Naffas 2009-07-23 18:36:47

Answer 2

A:

We used software called 'BigBrother' about four years ago. It would track our server assets, and then provide a detailed graph of the server's availability.

Network Monitoring software is probably your best bet for acquiring this kind of information on your own network.

I believe it would 'ping' the server every minute, and allow for 3 consecutive failures before registering a server as down. We had scripts that would page certain employees when a machine was identified as 'down'. Overall, we were very happy with its performance.

BigBrother used to be freeware, but they have since gone to some other licensing scheme.

Kieveli 2009-07-23 18:42:06

Answer 3

A:

It might just be semantics.

Continuity/Reliability = total uptime.
Availability = total uptime when people want to use it.

If your server goes offline at 3am when no one is trying to use it, that decreases reliability but not accessibility.

Rob Elliott 2009-07-23 18:48:31

Answer 4

A:

|----------|#|-----|#|-------------------|   
0          10h      15h                  35h 

'-' - means available,
'#' - means down, system failure

So, can the continuity be measured as (10h+5h+20h)/3 = 11.6h - the average interval of available state?

Alkersan 2009-07-23 19:04:16

Answer 5

A:

mean time to failure - the inverse of the failure frequency(non-repairable)

mean time between failure - the inverse of the failure frequency(repairable)

mean time to repair - how long it takes to fix (in your case - bring back online)

All sorts of operations research stuff on this

these are standard definitions

average "continuity" = mean time between failure

mtbf/(mtbf+mttr) = availability

natural definition - you can use what ever "it" is for mtbf time out of mtbf+mttr total time

note running parallel deceases mtbf (there are 2 machines so more failures) but decreases mttr to essentially switchovertime so mttr->0 (presumably) and availability goes up

So to measure what you term as "continuity" (actual mtbf)

total uptime divided by the number of failures(do not include the time you are bringing the device back online ie the unavailable time = repair time)

2009-07-23 19:20:40

ansaurus

tags:

views:

answers:

How to measure continuity of process?

related questions