You are on page 1of 7

Availability

If one considers both reliability (probability that the item will not fail) and maintainability (the
probability that the item is successfully restored after failure), then an additional metric is needed
for the probability that the component/system is operational at a given time, t (i.e. has not failed
or it has been restored after failure). This metric is availability. Availability is a performance
criterion for repairable systems that accounts for both the reliability and maintainability
properties of a component or system. It is defined as the probability that the system is operating
properly when it is requested for use. That is, availability is the probability that a system is not
failed or undergoing a repair action when it needs to be used. For example, if a lamp has a 99.9%
availability, there will be one time out of a thousand that someone needs to use the lamp and
finds out that the lamp is not operational either because the lamp is burned out or the lamp is in
the process of being replaced. (Note: Availability is always associated with time, much like
reliability and maintainability. As we will see in later sections, there are different availability
classifications and for some of which, the definition depends on the time under consideration.
Since no discussion about these classifications has been made yet, the time variable has been left
out of this 99.9% availability statement.)

This metric alone tells us nothing about how many times the lamp has been replaced. For all we
know, the lamp may be replaced every day or it could have never been replaced at all. Other
metrics are still important and needed, such as the lamp's reliability. The next table illustrates the
relationship between reliability, maintainability and availability.

A Brief Introduction to Renewal Theory

For a repairable system, the time of operation is not continuous. In other words, its life cycle can
be described by a sequence of up and down states. The system operates until it fails, then it is
repaired and returned to its original operating state. It will fail again after some random time of
operation, get repaired again, and this process of failure and repair will repeat. This is called a
renewal process and is defined as a sequence of independent and non-negative random variables.
In this case, the random variables are the times-to-failure and the times-to-repair/restore. Each
time a unit fails and is restored to working order, a renewal is said to have occurred. This type of
renewal process is known as an alternating renewal process because the state of the component
alternates between a functioning state and a repair state, as illustrated in the following graphic.
A system's renewal process is determined by the renewal processes of its components. For
example, consider a series system of three statistically independent components. Each
component has a failure distribution and a repair distribution. Since the components are in series,
when one component fails, the entire system fails. The system is then down for as long as the
failed component is under repair. Figure 7.1 illustrates this.

Figure 7.1: System downtime as a function of three component downtimes. Components A,


B and C are in series.

One of the main assumptions in renewal theory is that the failed components are replaced with
new ones or are repaired so they are "as good as new," hence the name renewal. One can make
the argument that this is the case for every repair, if you define the system in a high enough
detail. In other words, if the repair of a single circuit board in the system involves the
replacement of a single transistor in the offending circuit board, then if the analysis (or RBD) is
performed down to the transistor level, then the transistor itself gets renewed. In cases where the
analysis is done at a higher level, or if the offending component is replaced with a used
component, additional steps are required. We will discuss this in later chapters using a
restoration factor in the analysis. For more details on renewal theory, interested readers can refer
to Elsayed [7] and Leemis [17].

Availability Classifications

The definition of availability is somewhat flexible and is largely based on what types of
downtimes one chooses to consider in the analysis. As a result, there are a number of different
classifications of availability, such as:

Instantaneous (or Point) Availability.


Average Up-Time Availability (or Mean Availability).
Steady State Availability.
Inherent Availability.
Achieved Availability.
Operational Availability.

Instantaneous or Point Availability, A(t)

Instantaneous (or point) availability is the probability that a system (or component) will be
operational (up and running) at any random time, t. This is very similar to the reliability function
in that it gives a probability that a system will function at the given time, t. Unlike reliability, the
instantaneous availability measure incorporates maintainability information. At any given time, t,
the system will be operational if the following conditions are met [7]:

The item functioned properly from 0 to t with probability R(t) or it functioned properly since the
last repair at time u, 0 < u < t, with probability:

With m(u) being the renewal density function of the system.

Then the point availability is the summation of these two probabilities, or:

(1)

Average Uptime Availability (or Mean Availability),


The mean availability is the proportion of time during a mission or time period that the system is
available for use. It represents the mean value of the instantaneous availability function over the
period (0, T] and is given by:

(2)

Steady State Availability,

The steady state availability of the system is the limit of the instantaneous availability function as
time approaches infinity or:

(3)

(Note: For practical considerations, the instantaneous availability function will start approaching
the steady state availability value after a time period of approximately four times the average
time-to-failure.)

Figure 7.2 also illustrates this graphically.

Figure 7.2: Illustration of point availability approaching steady state.


In other words, one can think of the steady state availability as a stabilizing point where the
system's availability is a constant value. However, one has to be very careful in using the steady
state availability as the sole metric for some systems, especially systems that do not need regular
maintenance. A large scale system with repeated repairs, such as a car, will reach a point where it
is almost certain that something will break and need repair once a month. However, this state
may not be reached until, say, 500,000 miles. Obviously, if I am an operator of rental vehicles
and I only keep the vehicles until they reach 50,000 miles, then this value would not be of any
use to me. Similarly, if I am an auto maker and only warrant the vehicles to X miles, is knowing
the steady state value useful?

Inherent Availability, AI

Inherent availability is the steady state availability when considering only the corrective
downtime of the system.

For a single component, this can be computed by:

This gets slightly more complicated for a system. To do this, one needs to look at the mean time
between failures, or MTBF, and compute this as follows:

This may look simple. However, one should keep in mind that until steady state is reached, the
MTBF may be a function of time (e.g. a degrading system), thus the above formulation should be
used cautiously. Furthermore, it is important to note that the MTBF defined here is different from
the MTTF (or more precisely for a repairable system, MTTFF, mean time to first failure).

Achieved Availability, AA

Achieved availability is very similar to inherent availability with the exception that preventive
maintenance (PM) downtimes are also included. Specifically, it is the steady state availability
when considering corrective and preventive downtime of the system. It can be computed by
looking at the mean time between maintenance actions, MTBM and the mean maintenance
downtime, or:

Operational Availability, Ao
Operational availability is a measure of the average availability over a period of time and it
includes all experienced sources of downtime, such as administrative downtime, logistic
downtime, etc.

Operational availability is the ratio of the system uptime and total time. Mathematically, it is
given by:

(4)

Where the operating cycle is the overall time period of operation being investigated and uptime
is the total time the system was functioning during the operating cycle. (Note: The operational
availability is a function of time, t, or operating cycle.)

When there is no specified logistic downtime or preventive maintenance, Eqn. (4) returns the
Mean Availability of the system.

The operational availability is the availability that the customer actually experiences. It is
essentially the a posteriori availability based on actual events that happened to the system. The
previous availability definitions are a priori estimations based on models of the system failure
and downtime distributions. In many cases, operational availability cannot be controlled by the
manufacturer due to variation in location, resources and other factors that are the sole province of
the end user of the product.

Introduction to Repairable Systems Example 1

As an example, consider the following scenario. A diesel power generator is supplying electricity
at a research site in Antarctica. The personnel are not satisfied with the generator. They
estimated that in the past six months, they were without electricity due to generator failure for an
accumulated time of 1.5 months.

Therefore, the operational availability of the diesel generator experienced by the personnel of the
station is:

Obviously, this is not satisfactory performance for an electrical generator in such a climate so
alternatives to this source of electricity are investigated. One alternative under consideration is a
wind-powered electrical turbine, which the manufacturer claims to have a 99.71% availability.
This is much higher than the availability experienced by the crew of the Antarctic research
station for the diesel generator. Upon investigation, it was found that the wind-turbine
manufacturer estimated the availability based on the following information:
Failure Distribution Repair Distribution
Exponential, MTTF = 2400 hr Exponential, MTTR = 7 hr

Based on the above information, one can estimate the mean availability for the wind turbine over
a period of six months to be:

This availability, however, was obtained solely by considering the claimed failure and repair
properties of the wind-turbine. Waiting downtime was not considered in the above calculation.
Therefore, this availability measure cannot be compared to the operational availability for the
diesel generator since the two availability measurements have different inputs. This form of
availability measure is also known as inherent availability. In order to make a meaningful
comparison, the inherent availability of the diesel generator needs to be estimated. The diesel
generator has an MTTF = 50 days (or 1200 hours) and an MTTR = 3 hours. Thus, an estimate of
the mean availability is:

Note that the inherent availability of the diesel generator is actually a little bit better than the
inherent availability of the wind-turbine! Even though the diesel generator has a higher failure
rate, its mean-time-to-repair is much smaller than that of the wind turbine, resulting in a slightly
higher inherent availability value. This example illustrates the potentially large differences in the
types of availability measurements, as well as their misuse.

In this example, the operational availability is much lower than the inherent availability. This is
because the inherent availability does not account for downtime due to administrative time,
logistic time, the time required to obtain spare parts or the time it takes for the repair personnel to
arrive at the site.

You might also like