You are on page 1of 183

System Simulation

Dr. Dessouky
Description
Simulation is a very powerful and widely used management
science technique for the analysis and study of complex systems.

Simulation may be defined as a technique that imitates the
operation of a real-world system as it evolves over time. This is
normally done by developing a simulation model. A simulation
model usually takes the form of a set of assumptions about the
operation of the system, expressed as mathematical or logical
relations between the objects of interest in the system.

Simulation has its advantages and disadvantages. We will focus
our attention on simulation models and the simulation technique.
Simulation
What is simulation:

The process of designing a mathematical or
logical model of a real-system and then
conducting computer-based experiments
with the model to describe, explain, and
predict the behavior of the real system.
Simulation
Where simulation fits in
Simulation
Programming
Analysis
Modeling
Probability &
Statistics
Basic Terminology
In most simulation studies, we are concerned with
the simulation of some system.
Thus, in order to model a system, we must
understand the concept of a system.
Definition: A system is a collection of entities that
act and interact toward the accomplishment of some
logical end.
Systems generally tend to be dynamic their status
changes over time. To describe this status, we use
the concept of the state of a system.
Example Simulation Model
Ford - # of Panels per day (throughput)
Emergency Room (beds, doctors, nurses), (minor, moderate, major,
critical)
TRW Ballistic Missile Survivability against Soviet Threat
Paramount Farms Pistachio
Miami University Parking
HMT Disks Throughput
Christopher Ranch Garlic Capacity
Power Integration Semiconductor Capacity, and random machine
down times

Value of Simulation

Empirical Method verses mathematical
model

Allow you to calculate the extreme values
not just the expected value
Simulation
What is simulation

Simulation is the actual running of the
model system to gain insight into its
performance.
Simulation
Why use simulation

Simulation is used to better understand the
expected performance of the real system
and to test the effectiveness of the system
design.
Simulation
Why use simulation
Without building them
experimental system
new concepts
Without disturbing them
costly experimentation
unsafe experimentation.
Without destroying them
Determine limits of stress
Queuing systems
Performance measures (output)

Data requirements (input)

Uses of model

Kendalls notation
Queuing systems
System Performance measures (outputs)
Expected number of customers in system
Expected number of customers in queue
Expected time in system
Expected time in queue
Server utilization
Probability of n customers in system
Throughput
Queuing systems
Data requirements (Inputs)
Interarrival time distribution
Service time distribution
Number of servers
Queue discipline
System capacity
Size of input population
Kendalls notation (M/M/s/FCFS/K/M)
Alternative to simulation
Simulation
Analytic models
Physical experimentation
Visit other sites
Simulation vs. analytic modeling
Advantage:
various performance measures
greater realism
easier to understand
model the steady-state as well as the transit
behavior.
Disadvantage:
May not provide you with the optimal solution
time to construct model will be longer.
Simulation vs. Physical
Advantage:
High Speed
Not disruptive
Replication easy
Control variations
Generally less costly
Disadvantage:
Realism
Validity
Simulation vs. Alternatives
V A S P
Realism
V A
V A S P
Cost
Representing system
System:
a collection of mutually interacting objects
designed to accomplish a goal (machines
repair system)
Entities:
denotes an element/object within boundary of
system (machines, operators, repairman)
Entity work being performed on object
Resource performing the work

Representing system
Attribute:
Characteristic or property or an entity
(machine ID, Type of breakdown, time that
machine went down)
Activity:
transforms the state of an object usually over
some time (repairman service time, machine
run time)
Representing system
State of the system:
Numeric values that contain all the
information necessary to describe the system
at any time.

Delays:
Processes that take a conditional length of
time in the system
Representing system
Events:
Change the state of the system(end of service
of machine,machine breaks down)

Queue:
it is set, used to model waiting
Ex. Elevator systems
Entities
Elevators, people
Sets
People waiting at each floor
Attributes
Elevators capacity, speed, destination,
current location of each elevator
People inter-arrival time at each floor,
destination of each people

Ex. Elevator systems
State of system:
# of people on each elevator
# of people in each floor
Activities
Load/Unloading passenger
Travel to next floor (speed and distance)
Persons travel to elevator
Ex. Elevator systems
Delays:
Persons waiting for elevator
Events:
Elevator arrival
End unloading
End Loading
Person Arrival

Static Simulation vs. Dynamic
Simulation
There are two types of simulation models,
static and dynamic.
Definition: A static simulation model is a
representation of a system at a particular
point in time.
We usually refer to a static simulation as a
Monte Carlo simulation.
Static Simulation vs. Dynamic
Simulation
Definition: A dynamic simulation is a
representation of a system as it evolves
over time.
Within these two classifications, a
simulation may be deterministic or
stochastic.
A deterministic simulation model is one
that contains no random variables; a
stochastic simulation model contains one
or more random variables.
Discrete Event vs. Continuous
Event Simulation
Discrete event:
state of system changes only at discrete points
in time(events)
ex. Machine repair problem

Programming
Look at system only when events occur; time is
advanced from event to event.

Discrete Event vs. Continuous
Event Simulation
Continuous event:
state of system changes continuously over
time
Ex. Level of fluid in tank

Programming:
Advances time in small intervals. Use differential
equations to represent flows.
An Example of a Discrete-Event
Simulation
To simulate a queuing system, we first have to
describe it.
We assume arrivals are drawn from an infinite
calling population.
There is unlimited waiting room capacity, and
customers will be serve in the order of their
arrival (FCFS).
Arrivals occur one at a time in a random fashion.
All arrivals are eventually served with the
distribution of service teams as shown in the
book.

Service times are also assumed to be random.
After service, all customers return to the calling
population.
For this example, we use the following variables
to define the state of the system: (1) the number
of customers in the system; (2) the status of the
server that is, whether the server is busy or idle;
and (3)the time of the next arrival.
An event is defined as a situation that causes the
state of the system to change instantaneously.

All the information about them is maintained in a
list called the event list.
Time in a simulation is maintained using a
variable called the clock time.
We begin this simulation with an empty system
and arbitrarily assume that our first event, an
arrival, takes place at clock time 0.
Next we schedule the departure time of the first
customer.
Departure time = clock time now + generated service
time


Also, we now schedule the next arrival into the
system by randomly generating an interarrival time
from the interarrival time distribution and setting
the arrival time as
Arrival time = clock time now + generated interarrival time
Both these events are their scheduled times are
maintained on the event list.
This approach of simulation is called the next-event
time-advance mechanism, because of the way the
clock time is updated. We advance the simulation
clock to the time of the most imminent event.


As we move from event to event, we carry out the
appropriate actions for each event, including any
scheduling of future events.
The jump to the next event in the next-event
mechanism may be a large one or a small one; that
is, the jumps in this method are variable in size.
We contrast this approach with the fixed-increment
time-advance method.
With this method, we advance the simulation clock
in increments of t time units, where t is some
appropriate time unit, usually 1 time unit.

For most models, however, the next event
mechanism tends to be more efficient
computationally.
Consequently, we use only the next-event
approach for the development of the models for
the rest of the chapter.
To demonstrate the simulation model, we need to
define several variables:
TM = clock time of the simulation
AT = scheduled time of the next arrival


DT = scheduled time of the next departure
SS = status of the server (1=busy, 0=idle)
WL = length of the waiting line
MX = length (in time units) of a simulation run
We now begin the simulation by initializing
all the variables. This simple example
illustrates some of the basic concepts in
simulation and the way in which simulation
can be used to analyze a particular problem.
World View The Structure concepts and views
under which the simulation is guided for the
development of the simulation model
Event Orientation defines the changes in state that occur
at each event time

Process Orientation describes the process through
which the entities in the system flow

Activity Scanning Orientation describes the activities in
which the entities in the system engage
Discrete Event Simulation
Event scheduling
Write modules that describe changes in the
state of the system at each event
Main program advances time
One subprogram for each event
General purpose programming language

Discrete Event Simulation
Process interaction
Write modules that describe the progress of
entities through the system
As entities move the systems changes state
Entities are held to represent activities and
delays
Promodel programming language
Event scheduling
Time is advanced from event to event
Future events list ordered list of
upcoming events
As events are scheduled, they are added to the
list
As events occur they are removed from list
Activities in event ( one / event type)
Event scheduling
List is required to keep track of entities in a
set
Statistics Two types
Sample statistics average of some values (W)
W = (W1 +W2 + +Wn)/n = Total Wait / # of wait
Time average statistics time weighted (L)
L = (0(t1) + 1(t2-t1) + 2(t3-t2) + 1(t4-t3)) / t4

Activity scanning
Activity scanning
Time is modeled in fixed time increments to
check if activity occurred
Small time increments is inefficient
Large time increments may miss activity
describes the activities in which the entities in
the system engage.
Process Oriented
Process oriented:
Many simulation models include elements
which occur in defined patterns
The logic associated with such a system or
events can be generalized and defined by a
single statement
A simulation language could then translate
such statement into the appropriate sequence
of events
describes the processes through which the
entities in the system flow.
Process Oriented
Process oriented:
These statements, define a sequence of events
which are automatically executed by the
simulation language as the entities move
through the process
Create arrival entities every t time units
However, since we are normally restricted to a
set of standardized statement, provided by the
simulation language, our model flexibility is
not as great as with the event condition
Feature provided by a language
Conceptual framework(entities, attributes,
resource, queues)
Maintenance of event list
Random variable generation
Animation
Debugging function
Output analysis
Input analysis
Report generation

Simulation Languages
One of the most important aspects of a simulation
study is the computer programming.
Several special-purpose computer simulation
languages have been developed to simplify
programming.
The best known and most readily available
simulation languages, including GPSS, GASP IV
and SLAM.
Most simulation languages use one of two different
modeling approaches or orientations; event
scheduling or process interaction.

GPSS uses the process-interaction approach.
SLAM allows the modeler to use either approach
or even a mixture of the two, whichever is the
most appropriate for the model being analyzed.
Of the general-purpose languages, FORTRAN is
the most commonly used in simulation.
In fact, several simulation languages, including
GASP IV and SLAM, use a FORTRAN base.

To use GASP IV we must provide a main
program, an initialization routine, and the event
routines.
For the rest of the program, we use the GASP
routines.
Because of these prewritten routines, GASP IV
provides a great deal of programming flexibility.
GPSS, in contrast to GASP, is a highly structured
special-purpose language.
GPSS does not require writing a program in the
usual sense.

Building a GPSS model then consist of
combining these sets of blocks into a flow
diagram so that it represents the path an
entity takes as it passes through the system.
SLAM was developed by Pritsket and
Pegden (1979). It allows us to develop
simulation model as network models,
discrete-event models, continuous models, or
any combination of these.

The decision of which language to use is one of
the most important that a modeler or an analyst
must make in performing a simulation study.
The simulation language offer several
advantages.
The most important of these is that the special-
purpose languages provide a natural framework
for simulation modeling and most of the features
needed in programming a simulation model.
The Simulation Modeling Steps
We now discuss the process for a complete
simulation study and present a systematic
approach of carrying out a simulation.
A simulation study normally consists of several
distinct stages. (See Figure in the book)
However, not all simulation studies consist of all
these stages or follow the order stated here.
On the other hand, there may even be
considerable overlap between some of these
stages.
Problem/Model Formulation
State the objective of the study.
Identify the Problem. Determine any underlying
causes if possible.
Determine the input variables.
Controllable Variables.
Uncontrollable Variables.
Make assumptions / boundaries that were used to
simplify the model.
Determine Performance measures used to
measure the objective. (Output)
Data collection/acquisition
Determine the Data Collection System or
Estimates to be used.
Observe the system
Historical or Similar Systems
Theoretical Estimates
Engineering Estimates
Operator Estimates
Vendor Estimates
Identify the data collected.
How it was collected.
How it was represented in the model.
Model Construction or
Development
Identify The Real System
Determine Conceptual Model -Activities
and Events
Develop the Logical Model.
Identify the Programming Language used.
Computer Implementation (Promodel,
Arena, Slam Systems).
Model Construction or
Development
Modeling Tips

Art vs. Science
Over Simplification vs. Unnecessary Detail
Start Simple
Add stronger assumptions
Model Verification and
Validation
Verification: Determining whether
simulation model works as intended.
Verifying the Model.
Structure: Walk Through of the Model
Debugger.
Trace = print or writing in process calculations.
Animation.
Model testing
Analytical Model.
Model Verification and
Validation
Verification.
Logical Model.
Are events represented correctly?
Are mathematical formulas and relationships
correct?
Are statistical measures formulated correctly?
Computer Model/Simulation Model.
Does the code contain all aspects of the logical
model?
Are the statistics and formulas calculated correctly?
Does the model contain coding errors?
Model Verification and
Validation
Validation:Determine whether Simulation
of The Model is a credible representation
of a Real System.
Compare the model with the actual systems
by performing statistical tests. T-Test &
C.I.
Conceptual Model.
Does the model contain all relevant elements,
events and relationships?
Will the model answer the questions of concern?
Model Verification and
Validation
Logical Model.
Does the model contain all events included in the
conceptual model?
Does the model contain all the relationships of the
conceptual model?
Computer Model/Simulation Model.
Is the computer model a valid representation of the
real system?
Can the computer model duplicate the performance
of the real system?
Does the computer model output have credibility
with system experts and decision makers?
Experimentation and Analysis of
Results
Experimentation The execution of the
simulation model to obtain output values

Analysis of Results The process of analyzing
the simulation outputs to draw inferences and
make recommendations for problem resolution

Implementation and Documentation
The process of implementing decisions
resulting from the simulation and
documenting the model and its use.
Manual Simulation Example
Given the following arrival times for a single
server system what will be the average number
in the queue, average number in the system,
average time in system, average time in queue,
the number of completed jobs, number in the
queue, number in the system, and server
utilization at time 15 if the service time is 3 time
units for each entity.

1, 3, 5, 9,13,15,17

Data Collection
Activities may be represented as
Constants
Random variables
Collection of data
Design a data collection form
Record more than single attribute in case you
need to use data in a different way.
Use several session to get representative data
Use control charts
Data Collection
Machine Begin Repair End Repair Time
Elapsed
Data Collection
Testing data
Independence

Randomness

Homogeneity
Data Collection
Test of Independence
Ho: Measure A is independent of measure B

H1: Measure A is not independent of measure
B.

Inventory and day of week

Data Collection
Test of Randomness
Ho: f(xi/xj) = f(xi) =Independent
Hi: f(xi/xj) f(xi) : Dependent
For example, when simulation a production
process in which the items can be defective or
good, it would be important to know if
successive items are randomly distributed with
reputation good items followed by some of
defective items.
Data Collection
Test of Homogeneity
Tests for whether multiple sets of data can be
considered as coming from statistical
population are generally referred to as tests of
homogeneity distribution free.
Ho : G(x) =H(x)
H1 : G(x) = H(x)
Two different workers working on the same
machine.
Random Variable
Two types
Discrete

Continuous

Random Variable
Probability mass function
Discrete

P(X = xi) = p(xi)

E p(xi) = 1
Random Variable
Probability density function
Continuous
f(x) = e
x
x > 0
P(X = a) = 0
}
-

f(x) dx = 1
P(a < x < b) = }
a
b
f(x) dx
Random Variable
Cumulative distribution function (CDF)

F(X) = P(X <= x)

E X<x p(x
i
)

}
-
x
f(x) dx

Random Variable
Expected value

= E(x)
= E xi p (xi)
= } x f(x) dx
Random Variable
Variance

( )

=
=
+ =
=
)) ( (
)) ( (
2
2
2
2
2 2
2
) (
] 2 [
] ) [( ) (
x p
x
x
x E
x
i
i
xi p
E
x x E
x E x V
i


Random Variable
Standard deviation

Sums of R.V.

) ( ) ( X V X SD = =o
) ( ) ( ) (
) ( ) ( ) (
2
2
2 1
2
1
2 2 1 1
2 2 1 1
x V a x V a Y V
x E a x E a y E
x a x a Y
+ =
+ =
+ =
Random Variable

n
X
X SampleMean
i

= =
1 1
) (
2
2
2
2

= =

n
n
n
X X
ance SampleVari
x
x
S
i
i
Poisson Probability Distribution
Consider a discrete r.v. which is often useful
when dealing with the number of occurrences
of an event over a specified interval of time.

Suppose we want to find the probability
distribution of the accidents at the intersection
of Rural and Apache during a one week
period.

The R.V. we are interested in is the number of
accidents.
Poisson Probability Distribution
i. The Poisson Distribution provides a good model for the probability
distribution of the number of rare events that occur in space, time,
and volume where is the average at which events occur.

ii. Define: A r.v. is said to have a Poisson distribution if the p.m.f of
X is
P(x) = f(x) =
! x
e
x


, x = 0,1,
where is the rate per unit time or per unit area
iii.

=
=
) (
] [
X V
X E


Exponential Distribution
Previously, we discussed the Poisson random variable,
which was the number of events occurring in a given
interval. This number was a discrete r.v. and the
probabilities associated with it could be described by the
Poisson Probability Distribution.

Not only is the number of events a r.v., but the waiting
time between event is also a random variable. This r.v. is a
continuous r.v. for it can assume any positive value.

This r.v. is an exponential r.v. which can be described by
the exponential distribution.
Exponential Distribution
i. Pdf:

> >
=

otherwise
x e
x f
x
0
0 & 0
) (



where = rate at which events occur

ii. Correspondingly,
2
0
1
) (
1
] [
0 , 1 ) ( ) (


=
=
> = = s =

}
X V
X E
x e dx e x X P x F
x
x
x


iii. An important application of the exponential distribution is to
model the distribution of component lifetime. A reason for its
popularity is because of the memory-less property of the
Exponential Distribution

The Uniform Distribution
o The simplest distribution is the one in which a continuous r.v. can assume
any value within a interval [a, b]


Def:
A continuous r.v. X is said to have a uniform distribution on the
interval [a,b] if the probability distribution (pdf) of X is:

s s

=
otherwise
b x a
a b
x f
0
1
) (

The Uniform Distribution
The cumulative distribution is
12
) (
) (
2
)
1
( ) ( ] [
) (
) ( ) ( ) (
2
a b
X V
a b
dx
a b
x dx x xf X E
a b
a x
a b
a
a b
x
a
x
a b
x
dx x f
dx x f x X P X F
x x
x
x

=
+
=

= =

= =
= s =
} }
}
}




The Uniform Distribution
Note:

An important uniform distribution is
that for when a = 0 and b = 1, namely
U(0, 1)
A U(0,1) r.v. can be used to simulate
observation of other random variables
of the discrete and continuous type.

The Triangular Distribution
Continuous Distribution

elsewhere
c x b
a c b c
x c
b x a
a c a b
a x
x f
0
) )( (
) ( 2

) )( (
) ( 2
) (
=
s s


=
s s


=
The Triangular Distribution
c x
c x b
a c b c
x c
b x a
a c a b
a x
x F
a x x F
> =
s s


=
s s


=
< =
1
) )( (
) (
1
) )( (
) (
) (
0 ) (
2
2
The Triangular Distribution
c a x b
x x c
x x a
bc ac ab c b a
x V
c b a
x E
a x x F
n
n

3

} max{

} min{

18
) (
3
) (
0 ) (
1
1
2 2 2
=
=
=
+ +
=
+ +
=
< =

Normal Distribution
It is a fact that measurements on many random variables will follow a bell-
shaped distribution.

Random variable of this type are closely approximated by a Normal
Probability Distribution.

A continuous r.v. X is said to have a normal distribution if the pdf of X is
< < < < > =

o
o t
o

, , 0 ,
2
1
) (
2
2
2
) (
x e x f
x


The distribution contains 2 parameters ( and o). These are the expected
value and the variance and hence locate the center of the distribution and
measure its spread.
Normal Distribution
The Standard Normal Distribution

To compute P(a s x sb) when X ~ N(, o
2
), we must evaluate
dx e dx x f
b
a
x
b
a
} }

=
2
2
2
) (
2
1
) (
o

o t


Note: None of the standard integration techniques can be used
to evaluate this pdf. Instead, for = 0, and o
2
= 1, the pdf has
been evaluated and values have been computed. Using the
table, probabilities for any other values of and o
2
can be
determined
Normal Distribution
The normal distribution for parameters values
= 0, and o
2
= 1 is called the standard normal
distribution. A r.v. that has a standard
distribution is called a standard normal random
variable (denoted by Z). The pdf of Z is:
< < =

z e z f
z
,
2
1
) (
2
2
t

Normal Distribution
The cumulative distribution of Z is
(Z) by denoted is and ) ( ) ( u = s
}

z
dy y f z Z P


Note: The N(0,1) Table returns the cumulative
probability up to z or u(z)
Normal Distribution
Non-standard Normal Distribution

The table only provides probabilities for r.v.
following the N(0,1) distribution. Thus, when X
~ N(, o
2
), (i.e. not = 0, o
2
= 1), probabilities
involving X are computed by standardizing
the r.v. to N(0,1) scale.
Selecting a Distribution
Theoretical prior knowledge
Random arrival => exponential IAT
Sum of large manufactures => Normal CLT
Compare histogram with probability mass
or probability density
Data Collection
Little variability model as a constant.
Variability model as a random variable.
Empirical vs. Theoretical, Select a
Distribution, Estimate Parameter of
distribution, goodness or fit test.
X
2
goodness of fit test
Compare observed versus theoretical
density
A collection of data can be as a sample
from a specified p.d.f
H
0
: Xis are IID r.v. with density f(x)
H
1
: Xis are not IID r.v. with density f(x)
X
2
goodness of fit test
Critical value
If H
0
is true, TS ~ X
2
k-1-(# of par estimated), o
A large T.S.would cause rejection of H
0
Reject H
o
if T.S. > X
2

critical
( )


=
=
k
i
i
i i
TS
1
2
c
c o
X
2
goodness of fit test
Issues test is an art

Number of intervals > 2
Size of intervals: Ei ~ same > 5
Requires relatively large amount of data
K-S test
Compare observed with theoretical CDF
Limited to continuous distribution, known
parameters
H
0
: X
i
are IID r.v. with CDF F(x)
H
1
: X
i
are not IID r.v. with CDF F(x)
Test statistic From table
K-S test
Critical value
A large T.S would cause rejection
Critical value
)
`

= )) ( max( ),
1
) ( max( max
^
Xi F
n
i
n
i
Xi F TS
n
n
n
/ 22 . 1 10 . 0
/ 36 . 1 05 . 0
/ 63 . 1 01 . 0
=
=
=
o
o
o
Parameter estimation
Set of data x
1
, x
2
, x
m


Methods of moments => equate E(X),
V(X) to x and S
2

1
2 2
2

= =

n
x n x
s
n
x
x
i i
Parameter estimation
Maximum likelihood => find parameter
that max the likelihood of obtaining the
given sample
Produces efficient and consistent estimates
Not always unbiased
Superior properties to methods of moments
Common sense.
Statistical Analysis of Simulations
As previously mentioned, output data from
simulation always exhibit random variability,
since random variables are input to the simulation
model.
We must utilize statistical methods to analyze
output from simulations.
The overall measure of variability is generally
stated in the form of a confidence interval at a
given level of confidence.
Thus, the purpose of the statistical analysis is to
estimate this confidence interval.
Output analysis
Need multiple observations to estimate
variability
Y1, Y2, Y3, . Yn
Estimate a confidence interval for the
measure of performance
Estimate the number of observations
required to obtain the desired precision
Output analysis
What is an observation?
Is observation a sample statistic or time
average statistic?
Is this a steady state simulation or
terminating simulation?
Are the observations independent or
correlated?
Terminating vs Steady State Simulation
Often, the type of model determines which
type of output analysis is appropriate for a
particular simulation.
However, the system or model may not
always be the best indicator of which
simulation would be the most appropriate.
It is quite possible to use the terminating
simulation approach for systems more
suited to steady-state simulations, and vice
versa.
Observation vs Time Based
Observation (Sample)
Average Time In System
Average Time In Queue
Time Based
Average Number in System
Average Number in Queue
Machine Utilization


Terminating simulation
Simulation in which the output measure of
performance is defined over a specific
interval of time with a specific starting
condition and a specific ending condition
Retail sales during a business day
Project network
Time to produce a batch of parts in a work cell
Military Simulations
Terminating simulation
Has a specified starting and ending
condition.
Each observation must have the same
starting and ending.
Observations are obtained by replication.
Use a different seed for random number
generation.
Steady state simulation
Simulation in which the output measure of
performance is defined over an infinite
interval of time independent of the initial
state of the system and stopping condition
Average production from an assembly line of
well trained employees
Inventory simulation
Steady state simulation
Independent of starting and ending
condition.
Remove initial condition bias
Specify warm-up period (transient period) .
Set initial condition too steady state.
Have a very long run length
Steady state simulation
1. Individual Yi average of individuals.
2. Replication Yi average of each one.
3. Batch means batch by time, by number.

Terminating vs. Steady state simulation
Terminating
Observations are obtained by replication
Each observation must reflect the specified
starting and ending condition
Use a different seed for each replication
Y1, Y2, , Yr => one independent
observation per replication
Confidence interval for steady state
simulation
Y1, Y2, . Yn
Trying to estimate a long run performance
measure independent of starting and
ending conditions
Two problems
Initial condition bias
Dependent observations

Confidence interval for steady state
simulation
Outline
Removing initial condition bias
Creating independent observation
Replication/ deletion
Batch means
Confidence interval for replication
Let Y1, Y2, and Y3YR be measures of
performance from R independent
replication.
Independent -> different seed for each run
1
) (
2
,
2
2
2
2
1

=
|
.
|

\
|

R
R
R
Y
Y
Y
s
S
t
i
r
Confidence interval for replication
Approximate due to need for Yi ~ Normal
(1-o) Confidence Interval => Probability
of containing true mean
R
S
R
RS
Y Var Y Var
R
Y Y Var Y Var
R
R
R R
2
2
2
1
2
1
1
1
1
)) ( ... ) ( (
) ... ( ) (
= =
+ + =
+ + =
Number of replication needed
Suppose we desire a confidence interval

Based on a preliminary run of R
0

replication, we have an estimate of S
2
and
confidence interval
HalfLength I Y
0
2
2
1
1
0
R
S
t Y
,
R

Number of replication needed


Find R such that


If R is large,
R
S
t I
,
R
2
2
1
1

=
2
2
2
2
1
*
|
|
|
.
|

\
|
= =

I
S
R
R
I
Z
Z
s
Z
t
r
Test for comparing two means
H
0
:
1

2
= 0 H
1
:
1

2
= 0
Two approaches:
- Form a (1 o) confident on
1

2
:
) (
2 1 , 2 / 2 1
Y Y V t Y Y
r

o
Reject H
0
if confident does not contain 0.

- Perform a t test
) (
0 ) (
2 1
2 1
Y Y V
Y Y
t


=

Reject if \t\ > t
r,o/2

Assumptions
Case 1: Y
1
, Y
2
Y
R1

2
1 1
, s Y

Case 2: Y
1
, Y
2
Y
R2

2
2 2
, s Y


Observations are independent
Observation are normally distributed
Variances are unknown/known.
Variances are equal/unequal
Observations are paired/unpaired.
Test for comparing two means
Equal Variance
1. Assumptions: independent, normal, unknown, unpaired, equal
variance.
2.
2
) 1 ( ) 1 (
2
) ( ) (
2 1
2
2 2
2
1 1
2 1
2
2
2
1 2
+
+
=
+

+


=
R R
S R S R
R R
Y Y Y Y
S
i i
p

3.
2
2
1
2
2 1 2 1
) ( ) ( ) (
R
S
R
S
Y Var Y Var Y Y Var
p p
+ = + =

4.
2
2
1
2
2 , 2 / 2 1
2 1
: ) 1 (
R
S
R
S
t Y Y confident
p p
R R
+
+ o
o

5. t-test:
2
1
1
1
2 1
) (
R R
p
S
y y
t
+

=

t-crit =
2
, 2
2 1
o
+R R
t


6. Note: Many simulations do not have equal variance.

Test for comparing two means
One sided test


Need to make hypothesis in advance

Use t test, adjust critical value
Test for comparing two means
Test for normal population with known variance
- Assumptions: independent, normal, known variance,
unpaired, unequal variance.
- 2 populations: X
1
~ N(
1
, o
1
2
) & X
2
~ N(
2
, o
2
2
)
- Sample m from X
1
& sample n from X
2

- Want to test whether
1
=
2

- H
0
:
1
=
2

H
1
:
1
=
2

- Test Statistic:
n m n m
X X X X
Z
2
2
2
1
2 1
2
2
2
1
1 1 2 1
0
) (
o o o o

+ +

=

=

Test for comparing two means
Unequal Variance


1. Assumptions: independent, normal, unknown variance,
unpaired, unequal variance.
2.
2
2
2
1
2
1
2 1 2 1
) ( ) ( ) (
R
S
R
S
Y Var Y Var Y Y Var + = + =

3.
2
2
2
1
2
1
, 2 / 2 1
: ) 1 (
R
S
R
S
t Y Y conf ident +
o
o
1
2
2
2
2
1
1
1
2
1
2
2
2

1
2
1
2 2
2

|
|
|
.
|

\
|

|
|
|
.
|

\
|
|
|
|
.
|

\
|
+
+
=
R
R
S
R
R
S
R
S
R
S


Test for comparing two means
Paired Test

- Assumptions: independent, normal, unknown variance,
equal # of replications
- Case 1: Y
1
, Y
2
Y
R

Case 2: Y
1
, Y
2
Y
R

Different: d
1
, d
2
d
R
, where di = y
i
y
i
1
) (
2
2

=

R
d d
S
R
d
d
i
d
i


- H
0
:
1

2
= 0 d = 0
H
1
:
1

2
= 0 d = 0

-
R
S
d V t d confident
d
R
2
1 , 2 /
) ( : ) 1 ( =
o
o

-
R
S
d
d
t
2
=


Test for comparing two variances
F-test for equal variance
1.
2
2
2
1 1
2
2
2
1 0
o o
o o
= =
= =
H
H


2. Test statistics = F =
2
2
2
1
S
S
=
3. Critical Value =
2
, 1 , 1
2 1
o
R R
F

4. Example

F =5.4/2.55 = 2.12
o= .10, Fcritical = F
9
,
9
,
. 05
= 3.18, can not reject Ho

Common Random Number
The process of comparing cases with the
same set of random numbers
creating identical condition
Observation
Confident Interval



Use the paired test
) , ( 2 ) ( ) ( ) (
) ( ) (
2 1 2 1 2 1
2 1 2 / , 1 2 1
Y Y Cov Y V Y V Y Y V
Y Y V t Y Y
R
+ =

o
Random Numbers
Generation of U(0,1) random number
algorithm used by the RND function

Generation of random variates from
various distributions algorithm used by
EXPONENTIAL, UNIFORM, and so on
(these algorithms use U(0,1) random
numbers.
Random Number Generation
Desirable properties
Fast and efficient
Capable of repeating same sequence
Statistically equivalent to U(0,1)
Independent and dense
Large cycle length or period
Low storage requirements
Old method tables

Random Number Generation
Pseudo random number generators
A non random sequence of numbers each
completely determined by its predecessor, the
algorithm, and initially, the seed.
Linear Congruential Generator
Z
i
= ( a * Z
i-1
+ C ) mod m
Z
0
= seed
U
i
= Z
i
/ m (Random Number)

If we choose a, C, and m correctly, => then
we achieve a maximum period
0<= Z
i
<= m-1

Linear Congruential Generator
Rule For Full Period :
C is relatively prime to m.
other than 1, hence there is no integer that exactly
divides C and m
Every prime factor of M is also a prime factor
of A-1
If m is exactly dividable by 4, then A-1 must
be exactly dividable by 4

Linear Congruential Generator
A full period does NOT mean always a
good random number generator

Multiplicative Generators
Z
i
= a * Z
i-1
mod m
Z
0
= seed

Saves an addition, more popular

Multiplicative Generators
C=0
M divides both m and c
Condition (a) is violated
Not full period
P = m 1 is largest available period
Multiplicative Generators
2
b
is not a good choice for m
only possible numbers
Let m = 2
b
- 1
Testing a random number generator
Testing the distribution
Generate 1000 or more observations
X
2
test or K-S test for U(0, 1)
Use 100 intervals
Test for independence
Runs up
Tests designed to compare observed and
expected distribution
E(x) = .5 V(X) = 1/12, where a = 0, b=1
Random variate generation
Assume a random number generator is
available to generate U
i
~ U(0, 1)
Goal: Generate X
i
from a specified
distribution f(x) or p(x) of F(x)
Three methods
Inverse transformation method
Convolution method
Acceptance\Rejection method
Random variate generation
Apply these methods to the five
distributions we are using in this class
Uniform
Triangular
Exponential
Normal
Poisson
Inverse transformation method
General idea use CDF
Select U
i
Find corresponding x
i
That is x
i
= F
-1
(U
i
)
Advantage of inverse transformation
method
One U
i
per x
i
Disadvantage
CDF may not always exist
Inverse transformation method
Exponential distribution
f(x) = e
-x
x > 0
F(X) = 1 - e
-x
x > 0
U
i
= F(X
i
) = 1 - e
-xi
(1- U
i
) = e
-xi

ln(1- U
i
) = - X
i
X
i
= - (1/ ) ln(1- U
i
) = - (1/ ) ln(U
i
)
Inverse transformation method
Triangular distribution

s s


s s


= c x b
a c b c
x c
b x a
a c a b
a x
F
x
,
) )( (
) (
1 ,
) )( (
) (
2 2
) (
a c
a b
i
u

s
a c
a b
i
u

s
a c
a b
i
u

>
u x
i i
a c a b a ) )( ( + =
) 1 )( )( (
u x
i i
a c b c c + =
ui
Yes
No
Convolution Method
Applicable to situation where the random
variable of interest can be expressed as a
sum of other random variables that are IID
(independent identical distributed)

X=Y1+Y2+Y3. +Yn
Idea: Generate Y1. Yn and add these up
to calculate X

Convolution Method
Normal distribution
Focus: Generating Z
i
~ N(0, 1)

Generating Z
i


Inverse transformation: F(x) does not exist
Acceptance\Rejection: Not bounded
) , ( ~ o o
o

N Z x
x
Z
i i
i
i
+ =

=
2
2
1
2
1
) (
z
e Z f

=
t
Convolution Method
Normal distribution
Generate Ui
Generate Zi
Then
o

=
x
Z
i
i
Zi~N(0,1) o Zi Xi + =
Acceptance\Rejection Method
Applicable to distribution functions that
are hard to integrate
Idea
Find a majoring function t(x) where t(x) > f(x)
Sample values of x from t(x) call it x*
Sample Ui < f(x*) / t(x*), accept x*
Simplification for this class we will
always use a rectangular majoring function
9.3 Random Numbers and Monte
Carol Simulation
The procedure of generating these times from the
given probability distributions is known as
sampling from probability distributions, or
random variate generation, or Monte Carlo
sampling.
We will discuss several different methods of
sampling from discrete distributions.
The principle of sampling from discrete
distributions is based on the frequency
interpretation of probability.

In addition to obtaining the right frequencies, the
sampling procedure should be independent; that is,
each generated service time should be independent
of the service times that precede it and follow it.
This procedure of segmentation and using a roulette
wheel is equivalent to generating integer random
numbers between 00 and 99.
This follows from the fact that each random number
in a sequence has an equal probability of showing
up, and each random number is independent of the
numbers that precede and follow it.

A random number, R
i
, is defined as an
independent random sample drawn from a
continuous uniform distribution whose
probability density function (pdf) is given
by

s s
=
otherwise 0
1 0 1
) (
x
x f
Random Number Generators
Since our interest in random numbers is for use
within simulations, we need to be able to generate
them on a computer.
This is done by using mathematical functions called
random number generators.
Most random number generators use some form of a
congruential relationships. Examples of such
generators include linear congruential generator, the
multiplicative generator, and the mixed generator.
The lineal congruential generator is by far the most
widely used.

Each random number generated using this
methods will be a decimal number between 0
and 1.
Random numbers generated using congruential
methods are called pseudorandom numbers.
Random number generators must have these
important characteristics:
1. The routine must be fast
2. The routine should not require a lot of core storage
3. The random numbers should be replicable; and
4. The routine should have a sufficiently long cycle

Most programming languages have built-in
library functions that provide random (or
pseudorandom) numbers directly.
Computer Generation of Random
Numbers
We now take the method of Monte Carlo
sampling a stage further and develop a procedure
using random numbers generated on a computer.
The idea is to transform the U(0,1) random
numbers into integer random numbers between
00 and 99 and then to use these integer random
numbers to achieve the segmentation by
numbers.
We now formalize this procedure and use it to
generate random variates for a discrete random
variable.

The procedure consists of two steps:
1. We develop the cumulative probability
distribution (cdf) for the given random
variable, and
2. We use the cdf to allocate the integer random
numbers directly to the various values of the
random variables.

9.4 An Example of Monte Carlo
Simulation
The book uses a Monte Carlo simulation to
simulate a news vendor problem.
The procedure in this simulation is different from
the queuing simulation, in that the present
simulation does not evolve over time in the same
way.
Here, every day is an independent simulation.
Such simulations are commonly referred to as
Monte Carlo simulations.
9.5 Simulations with Continuous
Random Variables
In many simulations, it is more realistic and
practical to use continuous random variables.
We present and discuss several procedures for
generating random variates from continuous
distributions.
The basic principle is similar to the discrete case.
We first generate U(0,1) random number and
then transform it into a random variate from the
specified distribution.

The selection of a particular algorithm will
depend on the distribution from which we want
to generate, taking into account such factors as
the exactness of the random variables, the
computations and storage efficiencies, and the
complexity of the algorithm.
The two most common used algorithms are the
inverse transformation method (ITM) and the
acceptance-rejection method (ARM).
Inverse Transformation Method
The inverse transformation method is generally
used for distribution whose cumulative
distribution function can be obtained in closed
form.
Examples include the exponential, the uniform,
the triangular, and the Weibull distributions.
For distributions whose cdf does not exist in
closed form, it may be possible to use some
numerical method, such as a power-series
expansion, within the algorithm to evaluate the
cdf.

The ITM is relatively easy to describe and
execute.
It consists of the following steps:
Step1: Given the probability density formula f(x) for a
random variable X, obtain the cumulative distribution
function F(x) as


Step 2: Generate a random number r.
Step 3: Set F(x) = r and solve for x.
}

=
x
dt t f x F ) ( ) (

We consider the distribution given by the function



A function of this type is called a ramp function.
To obtain random variates from the distribution
using the inverse transformation method, we first
computer the cdf as

=
0
2
) (
x
x f
0 x 2
otherwise
4
2
) (
2
0
x
dt
t
x F
x
=
=
}

In Step 2, we generate a random number r.
Finally, in Step 3, we set F(x) =r and solve for x.


Since the service time are defined only for positive
values of x, a service time of as the
solution for x. This equation is called a random
variate generator or a process generator.
Thus, to obtain a service time, we first generate a
random number and then transform it using the
preceding equation.
r x
r
x
2
4
2
=
=
r x 2 =

As this example shows, the major
advantage of the inverse transformation
method is its simplicity and ease of
application.
Acceptance Rejection Method
There are several important distributions,
including the Erlang (used in queuing models)
and the beta (used in PERT), whose cumulative
distribution functions do not exist in closed form.
For these distributions, we must resort to other
methods of generating random variates, one of
which is the acceptance rejection method
(ARM).
This method is generally used for distributions
whose domains are defined over finite intervals.


Given a distribution whose pdf, f(x), is defined
over the interval a x b, the algorithm consists
of the following steps:
Step 1: Select a constant M such that M is the largest
value of f(x) over the interval [a, b].
Step 2: Generate two random numbers, r
1
and r
2
.
Step 3: Computer x* = a + (b a)r1. (This ensures
that each member of [a, b] has an equal chance to be
chosen as x*.)
Step 4: Evaluate the function f(x) at the point x*. Let
this be f(x*).

Step 5: If

deliver x* as a random variate from the distribution
whose pdf is f(x). Otherwise, reject x* and go back to
Step 2.
Note that the algorithm continues looping back to
Step 2 until a random variate is accepted.
This may take several iterations. For this reason,
the algorithm can be relatively inefficient.
The efficiency, however, is highly dependent on
the shape of the distribution.
M
x f
r
*) (
2 s

There are several ways by which the method can
be made more efficient.
One of these is to use a function in Step 1 instead
of a constant.
We now give an intuitive justification of the
validity of the ARM.
In particular, we want to show that the ARM
does generate observations from the given
random variable X.
Direct and Convolution Methods for
the Normal Distribution
Both the inverse transformation method and the
acceptance reject method are inappropriate for
the normal distribution, because (1) the cdf does
not equal in closed form and (2) the distribution
is not defined over a finite interval.
Other methods such as an algorithm based on
convolution techniques, and then a direct
transformation algorithm that produces two
standard normal variates with mean 0 and
variance 1.
The Convolution Algorithm
In the convolution algorithm, we make direct use of
the Central Limit Theorem.
The Central Limit Theorem states that the sum Y of
n independent and identically distributed random
variables ( say Y
1
, Y
2
,Y
n
), each with mean and
finite variance
2
) is approximately normally
distributed with mean n and variance n
2
.
If we want to generate a normal variate X with
mean and variance 2, we first generate Z using
this process generator then transform it using the
relation X = + Z. Unique to normal distribution.
The Direct Method
The direct methods for the normal distribution
was developed by Box and Muller (1958).
Its not as efficient as some of the newer
techniques, it is easy to apply and execute.
The algorithm generates two U(0,1) random
numbers, r
1

and
r
2
, and then transforms them into
two normal variates, each with mean 0 and
variance 1, using the direct transformation.

It is easy to transform these standardized normal
variates intro normal variates X1 and X2 from
the distribution with mean and variance 2,
using the equations
2
2
1
1 2
2
2
1
1 1
2 cos ) ln 2 (
2 sin ) ln 2 (
r r
r r
t
t
=
=
Z
Z
2 2
1 1
Z X
Z X
o
o
+ =
+ =
9.6 An Example of a Stochastic
Simulation
Cabot Inc. is a large mail order firm in Chicago.
Orders arrive into the warehouse via telephones.
At present, Cabot maintains 10 operators on-line
24 hours a day.
The operators take the orders and feed them
directly into a central computer, using terminals.
Each operator has one terminal. At present, the
company has a total of 11 terminals.
That is, if all terminals are working, there will be
1 spare terminal.

Cabot managers believe that the terminal system
needs evaluation, because the downtime of
operators due to broken terminals has been
excessive.
They feel that the problem can be solved by the
purchase of some additional terminals for the spares
pool.
It has been determined that a new terminal will cost
a total of $75 per week.
It has also been determined that the cost of terminal
downtime, in terms of delays, lost orders, and so on
is $1000 per week.

Given this information, the Cabot managers would like
to determine how many additional terminals they
should purchase.
This model is a version of the machine repair problem.
It is easy to find an analytical solution to the problem
using the birth-death processes.
However, in analyzing the historical data for the
terminals, it has been determined that although the
breakdown times can be represented by the
exponential distribution, the repair times can be
adequately represented only by the exponential
distribution.

This implies that analytical methods cannot be used
and that we must use simulation.
To simulate this system, we first require the
parameters of both the distributions.
The data show that the breakdown rate is
exponential and equal to 1 per week per terminal.
In other words, the time breakdowns for a terminal
is exponential with a mean equal to 1 week.
Analysis for the repair times shows that this
distribution can be represented by the triangular
distribution which has a mean of 0.075 week.

The repair stuff on average can repair 13.33
terminals per week.
To find the optimal number of terminals, we
must balance the cost of the additional terminals
against the increased revenues generated as a
result of the increase in the number of terminals.
In this simulation we increase the number of
terminals in the system, n, from the present total
of 11 in increments of 1.

s s
s s +
=
125 . 0 075 . 0 400 50
075 . 0 025 . 0 400 10
) (
x x
x x
x f

For this fixed value of n, we then run our simulation
model to estimate the net revenue.
Net revenue here is defined as the difference
between the increase in revenues due to the
additional terminals and the cost of these additional
terminals.
We keep on adding terminals until the net revenue
position reaches a peak.
To calculate the net revenue, we first computer the
average number of on-line terminals, EL
n
, for a
fixed number of terminals in the system, n.


Once we have a value of EL
n
, we can
computer the expected weekly downtime
costs, given by 1000(10-EL
n
).
Then the increase in revenue as a result of
increasing the number of terminals from 11
to n is 1000(EL
n
EL
11
). Mathematically,
we compute EL
n
T
A
T
dt t N
EL
m
i
i
T
n

}
=
= =
1 0
) (

where
T = length of simulation
N(t) = number of terminals on-line at time t (0tT)
A
i
= area of rectangle under N(t) between e
i-1
and e
i

(where e
i
is the time of the ith event)
m = number of events that occur in the interval [0,T]
Between time 0 and time e
1
, the time of the first
event, the total on-line time for all the terminals is
given by 10e
i
, since each terminal is on-line for a
period of e
1
time units.

If we now run this simulation over T time units and
sum up the areas A
1
, A
2
, A
3
,, we can get an
estimate for EL10 by dividing this sum by T. This
statistic is called a time-average statistic.
We would like to set up the process in such way that
it will be possible to collect the statistics to
computer the areas A
1
, A
2
, A
3
,.
That is, as we move from event to event, we would
like to keep track of at least the number of terminals
on-line between the events and the time between
events.

We first define the state of the system as the
number of terminals in the repair facility.
The only time the state of the system will change
is when there is either a breakdown or a
completion of a repair.
Therefore, there are two events in this simulation:
breakdown and completion of repairs.
To set up the simulation, our first task is to
determine the process generators for both the
breakdown and the repair times.

We use the ITM to develop the process generators.
For the exponential distribution the process
generator is simply x = -log r
In case of the repair times, applying the ITM gives
us


and


as the process generators.
) 5 . 0 0 ( 005 . 0 025 . 0 s s + = r r x
) 0 . 1 5 . 0 ( ) 1 ( 005 . 0 125 . 0 s s = r r x

For each n, we start the simulation in the state
where there are no terminals in the repair facility.
In this state, all 10 operators are on-line and any
remaining terminals are in the spares pool.
Our first action is the simulation is to schedule
the first series of events, the breakdown times for
the terminals presently on-line.
Having scheduled these events, we next
determine the first event, the first breakdown, by
searching through the current event list.

We then move the simulation clock to the time
of this event and process this breakdown.
To process a breakdown, we take two separate
series of actions
1. Determine whether a spare is available.
2. Determine whether the repair staff is idle.
These actions are summarized in the system
flow diagram showed in the book in Figure 17.
Otherwise, we process a completion of a repair.

To process the completion of a repair, we also
undertake two series of actions.
1. At the completion of a repair, we have an additional
working terminal, so we determine whether the terminal
goes directly to an operator or to the spares pool.
2. We check the repair queue to see whether any terminals
are waiting to be repaired.
We proceed with the simulation by moving from
event to event until the termination time T.
At this time, we calculate all the relevant measures
of performance from the statistical counters.

Our key measure is the net revenue for the
current value of n.
If this revenue is greater than the revenue for a
system with n-1 terminals, we increase the value
of n by 1 and repeat the simulation with n +1
terminals in the system.
Otherwise, the net revenue has reached a peak.
The simulation outlined in this example can be
used to analyze other policy options that
management may have.

The simulation model provides a very
flexible mechanism for evaluating
alternative policies.

You might also like