You are on page 1of 33

Chapter 3: Modeling Process

Quality

INDE 6363
Dr. Elias Keedy

1
Introduction

Probability distributions and statistical inference


methods
Modeling the output of a process
Describing Variation
Graphical methods
Numerical descriptions
Probability distributions
Important Probability Distributions
Discrete distributions
Continuous distributions
Some Useful Approximations

2
Quality and Variability

Definition of quality

How to improve quality?

How to describe variation?


Graphical methods

Numerical descriptions

Probability distributions

3
Describing Variation (I)
Graphical Methods
1. Histogram (Section 3.1.2)
2. Box Plot (Section 3.1.4)
What properties can be displayed? (Example: Ex 3-4)
Location (i.e., center) of the data
Spread of the data
Shape (i.e., symmetric, short-tail) of the data

3. Probability Plot (Section 3.4)


Graphical method for determining whether the data conform
to a hypothesized distribution
If the data fall approximately along a straight line, the
distribution describes the data

4
Exercise: Probability Plots

A B

INDE 7397: SPC


5
C D
Describing Variation (II)
Numerical Descriptions
Measures of Location (Central Tendency)
Mean
Median

Measures of Spread (Dispersion)


Variance
Standard deviation

Measures of Shape
Skewness (Symmetry)
Kurtosis (Heaviness of tails)

6
Useful Results on Mean and Variance
If X is a random variable and a and b are constants, then
E[b] b V [b] 0
E[aX ] aE[ X ] V [aX ] a 2V [ X ]
E[aX b] aE[ X ] b V [aX b] a 2V [ X ]
E[ X Y ] E[ X ] E[Y ] V [ X ] E[( X ) 2 ] E[ X 2 ] 2

For independent random variables X and Y,

E[ XY ] E[ X ]E[Y ] V [ X Y ] V [ X ] V [Y ]

7
Describing Variation (III)
Probability Distributions
Discrete Continuous
Probability Mass Function Probability Density Function
(PMF): p(x) (PDF): f(x)
p(x) f(x)

x a b x
Cumulative Distribution Function Cumulative Distribution Function
(CDF): F(x) (CDF): F(x)
F(x)
F(x) 1

8
x x
Important Distributions
Discrete Probability Distributions
Hypergeometric distribution
Binomial distribution
Poisson distribution

Continuous Probability Distributions


Normal distribution
Sampling distributions
Chi-Square distribution
Student t distribution
F distribution

9
Hypergeometric Distribution
+ N (Total # of
D
all items)
(# of Items
of Interest)

+
n
x
(# of items selected
~Hypergeomitric w/o replacement)

When to use this model in quality control?


The number of defective items (x) in a sample (n) that is selected
randomly without replacement from a lot of N items in which D items
are defective Acceptance Sampling
10
Hypergeometric Distribution
D N D

x n x
p( x) , x=0, 1,,min(n,D)
N

n
nD nD D N n
2
1
N N N N 1

Example: A lot of size N = 30 contains 3 nonconforming units. What


is the probability that a sample of 5 units selected at random contains
exactly one nonconforming unit? (Excel: HYPGEOMDIST)

11
Binomial Distribution
Bernoulli trials:
1) Two mutually exclusive outcomes
2) All trials statistically independent
3) Constant probability of success p
Binomial distribution:
Number of "success, x, in n Bernoulli trials follows Binomial dist.
When to use this model in quality control?
Sampling from an infinitely large population. The constant p
usually represents the fraction of defective or nonconforming items
in the population Acceptance Sampling

12
Binomial Distribution
n x nx
p(x) = x p (1 p) x = 0,1,2,...,n 0 p 1

E(x) = np V(x) = np(1 p) [Note: V(x) < E(x)]

Example: A lot of size N = 30 contains 3 nonconforming units. What


is the probability that five units selected at random with replacement
contains exactly one nonconforming units? (Excel: BINOMDIST)

13
Estimation of Binomial Distribution
Parameter
p is the ratio of the observed number of defective or
nonconforming items in a sample x to the sample size n

x
p
n
p(1 p)
p p 2p
n
the probability distribution of p is obtained from the binomial

x [ na ]
n x
a} P{ a} P{x na} p (1 p) n x
P{p
n x 0 x

14
Examples
1. A production process operates with 1% nonconforming output. Every
hour a sample of 25 units of product is taken, and the number of
nonconforming units counted. If one or more nonconforming units
are found, the process is stopped and the quality control technician
must search for the cause of nonconforming production. Evaluate
the performance of this decision rule.

2. A sample of 100 units is selected from a production process that is


1% nonconforming. What is the probability that p will exceed the
true fraction nonconforming by k standard deviations, where k = 1, 2,
and 3?

15
Poisson Distribution
Assumptions:
Occurrences are statistically independent
Occurrences are equally likely to occur within any unit of
time/area
The average occurrence rate (per unit) is a known constant
The number of random events occurring during a
specific time period follows Poisson distribution
e x
p ( x) , x 0,1,...
x!
, 2
When to use this model in quality control?
Number of defects on a unit
Number of random occurrences in a period of time
16
Poisson Distribution

17
Example
Surface-finish defects in a small electric appliance occur at random
with a mean rate of 0.1 defects per unit. Find the probability that a
randomly selected unit will contain at least one surface-finish defect.

(MTB > Calc > Probability Distributions > Poisson)


18
Relationships of Three Discrete Distributions

H: Sample without replacement


from finite population
B: Sample with replacement or
sample from infinitely large
population

Hypergeometric Binomial Poisson

B: finite constant number n trials


P: Infinite possible places/times of
occurrences, very small and constant
occurring probability at each place
19
Some Useful Approximations
Hypergeometric, Binomial, and Poisson

N: population size
Hypergeometric
n: sample size

if n/N <0.1; use p=D/N


Binomial

if n large, p<0.1; use =np

Poisson

20
Exercises: What is the distribution of x?
1. A production process operates with 2% nonconforming output.
Every hour a sample of 50 units of product is taken, and the
number of nonconforming units counted as x.

2. 60% of pulleys are produced using Lathe #1, 40% are produced
using Lathe #2. A random sample of four production parts
containing x parts coming from Lathe #1.

3. Circuit boards are produced in lots of size 20. The sample of size
3 is drawn from the lot at one time and tested. The lot contains 3
nonconforming boards and x is the number of nonconforming
boards in the sample.
4. Let x be the number of misprints on one page of a daily
newspaper, if the average misprints per page is 2.

5. 1000 fish in a pond, 100 of them are tagged. x is the number of


tagged fish among 5 randomly caught fish
21
Exercises: What is the distribution of x?
6. Accidents in a building are assumed to occur randomly with an
average rate of 36 per year. There will be x accidents in the
coming February.

7. A book of 200 pages with 2 error pages. There are x error pages
in a random selection of 10 pages

8. The probability that a salesman will make a sale on one call is 0.3.
Each day, this salesman makes 10 calls. Let x denote the number
of sales made in one day.

9. The average number of flaws per running yard of a certain type of


cotton fabric is 0.01. Let x be the number of flaws in a 100-yard
roll of this fabric.

10. The probability that a basketball player will make a free throw is
0.7. Let x denote the number of free throws he will make in a
game of seven free throw attempts.
22
Normal Distribution
f(x) (z)

2 1

x z
0
x ~ N ( , 2 ) x
z ~ N (0,1)

1
f(x) = e(x)2/22 x
22
E(x) = V(x) = 2

Pr( x + ) = Pr(-1 z 1)= 68.26%


Pr(2 x +2)= Pr(-2 z 2)= 95.46%
Pr(3 x +3)= Pr(-3 z 3)= 99.73%

23
Normal Distribution
Some useful formulas:
P{z a} 1 P{z a}
P{z a} P{z a}
P{z a} P{z a}

Example:
A quality characteristic of a product is normally distributed with mean
and standard deviation one. Specifications on the characteristic
are 6 < x < 8. A unit that falls within specifications on this quality
characteristic results in a profit of C0=1. However, if x < 6 or if x > 8,
the profit is 0. Find the value of that maximizes the expected
profit.

24
Linear Combinations

If x1, x2 are independently normally distributed random variables,


and x1~N(1,12), x2~N(2,22), then y=x1+x2 also follows the
normal distribution, and y~N(1+2,12+ 22).

Example:
Three shafts are made and assembled in a linkage. The length of
each shaft, in centimeters, is distributed as follows:
Length of Shaft 1 ~ N(75, 0.09)
Length of Shaft 2 ~ N(60, 0.16)
Length of Shaft 3 ~ N(25, 0.25)
a) What is the distribution of the length of the linkage?
b) What is the probability that the length of the linkage will be
longer than 160.5 cm? Longer than 159.5 cm?

25
* Central Limit Theorem *
Let x1, x2, , xn be a random sample of size n taken from a
population with mean and variance 2, and if y=x1+x2++xn,
then, when n is large enough,

y ~ N (n , n 2 )

Implication: The sum of n i.i.d. random variables is


approximately normal when n is large enough, regardless of
the distributions of the individual variables.
What is the sampling distribution of the sample mean X ?

The Normal distribution is widely used in quality engineering


due to the Central Limit Theorem.
26
Sampling Distributions
Three important sampling distributions based on
the normal distribution:
Chi-square distribution

t distribution

F distribution

27
ChiSquare Distribution
The Chi-square distribution is associated with the sum of squared
standard normal random variables.
If x1, x2, , xn are independent standard normal random variables,
and 2
y x1 x2 xn
2 2

2
then y follows n , a chi-square distribution with n degrees of
freedom. 1 ( n / 2 ) 1 y / 2
f ( y) y e
2 (n / 2)
n/2

( ) = ( 1) ( 2)... 3 2 1 for even
2 2 2
5 3
= ( 1) ( 2)... for odd
2 2 2 2 2

E(y) = n V(y) = 2n

The most popular use of Chi-square distribution is to test


hypotheses on the variance of a normal distribution.
28
Student t Distribution
If x is a standard normal random variable and y is a chi-square random
variable with k degrees of freedom, and if x and y are independent,
x
then t is distributed as t with k degrees of freedom.
y/k
k 1

( k 1)
2 x 2
2
f(x) =
1 1
k k k
2
k
E(x) = 0 V(x) =
k 2

As k, the t distribution reduces to a standard normal distribution.

The most popular use of t distribution is to test hypotheses on the


mean of a normal distribution with unknown variance.
29
F Distribution

If w and y are two independent chi-square random variables with u


and v degrees of freedom, respectively, then the ratio
w/u
Fu ,
y /
is distributed as F with u numerator degrees of freedom and v
denominator degrees of freedom.
u u
u/2


2 x (u / 2) 1
f ( x) ,0 x
u u
(u v ) / 2

2 2 2 x 1

The most popular use of F distribution is to test hypotheses on the


variances of two normal distributions.
30
Some Useful Approximations

Hypergeometric, Binomial, Poisson, and Normal

N: population size
Hypergeometric
n: sample size

if n/N <0.1; use p=D/N


Binomial

if n large, p<0.1; use =np


if np>10 and
0.1p 0.9; use
Poisson
=np, 2=np(1-p)
if 15; use = , 2=

Normal

31
Example
A textbook has 500 pages on which typographical errors could
occur. Suppose that there are exactly 10 error pages randomly
located on those pages.
(1) Find the probability that 50 randomly selected pages will contain
at least two error pages.
(2) Calculate the desired probability in (1) using the Binomial,
Poisson, and Normal approximations. Which approximations are
satisfactory? Why?

32
Example
An electronic component for a laser range-finder is produced in lots of size N = 25. An
acceptance testing procedure is used by the purchaser to protect against lots that contain
too many nonconforming components. The procedure consists of selecting five
components at random from the lot (without replacement) and testing them. If none of the
components is nonconforming, the lot is accepted.

a. If the lot contains three nonconforming components, what is the probability of lot
acceptance?

b. Calculate the desired probability in (a) using the binomial approximation. Is this
approximation satisfactory'? Why or why not?

c. Suppose the lot size was N=150. Would the binomial approximation be satisfactory in
this case?

d. Suppose that the purchaser will reject the lot with the decision rule of finding one or
more nonconforming components in a sample of size n, and wants the lot to be rejected
with probability at least 0.95 if the lot contains five or more nonconforming components.
How large should the sample size n be?

33

You might also like