Professional Documents
Culture Documents
Descriptive Statistics
Measure of Location
The most common measure of central tendency is the
(y)
arithmetic mean. The arithmetic mean of a sample is
defined as the sum of the individual data points (y i) divided by
the number of points (n)
y i
y i1
n
Measures of Spread
The most common measure of spread for a sample is the
standard deviation (sy) about the mean:
st
sy
n 1
n
st (yi y)2
n1
y sy
If a quantity is normally distributed, the range defined by
will encompass approximately 68% of the total measurements
y 2sy
or the range defined by will encompass approximately
95%.
Example:
Compute the mean, standard deviation, and coefficient of
variation for the data in the following table.
y a 0 a1 x e
e y a 0 a1 x
One strategy for fitting a best line through the data would be
to minimize the sum of the square residual errors for all the
available data
n
Sr e i
i1
n
(yi a 0 a1xi )2
i1
In order to get the best fit, we must differentiate S r to ai equal
to zerro
Sr n
2(yi a 0 a1xi) 0
a 0 i1
Sr n
2(yi a 0 a1xi)xi 0
a1 i1
n n n
y a a x 0
i1
i
i1
0
i1
1 i
n n n
then
n
n
n x i
a
y i
i1
0
i1
n n
a n
x xi2 1
xiy i
i1
i
i1 i1
n n
n
xi x i y
2
a 0 1
i
i1 i1
i1
n
x
2
a 1
n
n
n
xiy i n
n x xi i 2
i1 i1 i1
i
i1
n n n n
i i i i i
2
y x x y x
1 i1 i1 i1 i1
2 n n n
n
n
n xi2 xi n xiyi xi yi
i1 i1 i1 i1 i1
st
sy
n 1
n
st (yi y)2
n1
St Sr
r2
St
Example:
In this application, force is the dependent variable ( y) and
velocity is the independent variable (x).
xi 10 20 30 40 50 60 70 80
yi 25 70 380 550 610 1220 830 1450
y a 2 xb 2
a3x
y
b3 x
y a1eb1x
Equation can be linearized by taking its natural
logarithm to yield
lny lna1 b1x
Thus, a plot of ln y versus x will yield a straight line with a slope
of b1 and an intercept of ln 1
y a 2 xb 2
Equation is linearized by taking its base-10 logarithm to
give
Thus, a plot of log y versus log x will yield a straight line with a
slope of b2 and an intercept of log 2. Note that any base
logarithm can be used to linearize this model. However, as
done here, the base-10 logarithm is most commonly employed.
a 3x
y
b3 x
Equation is linearized by inverting it to give
1 1 b3 1
y a3 a3 x
Thus, a plot of 1/y versus 1/x will be linear, with a slope of b 3/3
and an intercept of 1/3.
Example:
Fit the following data using logarithmic transformation. General
relationship between force versus velocity is desribed by
F avb
v, m/s 10 20 30 40 50 60 70 80
F, N 25 70 380 550 610 1220 830 1450
POLYNOMIAL REGRESSION
The least-squares procedure can be readily extended to fit the
data to a higher-order polynomial. For example, suppose that
we fit a second-order polynomial or quadratic:
y a 0 a1x a 2 x2 e
n
Sr (yi a 0 a1xi a 2 xi2 )2
i1
Sr n
2(yi a 0 a1xi a 2 xi2 ) 0
a 0 i1
Sr n
2(yi a 0 a1xi a 2 xi2 )xi 0
a1 i1
Sr n
2(yi a 0 a1xi a 2 xi2 )xi2 0
a 2 i1
or
n n
n
n xi xi2 y i
i1 i1
a 0 i1
n n n
n
x x i
2
i xi3 a 1 x y i i 0
i1 i1 i1
a i1
n n n
2 n
x x x y
2 3
i i xi4
2
i i
i1 i1 i1 i1
Example:
Fit a second-order polynomial to the data
x 0 1 2 3 4 5
y 2.1 7.7 13.6 27.2 40.9 61.1
y a 0 a1x1 a 2 x2 e
n
Sr (yi a 0 a1x1,i a 2 x2,i)2
i1
Sr n
2(yi a 0 a1x1,i a 2 x2,i)x1,i 0
a1 i1
Sr n
2(yi a 0 a1x1,i a 2 x2,i)x2,i 0
a 2 i1
n n n
n
1 x1,i x2,i y i
i1 i1 i1
a 0 i1
n n n
n
x 1,i x 2
1,i x1,ix2,i a 1 x 1,iyi 0
i1 i1 i1
a i1
2 n
2
n n n
x1 0 2 2.5 1 4 7
x2 0 1 2 3 6 2
y 5 10 9 0 3 27