Professional Documents
Culture Documents
How rounded are these pebbles ? Where did they come from ?
Statistics
Histograms
Probability
Error Analysis
Regression
What is a Statistic ?
Is this a statistic ?
In 1970, the oil refining capacity of Belgium
was 32.6 million tonnes per year
What is a Statistic ?
This should give you a much better idea of your beach rocks
Specimen:
Sample:
Population:
One object
A subset number of objects
All the objects
So What is a Statistic ?
So What is a Statistic ?
Election Polls
Obama 365
McCain 162
Election Polls
Obama 66,882,230
McCain 58,343,671
Mass (g)
374
389
395
364
224
250
378
376
330
310
w = 1/N wi
i=1
Mass (g)
225
250
310
330
364
374
376
378
389
395
399
Mass (g)
225
250
310
330
364
374
376
378
389
395
399
Mass (g)
225
250
310
330
364
374
376
378
389
395
399
But how much does this tell us about all the sample pebbles ?
Mass (g)
225
250
310
330
364
374
376
378
389
395
399
Mass (g)
225
250
310
330
364
374
376
378
389
395
399
= (mass - w)
200-235
236-260
261-285
286-315
316-335
336-365
366-385
386-415
416-435
436-465
Number
1
3
7
9
16
22
19
14
6
2
Frequency
Range(g)
A histogram displays the pebble mass count in bins (10 bins shown)
Mass (g)
225
250
310
330
364
374
376
378
389
395
399
Frequency
Probability
Frequency
What is Probability ?
Probability
Frequency
What is Probability ?
Probability
Frequency
What is Probability ?
Probability
Frequency Distribution & Probability
200-235
236-260
261-285
286-315
316-335
336-365
366-385
386-415
416-435
436-465
Number
1
3
7
9
16
22
19
14
6
2
Probability
.01
.03
.07
.09
.16
.22
.19
.14
.06
.02
Probability
Range(g)
200-235
236-260
261-285
286-315
316-335
336-365
366-385
386-415
416-435
436-465
Number
1
3
7
9
16
22
19
14
6
2
Probability
.01
.03
.07
.09
.16
.22
.19
.14
.06
.02
Probability
Range(g)
Gaussian Distribution
P(x) = e
[-(x-x)2/22]
2
sqrt(2 )
Gaussian Distribution
P(x) = e
[-(x-x)2/22]
2
sqrt(2 )
P(x)
x
This is a Gaussian distribution for xmean= 5.0 and = 2.0
You are more likely to obtain a value between 4-6 where the
graph is high
Gaussian Distribution
P(x)
We can quantify this by looking at the area under the curve, the
total area under the curve is 1.0
This area is much smaller than the dark gray block between 4 - 7.
Gaussian Distribution
To quantify these
areas we use
established values
for multiples of the
standard deviation
from the mean
P(x)
1.0
2.0
x
The area under the curve between 3-7 is 0.683 and is termed 1.0
(this is known as the 68% confidence limit)
The area under the curve between 1-9 is 0.954 and is termed
(this is known as the 95% confidence limit)
Linear Regression:
How to Fit a Line to Scattered Data
Now that we've learned
statistical analysis of a
single variable
We can also consider
statistical analysis of two
related variables.
We may be able to
approximate this
relationship by a straight
line.
Pebble diameter
Linear Regression:
How to Fit a Line to Scattered Data
The line draw to the
right is one possibility.
How can we determine
whether this line is better
than another in a
quantitative way ?
Pebble diameter
Linear Regression:
How to Fit a Line to Scattered Data
This gives you the
deviation of one point
from the line.
To obtain the mean
square deviation, we take
the average ofy for all
points
(y - y)
Pebble diameter
Linear Regression:
How to Fit a Line to Scattered Data
Now that we've learned
statistical analysis of a
single variable
We can also consider
statistical analysis of two
related variables.
We may be able to
approximate this
relationship by a straight
line.
Pebble diameter
37
Linear Regression:
How to Fit a Line to Scattered Data
The line draw to the
right is one possibility.
How can we determine
whether this line is better
than another in a
quantitative way ?
Pebble diameter
Linear Regression:
How to Fit a Line to Scattered Data
This gives you the
deviation of one point
from the line.
To obtain the mean
square deviation, we take
the average ofy for all
points
(y - y)2
Pebble diameter