Professional Documents
Culture Documents
JOHN
LOUCKS
St. Edward’s
University
1
Chapter 2, Part A
Descriptive Statistics:
Tabular and Graphical Presentations
■ Summarizing Categorical Data
■ Summarizing Quantitative Data
2
Summarizing Categorical Data
■ Frequency Distribution
■ Relative Frequency Distribution
■ Percent Frequency
■ Distribution
Bar Chart
■ Pie Chart
■ Crosstabulatio
n
3
Frequency Distribution
A
A frequency
frequency distribution
distribution is
is aa tabular
tabular summary
summary of of
data
data showing
showing the
the frequency
frequency (or(or number)
number) of
of items
items
in
in each
each of
of several
several non-overlapping
non-overlapping classes.
classes.
The
The objective
objective is
is to
to provide
provide insights
insights about
about the
the data
data
that
that cannot
cannot be
be quickly
quickly obtained
obtained by
by looking
looking only
only at
at
the
the original
original data.
data.
4
Frequency Distribution
5
Frequency Distribution
Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20
6
Using Excel’s COUNTIF Function
to Construct a Frequency Distribution
■ Excel Formula
Worksheet
A B C D
1 Quality Rating Quality Rating Frequency
2 Above Average Poor =COUNTIF($A$2:$A$21,C2)
3 Below Average Below Average =COUNTIF($A$2:$A$21,C3)
4 Above Average Average =COUNTIF($A$2:$A$21,C4)
5 Average Above Average =COUNTIF($A$2:$A$21,C5)
6 Average Excellent =COUNTIF($A$2:$A$21,C6)
7 Above Average Total =SUM(D2:D6)
8 Above Average
Note: Rows 9-21 are not shown.
7
Using Excel’s COUNTIF Function
to Construct a Frequency Distribution
■ Excel Value
Worksheet
A B C D
1 Quality Rating Quality Rating Frequency
2 Above Average Poor 2
3 Below Average Below Average 3
4 Above Average Average 5
5 Average Above Average 9
6 Average Excellent 1
7 Above Average Total 20
8 Above Average
Note: Rows 9-21 are not shown.
8
Relative Frequency Distribution
The
The relative
relative frequency
frequency of of aa class
class is
is the
the fraction
fraction or
or
proportion
proportion of
of the
the total
total number
number of of data
data items
items
belonging
belonging to
to the
the class.
class.
A
A relative
relative frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the relative
relative
frequency
frequency forfor each
each class.
class.
9
Percent Frequency Distribution
The
The percent
percent frequency
frequency of
of aa class
class is
is the
the relative
relative
frequency
frequency multiplied
multiplied by
by 100.
100.
A
A percent
percent frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the percent
percent
frequency
frequency for
for each
each class.
class.
10
Relative Frequency and
Percent Frequency Distributions
■ Example: Marada Inn
Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25 .10(100) =
10
Above Average .45 45
Excellent .05 5
Total 1.00 100
1/20 = .
05
11
Using Excel to Construct Relative
Frequency and Percent Frequency
Distributions
■ Excel Formula
Worksheet
C D E F
Relative Percent
1 Quality Rating Frequency Frequency Frequency
2 Poor =COUNTIF($A$2:$A$21,C2) =D2/$D$7 =E2*100
3 Below Average =COUNTIF($A$2:$A$21,C3) =D3/$D$7 =E3*100
4 Average =COUNTIF($A$2:$A$21,C4) =D4/$D$7 =E4*100
5 Above Average =COUNTIF($A$2:$A$21,C5) =D5/$D$7 =E5*100
6 Excellent =COUNTIF($A$2:$A$21,C6) =D6/$D$7 =E6*100
7 Total =SUM(D2:D6) =SUM(E2:E6) =SUM(F2:F6)
8
Note: Columns A-B and rows 9-21 and are not shown.
12
Using Excel to Construct Relative
Frequency and Percent Frequency
Distributions
■ Excel Value
Worksheet
C D E F
Relative Percent
1 Quality Rating Frequency Frequency Frequency
2 Poor 2 0.10 10
3 Below Average 3 0.15 15
4 Average 5 0.25 25
5 Above Average 9 0.45 45
6 Excellent 1 0.05 5
7 Total 20 1.00 100
8
Note: Columns A-B and rows 9-21 and are not shown.
13
Bar Chart (In Excel this is called a Column
Chart)
A bar chart is a graphical device for depicting
qualitative data.
On one axis (usually the horizontal axis), we specify
the labels that are used for each of the classes.
A frequency, relative frequency, or percent frequency
scale can be used for the other axis (usually the
vertical axis).
Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
The bars are separated to emphasize the fact that each
class is a separate category.
14
Bar Chart (In Excel this is called a Column
Chart)
Marada Inn Quality Ratings
10
9
8
7
Frequency
6
5
4
3
2
1
Rating
Poor Below Average Above Excellent
Average Average
15
Using Excel’s Chart Tools
to Construct a Bar Chart
… continued
16
Using Excel’s Chart Tools
to Construct a Bar Chart
17
Using Excel’s Chart Tools
to Construct a Bar Chart
18
Using Excel’s Chart Tools
to Construct a Bar Chart
C D E
9
10 Marada Inn Quality Ratings
11 10
12
Frequency
8
13
6
14
15 4
16 2
17 0
18 Poor Below Average Above Excellent
19 Average Average
20 Quality Rating
21
19
Pie Chart
20
Pie Chart
21
Example: Marada Inn
22
Using Excel’s Chart Tools
to Construct a Pie Chart
23
Using Excel’s Chart Tools
to Construct a Pie Chart
C D E
9
10 Marada Inn Quality Ratings
11 Poor
Excellent 10%
12
5%
13 Below
14 Average
15 15%
16 Above
17 Average
45% Average
18
19 25%
20
24
Excel’s PivotTable Report
and PivotChart Report
25
Summarizing Quantitative Data
■ Frequency Distribution
■ Relative Frequency and
Percent Frequency
■ Distributions
Dot Plot
■ Histogram
■ Cumulative Distributions
■ Ogive
■ Stem-Leaf Display
■ Crosstabulation
■ Scatter Diagram
26
Frequency Distribution
27
Frequency Distribution
28
Frequency Distribution
29
Frequency Distribution
30
Frequency Distribution
31
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
32
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
33
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
34
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
■ Excel Value Worksheet
A B C D
1 Parts Cost Parts Cost Count of Parts Cost
2 91 50-59 2
3 71 60-69 13
4 104 70-79 16
5 85 80-89 7
6 62 90-99 7
7 78 100-109 5
8 69 Grand Total 50
Note: Rows 9-51 are not shown.
35
Relative Frequency and
Percent Frequency Distributions
■ Example: Hudson Auto Repair
36
Relative Frequency and
Percent Frequency Distributions
■ Example: Hudson Auto Repair
Insights Gained from the % Frequency
• Distribution:
Only 4% of the parts costs are in the $50-59 class.
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third)
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.
37
Dot Plot
38
Dot Plot
50 60 70 80 90 100 110
Cost ($)
39
Histogram
40
Histogram
12
10
8
6
4
2
Parts
Cost ($)
50−59 60−69 70−79 80−89 90−99 100-110
41
Using Excel’s Chart Tools
to Construct a Histogram
42
Using Excel’s Chart Tools
to Construct a Histogram
43
Using Excel’s Chart Tools
to Construct a Histogram
C D E
10
11 Tune-up Parts Cost
12
20
13
14
15
Frequency
15
16
17
10
18
19 5
20
21 0
22 50-59 60-69 70-79 80-89 90-99 100-109
23 Parts Cost ($)
24
44
Histogram
■ Symmetric
• Left tail is the mirror image of the right tail
• Examples: heights and weights of people
.35
Relative Frequency
.30
.25
.20
.15
.10
.05
0
45
Histogram
.30
.25
.20
.15
.10
.05
0
46
Histogram
.30
.25
.20
.15
.10
.05
0
47
Histogram
.30
.25
.20
.15
.10
.05
0
48
Cumulative Distributions
Cumulative
Cumulative frequency distribution −−shows
frequency distribution shows the
the
number
number ofof items
items with
with values
values less
less than
than or
or equal
equal to
to the
the
upper
upper limit
limit of
of each
each class..
class..
Cumulative
Cumulative relative
relative frequency
frequency distribution
distribution –– shows
shows
the
the proportion
proportion of
of items
items with
with values
values less
less than
than or
or
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.
Cumulative
Cumulative percent
percent frequency
frequency distribution
distribution –– shows
shows
the
the percentage
percentage ofof items
items with
with values
values less
less than
than oror
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.
49
Cumulative Distributions
Cumulative Cumulative
Cumulative Relative Percent
Cost ($) Frequency Frequency Frequency
< 59 2 .04 4
< 69 15 .30 30
< 79 31 2 + .62 15/50 62 .
< 89 38 13 .76 76 30(100
< 99 45 .90 90 )
< 109 50 1.00 100
50
Ogive
51
Ogive
52
Ogive with
Cumulative Percent Frequencies
■ Example: Hudson Auto Repair
Tune-up Parts Cost
Cumulative Percent Frequency
100
80
60 (89.5,
76)
40
20
Parts
Cost ($)
50 60 70 80 90 100 110
53
Using Excel’s PivotChart Report
54
Using Excel’s PivotChart Report
57
Using Excel’s PivotChart Report
58
End of Chapter 2, Part A
59
Chapter 2, Part B
Descriptive Statistics:
Tabular and Graphical Presentations
■ Exploratory Data Analysis: Stem-and-Leaf
■ Display
Crosstabulations and Scatter Diagrams
60
Exploratory Data Analysis
61
Stem-and-Leaf Display
62
Example: Hudson Auto Repair
63
Stem-and-Leaf Display
64
Stem-and-Leaf Display
5 2 7
6 2 2 2 2 5 6 7 8 8 8 9 9 9
17 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
8 0 0 2 3 5 8 9
9 1 3 7 7 7 8 9
10 1 4 5 5 9
a stem
a leaf
65
Stretched Stem-and-Leaf Display
66
Stretched Stem-and-Leaf Display
5 2
5 7
6 2 2 2 2
65 6 7 8 8 8 9 9 9
71 1 2 2 3 4 4
75 5 5 6 7 8 9 9 9
8 0 0 2 3
8 5 8 9
9 1 3
9 7 7 7 8 9
10 1 4
10 5 5 9
67
Stem-and-Leaf Display
Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed
to equal 1.
68
Example: Leaf Unit = 0.1
69
Example: Leaf Unit = 10
Leaf Unit = 10
16 8 The 82 in 1682
17 1 9 is rounded down
18 0 3 to 80 and is
represented as
19 1 7 an 8.
70
Crosstabulations and Scatter Diagrams
71
Crosstabulation
A crosstabulation is a tabular summary of
data for
two variables and helps to reveal the
relationship between the two variables.
■ Crosstabulation can be used when:
• One variable is Qualitative and the other is
Categorical,
• Both variables are Qualitative, or
• Both variables are Categorical.
The left and top margin labels define the classes for
the two variables.
72
Crosstabulation
Total 30 20 35 15 100
73
Crosstabulation
74
Crosstabulation
Frequency
distribution
Example: Finger Lakes Homes
for the
price range
variable
Total 30 20 35 15 100
Frequency distribution
for
the home style
variable
75
Using Excel’s PivotTable Report
to Create a Crosstabulation
Excel Worksheet (showing partial
data)
A B C D E
1 Home Price ($) Style
2 1 >99K Colonial
3 2 <=99K Log
4 3 >99K Log
5 4 <=99K A-Frame
6 5 <=99K Colonial
7 6 <=99K Split-Level
8 7 >99K A-Frame
9 8 >99K Colonial
Note: Rows 10-101 are not shown.
76
Using Excel’s PivotTable Report
to Create a Crosstabulation
Displaying the Initial PivotTable Field List and
PivotTable Report
Step 1 Click the Insert tab on the Ribbon
Step 2 In the Tables group, click the icon above the
word PivotTable
Step 3 When the Create PivotTable dialog box appears:
Choose Select a Table or Range
Enter A1:C101 in the Table/Range box
Choose New Worksheet as the location for
the PivotTable Report
Click OK
77
Using Excel’s PivotTable Report
to Create a Crosstabulation
Setting Up the PivotTable Field List
Step 1 In the PivotTable Field List, go to Choose Field
to add to report
Drag the Price ($) field to Row Labels area
Drag the Style field to Column Labels area
Drag the Home field to the Values area
Step 2 Click on Sum of Home in the Values area
Step 3 Click Value Field Settings from the list of options
Step 4 When the Value Field Settings dialog box appears:
Under Summarize value field by, choose Count
Coun
Choose New Worksheet as the location for
Click OK
78
Using Excel’s PivotTable Report
to Create a Crosstabulation
Value Worksheet
A B C D E F G
1 Count of Home Style
2 Price ($) Colonial Log Split-Level A-Frame Grand Total
3 <=99K 18 6 19 12 55
4 >99K 12 14 16 3 45
5 Grand Total 30 20 35 15 100
6
79
Crosstabulation: Row or Column
Percentages
Converting the entries in the table into row
percentages or column percentages can
provide additional insight about the
relationship between the two variables.
80
Crosstabulation: Row Percentages
81
Crosstabulation: Column Percentages
82
Crosstabulation: Simpson’s Paradox
84
Scatter Diagram
A Positive Relationship
85
Scatter Diagram
A Negative Relationship
86
Scatter Diagram
No Apparent Relationship
87
Scatter Diagram
88
Scatter Diagram
y
35
Number of Points Scored
30
25
20
15
10
5
0 x
0 1 2 3 4
Number of Interceptions
89
Example: Panthers Football Team
90
Using Excel’s Chart Wizard to Construct
a Scatter Diagram and Trendline
Excel Worksheet (showing data)
A B C
Number of Number of
1 Interceptions Points Scored
2 1 14
3 3 24
4 2 18
5 1 17
6 3 30
7
91
Using Excel’s Chart Tools to
Construct a Scatter Diagram and
Trendline
Step 1 Select cells A2:B6
Step 2 Click the Insert tab on the Excel Ribbon
Step 3 In the Charts group, click Scatter
Step 4 When the list of scatter diagram subtypes appears:
Click Scatter with only Markers
Step 5 In the Chart Layout group, click Layout 1
Step 6 Select the Chart Title and replace it with Scatter
Diagram for the Panthers
Step 7 Select the Horizontal Axis (Value) Title and
replace it with Number of Interceptions
. . . continue
92
Using Excel’s Chart Tools to
Construct a Scatter Diagram and
Trendline
Step 8 Select the Vertical (Value) Axis Title and replace
it with Number of Points Scored
Step 9 Right click Series 1 Legend Entry and click Delet
- - - - - - - - - - - - - - - - To Add a Trendline - - - - - - - - - - - - - - -
Step 10 Position the pointer over any data point in the
scatter diagram and right-click to display options
Step 11 Choose Add Trendline
Step 12 When the Format Trendline dialog box appears:
Select Trendline Options
Choose Linear from Trend/Regression Type lis
Click Close
93
Using Excel’s Chart Tools to
Construct a Scatter Diagram and
Trendline
A B C
8
Scatter Diagram for the Panthers
9
35
10
11 30
12
Points Scored.
25
Number of
13 20
14 15
15
10
16
17 5
18 0
19 0 1 2 3 4
Number of Interceptions
20
94
Tabular and Graphical Methods
Data
Categorical Data Quantitative Data
96