You are on page 1of 96

Slides by

JOHN
LOUCKS
St. Edward’s
University

1
Chapter 2, Part A
Descriptive Statistics:
Tabular and Graphical Presentations
■ Summarizing Categorical Data
■ Summarizing Quantitative Data

2
Summarizing Categorical Data

■ Frequency Distribution
■ Relative Frequency Distribution
■ Percent Frequency
■ Distribution
Bar Chart
■ Pie Chart
■ Crosstabulatio
n

3
Frequency Distribution

A
A frequency
frequency distribution
distribution is
is aa tabular
tabular summary
summary of of
data
data showing
showing the
the frequency
frequency (or(or number)
number) of
of items
items
in
in each
each of
of several
several non-overlapping
non-overlapping classes.
classes.

The
The objective
objective is
is to
to provide
provide insights
insights about
about the
the data
data
that
that cannot
cannot be
be quickly
quickly obtained
obtained by
by looking
looking only
only at
at
the
the original
original data.
data.

4
Frequency Distribution

■ Example: Marada Inn


Guests staying at Marada Inn were asked to rate the
quality of their accommodations as being excellent,
above average, average, below average, or poor. The
ratings provided by a sample of 20 guests are:

Below Average Average Above Average


Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average

5
Frequency Distribution

■ Example: Marada Inn

Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20

6
Using Excel’s COUNTIF Function
to Construct a Frequency Distribution
■ Excel Formula
Worksheet
A B C D
1 Quality Rating Quality Rating Frequency
2 Above Average Poor =COUNTIF($A$2:$A$21,C2)
3 Below Average Below Average =COUNTIF($A$2:$A$21,C3)
4 Above Average Average =COUNTIF($A$2:$A$21,C4)
5 Average Above Average =COUNTIF($A$2:$A$21,C5)
6 Average Excellent =COUNTIF($A$2:$A$21,C6)
7 Above Average Total =SUM(D2:D6)
8 Above Average
Note: Rows 9-21 are not shown.

7
Using Excel’s COUNTIF Function
to Construct a Frequency Distribution
■ Excel Value
Worksheet
A B C D
1 Quality Rating Quality Rating Frequency
2 Above Average Poor 2
3 Below Average Below Average 3
4 Above Average Average 5
5 Average Above Average 9
6 Average Excellent 1
7 Above Average Total 20
8 Above Average
Note: Rows 9-21 are not shown.

8
Relative Frequency Distribution

The
The relative
relative frequency
frequency of of aa class
class is
is the
the fraction
fraction or
or
proportion
proportion of
of the
the total
total number
number of of data
data items
items
belonging
belonging to
to the
the class.
class.

A
A relative
relative frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the relative
relative
frequency
frequency forfor each
each class.
class.

9
Percent Frequency Distribution

The
The percent
percent frequency
frequency of
of aa class
class is
is the
the relative
relative
frequency
frequency multiplied
multiplied by
by 100.
100.

A
A percent
percent frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the percent
percent
frequency
frequency for
for each
each class.
class.

10
Relative Frequency and
Percent Frequency Distributions
■ Example: Marada Inn

Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25 .10(100) =
10
Above Average .45 45
Excellent .05 5
Total 1.00 100

1/20 = .
05
11
Using Excel to Construct Relative
Frequency and Percent Frequency
Distributions
■ Excel Formula
Worksheet
C D E F
Relative Percent
1 Quality Rating Frequency Frequency Frequency
2 Poor =COUNTIF($A$2:$A$21,C2) =D2/$D$7 =E2*100
3 Below Average =COUNTIF($A$2:$A$21,C3) =D3/$D$7 =E3*100
4 Average =COUNTIF($A$2:$A$21,C4) =D4/$D$7 =E4*100
5 Above Average =COUNTIF($A$2:$A$21,C5) =D5/$D$7 =E5*100
6 Excellent =COUNTIF($A$2:$A$21,C6) =D6/$D$7 =E6*100
7 Total =SUM(D2:D6) =SUM(E2:E6) =SUM(F2:F6)
8
Note: Columns A-B and rows 9-21 and are not shown.

12
Using Excel to Construct Relative
Frequency and Percent Frequency
Distributions
■ Excel Value
Worksheet
C D E F
Relative Percent
1 Quality Rating Frequency Frequency Frequency
2 Poor 2 0.10 10
3 Below Average 3 0.15 15
4 Average 5 0.25 25
5 Above Average 9 0.45 45
6 Excellent 1 0.05 5
7 Total 20 1.00 100
8
Note: Columns A-B and rows 9-21 and are not shown.

13
Bar Chart (In Excel this is called a Column
Chart)
 A bar chart is a graphical device for depicting
qualitative data.
 On one axis (usually the horizontal axis), we specify
the labels that are used for each of the classes.
 A frequency, relative frequency, or percent frequency
scale can be used for the other axis (usually the
vertical axis).
 Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
 The bars are separated to emphasize the fact that each
class is a separate category.

14
Bar Chart (In Excel this is called a Column
Chart)
Marada Inn Quality Ratings
10
9
8
7
Frequency

6
5
4
3
2
1
Rating
Poor Below Average Above Excellent
Average Average

15
Using Excel’s Chart Tools
to Construct a Bar Chart

Step 1. Select cells C1:D6


Step 2. Click the Insert tab on the Ribbon
ep 3. In the Charts group, clickColumn
Step 4. When the list of column chart subtypes appears:
Go to the 2-D Column section
Click Clustered Column (the leftmost chart)
Step 5. In the Chart Layouts group, click the More button
(the downward pointing arrow with a line over it)
to display all the options

… continued

16
Using Excel’s Chart Tools
to Construct a Bar Chart

Step 6. Choose Layout 9


Step 7. Click the Chart Title and replace it with
Marada Inn Quality Ratings
Step 8. Click the Horizontal Axis (Category)
Title and
replace
Step 9. Click theitVertical
with Quality Rating Title
Axis (Value)
and
replace
Step 10. Right it with
click Frequency
the Series 1 Legend Entry
and choose
Delete from the list of options that
appear … continued

17
Using Excel’s Chart Tools
to Construct a Bar Chart

Step 11. Right click the vertical axis and choose


Format Axis from the options that
appear
Step 12. When the Format Axis dialog box
appears:
Go to the Axis Options section
Select Fixed for Major Unit and
enter 2.0 in
the corresponding box
Click Close

18
Using Excel’s Chart Tools
to Construct a Bar Chart
C D E
9
10 Marada Inn Quality Ratings
11 10
12
Frequency

8
13
6
14
15 4
16 2
17 0
18 Poor Below Average Above Excellent
19 Average Average
20 Quality Rating
21

19
Pie Chart

 The pie chart is a commonly used graphical device


for presenting relative frequency distributions for
qualitative data.
■ First draw a circle; then use the relative frequencies
to subdivide the circle into sectors that correspond to
the relative frequency for each class.
■ Since there are 360 degrees in a circle, a class with a
relative frequency of .25 would consume .25(360) = 90
degrees of the circle.

20
Pie Chart

Marada Inn Quality


Ratings
Excellent
5%
Poor
10%
Below
Average
Above 15%
Average
45%
Average
25%

21
Example: Marada Inn

■ Insights Gained from the Preceding Pie Chart


• One-half of the customers surveyed gave Marada
a quality rating of “above average” or “excellent”
(looking at the left side of the pie). This might
please the manager.
• For each customer who gave an “excellent” rating,
there were two customers who gave a “poor”
rating (looking at the top of the pie). This should
displease the manager.

22
Using Excel’s Chart Tools
to Construct a Pie Chart

Excel’s chart tools can be used to develop a pie chart for


the Marada quality rating data in much the same way we
developed the bar chart.
The major difference is that in step 3 we would choose
Pie in the Charts group.

23
Using Excel’s Chart Tools
to Construct a Pie Chart
C D E
9
10 Marada Inn Quality Ratings
11 Poor
Excellent 10%
12
5%
13 Below
14 Average
15 15%
16 Above
17 Average
45% Average
18
19 25%
20
24
Excel’s PivotTable Report
and PivotChart Report

You have now seen how Excel’s COUNTIF function can


be used to develop a frequency distribution and Excel’s
Chart Tools can be used to create bar and pie charts.
But there is a more powerful set of Excel tools that can
be used for categorical data:
• PivotTable report
• PivotChart report

25
Summarizing Quantitative Data

■ Frequency Distribution
■ Relative Frequency and
Percent Frequency
■ Distributions
Dot Plot
■ Histogram
■ Cumulative Distributions
■ Ogive
■ Stem-Leaf Display
■ Crosstabulation
■ Scatter Diagram

26
Frequency Distribution

■ Example: Hudson Auto Repair


The manager of Hudson Auto would like to
gain a
better understanding of the cost of parts used in
the
engine tune-ups performed in the shop. She
examines
50 customer invoices for tune-ups. The costs of
parts,
rounded to the nearest dollar, are listed on the
next
slide.

27
Frequency Distribution

■ Example: Hudson Auto Repair


Sample of Parts Cost($) for 50 Tune-
ups
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

28
Frequency Distribution

■ Guidelines for Selecting Number of Classes


• Use between 5 and 20 classes.
• Data sets with a larger number of elements
usually require a larger number of classes.
• Smaller data sets usually require fewer classes.

29
Frequency Distribution

■ Guidelines for Selecting Width of Classes


•Use classes of equal width.
•Approximate Class Width =
Largest Data Value − Smallest Data Value
Number of Classes

30
Frequency Distribution

■ Example: Hudson Auto Repair


If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5 ≅ 10

Parts Cost ($) Frequency


50-59 2
60-69 13
70-79 16
80-89 7
90-99 7
100-109 5
Total 50

31
Using Excel’s PivotTable Report
to Construct a Frequency Distribution

Step 1 Click the Insert tab on the Ribbon


Step 2 In the Tables group, click the icon above the
word PivotTable
Step 3 When the Create PivotTable dialog box appears:
Choose Select a table or range
Enter A1:A51 in the Table/Range box
Choose Existing Worksheet as the location
for the PivotTable
Enter C1 in the Location box
Click OK
… continued

32
Using Excel’s PivotTable Report
to Construct a Frequency Distribution

Step 4 In the PivotTable Field List, go to Choose Fields


to add to report:
Drag the Parts Cost field to the Row Labels are
Drag the Parts Cost field to the Values area
Click on Sum of Parts Cost in the Values area
Step 6 Click Value Field Settings from the list of options
that appear
Step 7 When the Value Field Settings dialog box appears:
Under Summarize value field by, choose Coun
Click OK

33
Using Excel’s PivotTable Report
to Construct a Frequency Distribution

To construct the frequency distribution, we must group


the rows containing parts costs.
Step 1 Right click any cell in the PivotTable report
containing a parts cost.
Step 2 Choose Group from the list of options that appear
Step 3 When the Grouping dialog box appears:
Enter 50 in the Starting at box
Enter 109 in the Ending at box
Enter 10 in the By box
Click OK

34
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
■ Excel Value Worksheet
A B C D
1 Parts Cost Parts Cost Count of Parts Cost
2 91 50-59 2
3 71 60-69 13
4 104 70-79 16
5 85 80-89 7
6 62 90-99 7
7 78 100-109 5
8 69 Grand Total 50
Note: Rows 9-51 are not shown.

35
Relative Frequency and
Percent Frequency Distributions
■ Example: Hudson Auto Repair

Parts Relative Percent


Cost ($) Frequency Frequency
50-59 .04 4
60-69 .26 2/50 26 .
70-79 .32 32 04(100
80-89 .14 14 )
90-99 .14 14
100-109 .10 10
Total 1.00 100

36
Relative Frequency and
Percent Frequency Distributions
■ Example: Hudson Auto Repair
Insights Gained from the % Frequency
• Distribution:
Only 4% of the parts costs are in the $50-59 class.
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third)
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.

37
Dot Plot

■ One of the simplest graphical summaries of


data is a dot plot.
■ A horizontal axis shows the range of data
■ values.
Then each data value is represented by a dot
placed above the axis.

38
Dot Plot

■ Example: Hudson Auto Repair

Tune-up Parts Cost

50 60 70 80 90 100 110
Cost ($)

39
Histogram

 Another common graphical presentation of


quantitative data is a histogram.
 The variable of interest is placed on the horizontal
axis.
 A rectangle is drawn above each class interval with
its height corresponding to the interval’s frequency,
relative frequency, or percent frequency.
 Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent classes.

40
Histogram

■ Example: Hudson Auto Repair


18 Tune-up Parts Cost
16
14
Frequency

12
10
8
6
4
2
Parts
Cost ($)
50−59 60−69 70−79 80−89 90−99 100-110

41
Using Excel’s Chart Tools
to Construct a Histogram

Step 1. Select cells C2:D7


Step 2. Click the Insert tab on the Ribbon
Step 3. In the Charts group, click Column
Step 4. When the list of column chart subtypes appears:
Go to the 2-D Column section
Click Clustered Column (the leftmost chart)
Step 5. In the Chart Layouts group, click the More
button (the downward pointing arrow with
a line over it) to display all the options
… continued

42
Using Excel’s Chart Tools
to Construct a Histogram

Step 6. Choose Layout 8


Step 7. Select the Chart Title and replace it with
Tune-up Parts Cost
Step 8. Select the Horizontal (Category) Axis Title and
replace it with Parts Cost ($)
Step 9. Select the Vertical (Value) Axis Title and replace
it with Frequency

43
Using Excel’s Chart Tools
to Construct a Histogram
C D E
10
11 Tune-up Parts Cost
12
20
13
14
15
Frequency

15
16
17
10
18
19 5
20
21 0
22 50-59 60-69 70-79 80-89 90-99 100-109
23 Parts Cost ($)
24

44
Histogram

■ Symmetric
• Left tail is the mirror image of the right tail
• Examples: heights and weights of people
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

45
Histogram

■ Moderately Skewed Left


• A longer tail to the left
• Example: exam scores
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

46
Histogram

■ Moderately Right Skewed


• A Longer tail to the right
• Example: housing values
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

47
Histogram

■ Highly Skewed Right


• A very long tail to the right
• Example: executive salaries
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0

48
Cumulative Distributions

Cumulative
Cumulative frequency distribution −−shows
frequency distribution shows the
the
number
number ofof items
items with
with values
values less
less than
than or
or equal
equal to
to the
the
upper
upper limit
limit of
of each
each class..
class..

Cumulative
Cumulative relative
relative frequency
frequency distribution
distribution –– shows
shows
the
the proportion
proportion of
of items
items with
with values
values less
less than
than or
or
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.

Cumulative
Cumulative percent
percent frequency
frequency distribution
distribution –– shows
shows
the
the percentage
percentage ofof items
items with
with values
values less
less than
than oror
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.

49
Cumulative Distributions

■ Hudson Auto Repair

Cumulative Cumulative
Cumulative Relative Percent
Cost ($) Frequency Frequency Frequency
< 59 2 .04 4
< 69 15 .30 30
< 79 31 2 + .62 15/50 62 .
< 89 38 13 .76 76 30(100
< 99 45 .90 90 )
< 109 50 1.00 100

50
Ogive

■ An ogive is a graph of a cumulative


■ distribution.
The data values are shown on the horizontal
■ axis.
Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies
■ The frequency (one of the above) of each class
is plotted as a point.
■ The plotted points are connected by straight
lines.

51
Ogive

■ Hudson Auto Repair


• Because the class limits for the parts-cost
data are 50-59, 60-69, and so on, there
appear to be one-unit gaps from 59 to 60,
• 69 to 70, and so on.
These gaps are eliminated by plotting points
halfway between the class limits.
• Thus, 59.5 is used for the 50-59 class, 69.5
is used for the 60-69 class, and so on.

52
Ogive with
Cumulative Percent Frequencies
■ Example: Hudson Auto Repair
Tune-up Parts Cost
Cumulative Percent Frequency

100

80

60 (89.5,
76)
40

20
Parts
Cost ($)
50 60 70 80 90 100 110

53
Using Excel’s PivotChart Report

You have now seen how Excel’s PivotTable report can be


used to construct a frequency distribution for quantitative
data and how Excel’s Chart tools can be used to construct
the corresponding histogram.
However, Excel’s PivotChart report can be used to
develop a frequency distribution and a graphical display
at the same time.

54
Using Excel’s PivotChart Report

Step 1. Click the Insert tab on the Ribbon


Step 2. In the Tables group, click the word PivotTable
Step 3. Choose PivotChart from the options that appear
Step 4. When the Create PivotTable with PivotChart
dialog box appears:
Choose Select a table or range
Enter A1:A51 in the Table/Range box
Choose Existing Worksheet as the location fo
the PivotTable and PivotChart
Enter C1 in the Location box
Click OK … continued
55
Using Excel’s PivotChart Report

Step 5. In the PivotTable Field List, go to Choose Field


to add to report
Drag the Parts Cost field to the Axis Fields
(Categories) area
Drag the Parts Cost field to the Values area
ep 6. Click Sum of Parts Cost in the Values area
Step 7. Click Value Field Settings from the list of options
that appear
tep 8. When the Value Field Settings dialog appears:
Under Summarize value field by, choose Count
Click OK
… continued
56
Using Excel’s PivotChart Report

Step 9. Right click cell C2 n the PivotTable report or any


other cell containing a parts cost
Step 10. Choose Group from the list of options
Step 11. When the Grouping dialog box appears:
Enter ___ in the Starting at box
Enter ___ in the Ending at box
Click OK
Step 12. Click inside the resulting PivotChart
Step 13. Click the Design tab on the Ribbon
… continued

57
Using Excel’s PivotChart Report

Step 14. In the Chart Layouts group, click the More


button (the downward pointing arrow with a
line over it) to display all the options
Step 15. Choose Layout 8
Step 16. Select the Chart Title and replace it with
Tune-up Parts Costs
Step 17. Select the Horizontal Axis (Category) Title and
replace it with Parts Cost ($)
tep 18. Select the Vertical (Value) Axis Title and replace
it with Frequency

58
End of Chapter 2, Part A

59
Chapter 2, Part B
Descriptive Statistics:
Tabular and Graphical Presentations
■ Exploratory Data Analysis: Stem-and-Leaf
■ Display
Crosstabulations and Scatter Diagrams

60
Exploratory Data Analysis

 The techniques of exploratory data analysis consist of


simple arithmetic and easy-to-draw pictures that can
be used to summarize data quickly.
 One such technique is the stem-and-leaf display.

61
Stem-and-Leaf Display

 A stem-and-leaf display shows both the rank order


and shape of the distribution of the data.
 It is similar to a histogram on its side, but it has the
advantage of showing the actual data values.
 The first digits of each data item are arranged to the
left of a vertical line.
 To the right of the vertical line we record the last
digit for each item in rank order.
 Each line in the display is referred to as a stem.
 Each digit on a stem is a leaf.

62
Example: Hudson Auto Repair

The manager of Hudson Auto would like to


gain a
better understanding of the cost of parts used in
the
engine tune-ups performed in the shop. She
examines
50 customer invoices for tune-ups. The costs of
parts,
rounded to the nearest dollar, are listed on the
next
slide.

63
Stem-and-Leaf Display

 Example: Hudson Auto Repair


Sample of Parts Cost ($) for 50
91 78 Tune-ups
93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

64
Stem-and-Leaf Display

 Example: Hudson Auto Repair

5 2 7
6 2 2 2 2 5 6 7 8 8 8 9 9 9
17 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
8 0 0 2 3 5 8 9
9 1 3 7 7 7 8 9
10 1 4 5 5 9

a stem
a leaf

65
Stretched Stem-and-Leaf Display

 If we believe the original stem-and-leaf display has


condensed the data too much, we can stretch the
display by using two stems for each leading digit(s).

 Whenever a stem value is stated twice, the first value


corresponds to leaf values of 0 − 4, and the second
value corresponds to leaf values of 5 − 9.

66
Stretched Stem-and-Leaf Display

 Example: Hudson Auto Repair

5 2
5 7
6 2 2 2 2
65 6 7 8 8 8 9 9 9
71 1 2 2 3 4 4
75 5 5 6 7 8 9 9 9
8 0 0 2 3
8 5 8 9
9 1 3
9 7 7 7 8 9
10 1 4
10 5 5 9
67
Stem-and-Leaf Display

 Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed
to equal 1.

68
Example: Leaf Unit = 0.1

If we have data with values such as


8.6
8.6 11.7
11.7 9.4
9.4 9.1
9.1 10.2
10.2 11.0
11.0 8.8
8.8

a stem-and-leaf display of these data will be

Leaf Unit = 0.1


8 6 8
9 1 4
10 2
11 0 7

69
Example: Leaf Unit = 10

If we have data with values such as


1806
1806 1717
1717 1974
1974 1791
1791 1682
1682 1910
1910 1838
1838

a stem-and-leaf display of these data will be

Leaf Unit = 10
16 8 The 82 in 1682
17 1 9 is rounded down
18 0 3 to 80 and is
represented as
19 1 7 an 8.

70
Crosstabulations and Scatter Diagrams

 Thus far we have focused on methods that are used


to summarize the data for one variable at a time.
 Often a manager is interested in tabular and
graphical methods that will help understand the
relationship between two variables.
 Crosstabulation and a scatter diagram are two
methods for summarizing the data for two variables
simultaneously.

71
Crosstabulation
 A crosstabulation is a tabular summary of
data for
two variables and helps to reveal the
relationship between the two variables.
■ Crosstabulation can be used when:
• One variable is Qualitative and the other is
Categorical,
• Both variables are Qualitative, or
• Both variables are Categorical.
 The left and top margin labels define the classes for
the two variables.

72
Crosstabulation

 Example: Finger Lakes Homes


The number of Finger Lakes homes sold for
each
style andquantitativ
price for the past two years is shown
categorica
below. e l
variable variable
Price Home Style
Range Colonial Log A-FrameTotal
Split A-FrameTotal
< $99,000 18 6 19 12 55
> $99,000 12 14 16 3 45

Total 30 20 35 15 100

73
Crosstabulation

 Example: Finger Lakes Homes


Insights Gained from Preceding Crosstabulation
• The greatest number of homes (19) in the sample
are a split-level style and priced at less than or
equal to $99,000.
• Only three homes in the sample are an A-Frame
style and priced at more than $99,000.

74
Crosstabulation
Frequency
distribution
 Example: Finger Lakes Homes
for the
price range
variable

Price Home Style


Range Colonial Log A-FrameTotal
Split A-FrameTotal
< $99,000 18 6 19 12 55
> $99,000 12 14 16 3 45

Total 30 20 35 15 100

Frequency distribution
for
the home style
variable
75
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Excel Worksheet (showing partial
data)
A B C D E
1 Home Price ($) Style
2 1 >99K Colonial
3 2 <=99K Log
4 3 >99K Log
5 4 <=99K A-Frame
6 5 <=99K Colonial
7 6 <=99K Split-Level
8 7 >99K A-Frame
9 8 >99K Colonial
Note: Rows 10-101 are not shown.

76
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Displaying the Initial PivotTable Field List and
PivotTable Report
Step 1 Click the Insert tab on the Ribbon
Step 2 In the Tables group, click the icon above the
word PivotTable
Step 3 When the Create PivotTable dialog box appears:
Choose Select a Table or Range
Enter A1:C101 in the Table/Range box
Choose New Worksheet as the location for
the PivotTable Report
Click OK

77
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Setting Up the PivotTable Field List
Step 1 In the PivotTable Field List, go to Choose Field
to add to report
Drag the Price ($) field to Row Labels area
Drag the Style field to Column Labels area
Drag the Home field to the Values area
Step 2 Click on Sum of Home in the Values area
Step 3 Click Value Field Settings from the list of options
Step 4 When the Value Field Settings dialog box appears:
Under Summarize value field by, choose Count
Coun
Choose New Worksheet as the location for
Click OK

78
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Value Worksheet
A B C D E F G
1 Count of Home Style
2 Price ($) Colonial Log Split-Level A-Frame Grand Total
3 <=99K 18 6 19 12 55
4 >99K 12 14 16 3 45
5 Grand Total 30 20 35 15 100
6

79
Crosstabulation: Row or Column
Percentages
 Converting the entries in the table into row
percentages or column percentages can
provide additional insight about the
relationship between the two variables.

80
Crosstabulation: Row Percentages

 Example: Finger Lakes Homes

Price Home Style


Range Colonial Log A-FrameTotal
Split A-FrameTotal
< $99,000
32.73 10.91 34.55 21.82 100
26.67
> $99,000 31.11 35.56 6.67 100

Note: row totals are actually 100.01 due to rounding.

(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100

81
Crosstabulation: Column Percentages

 Example: Finger Lakes Homes

Price Home Style


Range Colonial Log Split A-Frame
< $99,000
60.00 30.00 54.29 80.00
40.00
> $99,000 70.00 45.71 20.00
Total 100 100 100 100

(Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100

82
Crosstabulation: Simpson’s Paradox

 Data in two or more crosstabulations are often


aggregated to produce a summary crosstabulation.
 We must be careful in drawing conclusions about the
relationship between the two variables in the
aggregated crosstabulation.
 Simpson’ Paradox: In some cases the
conclusions
based upon an aggregated crosstabulation
can be
completely reversed if we look at the
unaggregated
data. Before drawing conclusions about
relationships between two variables (for
aggregated data), you must investigate whether
any hidden variables could affect the results. 83
Scatter Diagram and Trendline

 A scatter diagram is a graphical presentation of the


relationship between two quantitative variables.
 One variable is shown on the horizontal axis and the
other variable is shown on the vertical axis.
 The general pattern of the plotted points suggests the
overall relationship between the variables.
 A trendline is an approximation of the relationship.

84
Scatter Diagram

 A Positive Relationship

85
Scatter Diagram

 A Negative Relationship

86
Scatter Diagram

 No Apparent Relationship

87
Scatter Diagram

 Example: Panthers Football Team


The Panthers football team is interested in
investigating the relationship, if any, between
interceptions made and points scored.
x = Number of y = Number of
Interceptions Points Scored
1 14
3 24
2 18
1 17
3 30

88
Scatter Diagram

y
35
Number of Points Scored

30
25
20
15
10
5
0 x
0 1 2 3 4
Number of Interceptions

89
Example: Panthers Football Team

 Insights Gained from the Preceding Scatter


Diagram
• The scatter diagram indicates a positive relationship
between the number of interceptions and the
number of points scored.
• Higher points scored are associated with a higher
number of interceptions.
• The relationship is not perfect; all plotted points in
the scatter diagram are not on a straight line.

90
Using Excel’s Chart Wizard to Construct
a Scatter Diagram and Trendline
 Excel Worksheet (showing data)
A B C
Number of Number of
1 Interceptions Points Scored
2 1 14
3 3 24
4 2 18
5 1 17
6 3 30
7

91
Using Excel’s Chart Tools to
Construct a Scatter Diagram and
Trendline
Step 1 Select cells A2:B6
Step 2 Click the Insert tab on the Excel Ribbon
Step 3 In the Charts group, click Scatter
Step 4 When the list of scatter diagram subtypes appears:
Click Scatter with only Markers
Step 5 In the Chart Layout group, click Layout 1
Step 6 Select the Chart Title and replace it with Scatter
Diagram for the Panthers
Step 7 Select the Horizontal Axis (Value) Title and
replace it with Number of Interceptions
. . . continue
92
Using Excel’s Chart Tools to
Construct a Scatter Diagram and
Trendline
Step 8 Select the Vertical (Value) Axis Title and replace
it with Number of Points Scored
Step 9 Right click Series 1 Legend Entry and click Delet
- - - - - - - - - - - - - - - - To Add a Trendline - - - - - - - - - - - - - - -
Step 10 Position the pointer over any data point in the
scatter diagram and right-click to display options
Step 11 Choose Add Trendline
Step 12 When the Format Trendline dialog box appears:
Select Trendline Options
Choose Linear from Trend/Regression Type lis
Click Close
93
Using Excel’s Chart Tools to
Construct a Scatter Diagram and
Trendline
A B C
8
Scatter Diagram for the Panthers
9
35
10
11 30
12
Points Scored.

25
Number of

13 20
14 15
15
10
16
17 5
18 0
19 0 1 2 3 4
Number of Interceptions
20

94
Tabular and Graphical Methods
Data
Categorical Data Quantitative Data

Tabular Graphical Tabular Graphical


Methods Methods Methods Methods

• Frequency • Bar Graph • Frequency • Dot Plot


Distribution • Pie Chart Distribution • Histogram
• Rel. Freq. Dist. • Rel. Freq. Dist. • Ogive
• Percent Freq. • % Freq. Dist. • Stem-and-
Distribution • Cum. Freq. Dist. Leaf Display
• Crosstabulation • Cum. Rel. Freq. • Scatter
Distribution Diagram
• Cum. % Freq.
Distribution
• Crosstabulation
95
End of Chapter 2, Part B

96

You might also like