You are on page 1of 8

Math 1040 Skittles Term Project

The Skittles Term Project began by the students from my class going to the store to
purchase a bag of 2.17 ounces of Skittles candy and counting how many of each color was in the
bag (we were not to count the partial candies). After counting the candies we submitted our
results to Professor Southerland where he compiled the information to give us a classroom total
of the number of candies including their color totals. From the information given we are to
create a Pie chart and a Pareto chart for the number of candies of each color.

Organizing and Displaying Categorical Data: Colors

My results:
Proportio
Number
n
13
0.213
12
0.197
18
0.295
11
0.18
7
0.115
61
1

Color
Red
Orange
Yellow
Green
Purple
Total

Classroom Results:
Color
Red
Orange
Yellow
Green
Purple
Total

Number
292
307
299
297
317
1512

My Proportion Results
purple

12%

21%

orange
yellow

20%

green
red

18%
30%

Classroom Proportion of Skittles of different colors


purple

19%

21%

orange
yellow
green

20%

20%
20%

red

My Proportion Results
0.3

0.21

0.2

0.18

0.12

purple

orange

yellow

green

red

Classroom Proportion of Skittles of different colors


0.21
0.2
0.2

0.2
0.19

purple

orange

yellow

green

red

As I look at the pie chart it looks like overall the color of the Skittles is fairly close in
numbers. Orange, Green and Yellow come in at 20%, where purple is high at 21% and red
comes in last at 19%. So the Skittles company overall tries to make all the bags with the same
amount of color. If we compare the proportions of candies of different colors in my bag and the
classroom proportion the yellow and the purple candies show significant differences.

Organizing and Displaying Quantitative Data: the Number of Candies per Bag
Number of bags in the sample: 25
Total Candies in my bag: 61
Class Mean: 60.5
Class Std. Dev.: 1.76

Five-Number
Summary
Minimum
First Quartile
Median
Third Quartile
Maximum

5
5
6
0
6
1
6
2
6
3

Skittles Boxplot

Boxplot

50

52

54

56

58

60

62

64

Skittles Histogram

Histogram of total number of candies


8
7
Frequency

1
55

56

57

58

59
Bin

Frequency

Total
Candies
50
52
54
56
58
60

Frequenc
y
0
0
0
1
2
9

60

61

62

1
63

66

68

70

62
64

12
1

From the above boxplot and histogram graphs the distribution appears to be skewed to
the left. Yes the graphs reflect what I expected to see based upon the data entered. The
maximum number of 12 bags contain around 62 candies, the next 9 bags has around 60 skittles,
the next 2 bags contained 58 candies. The last two bags contained a minimum of 56 candies and
the maximum has 64 candies. My total number of candies matched up with the actual median
which was pretty cool. The total number of candies in my bag were 61 and the mean of the
number of candies for the class size of 25 was equal to 60.5 which is pretty close to each other.
The yellow and purple candies were significantly different from the class average for the same
color candies, but the other colors are matching pretty close to each other.
The histogram and the boxplot shows a distribution that shows approximately normal, but
skewed to the left with the outlier of 55.

Quantitative vs. Categorical (qualitative) Data


Categorical data are quantities that you can count and Quantitative data are quantities that
you can measure. For quantitative data you can calculate the statistics such as mean, median and
standard deviation and range, but for categorical data you cannot do these calculation and may be
only able to get frequency data. When deciding what graphs not to use for quantitative data
consider not using: bar charts, and pie charts. They do not make sense for Quantitative Data,
only, Histograms, Stem and Leaf, Box Plots, Line Graphs and XY Charts. When deciding what
graphs not to use for categorical data consider not using: histograms, stem and leaf, box plots,
line graphs and XY Charts because they do not make sense for categorical data, only pie charts,
and bar charts. Categorical data and quantitative data measure items differently and would need
different charts to express their points.
(See the 95% C.I. scan)
(See the 99% C.I. scan)
(See the 98 C.I. scan)
In the 95% C.I. we looked at the true proportion of purple candies,
comparing it against the true value of the population parameter. So we are
95% confident that the true proportion of the population is contained
between .2 and .220.

In the 99% C.I. we compared the true mean number of candies per bag
compared to the classroom total. So we are 99% confident that the true
mean number of candies per bag is contained between 59.5 and 61.5
In the 98% C.I. we estimated the standard deviation of the number of
candies per bag against our classroom total. We are 98% confident that the
true value of the standard deviation of the number of candies per gab is
contained between 1.31 and 2.61.

Hypothesis Testing: is a procedure for testing a claim about a property


of a population.

(See Hypothesis Test .01 scan)

(See the Hypothesis Test .05 scan)

There is not enough evidence to reject the claim that 20% of Skittles
candies are green. There is a .01 probability of making a Type I error.
We have enough evidence to reject the claim that 56 Skittles is the
mean number in all the bags in the classroom. There is a .05 probability of
making a Type I error.
Reflection
For the C.I. test the requirements of normality is not met since we dont
know that the population itself is not normally distributed. The sample is less
than 30. The histogram is not normal and skewed to the left. There is one
outlier that may affect the results. The C.I. will not be reliable. The sample
should be a simple random sample.
For Hypothesis Testing the requirements that it needs to be a simple
random are not met.
np is less than or equal to 5
(25)(.02) = 5
(25)(.08) = 20

nq is less than or equal to 5

N is less than 30 so the requirements for normal distribution are not


met. We can assume that binomial distribution can be approximated by a
normal distribution.
For testing of the standard deviation interval the strict requirement of a
normal distribution is not met. This will affect the C.I. for standard deviation.
The requirements for simple random is met.
The sample size is small, so the more the merrier. There can be
counting errors by miscounting. Since the sample distribution is not normal
it can create errors in the result. There is a possibility that we can make a
Type I and Type II errors.
A larger class size would give a better estimate of samples for the
sample size to be improved.
The C.I. contain the population estimate and we are 95% confident that
the true value of the population proportion for purple candies are contained
within .189<p<.231.
The C.I. contain the population estimate and we are 99% confident that
the true mean number of candies per bag is contained within 59.5<u<61.5.
The C.I. contain the population estimate and we are 98% confident that
the standard deviation of the number of candies per bag are contained
within 1.315<o<2.617.

You might also like