Professional Documents
Culture Documents
You have spent that last 10 weeks learning about data and representations:
boxplots, scatter plots, linear regressions, outliers, influential points, and correlations. In
this data-saturated world, its really important to understand and apply your knowledge
to analyzing actual data. Weve done this in class: you collected some data (kneeling
heights and arm spans, for example) and you looked at some data about manatees, flight
times, growth rates. But, except for the data you collected, someone else scoured the
interwebs and books to find and organize that data for you. Data isnt always clean like
that. It doesnt always come in nice little matched up pairs. Someone has to do that work.
Now its your turn.
Your Product
Your analysis should at least have
a table of your selected data and a paragraph explaining what the variables
represent
a box plot of each variable complete with a description (remember SOCS - shape,
outliers, center, and spread)
a comparison of the variables by describing similarities and differences of the box
plots
a scatterplot of your data
the equation of the best fit (regression) line
an interpretation of the meaning of the slope in the context of your data
some predictions based on the linear model (using the equation to find a missing
value)
a thorough analysis about outliers and influential points (removing an outlier from
the data to see if it is influential)
https://ourworldindata.org/happiness-and-life-satisfaction/
https://ourworldindata.org/internet/
Peoples access to information results in less innocent happiness and more stress.
Statistics/Human Evolutionary Anatomy
Data Description, Analysis and Evolutionary Significance
T2 2016-17
Internet users:
Internet users: bimodal
Happiness_levels: bimodal Median: 250.313
Mean: 252.453
Mode: NA
Happiness levels:
Median: 7.161
Statistics/Human Evolutionary Anatomy
Data Description, Analysis and Evolutionary Significance
T2 2016-17
Mean: 7.1702
Mode: NA
7. What do these center, shape, spread, outliers, and plot(s) tell you about the data.?
Both histograms are bimodal although there are several modes in the data
8. Scatterplot of the data, with linear regression line (line of best fit).
Statistics/Human Evolutionary Anatomy
Data Description, Analysis and Evolutionary Significance
T2 2016-17
Happiness_levels_usa has two outliers, possibly because of an event in one year that
caused peoples happiness to spike or drop at a larger than usual amount.
(with outlier)
Statistics/Human Evolutionary Anatomy
Data Description, Analysis and Evolutionary Significance
T2 2016-17
(without outlier)
Part D: Reasoning/Conclusion
Now its time to present the story of the data in a clear way.
12. Provide at least two pieces of evidence & reasoning that support your chosen
claim.
a. First piece of evidence:
b.
Statistics/Human Evolutionary Anatomy
Data Description, Analysis and Evolutionary Significance
T2 2016-17
Reasoning:
Over time, as people gain more access to the internet and it becomes a human right,
it is possible that their ignorant bliss is taken away by the constant news that finds its
way to everyone through social media.
Reasoning:
Here you can see that as the internet access levels in North America per year increase,
the happiness levels of US citizens decrease.
13. Write a conclusion that supports your claim, and speaks about the
evolutionary significance.
Overall, people that have more access to the internet tend to be less happy, which tells
us that we should look more into what pushes us as humans to want that access so
much, since it is not benefitting us in the way of content.
Statistics/Human Evolutionary Anatomy
Data Description, Analysis and Evolutionary Significance
T2 2016-17
Humans probably have a genetic predisposition to use screens, and this data tells us that
screen use ultimately affects our happiness negatively.
Further study of the connection between humans and causes of screen time addiction is
clearly warranted.
Statistics/Human Evolutionary Anatomy
Data Description, Analysis and Evolutionary Significance
T2 2016-17
Statistics Represents data with appropriate Represents the data in a new and
& plots for single variables and also for interesting way that addresses a
Probabilit paired data question of your choosing
y
Representing Accurately describes the distributions Demonstrates a deep
& Describing of the data being studied understanding of the content
Data through a thorough analysis of
Interprets linear model, including the the data
correlation coefficient and making
predictions