You are on page 1of 3

Math 205

Project: Crash Test Dummies


Spring 2014

Purpose: The purpose of this lab is for you to demonstrate what you have learned this
semester in Statistics and in R. The project is more open-ended than the labs. You are
going to have to develop some of your own questions to pursue, using the rich data set
provided.

Data: The data set consists of crash results using Crash Test Dummies. An Excel file
accompanies this project. There are 352 cases and 14 variables. Their descriptions follow.
There are 4 potential response variables listed as variables 5-8. Each of these is
quantitative. Many of the remaining would make good independent variables for
associating with the response. Some are categorical; some are quantitative.

Crash Test Dummies
Reference:
National Transportation Safety Administration
Description:
Data based on trials in which stock automobiles are crashed into a wall at 35MPH with
dummies in the driver and front passenger seat.

Number of cases:
352
Variable Names:

1. make: Car make
2. Model: Model of that car
3. carID: Usually the combination of make and model
4. carID_&_Year: Full ID of the car
5. Head_IC: Head injury criterion
6. Chest_decel: Chest deceleration
7. L_Leg: Left femur load
8. R_Leg: Right femur load
9. D/P: Whether the dummy is in the Driver or Passenger seat
10. Protection: Kind of protection (seat belt, air bag, etc.)
11. Doors: Number of doors on the car
12. Year: Year of the car
13. Wt: Weight in pounds
14. Size: A categorical variable to classify the cars to a type (light, minivan)


Requirements: You are to conduct an analysis on the data. It is far too rich a data set to
explore completely in a single project, but you should be able to pose and answer a
handful of interesting questions using graphical displays, statistical summaries, tests,
confidence intervals, or models. For example, a simple question would be to ask, Is
there a difference in Left femur load according to car make? Such a question could be
explored graphically with parallel boxplots of Leg over the different makes. It could be
explored analytically by choosing just two makes and conducting a t-test for difference in
means. If you want to read ahead to one-way ANOVA, you could conduct an F-test for
the same question with the comparison being among all car makes. A more complex
question might be Does the effect of car weight on head injury criterion differ according
to whether or not seatbelts were used? For that question, you might create a regression
model for head injury criterion as a function of car weight for each seatbelt situation. We
havent studied how to compare regressions, so your comparison would be somewhat
subjective, but the individual interpretation of each regression model we know how to do.

(1) Choose a minimum of 5 questions to explore.
a. Clearly state what association/relationship you are exploring.
b. Use R, Minitab, or JMP to explore the data and provide any graphs and
analysis.
c. Provide neatly summarized output and graphs in the report.
d. Interpret your results.

(2) You must include, somewhere in your investigation the following techniques.
You may use some more than once. Dont expect to use every technique for every
question.
a. Scatterplot, with model line/curve included in the graph.
b. Boxplot
c. Bargraph or Pie Chart
d. Table
e. Summary statistics
f. Regression model
g. Confidence interval (t or z)
h. Hypothesis tests (one of each)
i. t or z
ii. Chi-square (Contingency Table)
iii. F

Paper: Your paper should include the requirements above, be approximately 10 pages in
length (including statistical output), and be prepared with the following structure.

(1) Introduction (brief overview of the data set and types of questions)
(2) Explorations
a. Question 1 (This can be a couple of related questions)
i. Associations you are exploring
ii. R results
iii. Interpretation
b. Question 2
i.
(3) Summary of most significant findings (text only, one paragraph)
(4) Future Investigations (What would you ask next and how would you answer it
with this data?)

Grading Rubric:
(1) For questions (80%)
a. Clarity of question investigated (25%)
b. R investigation appropriate use and presentation (50%)
c. Clarity and correctness of interpretation (25%)
(2) For paper (20%)
a. Well-written. Free of grammar errors, etc. (50%)
b. Correct format (50%)

Note: You may work in groups of 2.

Due: May 19, 2014.

You might also like