Professional Documents
Culture Documents
C O UR S E : D A T A M I N I N G
10/26/2016
Chapter 2
DATA EXPLORATION AND REDUCTION
Table of contents
1. Have a look at data
2. Explore individual variables
3. Explore multiple variables
4. More explorations
5. Data reduction
SECTION I
HAVE A LOOK AT DATA
churn[1:10,"Day.Mins"]
churn$Day.Mins[1:10]
SECTION 2
EXPLORE INDIVIDUAL VARIABLES
10
11
var(churn$Day.Mins)
sd(churn$Day.Mins)
quantile(churn$Day.Mins)
quantile(churn$Day.Mins,c(0.1,0.5,0.65))
12
Type ?hist
breaks?
probability?
xlab?
ylab?
13
plot(density(churn$Day.Mins))
plot(density(churn$Day.Mins))
14
3010
yes
323
15
barplot(table(churn$Int.l.Plan))
16
SECTION 3
EXPLORE MULTIPLE VARIABLES
17
[1] 19.45318
cor(churn$Eve.Mins,churn$Day.Mins)
[1] 0.007042511
18
19
20
21
22
ifelse(churn$Churn=="True.","red","blue"))
# Adding legend
legend("topright",c("True.","False."),col =c("red","blue"),pch =
1,title = "Churn")
23
pairs(~churn$Day.Mins+churn$Eve.Mins
+churn$Night.Mins)
24
aggregate(churn$Day.Mins~churn$Churn,mean,data=churn)
25
26
27
Library(descr)
crosstab(churn$VMail.Plan,
churn$Churn,type="f",addmarg
ins=T)
28
29
30