Professional Documents
Culture Documents
Index
253
254 INDEX
E frequentists, 114n14
economy of scale, 199 functional dimension analysis
Edison, Thomas, 60 (FDA), 219
Empirical Model Building and future
Response Surfaces (Box), 57 of big data, data mining and
ensemble methods, 124126 machine learning, 233241
enterprise data warehouses development of algorithms,
(EDWs), 37, 58, 198199, 238241
217 of software development,
entropy, 105 237238
Estimation of Dependencies Based on
Empirical Data (Vapnik), 107 G
Euclidean distance, 133 gain, 69
exploratory data analysis, Galton, Francis, 75
performing, 5960 gap, 135
external evaluation criterion, Gauss, Carl, 75
134135 Gaussian distribution, 119
extremity-based model, 15 Gaussian kernel, 112
generalized additive models
F (GAM), 83
Facebook, 7 generalized linear models
failing fast paradigm, 18 (GLMs)
feature films, as example of about, 8486
segmentation, 132 applications for big data, 8990
feedback loop, 223 probit, 8689
file crawling, 176 Gentleman-Givens
file system computing, 3739 computational method, 83
financial services company case global minimum, assurance of, 98
study, 197204 Global Positioning System (GPS),
Fisher, Sir Ronald, 98 56
forecasting, new product, Goodnight, Jim, 50
152153 Google, 6, 10
Forrester Wave and Gartner Gosset, William, 80n1
Magic Quadrant, 50 gradient boosting, 106
FORTRAN language, 4445, gradient descent method, 96
46, 49 graphical processing unit (GPU),
fraud, 16, 152 3031
258 INDEX
H I
Hadoop, 7, 3739, 38f,
8 40f,
0 45 IBM, 7, 239
Hadoop Streaming Utility, 49 identity output activation
hardware function, 94
central processing unit (CPU), incremental response modeling
2930 about, 141142
cost of, 36 building model, 142143
graphical processing unit measuring, 143148
(GPU), 3031 index, 177
memory, 3133 inference, in Bayesian networks,
network, 3334 122123
storage (disk), 2729 information retrieval, 176177,
health care provider case study, 183184, 211, 245246
205214 in-memory databases (IMDBs),
HEDIS, 207208 37, 40f
0
Hertzano, Ephraim, 129130 input scaling, 98
Hessian, 9697, 96n11 Institute of Electrical and
Hession Free Learning, 101 Electronic Engineers
Hickey, Rick (programmer), 46 (IEEE), 6
hierarchical clustering, 132, 138 internal evaluation criterion, 134
high-performance marketing Internet, birth of, 5
solution, 202203 Internet of Things, 6, 236237
high-tech product manufacturer interpretability, as drawback of
case study, 229232 SVM, 113
Hinton, Geoffrey E., 100 interval prediction, 6667
Learning Representation by iPhone (Apple), 7
Back-Propagating Errors, 92 IPv4 protocol, 7
Hornik, Kurt IRE, 208
Multilayer Feedforward iterative model building, 6061
networks are Universal
Approximators, 92 J
HOS, 208 Java and Java Virtual Machine
Huber M estimation, 83 (JVM) languages, 5,
hyperplane, 107n13 4446
Hypertext Transfer Protocol Jennings, Ken, 182
(HTTP), 5 Jeopardy (TV show), 7, 180191,
hypothesis, 234 239241
INDEX 259
K MapReduce paradigm, 38
Kaggle, 161n3 marketing campaign process,
kernels, 112 traditional, 198202
K-means clustering, 132, 137138 Markov blanket bayesian
Kolmogorov-Smirnov (KS) network (MB), 120
statistic, 70 Martens, James, 101
massively parallel processing
L
(MPP) databases, 37, 40f,0
Lagrange multipliers, 111
202203, 211213, 217
language detection, 222
Matplotlib library, 49
latent factor models, 164
matrix, rank of, 164n1
Learning Representation by
maximum likelihood
Back-Propagating Errors
estimation, 117
(Rumelhart, Hinton and
Mayer, Marissa, 24
Williams), 92
McAuliffe, Sharon, 20
Legrendre, Adrien-Marie, 75
McCulloch, Warren, 90
lift, 69
mean, 51
likelihood, 114
Medicare Advantage, 206
Limited Memory BFGS (LBFGS)
memory, 3133
method, 9798
methodology, for building
linear regression, 15
models, 5861
LinkedIn, 6
Michael J. Fox Foundation, 161
Linux, 237, 238
microdata, 235n1
Lisp programming language, 46
Microsoft, 237
lobular carcinoma, 2
Minsky, Marvin
locally estimated scatterplot
Perceptrons, 9192
smoothing (LOESS), 83
misclassification rate, 105
logistic regression model, 85
MIT, 236
low-rank matrix factorization,
mobile application
164, 166
recommendations case study,
M 225228
machine learning, future of, model lift, 204
233241 models
See also specific topics creating, 223
Maimon, Oded methodology for building,
Data Mining with Decision Trees, 5861
103 scoring, 201, 213
260 INDEX