Professional Documents
Culture Documents
By Frank Ohlhor st
Copyright 2013 by John Wiley & Sons, Inc.
CHAPTER
21
THE CA SE FO R BIG DA TA
Th is in clu des th e
drivers of th e project, h ow oth ers are u sin g Big Data, wh at
bu sin ess processes Big Data will align with , an d th e overall goal
of im plem en tin g th e project.
It is often dif cu lt to qu an tify th e ben e ts of
Big Data as static an d tan gible. Big Data an alytics is all abou t th e
Teradata, IBM, HP, Oracle, an d m an y oth er com pan ies h ave been
offerin g terabyte-scale data wareh ou ses for m ore th an a decade, bu t
th ose offerin gs were tu n ed for processes in wh ich data wareh ou sin g was
th e prim ary goal. Today, data ten d to be collected an d stored in a wider
variety of form ats an d can in clu de stru ctu red, sem istru ctu red, an d
u n stru ctu red elem en ts, wh ich each ten d to h ave differen t storage
an d m an agem en t requ irem en ts. For Big Data an alytics, data m u st be
able to be processed in parallel across m u ltiple servers. Th is is a n ecessity,
given th e am ou n ts of in form ation bein g an alyzed.
In addition to h avin g exh au stively m ain tain ed tran saction al
data from databases an d carefu lly cu lled data residin g in data ware-
h ou ses, organ ization s are reapin g u n told am ou n ts of log data from
servers an d form s of m ach in e-gen erated data, cu stom er com m en ts
from in tern al an d extern al social n etworks, an d oth er sou rces of loose,
u n stru ctu red data.
Su ch data sets are growin g at an expon en tial rate, th an ks to
Moore s Law. Moore s Law states th at th e n u m ber of tran sistors th at
BEYO N D HADO O P