Big Data and Advanced Analytics
Technologies and Use Cases
Agenda
Derhonsaeairteatywortnmetoe atin
‘Stermen cn be enone spensysuaeaene Sopra ob ores
renee iey acto and aed oar
{nina te ns scereat ecg sent one
+ Acokarttegen tat ppotby ta
1 eplatone inn dm weetnaryentcmnetante cine oappat
Sh maacnestnape
+ hss eae ne Donets oy go bbwsThe Evolution of Digital Data
es xy fis. cay pat
i dneetng Dt tums dae
Data Growth: Choose an Analyst!
Big data growth
erty
raises
it
ce
ETData Growth: Multi-Structured Data
‘Demande hethas unin, ened
eromipongscheras|
‘Seal company aa
‘atv ees este an acres dla
‘sun notiregmiecinon eta warehouse
ceeeecantsata
ata reaeie moses oe anes
Data Growth: Big Data
Big data technologies apply to
all types of digital data not ust,
‘mullhstructured data
“Bigs 2 relative term and is
diferent for each organization
‘and application
What you do with big data and
hhow you use itfor business.
benef should be the main
consideration ~ analytes play
a key role hereThe Value of Data: IBM 2012 Study
‘The Value of Data: IBM 2012 StudyThe Changing World of BI Analytics
Advanced Analytics
isa a peechve sme
+ Hectan feleeg ae sang
+ Spenioaitins wih enbeasee 81
Data Management
+ dnp eon dabase tens ht
te rovve prceberaance te
Breit enayte clone
+ memento igh peromance
once es 0 5 Hc tr
ange yestAdvanced Analytics Example: SC Digest 2012
ye apy hl, you se
‘espe snes to bebe ander yar tea
Gleand patos to une edit os ua
‘Blenedianh snate aera nena Sane eh
‘ne
Pode sts ung dt ope enon
Ptr, Msie canon assole wi secs he
Sony cam yu ee preter nye ofa Ms
‘tena cali ci
0
Prose antes —usng doseage he eptma
Sel eanony ansaid theta It
‘besos or. you uo pesreve ones oat
nectar ite eee oan teers
The Role of the Data Scientist: CITO Interviews.
oa scents un 0g iano tig va, eon pres at cen was,
srdinigtnal tors tsbeos cers,
‘Seow min stare en: above ta cet nee tobe 20
(Seay ant sng corms see
“ane ular, Pasa Sct cn
“dala sens sraove no can can. sub, explore, motel ans nwt
ita tevangnecing Sabi and racine teeming Data Seto ey ae
‘Seat wong in cata bl opecte da fol at wan ras
‘ron Cet Soni ay
ids Sedasohae nessa’ sls eae vue ane ae ae
‘Sesandresesina aaa ge uence
Sh faue, Pel Enger ArarncaData Science Skills Requirements
esieas domain subject ater expo
‘wth stong nail ss
Ceatity and good cammuniatens
Krowadgeatein sates machine
erring and ta vvatzaton
iletodevlo data ana solitons
ting medesngsnaves rsthods and
‘enguages suchas MapResue, R
SAS, ee
Adept at data engneeing, ung
‘Secovenng a mashngblending Ice
‘mts of dats
{shi one person era war of specialists?
Data Science: Further Reading
enieony
irae lteeaWhat Then is Big Data?
Represents anayte ana data
crass nani taooin at
“Set cfoverapping teenager that
‘sable cosorarso deploy aay
syste antiga ste spect
Bisness needs ans workoace
‘optnizaton may involve egcovingBig Data and Data Life Cycle Management
(Data Management and Analyte Performance
* Cepaety annie
+ Managing data warehouse growth
+ Aaalyc perfomance menagemert and
catmston
+ Sanvce lve agreaments
ala Governance
+ Secuty: user access, enti,
rmestog.
+ Quality governetungovenes ta
+ Backup ard recovery
+ tong and reterton hse a
analy, complance
‘The Impact of Big Data on the Data Life Cycle
‘acento ie atin EON arose aan colton
“ology te ng oma ems tne,
‘eed cnn nn consar nenaThe Extended (or Logical) Data Warehouse
—> <<)
Beyond the EDW: Optimized Platforms
rece 2 a rece
ranaronaneety desig col
co ea”
ae eeeOptimized Analytic Platforms: Variables
‘Analyics Required to Meet Business Needs
“+ Corpoity~reporing, OLAP or advanced aalics
1+ Agity = atoncy ta, analice, decisions, ecommenatins
sndacione
+ Werkosd it~ complonty of everal anahi woroad
‘concert data moateston
‘Data Requted to Meet ausiness Needs
“Volume amount of data to be managed
+ Volo rate of data generaton or change
+ Vary ~ ypesct data be managed
Compost number of datasources and relatonships itty
Optimized Platforms: Analytic ROBMSs
xing traditional ROBMSe with features designed specifically
‘or analy processing and new analytic techniques
Hardware expotation
Parallel computing
New datatypes
New storage sructures
Data compression
Suppor fr nyora storage
Intetigent worttoad management
Irememory dota
Irememory analytics
In-database aggregation & analytes |Analytic RDBMSs: Hardware Exploitation
Faster processors
Nut coe processors
Ireagertnardware
t-te memory spaces
Large capacity dk ives
Fastharsdk and slate doves
Hybrid storage contguations
‘Scale-iplot paral processing comipratons
Loran hardware haces eters)
Reduced power anc coo requrenenss
Packaged herdnaretetae appliances
Analytic RDBMSs: New Storage Structures
‘US vendors are ening new pyc!
Strageshutresto prove promance,
Feduce siege urements ana support
ew pest aalser
amps compressed coun, 2,
Inpreraton vary by wena
Ins mporant recognize tat phys!
erage stuctres she be nependent ot |
tetera noe anathe at
manipulation arguage ML)
+ ten rorwe ornenseatoral eeAnalytic RDBMSs: Data Storage Options
Largecaoncty hao ies 100)
‘Sesto (S505)
Hi andconsut promos
trey nae ry fen
imam RAM (ORAM)
Memory versus Storage
Memery
+ Data act adeessabie by CPU va 2 memory bus
+ linnstos iO ovehoad and provides fast acess to deta
+ Types feos mem
> Freesor coco) —er et vate tn
> pane RAM fl naozeene) vai ta
Storage
+ Dalai adhessble vi advice irterconnct x nebo pote
+ Several pes of storage for pesising cat
> Canmadty HOD: hgh eapacy area ow cs (9, SATAY
> spree HO: more elo, note cose. SAS)
1 MAND ts ony dees every ela ne cst ea, PCED,
(tern 0 fash meng ay hyd)‘Whats In-Memory Computing?
A worn whet a he dat ng
froccosat beated ne camper
fremont direst sceaate
(aime Crus nemoy is
Proves non speed peromance
teroLt an Slwotrade by
tiring VO trae ces
‘The Changing World of BI Analytics
Advanced Anais em e001 a5
clara peeves
fe eee se Ysieng new
+ Glens vn esses
Big Data Nanagement em 209 a2
* si ne case items ht
ter morves preebetarance one
‘Brose oeyte rales
+ ncstonst ste uch Heo
Aang yest
+ Shnamnoceenaantes tr aniens‘Why In-Memory Computing for BI Analytics?
Benet
+ Testnology answor Improved spoed and
performanes, e.g. queiy run complex
halos. bef
+ Business answer What #you couls
don? 29, feartme frau detection
Considerations
+ Types of inemany data and
Inmermery anavics ana hor benels
+ Relatonshipto in-datatase processing
2g, Incatabace aggregation and n-
fatabaseanatis
In-Memory DataIn-Memory Database Systems: Vendor Examples
ae
Shah tpmase
Ir nde set Pere wie
ener Data et
Chats aon Cant, Mw,
In-Memory Data: Important to Note
ITPro impact:In- Memory
Analytics Databases
(ewer email mat
srinmamor atabseIn-Database Technologies
In-Database Agarepation
+ Sme RDBMS suppot pre aporegaton to enhance perormance
‘+ Should be tansparertto user opiize deciles when use
sasrogste
+ Vavous nrres—rratorsized views, materszed quer bie, et,
In-Database Anais
‘+ Beng the processing othe data rather than the data the
proses
+ Consis primary of precefned arate functions — created by
RDBMS vendor, hepa ven, open Sout comma, ee
eveopes
In-Database Analytic Functions
‘Algal function stored nan ROBMS ofr several benefis
Usa. la lett en nen ound a func dows
‘odo use they drat heetoknow hw to cetiep ers
Imperato undestenbe vet psa ecessng end how teens
Ian 09 atarote ROBND.n ROBIE ptsed money
‘Several approaches to using incatabase functions
"+ ROBMS tut ston artnet, ng, easel Ancor
Functens prvi Spay vendo. eg. Fuzz.
+ Open sere unten, 29, Apache an,
eserves ncn sg neAnalytic RDBMSs: Vendor Examples
Str opase es
er Satons
tes operon
‘hatin Recount Ace
This ate Damar peice a tga rie pence
Nery aae Rosato
Data Warehouse DBMSs: Gartner 2013 Ma.Optimized Platforms: Non-Relational Systems - 4
‘Several nem companies developed thi oun
nonelatonal (NoSak o* NewS) syste to
Sipportentere data wus
+= Googe earl Goa be sytem, Mio Reaic,
Setanie Sudven
+ Mapacanane gocermactinse nest
+ Serle ese elpmen’ une [
Denney oie pen sauce amma |
Nonelstiona peters ao not new, bat moder
‘ersons ste tan apen source
+ Depljedon ion cstwntetaxharvarein 2
+ Seva ype tte aaa srs a
+ Keynes reais Hato
Optimized Platforms: Non-Relational Systems - 2
Many types of products, APIs and angusges
ee
= = Eo
= 38
Can handle vaseties of data and processing that ae ctteut
‘to support using atractonal ROBMSOptimized Platforms: Non-Relational Systems - 2
any types of products, APIs and languages
“A tanec tigen 0 ge
PORN etna
‘Setapote hoster”
oe bt ten 8
possnppetmane
otece
eles rom ne a,
ae eters came a ace cute,
"tenons apWPirzslamD components
‘The Hadoop Ecosystem
Steen a View me OtData Management: Hadoop Option
eApaymen tice rsp
= ame
=
Somme uc opaae = armen noon en
Data Management: Relational vs Non-Relational
‘Gventhe number at cptons and fst changing maretplace
comparisons are iit
Foausis on anaytc ROBNS vreus Hasop HOFS - DEMS versus
Ine system, whichis an “spies to ranges comparson
‘Ata ngheve an analyte DEMS sulted 0 compos teractve
‘ernads and Hadoop HOPS or batch pocesing ot mutoctued
From an Hadoop perspeciv, HBGEe is becoming mre important, but
Taek of SaL support ian nor
He sper fr HES ceeepmentb ts usebatch ap
Resse
CCacern setong pas win upon ive SOL eae bt
Sins soa bt HOPS ard MBase
\ibicad sutaity and prlermance ate inetant but development
‘and administration or. ar tool support ate ako Key consider abonsData Management: Language ConsiderationsOptimized Platforms: Cloud Option
“ant ocre ge rab pee hen
nana vce COs cheesy abtecsvent
(Soge Chott Mysore gay sce
Kear
No Reon: 8 Oma
“eonb pane, See Euscoe
enon remesiTntsiucure wasiao oleae tens
‘eras moan Ga espera senounoociess ness
Rapidy ing anny ite cele recuterens
Ine toate tester operate
Feoesone ppestns es operate
Spent sees scoss many eel Amazon dl cetera regons
{oemancerenbiy ae vay
Dif serie nvm ae endony ike fn tetra ie
‘ciromant sn corte ataingin the ce fa sue are
‘Sons: Ne onge ts aproch becuse ecoizetnat ne hae of
‘bonnes reqed erent way fag tin
tea ti sl ccteitecntaroun ed GO HepioeSummary: Big Data Benefits
Traian eaaaee
Choosing the Right SolutionUse Cases and Application Examples
wc oneston Exe
seme rermenceiap means
Software Selection: Some Key Options
bis menagoment ————+-— 5 —+Use Cases and Technologies
Example Telco Provider: Real-Time Embedded BI
e089 5
3 isms
5 >
‘Sz, 5 =Example FESTSRINES BE] metascale
‘549 bllon etal organization wth over 4000 stores (Sear an Kran
Nurerousegacy systems wth appcatons writen in COBOL ae
‘some over 100 ion ines)
unig ou of cspacty butt he ourent cot of $3K 57 par MIP pr
Yt anatet scion neon fun
Requtements
* Reoiee tna rmoar stn reauee ard ETL
+ Reaice anal pacesaing tes end route ie hy ras
+ Caps east at ette ansacton (POS dala, wed 1,
Spiro vents eto says
Sconol SSMS wetsscie
‘Solution: Hadoop Data Hub and Analytics Accelerator
Esnancd pricing oppostion
lsu esau 10 fe sen ais be EDM eng odes
‘fren king weet foseupand un
+ Ric pete sn nee
+ rina mds an now Be rn wok ry request)
Impedance
teeing open waa sable ey ane abel othe cite
Coase aayana
+ Repleed 600 COOOL sppteston wh 0 ies et Pig and Jr LOS
Frente Ses
+ Agteson cn tow bean ile tenes pt yt leper
Fev casoer reais Epa canpeler sichas azo
Josey based co 100% eFScones STEIN waste
‘Solution: Hadoop Data Hub and Analytics Accelerator icont,
Reduce tine trun bleh appr
*Esang mentee woo 2 nut was bearing nse ot
‘tbh prongs sant 00min ros of ca
Baht we rel Fg an ni Fo oe Ke
Redice tne tun alc an ineratve Bl appeaons
"Ens bach ancineracive 8 ppeabons wee aig lng orn
Stdsolarerde ey osibet oe ae
“Dats fom eer 20 saccesnow sere pane on Hcoop
‘ater use fortes
Pissed ETL ander resng evel Eee
Sono SESE wetsscie
‘conclusions
Pleased wit Hadoops ability trun eneprie werkbads = eras
etal cata to be stored and aries io be ran tat wer nt
Previous posse
adoops only one component the BOW ecosystem and strategy
Hadoop rears sinfeart education anainplomenaton eft andi
Tacong tls frente itegraton
New Sears subsisay (Metacale formed to help enterises integrate
‘nating estore wh Hacoopbeeaura "75% of CEOs and Cis dot
‘ten know wht opExample: International Bank Trading Desk
‘Solution: Analytic RDBMS as an Analytics Accelerator
‘Ths intrstional bank fre wie ange of
evean ois over a0 lion osorers
(One of the bank's ding desis uses me
pplanes for an analyte soliton tat andes
‘head ho analyst ions fom of
Monten leading was reduce rom days to2
Key customer queis were rediced tom 3-4
Gayeto avout Finer
‘plane was treated a a lack box bythe IT
group for complance reasons
Example a
Solution: Analytic ROBMS for New LOB Application
Meszaltat aden lin dat