Professional Documents
Culture Documents
By Frank Ohlhor st
Copyright 2013 by John Wiley & Sons, Inc.
CHAPTER
Security,
Com pliance,
Auditing, and
Protection
63
im prove secu rity. However, th ese secu rity tech n iqu es all carry a pro-
cessin g bu rden th at can severely affect perform an ce.
Th e fou rth caveat is liability. Accessible data carry with th em
liability, su ch as th e sen sitivity of th e data, th e legal requ irem en ts con -
n ected to th e data, privacy issu es, an d in tellectu al property con cern s.
Adequ ate secu rity in th e Big Data realm becom es a strategic bal-
an cin g act am on g th ese caveats alon g with an y addition al issu es th e
caveats create. Non eth eless, effective secu rity is an obtain able, if n ot
perfect, goal. With plan n in g, logic, an d observation , secu rity becom es
m an ageable an d om n ipresen t, effectively protectin g data wh ile still
offerin g access to au th orized u sers an d system s.
CLA SSIFYIN G DA TA
Big Data often presen ts th e worst-case scen ario for m ost backu p
applian ces, in wh ich th e w orkload m ix con sists of billion s of sm all
les an d a sm all n u m ber of large les. Fin din g a backu p solu tion th at
can in gest th is m ixed workload of data at fu ll speed an d th at can scale
to m assive capacities m ay be th e biggest ch allen ge in th e Big Data
backu p m arket.
BIG DA TA A N D CO MPLIA N CE
Com plian ce issu es are becom in g a big con cern in th e data cen ter, an d
th ese issu es h ave a m ajor effect on h ow Big Data is protected, stored,
accessed, an d arch ived. Wh eth er Big Data is goin g to reside in th e data
wareh ou se or in som e oth er m ore scalable data store rem ain s u n re-
solved for m ost of th e in du stry; it is an evolvin g paradigm . However,
on e th in g is certain : Big Data is n ot easily h an dled by th e relation al
databases th at th e typical database adm in istrator is u sed to workin g
with in th e tradition al en terprise database server en viron m en t. Th is
m ean s it is h arder to u n derstan d h ow com plian ce affects th e data.
Big Data is tran sform in g th e storage an d access paradigm s to an
em ergin g n ew world of h orizon tally scalin g, u n stru ctu red databases,
wh ich are better at solvin g som e old bu sin ess problem s th rou gh an a-
lytics. More im portan t, th is n ew world of le types an d data is
prom ptin g an alysis profession als to th in k of n ew problem s to solve,
som e of wh ich h ave n ever been attem pted before. With th at in m in d,
it becom es easy to see th at a rebalan cin g of th e database lan dscape is
abou t to com m en ce, an d data arch itects will n ally em brace th e fact
th at relation al databases are n o lon ger th e on ly tool in th e tool kit.
Th is h as everyth in g to do with com plian ce. New data types an d
m eth odologies are still expected to m eet th e legislative requ irem en ts
placed on bu sin esses by com plian ce laws. Th ere will be n o excu ses
accepted an d n o passes given if a n ew data m eth odology breaks th e law.
Preven tin g com plian ce from becom in g th e n ext Big Data n igh t-
m are is goin g to be th e job of secu rity profession als. Th ey will h ave to
ask th em selves som e im portan t qu estion s an d take in to accou n t th e
growin g m ass of data, wh ich are becom in g in creasin gly u n stru ctu red
an d are accessed from a distribu ted clou d of u sers an d application s
lookin g to slice an d dice th em in a m illion an d on e ways. How will
1. Server and
network adm in istrators, clou d adm in istrators, and oth er employ-
ees often h ave access to m ore information than their jobs require
becau se the systems sim ply lack the appropriate access con trols.
Just because a user h as operating system level access to a speci c
server does not m ean that he or she n eeds, or shou ld h ave, access to
the Big Data stored on that server.
2. Most con su m ers today wou ld n ot
con du ct an on lin e tran saction with ou t seein g th e fam iliar padlock
sym bol or at least a certi cation n otice design atin g th at particu lar
tran saction as en crypted an d secu re. So wh y wou ldn t you
requ ire th e sam e data to be protected at rest in a Big Data store?
All Big Data, especially sen sitive in form ation , sh ou ld rem ain
en crypted, wh eth er it is stored on a disk, on a server, or in th e
clou d an d regardless of wh eth er th e clou d is in side or ou tside
th e walls of you r organ ization .
3.
Cryptograph ic keys are th e gateway to th e
en crypted data. If th e keys are left u n protected, th e data are
easily com prom ised. Organ ization s often th ose th at h ave
cobbled togeth er th eir own en cryption an d key m an agem en t
solution will sometimes leave the key exposed within the
con guration le or on the very server that stores the encrypted
data. This leads to the frightening reality that any user with
access to the server, authorized or not, can access the key and
the data. In addition, that key may be used for any number of
other servers. Storing the cryptographic keys on a separate,
hardened server, either on the premises or in the cloud, is the
best practice for keeping data safe and an important step in
regulatory compliance. The bottom line is to treat key security
with as much, if not greater, rigor than the data set itself.
4.
You m ay en crypt you r data to con trol
access, bu t wh at abou t th e u ser wh o h as access to th e con g-
u ration les th at de n e th e access con trols to th ose data?
En cryptin g m ore th an ju st th e data an d h arden in g th e secu rity
of you r overall en viron m en t in clu din g application s, services,
an d con gu ration s gives you peace of m in d th at you r sen si-
tive in form ation is protected from m aliciou s u sers an d rogu e
em ployees.
With Big Data con solidatin g all sorts of private, pu blic, corporate,
an d govern m en t data in to a large data store, th ere are bou n d to be pieces
of IP in th e m ix: sim ple elem en ts, su ch as ph otograph s, to m ore com plex
elem en ts, su ch as paten t application s or en gin eerin g diagram s. Th at
in form ation h as to be properly protected, wh ich m ay prove to be dif -
cu lt, sin ce Big Data an alytics is design ed to n d n u ggets of in form ation
an d report on th em .
Here is a little backgrou n d: Between 1985 an d 2010, th e n u m ber of
paten ts gran ted worldwide rose from sligh tly less th an 400,000 to m ore
th an 900,000. Th at s an in crease of m ore th an 125 percen t over on e
gen eration (25 years). Paten ts are led an d backed with IP righ ts (IPRs).
Tech n ology is obviou sly pu sh in g th is growth forward, so it on ly
m akes sen se th at Big Data will be u sed to look at IP an d IP righ ts to
determ in e opportu n ity. Th is sh ou ld create a m ajor con cern for com -
pan ies lookin g to protect IP an d sh ou ld also be a catalyst to take action .
Fortu n ately, protectin g IP in th e realm of Big Data follows m an y of th e
sam e ru les th at organ ization s h ave already com e to em brace, so IP
protection sh ou ld already be part of th e cu ltu re in an y en terprise.
Th e sam e con cepts ju st h ave to be expan ded in to th e realm of Big
Data. Som e basic ru les are as follows:
Th ese gu idelin es can be applied to alm ost an y in form ation secu rity
paradigm th at is geared toward protectin g IP. Th e sam e gu idelin es can
be u sed wh en design in g IP protection for a Big Data platform .