You are on page 1of 13

Big Data Analytics: Turning Big Data into Big Money

By Frank Ohlhor st
Copyright 2013 by John Wiley & Sons, Inc.

CHAPTER

Security,
Com pliance,
Auditing, and
Protection

h e sh eer size of a Big Data repository brin gs with it a m ajor


secu rity ch allen ge, gen eratin g th e age-old qu estion presen ted
to IT: How can th e data be protected? However, th at is a trick
qu estion th e an swer h as m an y caveats, wh ich dictate h ow secu rity
m u st be im agin ed as well as deployed. Proper secu rity en tails m ore
th an ju st keepin g th e bad gu ys ou t; it also m ean s backin g u p data an d
protectin g data from corru ption .
Th e rst caveat is access. Data can be easily protected, bu t on ly if
you elim in ate access to th e data. Th at s n ot a pragm atic solu tion , to say
th e least. Th e key is to con trol access, bu t even th en , kn owin g th e who,
what, when, an d where of data access is on ly a start.
Th e secon d caveat is availability: con trollin g wh ere th e data are
stored an d h ow th e data are distribu ted. Th e m ore con trol you h ave,
th e better you are position ed to protect th e data.
Th e th ird caveat is perform an ce. High er levels of en cryption ,
com plex secu rity m eth odologies, an d addition al secu rity layers can all

63

c07 22 October 2012; 17:58:55


64 BI G DATA ANAL YTI CS

im prove secu rity. However, th ese secu rity tech n iqu es all carry a pro-
cessin g bu rden th at can severely affect perform an ce.
Th e fou rth caveat is liability. Accessible data carry with th em
liability, su ch as th e sen sitivity of th e data, th e legal requ irem en ts con -
n ected to th e data, privacy issu es, an d in tellectu al property con cern s.
Adequ ate secu rity in th e Big Data realm becom es a strategic bal-
an cin g act am on g th ese caveats alon g with an y addition al issu es th e
caveats create. Non eth eless, effective secu rity is an obtain able, if n ot
perfect, goal. With plan n in g, logic, an d observation , secu rity becom es
m an ageable an d om n ipresen t, effectively protectin g data wh ile still
offerin g access to au th orized u sers an d system s.

PRA GMA TIC STEPS TO SECURIN G BIG DA TA

Secu rin g th e m assive am ou n ts of data th at are in u n datin g organ iza-


tion s can be addressed in several ways. A startin g poin t is to basically
get rid of data th at are n o lon ger n eeded. If you do n ot n eed certain
in form ation , it sh ou ld be destroyed, becau se it represen ts a risk to th e
organ ization . Th at risk grows every day for as lon g as th e in form ation
is kept. Of cou rse, th ere are situ ation s in wh ich in form ation can n ot
legally be destroyed; in th at case, th e in form ation sh ou ld be secu rely
arch ived by an of in e m eth od.
Th e real ch allen ge m ay be determ in in g wh eth er th e data are
n eeded a dif cu lt task in th e world of Big Data, wh ere valu e can be
fou n d in u n expected places. For exam ple, gettin g rid of activity logs
m ay be a sm art m ove from a secu rity stan dpoin t. After all, th ose
seekin g to com prom ise n etworks m ay start by an alyzin g activity so
th ey can com e u p with a way to m on itor an d in tercept traf c to break
in to a n etwork. In a sen se, th ose logs presen t a seriou s risk to an
organ ization , an d to preven t th e logs from bein g exposed, th e best
m eth od m ay be to delete th em after th eir u sefu ln ess en ds.
However, th ose logs cou ld be u sed to determ in e scale, u se, an d
ef cien cy of large data system s, an an alytical process th at falls righ t
u n der th e u m brella of Big Data an alytics. Here a catch -22 is created:
Logs are a risk, bu t an alyzin g th ose logs properly can m itigate risks as
well. Sh ou ld you keep or dispose of th e data in th ese cases?

c07 22 October 2012; 17:58:55


SECURI TY, COMPL I ANCE, AUDI TI NG, AND PROTECTI ON 65

Th ere is n o easy an swer to th at dilem m a, an d it becom es a case of


ch oosin g th e lesser of two evils. If th e data h ave in trin sic valu e for
an alytics, th ey m u st be kept, bu t th at does n ot m ean th ey n eed to be
kept on a system th at is con n ected to th e In tern et or oth er system s.
Th e data can be arch ived, retrieved for processin g, an d th en retu rn ed
to th e arch ive.

CLA SSIFYIN G DA TA

Protectin g data becom es m u ch easier if th e data are classi ed th at is,


th e data sh ou ld be divided in to appropriate grou pin gs for m an agem en t
pu rposes. A classi cation system does n ot h ave to be very soph isticated
or com plicated to en able th e secu rity process, an d it can be lim ited to
a few differen t grou ps or categories to keep th in gs sim ple for proces-
sin g an d m on itorin g.
With data classi cation in m in d, it is essen tial to realize th at all data
are n ot created equ al. For exam ple, In tern al e-m ails between two
colleagu es sh ou ld n ot be secu red or treated th e sam e way as n an cial
reports, h u m an resou rces (HR)in form ation , or cu stom er data.
Un derstan din g th e classi cation s an d th e valu e of th e data sets is
n ot a on e-task job; th e life-cycle m an agem en t of data m ay n eed to be
sh ared by several departm en ts or team s in an en terprise. For exam ple,
you m ay wan t to divide th e respon sibilities am on g tech n ical, secu rity,
an d bu sin ess organ ization s. Alth ou gh it m ay sou n d com plex, it really
isn t all th at h ard to edu cate th e variou s corporate sh areh olders to
u n derstan d th e valu e of data an d wh ere th eir respon sibilities lie.
Classi cation can becom e a powerful tool for determ in in g the sen -
sitivity of data. A sim ple approach m ay just inclu de classi cations su ch as
n an cial, HR, sales, in ventory, and com mu nication s, each of which is
self-explan atory and offers insight into the sen sitivity of the data.
Once organizations better understand their data, they can take
im portant steps to segregate the information, which will m ake the
deployment of security measures like encryption and monitoring more
manageable. The m ore data are placed into silos at higher levels, the easier
it becom es to protect and control them . Smaller sample sizes are easier to
protect and can be m onitored separately for speci c necessary controls.

c07 22 October 2012; 17:58:55


66 BI G DATA ANAL YTI CS

PRO TECTIN G BIG DA TA A N A LYTICS

It is sad to report th at protectin g data is an often forgotten in clin ation


in th e data cen ter, an afterth ou gh t th at falls beh in d cu rren t n eeds. Th e
lau n ch of Big Data in itiatives is n o exception in th e data cen ter, an d
protection is too often an afterth ou gh t. Big Data offers m ore of a
ch allen ge th an m ost oth er data cen ter tech n ologies, m akin g it th e
perfect storm for a data protection disaster.
Th e real cau se of con cern is th e fact th at Big Data con tain s all of th e
th in gs you don t wan t to see wh en you are tryin g to protect data. Big
Data can con tain very u n iqu e sam ple sets for exam ple, data from
devices th at m on itor ph ysical elem en ts (e.g., traf c, m ovem en t, soil
pH, rain , win d) on a frequ en t sch edu le, su rveillan ce cam eras, or an y
oth er type of data th at are accu m u lated frequ en tly an d in real tim e. All
of th e data are u n iqu e to th e m om en t, an d if th ey are lost, th ey are
im possible to recreate.
Th at u n iqu en ess also m ean s you can n ot leverage tim e-savin g
backu p preparation an d secu rity tech n ologies, su ch as dedu plication ;
th is greatly in creases th e capacity requ irem en ts for backu p su bsystem s,
slows down secu rity scan n in g, m akes it h arder to detect data corru p-
tion , an d com plicates arch ivin g.
Th ere is also th e issu e of th e large size an d n u m ber of les often
fou n d in Big Data an alytic en viron m en ts. In order for a backu p
application an d associated applian ces or h ardware to ch u rn th rou gh a
large n u m ber of les, ban dwidth to th e backu p system s an d/ or th e
backu p applian ce m u st be large, an d th e receivin g devices m u st be able
to in gest data at th e rate th at th e data can be delivered, wh ich m ean s
th at sign i can t CPU processin g power is n ecessary to ch u rn th rou gh
billion s of les.
Th ere is m ore to backu p th an ju st processin g les. Big Data n or-
m ally in clu des a database com pon en t, wh ich can n ot be overlooked.
An alytic in form ation is often processed in to an Oracle, NoSQL, or
Hadoop en viron m en t of som e type, so real-tim e (or live) protection of
th at en viron m en t m ay be requ ired. A database com pon en t sh ifts th e
backu p ideology from a m assive n u m ber of sm all les to be backed u p
to a sm all n u m ber of m assive les to be backed u p. Th at ch an ges th e
dyn am ics of h ow backu ps n eed to be processed.

c07 22 October 2012; 17:58:55


SECURI TY, COMPL I ANCE, AUDI TI NG, AND PROTECTI ON 67

Big Data often presen ts th e worst-case scen ario for m ost backu p
applian ces, in wh ich th e w orkload m ix con sists of billion s of sm all
les an d a sm all n u m ber of large les. Fin din g a backu p solu tion th at
can in gest th is m ixed workload of data at fu ll speed an d th at can scale
to m assive capacities m ay be th e biggest ch allen ge in th e Big Data
backu p m arket.

BIG DA TA A N D CO MPLIA N CE

Com plian ce issu es are becom in g a big con cern in th e data cen ter, an d
th ese issu es h ave a m ajor effect on h ow Big Data is protected, stored,
accessed, an d arch ived. Wh eth er Big Data is goin g to reside in th e data
wareh ou se or in som e oth er m ore scalable data store rem ain s u n re-
solved for m ost of th e in du stry; it is an evolvin g paradigm . However,
on e th in g is certain : Big Data is n ot easily h an dled by th e relation al
databases th at th e typical database adm in istrator is u sed to workin g
with in th e tradition al en terprise database server en viron m en t. Th is
m ean s it is h arder to u n derstan d h ow com plian ce affects th e data.
Big Data is tran sform in g th e storage an d access paradigm s to an
em ergin g n ew world of h orizon tally scalin g, u n stru ctu red databases,
wh ich are better at solvin g som e old bu sin ess problem s th rou gh an a-
lytics. More im portan t, th is n ew world of le types an d data is
prom ptin g an alysis profession als to th in k of n ew problem s to solve,
som e of wh ich h ave n ever been attem pted before. With th at in m in d,
it becom es easy to see th at a rebalan cin g of th e database lan dscape is
abou t to com m en ce, an d data arch itects will n ally em brace th e fact
th at relation al databases are n o lon ger th e on ly tool in th e tool kit.
Th is h as everyth in g to do with com plian ce. New data types an d
m eth odologies are still expected to m eet th e legislative requ irem en ts
placed on bu sin esses by com plian ce laws. Th ere will be n o excu ses
accepted an d n o passes given if a n ew data m eth odology breaks th e law.
Preven tin g com plian ce from becom in g th e n ext Big Data n igh t-
m are is goin g to be th e job of secu rity profession als. Th ey will h ave to
ask th em selves som e im portan t qu estion s an d take in to accou n t th e
growin g m ass of data, wh ich are becom in g in creasin gly u n stru ctu red
an d are accessed from a distribu ted clou d of u sers an d application s
lookin g to slice an d dice th em in a m illion an d on e ways. How will

c07 22 October 2012; 17:58:55


68 BI G DATA ANAL YTI CS

secu rity profession als be su re th ey are keepin g tabs on th e regu lated


in form ation in all th at m ix?
Man y organ ization s still h ave to grasp th e im portan ce of su ch areas
as paym en t card in du stry an d person al h ealth in form ation com plian ce
an d are failin g to take th e n ecessary steps becau se th e Big Data ele-
m en ts are m ovin g th rou gh th e en terprise with oth er basic data. Th e
tren d seem s to be th at as bu sin esses ju m p in to Big Data, th ey forget to
worry abou t very speci c pieces of in form ation th at m ay be m ixed in to
th eir large data stores, exposin g th em to com plian ce issu es.
Health care probably provides th e best exam ple for th ose ch arged
with com plian ce as th ey exam in e h ow Big Data creation , storage,
an d ow work in th eir organ ization s. Th e m ove to electron ic h ealth
record system s, driven by th e Health In su ran ce Portability an d
Accou n tability Act (HIPAA) an d oth er legislation , is cau sin g a dram atic
in crease in th e accu m u lation , access, an d in ter-en terprise exch an ge of
person al iden tifyin g in form ation . Th at h as already created a Big Data
problem for th e largest h ealth care providers an d payers, an d it m u st be
solved to m ain tain com plian ce.
Th e con cepts of Big Data are as applicable to h ealth care as th ey are
to oth er bu sin esses. Th e types of data are as varied an d vast as th e
devices collectin g th e data, an d wh ile th e con cept of collectin g an d
an alyzin g th e u n stru ctu red data is n ot n ew, recen tly developed tech -
n ologies m ake it qu icker an d easier th an ever to store, an alyze, an d
m an ipu late th ese m assive data sets.
Health care deals with th ese m assive data sets u sin g Big Data stores,
wh ich can span ten s of th ou san ds of com pu ters to en able en terprises,
research ers, an d govern m en ts to develop in n ovative produ cts, m ake
im portan t discoveries, an d gen erate n ew reven u e stream s. Th e rapid
evolu tion of Big Data h as forced ven dors an d arch itects to focu s pri-
m arily on th e storage, perform an ce, an d availability elem en ts, wh ile
secu rity wh ich is often th ou gh t to dim in ish perform an ce h as largely
been an afterth ou gh t.
In th e m edical in du stry, th e prim ary problem is th at u n secu red Big
Data stores are lled with con ten t th at is collected an d an alyzed in real
tim e an d is often extraordin arily sen sitive: in tellectu al property, per-
son al iden tifyin g in form ation , an d oth er con den tial in form ation . Th e
disclosu re of th is type of data, by eith er attack or h u m an error, can be
devastatin g to a com pan y an d its repu tation .

c07 22 October 2012; 17:58:55


SECURI TY, COMPL I ANCE, AUDI TI NG, AND PROTECTI ON 69

However, becau se th is u n stru ctu red Big Data doesn t t in to tradi-


tion al, stru ctu red, SQL-based relation al databases, NoSQL, a n ew type
of data m an agem en t approach , h as evolved. Th ese n on relation al data
stores can store, m an age, an d m an ipu late terabytes, petabytes, an d even
exabytes of data in real tim e.
No lon ger scattered in m u ltiple federated databases th rou gh ou t th e
en terprise, Big Data con solidates in form ation in a sin gle m assive
database stored in distribu ted clu sters an d can be easily deployed in
th e clou d to save costs an d ease m an agem en t. Com pan ies m ay also
m ove Big Data to th e clou d for disaster recovery, replication , load
balan cin g, storage, an d oth er pu rposes.
Un fortu n ately, m ost of th e data stores in u se today in clu din g
Hadoop, Cassan dra, an d Mon goDB do n ot in corporate su f cien t data
secu rity tools to provide en terprises with th e peace of m in d th at
con den tial data will rem ain safe an d secu re at all tim es. Th e n eed for
secu rity an d privacy of en terprise data is n ot a n ew con cept. However,
th e developm en t of Big Data ch an ges th e situ ation in m an y ways.
To date, th ose ch arged with n etwork secu rity h ave spen t a great deal of
tim e an d m on ey on perim eter-based secu rity m ech an ism s su ch as
rewalls, bu t perim eter en forcem en t can n ot preven t u n au th orized
access to data on ce a crim in al or a h acker h as en tered th e n etwork.
Add to th is th e fact th at m ost Big Data platform s provide little to n o
data-level secu rity alon g with th e alarm in g tru th th at Big Data cen -
tralizes m ost critical, sen sitive, an d proprietary data in a sin gle logical
data store, an d it s clear th at Big Data requ ires big secu rity.
Th e lesson s learn ed by th e h ealth care in du stry sh ow th at th ere is a
way to keep Big Data secu re an d in com plian ce. A com bin ation of
tech n ologies h as been assem bled to m eet fou r im portan t goals:

1. Server and
network adm in istrators, clou d adm in istrators, and oth er employ-
ees often h ave access to m ore information than their jobs require
becau se the systems sim ply lack the appropriate access con trols.
Just because a user h as operating system level access to a speci c
server does not m ean that he or she n eeds, or shou ld h ave, access to
the Big Data stored on that server.
2. Most con su m ers today wou ld n ot
con du ct an on lin e tran saction with ou t seein g th e fam iliar padlock

c07 22 October 2012; 17:58:55


70 BI G DATA ANAL YTI CS

sym bol or at least a certi cation n otice design atin g th at particu lar
tran saction as en crypted an d secu re. So wh y wou ldn t you
requ ire th e sam e data to be protected at rest in a Big Data store?
All Big Data, especially sen sitive in form ation , sh ou ld rem ain
en crypted, wh eth er it is stored on a disk, on a server, or in th e
clou d an d regardless of wh eth er th e clou d is in side or ou tside
th e walls of you r organ ization .
3.
Cryptograph ic keys are th e gateway to th e
en crypted data. If th e keys are left u n protected, th e data are
easily com prom ised. Organ ization s often th ose th at h ave
cobbled togeth er th eir own en cryption an d key m an agem en t
solution will sometimes leave the key exposed within the
con guration le or on the very server that stores the encrypted
data. This leads to the frightening reality that any user with
access to the server, authorized or not, can access the key and
the data. In addition, that key may be used for any number of
other servers. Storing the cryptographic keys on a separate,
hardened server, either on the premises or in the cloud, is the
best practice for keeping data safe and an important step in
regulatory compliance. The bottom line is to treat key security
with as much, if not greater, rigor than the data set itself.
4.
You m ay en crypt you r data to con trol
access, bu t wh at abou t th e u ser wh o h as access to th e con g-
u ration les th at de n e th e access con trols to th ose data?
En cryptin g m ore th an ju st th e data an d h arden in g th e secu rity
of you r overall en viron m en t in clu din g application s, services,
an d con gu ration s gives you peace of m in d th at you r sen si-
tive in form ation is protected from m aliciou s u sers an d rogu e
em ployees.

Th ere is still tim e to create an d deploy appropriate secu rity ru les


an d com plian ce objectives. Th e h ealth care in du stry h as h elped to lay
som e of th e grou n dwork. However, th e slow developm en t of laws an d
regu lation s works in favor of th ose tryin g to get ah ead on Big Data.
Cu rren tly, m an y of th e laws an d regu lation s h ave n ot addressed th e

c07 22 October 2012; 17:58:55


SECURI TY, COMPL I ANCE, AUDI TI NG, AND PROTECTI ON 71

u n iqu e ch allen ges of data wareh ou sin g. Man y of th e regu lation s do


n ot address th e ru les for protectin g data from differen t cu stom ers at
differen t levels.
For exam ple, if a database h as credit card data an d h ealth care
data, do th e PCI Secu rity Stan dards Cou n cil an d HIPAA apply to
th e en tire data store or on ly to th e parts of th e data store th at h ave
th eir types of data? Th e an swer is h igh ly depen den t on you r in ter-
pretation of th e requ irem en ts an d th e w ay you h ave im plem en ted
th e tech n ology.
Sim ilarly, social m edia application s th at are collectin g ton s of
u n regu lated yet poten tially sen sitive data m ay n ot yet be a com plian ce
con cern . Bu t th ey are still a secu rity problem th at if n ot properly
addressed n ow m ay be regu lated in th e fu tu re. Social n etw orks are
accu m u latin g m assive am ou n ts of u n stru ctu red data a prim ary fu el
for Big Data, bu t th ey are n ot yet regu lated, so th is is n ot a com plian ce
con cern bu t rem ain s as a secu rity con cern .
Secu rity profession als con cern ed abou t h ow th in gs like Hadoop
an d NoSQL deploym en ts are goin g to affect th eir com plian ce efforts
sh ou ld take a deep breath an d rem em ber th at th e gen eral prin ciples of
data secu rity still apply. Th e rst prin ciple is kn owin g wh ere th e data
reside. With th e n ewer database solu tion s, th ere are au tom ated ways
of detectin g data an d triagin g system s th at appear to h ave data th ey
sh ou ldn t.
On ce you begin to m ap an d u n derstan d th e data, opportu n ities
sh ou ld becom e eviden t th at will lead to au tom atin g an d m on itorin g
com plian ce an d secu rity th rou gh data wareh ou se tech n ologies. Au to-
m ation offers th e ability to decrease com plian ce an d secu rity costs an d
still provide th e h igh er levels of assu ran ce, wh ich validates wh ere th e
data are an d wh ere th ey are goin g.
Of cou rse, au tom ation does n ot solve every problem for secu -
rity, com plian ce, an d backu p. Th ere are still som e very basic ru les
th at sh ou ld be u sed to en able secu rity w h ile n ot derailin g th e valu e
of Big Data:

Big Data is all abou t h an dlin g volu m e wh ile pro-


vidin g resu lts, bein g able to deal with th e velocity an d variety

c07 22 October 2012; 17:58:55


72 BI G DATA ANAL YTI CS

of data, an d allowin g organ ization s to captu re, an alyze, store,


or m ove data in real tim e. Secu rity con trols th at lim it an y of
th ese processes are a n on starter for organ ization s seriou s abou t
Big Data.
Som e data secu rity solu -
tion s en crypt at th e le level or lower, su ch as in clu din g
speci c data valu es, docu m en ts, or rows an d colu m n s. Th ose
m eth odologies can be cu m bersom e, especially for key m an -
agem en t. File level or in tern al le en cryption can also ren der
data u n u sable becau se m an y applications cannot analyze
encrypted data. Likewise, encryption at the operating system
level, but without advanced key management and process-
based access controls, can leave Big Data woefully insecure.
To maintain the high levels of perform ance required to analyze
Big Data, consider a transparent data encryption solution opti-
mized for Big Data.

Ven dor lock-in is becom in g a m ajor


con cern for m an y en terprises. Organ ization s do n ot wan t to be
h eld captive to a sole sou rce for secu rity, wh eth er it is a sin gle-
server ven dor, a n etwork ven dor, a clou d provider, or a platform .
Th e exibility to m igrate between clou d providers an d m odels
based on ch an gin g bu sin ess n eeds is a requ irem en t, an d th is is n o
differen t with Big Data tech n ologies. Wh en evalu atin g secu rity,
you sh ou ld con sider a solu tion th at is platform -agn ostic an d
can work with an y Big Data le system or database, in clu din g
Hadoop, Cassan dra, an d Mon goDB.

THE IN TELLECTUA L PRO PERTY CHA LLEN GE

On e of th e biggest issu es arou n d Big Data is th e con cept of in tellectu al


property (IP). First we m u st u n derstan d wh at IP is, in its m ost basic form .
Th ere are m an y de n ition s available, bu t basically, in tellectu al property
refers to creation s of th e h u m an m in d, su ch as in ven tion s, literary
an d artistic works, an d sym bols, n am es, im ages, an d design s u sed in
com m erce. Alth ou gh th is is a rath er broad description , it con veys th e
essen ce of IP.

c07 22 October 2012; 17:58:55


SECURI TY, COMPL I ANCE, AUDI TI NG, AND PROTECTI ON 73

With Big Data con solidatin g all sorts of private, pu blic, corporate,
an d govern m en t data in to a large data store, th ere are bou n d to be pieces
of IP in th e m ix: sim ple elem en ts, su ch as ph otograph s, to m ore com plex
elem en ts, su ch as paten t application s or en gin eerin g diagram s. Th at
in form ation h as to be properly protected, wh ich m ay prove to be dif -
cu lt, sin ce Big Data an alytics is design ed to n d n u ggets of in form ation
an d report on th em .
Here is a little backgrou n d: Between 1985 an d 2010, th e n u m ber of
paten ts gran ted worldwide rose from sligh tly less th an 400,000 to m ore
th an 900,000. Th at s an in crease of m ore th an 125 percen t over on e
gen eration (25 years). Paten ts are led an d backed with IP righ ts (IPRs).
Tech n ology is obviou sly pu sh in g th is growth forward, so it on ly
m akes sen se th at Big Data will be u sed to look at IP an d IP righ ts to
determ in e opportu n ity. Th is sh ou ld create a m ajor con cern for com -
pan ies lookin g to protect IP an d sh ou ld also be a catalyst to take action .
Fortu n ately, protectin g IP in th e realm of Big Data follows m an y of th e
sam e ru les th at organ ization s h ave already com e to em brace, so IP
protection sh ou ld already be part of th e cu ltu re in an y en terprise.
Th e sam e con cepts ju st h ave to be expan ded in to th e realm of Big
Data. Som e basic ru les are as follows:

If all em ployees u n derstan d wh at n eeds to be pro-


tected, th ey can better u n derstan d h ow to protect it an d wh om
to protect it from . Doin g th at requ ires th at th ose ch arged with IP
secu rity in IT (u su ally a com pu ter secu rity of cer, or CSO) m u st
com m u n icate on an on goin g basis with th e execu tives wh o
oversee in tellectu al capital. Th is m ay requ ire m eetin g at least
qu arterly with th e ch ief execu tive, operatin g, an d in form ation
of cers an d represen tatives from HR, m arketin g, sales, legal
services, produ ction , an d research an d developm en t (R&D).
Corporate leaders will be th e fou n dation for protectin g IP.
CSOs with exten sive experien ce n or-
m ally recom m en d doin g a risk an d cost-ben e t an alysis. Th is
m ay requ ire you to create a m ap of you r com pan y s assets an d
determ in e wh at in form ation , if lost, wou ld h u rt you r com pan y
th e m ost. Th en con sider wh ich of th ose assets are m ost at risk of

c07 22 October 2012; 17:58:56


74 BI G DATA ANAL YTI CS

bein g stolen . Pu ttin g th ese two factors togeth er sh ou ld h elp you


gu re ou t wh ere to best allocate you r protective efforts.
Con den tial in form ation sh ou ld be labeled appropri-
ately. If com pan y data are proprietary, n ote th at on every log-in
screen . Th is m ay sou n d trivial, bu t in cou rt you m ay h ave to
prove th at som eon e wh o was n ot au th orized to take in form a-
tion h ad been in form ed repeatedly. You r argu m en t won t stan d
u p if you can t dem on strate th at you m ade th is clear.
Ph ysical as well as digital protection sch em es are a
m u st. Room s th at store sen sitive data sh ou ld be locked. Th is
applies to everyth in g from th e server farm to th e le room .
Keep track of wh o h as th e keys, always u se com plex passwords,
an d lim it em ployee access to im portan t databases.
Awaren ess train in g can be effective for
plu ggin g an d preven tin g IP leaks, bu t it m u st be targeted to th e
in form ation th at a speci c grou p of em ployees n eeds to gu ard.
Talk in speci c term s abou t som eth in g th at en gin eers or scien -
tists h ave in vested a lot of tim e in , an d th ey will pay atten tion .
Hu m an s are often th e weakest lin k in th e defen se ch ain . Th is is
wh y an IP protection effort th at cou n ts on rewalls an d copy-
righ ts bu t ign ores em ployee awaren ess an d train in g is doom ed
to fail.
A growin g variety of software tools are
available for trackin g docu m en ts an d oth er IP stores. Th e cate-
gory of data loss protection (or data leakage preven tion ) grew
qu ickly in th e m iddle of th e rst decade of th is cen tu ry an d n ow
sh ows sign s of con solidation in to oth er secu rity tool sets. Th ose
tools can locate sen sitive docu m en ts an d keep track of h ow th ey
are bein g u sed an d by wh om .
You m u st take a pan oram ic view of
secu rity. If som eon e is scan n in g th e in tern al n etwork, you r
in tern al in tru sion detection system goes off, an d som eon e from
IT calls th e em ployee wh o is doin g th e scan n in g an d says, Stop
doin g th at. Th e em ployee offers a plau sible explan ation , an d
th at s th e en d of it. Later th e n igh t watch m an sees an em ployee
carryin g ou t protected docu m en ts, wh ose explan ation , wh en

c07 22 October 2012; 17:58:56


SECURI TY, COMPL I ANCE, AUDI TI NG, AND PROTECTI ON 75

stopped, is Oops, I didn t realize th at got in to m y briefcase.


Over tim e, th e HR grou p, th e au dit grou p, th e in dividu al s
colleagu es, an d oth ers all n otice isolated in ciden ts, bu t n o on e
pu ts th em togeth er an d realizes th at all th ese breach es were
perpetrated by th e sam e person . Th is is wh y com m u n ication
gaps between in fosecu rity an d corporate secu rity grou ps can be
so h arm fu l. IP protection requ ires con n ection s an d com m u n i-
cation am on g all th e corporate fu n ction s. Th e legal departm en t
h as to play a role in IP protection , an d so does HR, IT, R&D,
en gin eerin g, an d graph ic design . Th in k h olistically, both to
protect an d to detect.
If you were spyin g on
you r own com pan y, h ow wou ld you do it? Th in kin g th rou gh
su ch tactics will lead you to con sider protectin g ph on e lists,
sh reddin g th e papers in th e recyclin g bin s, con ven in g an in tern al
cou n cil to approve you r R&D scien tists pu blication s, an d com in g
u p with oth er worth wh ile ideas for you r particu lar bu sin ess.

Th ese gu idelin es can be applied to alm ost an y in form ation secu rity
paradigm th at is geared toward protectin g IP. Th e sam e gu idelin es can
be u sed wh en design in g IP protection for a Big Data platform .

c07 22 October 2012; 17:58:56

You might also like