You are on page 1of 92

Creating Unified IT Monitoring and Management in Your Environment

Don Jones

sponsored by

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

IntroductiontoRealtimePublishers
by Don Jones, Series Editor

Forseveralyearsnow,Realtimehasproduceddozensanddozensofhighqualitybooks thatjusthappentobedeliveredinelectronicformatatnocosttoyou,thereader.Weve madethisuniquepublishingmodelworkthroughthegeneroussupportandcooperationof oursponsors,whoagreetobeareachbooksproductionexpensesforthebenefitofour readers. Althoughwevealwaysofferedourpublicationstoyouforfree,dontthinkforamoment thatqualityisanythinglessthanourtoppriority.Myjobistomakesurethatourbooksare asgoodasandinmostcasesbetterthananyprintedbookthatwouldcostyou$40or more.Ourelectronicpublishingmodeloffersseveraladvantagesoverprintedbooks:You receivechaptersliterallyasfastasourauthorsproducethem(hencetherealtimeaspect ofourmodel),andwecanupdatechapterstoreflectthelatestchangesintechnology. Iwanttopointoutthatourbooksarebynomeanspaidadvertisementsorwhitepapers. Wereanindependentpublishingcompany,andanimportantaspectofmyjobistomake surethatourauthorsarefreetovoicetheirexpertiseandopinionswithoutreservationor restriction.Wemaintaincompleteeditorialcontrolofourpublications,andImproudthat weveproducedsomanyqualitybooksoverthepastyears. Iwanttoextendaninvitationtovisitusathttp://nexus.realtimepublishers.com,especially ifyouvereceivedthispublicationfromafriendorcolleague.Wehaveawidevarietyof additionalbooksonarangeoftopics,andyouresuretofindsomethingthatsofinterestto youanditwontcostyouathing.WehopeyoullcontinuetocometoRealtimeforyour educationalneedsfarintothefuture. Untilthen,enjoy. DonJones

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

IntroductiontoRealtimePublishers.................................................................................................................i Chapter1:ManagingYourITEnvironment:FourThingsYoureDoingWrong...........................1 ITManagement:HowWeGottoWhereWeAreToday.....................................................................1 Problem1:YoureManagingITinSilos.....................................................................................................3 Problem2:YouArentConnectingYourUsers,ServiceDesk,andITManagement...............6 Problem3:YoureMeasuringtheWrongThings.................................................................................8 Problem4:YoureLosingKnowledge.....................................................................................................12 HowTrulyUnifiedManagementCanFixtheProblems...................................................................13 Summary..............................................................................................................................................................14 Chapter2:EliminatingtheSilosinITManagement...............................................................................16 TooManyToolsMeansTooFewSolutions...........................................................................................16 DomainSpecificToolsDontFacilitateCooperation........................................................................19 TheCloudQuestion:UnifyingOnPremiseandOffPremiseMonitoring.................................21 MissingPieces....................................................................................................................................................23 NotAllofITIsaProblem:Ordering,Routing,andProvidingServices.....................................27 ComingUpNext.............................................................................................................................................28 Chapter3:ConnectingEveryonetotheITManagementLoop...........................................................29 StartingtheLoop:ConnectingMonitoringtotheServiceDesk...................................................30 MakingChanges:HowtoFindaChangeManagementWindow..................................................35 Communicating:HowtoBringUsersintotheLoop..........................................................................37 SLAs:SettingandMeetingRealisticExpectations.............................................................................39 . TellMeWhatYouReallyThink..................................................................................................................41 . WhenEveryoneDoesntNeedtoSeeEverything:AMultiTenantApproach........................42 CallItaPrivateManagementCloud:AllocatingCosts......................................................................43 Conclusion...........................................................................................................................................................44 ComingUpNext.............................................................................................................................................44 Chapter4:Monitoring:LookOutsidetheDataCenter..........................................................................45 MonitoringTechnicalCountersvs.theEndUserExperience......................................................45

ii

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

HowtheEUEDrivesBetterSLAs...............................................................................................................46 HowItsDone:SyntheticTransactions,TransactionTracking,andMore..............................49 . TopDownMonitoring:FromtheEUEtotheRootProblem.........................................................50 Agentvs.AgentlessMonitoring..................................................................................................................51 MonitoringWhatIsntYours.......................................................................................................................54 CriticalCapability:YouNeedtoMonitorEverything........................................................................57 Conclusion...........................................................................................................................................................59 ComingUpNext.............................................................................................................................................59 Chapter5:TurningProblemsintoSolutions.............................................................................................60 ClosingtheLoop:ConnectingtheServiceDesktoMonitoring.....................................................60 RetainingKnowledgeMeansFasterFutureResolution..................................................................62 KnowledgeBases.........................................................................................................................................63 TicketsasKnowledgeBaseArticles....................................................................................................64 UnifyingtheKnowledgeBase.................................................................................................................65 MakingTicketsanAsset...........................................................................................................................69 PastPerformanceIsanIndicationofFutureResults........................................................................69 ItsthePerformanceDatabase...............................................................................................................72 Summary..............................................................................................................................................................73 ComingUpNext.............................................................................................................................................73 Chapter6:UnifiedManagement,Illustrated.............................................................................................74 TheCaseStudies...............................................................................................................................................74 DetectingandSolvingProblems...........................................................................................................74 FulfillingUserOrders.................................................................................................................................79 AShoppingListforUnifiedITManagement.........................................................................................82 WaystoBuyYourUnifiedIT.......................................................................................................................84 Conclusion...........................................................................................................................................................85

iii

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Copyright Statement
2012 Realtime Publishers. All rights reserved. This site contains materials that have been created, developed, or commissioned by, and published with the permission of, Realtime Publishers (the Materials) and this site and any such Materials are protected by international copyright and trademark laws. THE MATERIALS ARE PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. The Materials are subject to change without notice and do not represent a commitment on the part of Realtime Publishers its web site sponsors. In no event shall Realtime Publishers or its web site sponsors be held liable for technical or editorial errors or omissions contained in the Materials, including without limitation, for any direct, indirect, incidental, special, exemplary or consequential damages whatsoever resulting from the use of any information contained in the Materials. The Materials (including but not limited to the text, images, audio, and/or video) may not be copied, reproduced, republished, uploaded, posted, transmitted, or distributed in any way, in whole or in part, except that one copy may be downloaded for your personal, noncommercial use on a single computer. In connection with such use, you may not modify or obscure any copyright or other proprietary notice. The Materials may contain trademarks, services marks and logos that are the property of third parties. You are not permitted to use these trademarks, services marks or logos without prior written consent of such third parties. Realtime Publishers and the Realtime Publishers logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. If you have any questions about these terms, or if you would like information about licensing materials from Realtime Publishers, please contact us via e-mail at info@realtimepublishers.com.

iv

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Chapter1:ManagingYourITEnvironment: FourThingsYoureDoingWrong
AttheverystartoftheITindustry,monitoringmeanthavingaguywanderaroundinside themainframelookingforburntoutvacuumtubes.Therewasntreallyawaytolocatethe tubesthatwereworkingabitharderthantheyweredesignedfor,somonitoringsuchas itwaswasanentirelyreactiveaffair. Inthosedays,theHelpdeskwasprobablythatsameguyansweringthephonewhenone oftheotherdozenorsocomputerpeopleneededahandfeedingpunchcardsintoa hopper,trackingdownaburntouttube,andsoon.Theconceptsoftickets,knowledge bases,servicelevelagreements(SLAs),andsoforthhadntyetbeeninvented. ITmanagementhascertainlyevolvedsincethosedays,butitunfortunatelyhasntevolved asmuchasitcouldorshouldhave.Ourtoolshavedefinitelybecomemorecomplexand moremature,butthewayinwhichweusethosetoolsourITmanagementprocesses areinsomewaysstillstuckinthedaysofreactivetubechanging. SomeofthephilosophiesthatunderpinmanyorganizationsITmanagementpracticesare reallybecomingadetrimenttotheorganizationsthatITismeanttosupport.The discussioninthischapterwillrevolvearoundseveralcorethemes,whichwillcontinueto drivethesubsequentchaptersinthisbook.Thegoalwillbetohelpchangeyourthinking abouthowITmanagementparticularlymonitoringshouldwork,whatvalueitshould providetoyourorganization,andhowyoushouldgoaboutbuildingabettermanagedIT environment.

ITManagement:HowWeGottoWhereWeAreToday
IntheearliestdaysofIT,wedealtwithfairlystraightforwardsystems.Evensimplistic,by todaysstandards.TheITteamoftenconsistedofpeoplewhocouldfixanyoftheproblems thatarose,simplybecausetherewerentallthatmanymovingparts.ItsasifITwasacar: Amachinecapableofcomplexityandofdoingmanydifferentthings,butperfectly comprehendible,initsentirety,byasinglehumanbeing.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

AswestartedtoevolvethatITcarintoaspaceshuttle,wegraduallyneededtoallowfor specialization.Individualsystemsbecamesocomplexinandofthemselvesthatweneeded domainspecificexpertstobeabletomonitor,maintain,andmanageeachsystem. Messagingsystems.Databases.Infrastructurecomponents.Directoryservices.Thevendors whoproducedthesesystems,alongwiththirdparties,developedtoolstohelpourexperts monitorandmanageeachsystem.Thatsreallywherethingswentwrong.Itseemed perfectlysensibleatthetime,andindeedtherewasprobablynootherwaytohavedone things,butthatestablishmentofdomainspecificsiloseachwiththeirowntools,their ownprocedures,andtheirownexpertisewastheseedforwhatwouldbecomea toweringprobleminsidemanyITshops. Fastforwardtotoday,whenoursystemsarevastlymorecomplex,vastlyinterconnected, andincreasinglynotevenhostedwithinourowndatacenters.Whenauserencountersa problem,theyobviouslycanttelluswhichofourmanycomplexsystemsisatfault.They simplytelluswhattheyobserveandexperienceabouttheproblem,whichmaybethe aggregateresultofseveralsystemsinteractionsandinterdependencies.Ourusersseea holisticenvironment:IT.Thatdoesntcorrespondwelltowhatweseeonthebackend: databases,servers,directories,files,networks,andmore.Asaresult,weoftenspendalot oftimetryingtotrackdowntherootcauseofproblems.Worse,weoftendontevenseethe problemscoming,becausetheproblemsonlyexistwhenyoulookattheendresultofthe entireenvironmentratherthanatindividualsubsystems.Usersfeelcompletely disconnectedfromtheprocess,shieldedfromITbyasometimeshelpfulsometimesnot Helpdesk.ITmanagementhasadifficulttimewrappingtheirheadsaroundthingslike performance,availability,andsoon,simplybecausetheyreforcedtousemetricsthatare specifictoeachsystemonthenetworkratherthanlookattheenvironmentasawhole. ThewaywevebuiltoutourITorganizationshasledtoveryspecificbusinesslevelissues, whichhavebecomecommonconcernsandcomplaintsthroughouttheworld: IThasdifficultydefiningandmeetingbusinesslevelSLAs.Themessagingserver willbeup99%ofthetimeisntabusinesslevelSLA;itsatechnicalone.Emailwill flowbetweeninternalandexternalusers99%ofthetimeisabusinesslevelSLA, butitcanbedifficulttomeasurebecausethatstatementinvolvessignificantlymore systemsthanjusttheemailserver. IThasdifficultyproactivelypredictingproblemsbasedonsystemhealth,and remainslargelyreactivetoproblems. Whenproblemsoccur,IToftenspendsfartoomuchtimepinpointingtherootcause oftheproblem. ITsconceptofperformanceandsystemhealthisdrivenbysystemsdatabase servers,directoryservices,networkdevices,andsoforthratherthanbyhowusers andtheorganizationasawholeareexperiencingtheservicesdeliveredbythose systems.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

IThasatoughtimerapidlyadoptingnewtechnologiesthatcanbenefitthebusiness. Oxymoronically,ITisoftenthepartoftheorganizationmostopposedtochange, becausechangeisusuallythetriggerforproblems.Brokensystemsdonthelp anyone,butaninabilitytoquicklyincorporatechangescanalsobeadetrimentto theorganizationscompetitivenessandflexibility. IThasareallytoughtimeadoptingnewtechnologiesthataresignificantlyoutside theteamsexperienceorphysicalreachmostspecificallythebevyofoutsourced offeringscommonlygroupedunderthetermcloudcomputing.Thesetechnologies andapproachestotechnologyaresodifferentfromwhatscomebeforethatIT doesntfeelconfidentthattheycanmonitorandmanagethesenewsystems.Thus, theyresistimplementingthesetypesofsystemsforfearthatdoingsowillsimply damagetheorganization. EvenwithmodernselfserviceHelpdesksystems,usersfeelincrediblypowerless andoutoftouchwhenitcomestoIT.

AllofthesebusinesslevelproblemsarethedirectresultofhowwevealwaysmanagedIT. OurprocessesformonitoringandmanagingITbasicallyhavefourcoreproblems.Not everyorganizationhaseverysingleoneofthese,ofcourse,andmostorganizationsareat leastawareofsomeoftheseandworkhardtocorrectthem.Ultimately,however, organizationsneedtoensurethatallfourofthesecoreproblemsareaddressed.Doingso willimmediatelybegintoresolvethebusinesslevelissuesIveoutlined.

Problem1:YoureManagingITinSilos
Figures1.1,1.2,and1.3illustrateoneofthefundamentalproblemsinITmonitoringand managementtoday.

Figure1.1:WindowsPerformanceMonitor.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure1.2:SQLServerPerformance.

Figure1.3:RouterPerformance. ThesefigureseachillustrateadifferentperformancechartforvariouscomponentsofanIT system.Eachoftheseimageswasproducedusingatoolthatismoreorlessspecializedfor theexactthingthatwasbeingmonitored.Thetoolthatproducedtherouterperformance chart,forexample,cantproducethesamechartforadatabaseserverorevenforarouter thatslocatedonsomeoneelsesnetwork.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Thisissuchacore,fundamentalproblemthatmanyITexpertscantevenrecognizethatit isaproblem.Usingthesedomainspecifictoolsissuchanintegratedandseeminglynatural partofhowITworksthatmanyofussimplycantimagineadifferentway.Butweneedto movepastusingthesedomainspecifictoolsasourfirstlineofdefensewhenitcomesto monitoringandtroubleshooting. Why? Onemajorreasonisthatthesetoolskeepusallfrombeingonthesamepage.ITexperts cantevenhavemeaningfulcrossdisciplinediscussionswhenthesetoolsbecomeinvolved. Imlookingatthedatabaseserver,andtheperformanceisatmorethan200TPMs,one expertsays.Well,thatmustbeaproblembecausetherouterisrunningwellover10,000 PPMs.Thosetwoexpertsdontevenhaveacommonlanguageforperformancebecause theyrelockedintothedomainspecific,deeplytechnicalaspectsofthetechnologiesthey manage. DomainspecifictoolsalsoencouragewhatisprobablytheworstsinglepracticeinallofIT: lookingatsystemsinisolation.Thedatabaseguydoesnthavetheslightestideawhat makesaroutertick,whatconstitutesgoodorbadperformanceinamessagingserver,or whattolookfortoseeifthedirectoryservicesinfrastructureisrunningsmoothly.Sothe databaseguyputsonasetofblindersandjustlooksathisdatabaseservers.Butthose serversdontexistinavacuum;theyreimpactedby,andtheyinturnimpact,manyother systems.Everythingworkstogether,butwecantseethatusingdomainspecifictools. Wehavetopermanentlyremovethewallsbetweenourtechnicaldisciplines,breaking downthesilosandgettingeveryonetoworkasasingleteam.Inlargepart,thatmeans weregoingtohavetoadoptnewtoolsthatenableITsilostoworkasateam,puttingthe informationeveryoneneedsintoacommoncontext.Sure,domainspecifictoolswillalways havetheirplace,buttheycantbeourfirstlineofinformation. CaseStudy JerryworksforatypicalITdepartmentinamidsizecompany.Hisspecialtyis Windowsserveradministration,andhisteamincludesspecialistsforWeb applications,MicrosoftSQLServerandOracle,VMwarevSphere,andforthe networkinfrastructure.Thecompanyoutsourcescertainenterprise functionality,includingtheirCustomerRelationshipManagement(CRM)and email. Recently,aproblemoccurredthatcausedthecompanysmainWebsiteto stopsendingcustomerorderconfirmationemails.Jerrywasinitiallycalledto solvetheproblem,ontheassumptionthatitwaswiththecompanys outsourcedmessagingsolution.Jerrydiscovered,however,thatuseremail wasflowingnormally.HepassedtheproblemtotheWebspecialist,who confirmedthattheWebsitewasworkingproperlybutthatemailssentbyit werebeingrejected.Jerryfiledaticketwiththemessaginghostingcompany, whorespondedthattheirsystemswereinworkingorderandthatheshould checkthepasswordsthattheWebserverswereusing.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Aftermorethanadayofbackandforthwiththehostingcompanyand variousexperts,theproblemwastracedtothecompanysfirewall.Ithad recentlybeenupgradedtoanewversion,andthatversionwasnowblocking outgoingmessagetrafficfromthecompanysperimeternetwork,whichis wheretheWebserverswerelocated.Thenetworkinfrastructurespecialist wascalledintoreconfigurethefirewall,andtheproblemwassolved. Thisnarrativepreciselydemonstratestheproblem:BymanagingourITteamsasdomain specificsilos,wesignificantlyhindertheirabilitytoworktogethertosolveproblems.The factthatITexpertsrequiredomainspecifictoolsshouldntbeabarriertobreakingdown thosesilosandgettingourteamtoworkmoreefficientlytogether.Thisbecomesespecially importantwhenpiecesoftheinfrastructureareoutsourced;thosehostingcompaniesare anunbreakablesilo,astheyrenotresponsibleforanysystemsotherthantheonesthey providetous.However,thedependenciesthatoursystemsandprocesseshaveontheir systemsmeansourownteamstillhastobeabletomonitorandtroubleshootthose outsourcedsystemsasiftheywerelocatedrightinthedatacenter.

Problem2:YouArentConnectingYourUsers,ServiceDesk,andIT Management
Communicationisakeycomponentofmakinganyteamwork;andtheteamthatisyour organizationisnoexception.InthecaseofIT,wetypicallyuseHelpdesksystemsasour meansofenablingcommunicationsbutthatisntalwayssufficient.Helpdesksystemsare almostalwaysbuiltaroundtheconceptofreactingtoproblems,thenmanagingthat reaction;theyrealmostbydefinitionnotproactive. Forexample,howdoyoutellyourusersthatagivensystemwillhavedegraded performanceorwillbeofflineforsomeperiodoftime?Probablythroughemail,which createsacoupleofproblems: Importantmessagestendtogetlostintheglutofemailthatusersdealwithdaily UserswhodontgetthemessagetendtogotheHelpdeskroute,whichdoesnt includeameansofinterceptingtheirmentalprocessandlettingthemknowthatthe problemwasplannedfor.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

MostITteamsdoknowthethingsthatneedtobecommunicatedthroughoutthe organization,forexample: SLAs ThecurrentstatusofSLAswhethertheyrebeingmet Plannedoutagesanddegradedservice Averageresponsetimesforspecificservices Knownissuesthatarebeingworkedon

WhatmostITteamshaveaproblemwithiscommunicatingtheseitemsconsistentlyacross theentireorganization.Someorganizationsrelyonemail,whichasIvealreadypointedout canbeinefficientandnotconsistentlyeffective.Someorganizationswilluseanintranet Website,suchasaSharePointportal,topostnoticesbutthesesitesarentdirectly integratedwiththeHelpdesk,makingitanextrasteptokeepthemupdatedandrequiring userstoremembertocheckthem. CaseStudy Tomworksasaninsidesalespersonforamidsizemanufacturingcompany. Recently,theapplicationthatTomusestotrackprospectsandcreatenew ordersstartedrespondingveryslowly,andoverthecourseoftheday, stoppedworkingcompletely. TomsinitialactionwastocallhiscompanysITHelpdesk.TheHelpdesk techniciansoundedharriedandfrustrated,andtoldTom,Weknow,were workingonit,andhungup.Tomhadnoexpectationwhenthesystemmight returntonormal,andwasafraidtobothertheHelpdeskbycallingbackfor moredetails. Overthecourseofthatday,theHelpdeskloggedcallsfromnearlyevery salesperson,eachofwhomcalledontheirowntofindoutwhatwasgoingon. Eventually,theHelpdesksimplystoppedloggingthecalls,tellingeveryone that,Aticketisalreadyopen,anddisconnectingthecall. SomeoneontheITmanagementteameventuallysentoutanemailexplaining thataserverhadfailedandthattheapplicationwasntexpectedtobeonline untilthenextmorning.Tomwishedhehadknownearlier;althoughhed originallyplannedtomakesalescallsallday,ifhedknownthatthe applicationwouldbedownforthatlong,hecouldhaveswitchedtoother activitiesforthedayorevenjusttakenthedayoff.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Managementcommunicationsareequallyimportant,andequallychallenging.Providing franknumbersonservicelevels,responsetimes,outages,andsoforthiscrucialinorderfor managementtomakebetterdecisionsaboutITbutthatinformationcanoftenbedifficult tocomeby.

Problem3:YoureMeasuringtheWrongThings
ThisproblemisverylikelyattheheartofeverythingITisnotdoingtohelpbetteralign technologywithbusinessneeds.Thefollowingcasestudyoutlinesthescenario. CaseStudy ShellyworksintheAccountingdepartmentforhercompany.Recently,while tryingtoclosethebooksforhercompany,theaccountingapplicationbegan toreactveryslowly.ShecalledhercompanysITHelpdesktoreportthe problem. TheHelpdesktechnicianlistenedtoherthensaidthat,Everythingonthat serverlooksfinerightnow.Illopenaticketandasksomeonetolookatit, butsincewearecurrentlywithinourservicelevelagreementforresponse times,itwillbealowpriorityticket. Shellycontinuedtostrugglewiththeslowlyrespondingapplication. Eventually,someonewasdispatchedtoherdesktop.Shedemonstratedthat everyotherapplicationwasrespondingnormally.Shepointedoutthatother peopleinherdepartmentwerehavingsimilarproblemswiththeapplication. Thetechnicianmadehercloseallofherapplicationsandthenrestartedher computer,tonoeffect.Heshrugged,enteredsomenotesintohissmartphone, andleft. Bythenextmorning,theapplicationsresponsetimeswerebetter,butthey werefarfromnormal.ShellycontinuedtocalltheHelpdeskforupdateson herticketsstatus,butitseemedasiftheITteamhadgivenupontryingtofix theproblemandrefusedtoevenadmitthattherewasaproblem. Thiskindofscenariounfortunatelyhappensalltooofteninmanyorganizations.Itexactly illustrateswhathappenswhenseveralproblemsarehappeningatonce:ITisoperatingasa setofindividualsilosratherthanasateam,andeachsilohasitsowndefinitionforwords likeslow.Arootissuehereisthateveryoneismeasuringthewrongthing.Figure1.4 showshowtheaverageITteamseesamulticomponent,distributedapplication.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure1.4:ITperspectiveofadistributedapplication.

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Theyseethecomponents.Domainexpertsmeasuretheperformanceofeachcomponent usingtechnicalmetrics,suchasprocessorutilization,responsetime,andsoforth.Whena componentsperformanceexceedscertainpredefinedthresholds,someoneinITpays attention.Figure1.5,however,showshowauserseesthissameapplication.

Figure1.5:Usersperspectiveofadistributedapplication. Theuserdoesntoftencantseeanyofthecomponents.Theysimplyseeanapplication, andeitheritsrespondingthewaytheyexpect,oritisnt.Itdoesntmatterabittotheuser ifeverysingleconstituentcomponentisrunningatanacceptablelevelofprocessor utilizationwhateverthatmeans.Theysimplycarewhethertheapplicationisworking. ThiscreatesamajordisconnectbetweentheuserpopulationandIT,asFigure1.6 illustrates.

10

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure1.6:ITvs.usermeasurementsofperformance.

11

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

UsersandITmeasureverydifferentthings.AnITcentricSLAmightspecifyagiven responsetimeforqueriessenttoadatabaseserver;thatoftenhaslittletodowithwhether anapplicationisseenasslowbyusers.Worse,aswestarttomigrateservicesand componentstothecloud,welosemuchofourabilitytomeasurethosecomponents performancethewaywedoforthingsthatareinourowndatacenter.Theresult?Nobody canagreeonwhatanSLAshouldsay. Thisallhastochange.Wehavetostartmeasuringthingsmorefromauserperspective. Theperformanceofindividualcomponentsisimportant,butonlyastheycontributetothe totalexperiencethatauserperceives.WeneedtodefineSLAsthatputeveryoneusers andITonthesamepage,thenmanagetothoseSLAsusingtoolsthatenableustodoso. Someorganizationswilltellyouthattheyremoving,orhavemoved,toaservicebasedIT offering.Whatthatgenerallymeansinbroadtermsisthattheorganizationisseekingto provideITasasetofservicestotheorganizationsvariousdepartmentsandusers.Inmany instances,however,thoseserviceorientedorganizationsarestillfocusedoncomponents anddevices,whichisntaserviceorientedapproachatall.Whenyourphonelinegoes down,youdontcallthephonecompany(onyourcellphone,probably)andstartasking questionsaboutswitchesandtrunklinesyouaskwhenyourdialtonewillbeback.The backendinfrastructureismeaninglesstotheuser.Youdontaskforaservicecreditbased onhowlongaparticularphonecompanyofficewillbeoffline,youaskforthatcreditbased onhowlongyouwentwithoutadialtone.That'sthemodelITneedstomovetoward.

Problem4:YoureLosingKnowledge
Thelastproblematicpracticewelllookatistheissueoflostinstitutionalknowledge.This problemisapurelyhumanone,andfranklyitsgoingtobedifficulttoaddress.Heresa quickscenariotosetthescene. CaseStudy AaronworksforhiscompanysITdepartment.Hesbeenwiththecompany for3yearsandisresponsibleforseveralofthecompanyssystemsand infrastructurecomponents.OneTuesday,Aaroniscontactedbyhis companysITHelpdesk.WereassigningyouaticketabouttheOracle system,hestold.Onceeverycoupleofmonthsitstartsactingreallyweird, andsomeonehastofixit. ImnottheOracleguy,Aaronsays.ThatsJill. Yeah,butJillsoutonvacationfor2weeks.Soyoullhavetofixit. Ivenoideawhattodo! Well,figuresomethingout.TheCEOgetsupsetwhenthistakestoolongto fix.

12

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Unfortunately,toomuchknowledgegetswrappedupintheheadsofspecificindividuals.In fact,itsasadtruththatmanyorganizationsdealwiththisproblembysimply discouragingITteammemberstotakelengthyvacations,andoftenresistotheractivities thatwouldputthemoutoftouchsuchassendingthemtoconferencesandclassesto continuetheireducationandtolearnnewskills. Morethanafeworganizationshavemadehalfheartedattemptsatbuildingknowledge bases,inahopethatsomeofthisinstitutionalknowledgecanbecommittedtoelectronic paper,preserved,andmademoreaccessible.TheproblemisthatITprofessionalsarent necessarilygoodwriters,sotheactofproducingtheknowledgebaseisdifficultforthem.It alsotakestimetimetheorganizationisoftenunwillingtocommit,especiallyintheface ofotherdailypressuresanddemands. AsIsaid,thisisaproblemthatsdifficulttofix.TheITteamrealizesitsaproblem,andis generallywillingtofixitbuttheyrenottechwriters,andoftenhavealimitedabilityto fixtheproblem.Youcanusuallycreatemanagementrequirementsthatrequireproblems andsolutionsbeloggedinaHelpdeskticketingsystem,butsearchingthroughthatsystem forproblemsandsolutionscanoftenbedifficultandtimeconsumingmuchlikesearching forsolutionsonanInternetsearchengine,withallofthefalsehitssuchasearchgenerally produces. Butwemustfindawaytoaddressthisproblem.Knowledgeaboutthecompanys infrastructureandhowtosolveproblemshastobecapturedandpreserved.This requirementiscrucialnotonlytosolvingproblemsfasterinthefuturebutalsoto eventuallypreventingthoseproblemsbymakingbetterITmanagementdecisions.

HowTrulyUnifiedManagementCanFixtheProblems
Thisbookisgoingtobeallaboutfixingthesefourproblems,andthemeansbywhichIll proposetodosofallsundertheumbrellatermunifiedmanagement.Essentially,unified managementisallaboutbringingeverythingtogetherinoneplace. WellbreakdownthesilosbetweenITdisciplines,puttingeveryoneontothesameconsole, gettingeveryoneworkingfromthesamedataset,andgettingeveryoneworkingtogether onproblems.Welldothatinawaythatbringsusers,IT,andmanagementintoasingle viewportofITserviceandperformance.Wellcreatemoretransparencyaboutthingslike servicelevels,lettingusersseewhatshappeningintheenvironmentsothattheyremore informed. Wellinformusersinawaythatsmeaningfultothemratherthanusinginvisible,backend technicalmetrics.WellrebuildtheentireconceptofSLAsintosomethingthatsmeaningful firsttousersandmanagement,andthatcanwithstandthetransitiontohybridITthats beingbroughtaboutbyoutsourcingcertainITservicestothecloud.

13

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Finally,wellfindawaytocaptureinformationaboutourenvironment,includingsolutions toproblems,toenablefastertimetoresolutionwhenproblemsoccur.Inaddition,this informationwillenablemanagementtomakesmarterdecisionsaboutfuturetechnology directionsandinvestments. Welltrytodoallofthisinawaythatwontcosttheorganizationanarmandalegnortake halfalifetimetoactuallyimplement.Thatwillinvolveacertainamountofcreativity, includinglookingatoutsourcedsolutions.Theideaofanoutsourcedsolutionproviding monitoringforinsourcedcomponentsisfairlyinnovative,andwellseewhatapplicability ithas. IshouldpointoutthatmuchofwhatwellbelookingatcanworktosupporttheIT managementframeworksthatmanyorganizationsareadoptingthesedays,includingthe ITILframeworkthatsbecomepopularinthepastfewyears.Youcertainlydonthavetobe anITILexperttotakeadvantageofthenewprocessesandtechniquesIllsuggestnordo youevenhavetothinkaboutimplementingITIL(oranyotherframework)ifyour organizationisntalreadydoingso.Ifyouareusingaframework,however,youllbe pleasedtoknowthateverythingIhavetoproposeshouldfitrightintoit.

Summary
Thischapterhasestablishedthefourmainthemesthatwilldrivetheremainingchaptersin thisbook.Thesecorethingsrepresentwhatmanyexpertsbelievearethebiggestandmost fundamentalproblemswithhowITismanagedtoday,andrepresentthethingsthatwell focusonfixingthroughouttheremainderofthisbook.Ourfocuswillbeonchanging managementphilosophiesandpractices,notonsimplypickingoutnewtoolsalthough newtoolsmaybesomethingyoullacquiretohelpsupportthesenewpractices. Chapter2willfocusonthefirstproblematicpractice,whichisthefactthatITtendstobe managedindomainspecificsilos.Welllookatthetechnicalreasonsorganizationshave beenmoreorlessforcedtomanagethisway,andexplorewaysinwhichyoucanstartto changethatpractice. Chapter3willlookatconnectingpeople:ITmanagement,yourusers,yourservicedesk, andmore.OnlybybringingeveryoneintotheprocesscanITbetteralignitselftotheneeds oftheorganization. OurthirdproblempracticewillbethesubjectofChapter4,wherewediveintolooking outsidethedatacenterformonitoring.Thegoalwillbetosolvetheproblemsweve discussedinthischapter,furtherfocusingITonitsvaluetotheorganization.

14

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Chapter5willdiscusswaystoturnproblemsintofuturesolutions.Althoughmodern organizationsarefullyawareoftheneedforHelpdesktrackingandknowledgebuilding, howthoseactivitiesaremanagedaspartofthelargerITmanagementprocesscanmakea hugedifferenceintheirvalueaddtotheorganization. WellconcludeinChapter6,withanattempttovisualizeanITenvironmentwherethese new,unifiedmanagementpracticesareinplace.Illprovidenarrativesfromseveralcase studies,helpingyouseehowthesemodernizedpracticesworkinarealenvironment.

15

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Chapter2:EliminatingtheSilosinIT Management
Inthepreviouschapter,IproposedthatoneofthebiggestproblemsinmodernITisthe factthatwemanageourenvironmentintechnologyspecificsilos:databaseadministrators areinchargeofdatabases,Windowsadminsareinchargeoftheirmachines,VMware adminsrunthevirtualizationinfrastructure,andsoforth.Imnotactuallyproposingthat wechangethatexactpracticehavingdomainspecificexpertsontheteamisdefinitelya benefit.However,havingthesedomainspecificexpertseachusingtheirownunique, domainspecifictooldefinitelycreatesproblems.Inthischapter,wellexploresomeof thoseproblems,andseewhatwecandotosolvethemandcreateamoreefficient,unified ITenvironment.

TooManyToolsMeansTooFewSolutions
Comparingapplestoorangesisanaptphrasewhenitcomestohowwemanage performance,troubleshooting,andothercoreprocessesinIT.TellanExchangeServer administratorthattheresaperformanceproblemwiththemessagingsystem,andhell likelyjumprightintoWindowsPerformanceMonitor,perhapswithaprecreatedcounter setthatfocusesondiskthroughput,processorutilization,RPCrequestcount,andso forthasshowninFigure2.1.

Figure2.1:MonitoringExchange.

16

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

IftheExchangeadministratorcantfindanythingwrongwiththeserver,hemightpassthe problemovertosomeoneelse.PerhapsitwillbetheActiveDirectoryadministrator becauseActiveDirectoryplayssuchacrucialroleinExchangesoperationand performance.OutcomestheActiveDirectoryadministratorsfavoriteperformancetool, perhapssimilartotheoneshowninFigure2.2.Thisistrulyadomainspecifictool,with specialdisplaysandmeasurementsthatrelatespecificallytoActiveDirectory.

Figure2.2:MonitoringActiveDirectory. IfActiveDirectorylooksfine,thentheproblemmightbepassedovertothenetwork infrastructurespecialist.Outcomesanothertool,thisonedesignedtolookatthe performanceoftheorganizationsrouters(seeFigure2.3).

17

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure2.3:Monitoringrouterperformance. Combined,allofthesetoolshaveledthesethreespecialiststothesamedecision: Everythingsworkingfine.InspiteofthefactthatExchangeisclearly,fromtheuserspoint ofview,notworkingfine,theresnoevidencethatpointstoaproblem. Simplyput,thisisatoomanytools,toofewanswersproblem.IntodayscomplexIT environments,performancealongwithothercharacteristicslikeavailabilityand scalabilityaretheresultofmanycomponentsinteractingwitheachotherandworking together.YoucantmanageITbysimplylookingatonecomponent;youhavetolookat entiresystemsofinteracting,interdependentcomponents. OurrelianceondomainspecifictoolsholdsusbackfromfindingtheanswerstoourIT problems.Thatreliancealsoholdsusbackwhenitcomestimetogrowtheenvironment, manageservicelevelagreements(SLAs),andothercoretasks.Iveactuallyseeninstances wheredomainspecifictoolsactedalmostasblinders,preventinganexpertwhoshould havebeenabletosolveaproblem,oratleastidentifyit,fromdoingsoasquicklyasheor shemighthavedone.

18

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

CaseStudy Heatherisadatabaseadministratorforherorganization.Shesresponsible fortheentiredatabaseserver,includingthedatabasesoftware,theoperating system(OS),andthephysicalhardware. Onedayshereceivesaticketindicatingthatusersareexperiencingsharply reducedperformancefromtheapplicationthatusesherdatabase.Shewhips outhermonitoringtools,anddoesntseeaproblem.TheserversCPUis idlingalong,diskthroughputiswellwithinnorms,andmemoryconsumption islookinggood.Infact,shenoticesthattheamountofworkloadbeingsentto theserverislowerthanshesusedtoseeing.Thatmakeshersuspectthe networkishavingtrafficjams,soshereassignsthetickettothecompanys infrastructureteam.Thatteamquicklyreassignstheticketrightbacktoher, assuringherthatthenetworkislookingabitcongested,butitsalltraffic comingfromherserver. Heatherlooksagain,andseesthattheserversnetworkinterfaceishumming alongwithabitmoretrafficthanusual.Diggingdeeper,shefinallyrealizes thattheserverisexperiencingahighlevelofCRCerrors,andisthushaving toretransmitahugenumberofpackets.Clientsexperiencethisproblemasa generalslowdownbecauseittakeslongerforundamagedpacketstoreach theircomputers. Heathersfocusonherspecificdomainexpertiseledhertotosstheproblem overthewalltotheinfrastructureteam,wastingtime.Becauseshewasnt accustomedtolookingatherserversnetworkinterface,shedidntcheckit aspartofherroutineperformancetroubleshootingprocess.

DomainSpecificToolsDontFacilitateCooperation
IfthecomponentsofourcomplexITsystemsarecooperativeandinterdependent,ourIT professionalsareoftenanythingbut.Inotherwords,ITmanagementtendstoencourage thesilosthatarebuiltaroundspecifictechnologydomains.Theresthedatabase administrationgroup,theActiveDirectorygroup,theinfrastructuregroup,andsoforth. Evencompaniesthatpracticematrixmanagement,inwhichmultipledomainexpertsare groupedintoafunctionalteam,stilltendtoacceptthesilosaroundeachtechnicaldomain.

19

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Therearetwomajorreasonsthatthesesilospersist,andalmostanyITprofessionalcan describethemtoyou: Idontknowanythingaboutthat.Eachdomainexpertisanexpertinhistechnical area.Thedatabaseadministratorisntproficientatmonitoringormanagingrouters, anddoesntespeciallywanttoworkwiththemanyway.Thereslittlerealvaluein extensivetechnicalcrosstrainingformostorganizations,simplybecausetheirstaff doesnthavethetime.Devotingtimetosecondaryandtertiarydisciplinesreduces theamountoftimeavailablefortheirprimaryjobresponsibilities. Idontwantanyonemessingwithmystuff.ITprofessionalswanttodoagoodjob, andtheyrekeenlyawarethatmostproblemscomeaboutastheresultofchange. Allowsomeonetochangesomething,andyoureaskingfortrouble.Ifsomeone changessomethinginyourpartoftheenvironment,andyoudontknowabouttheir activity,youllhaveahardertimefixinganyresultingproblems.

Bothofthesereasonsarecompletelyvalid,andIminnowaysuggestingthateveryoneon theITteambecomeanexpertineverytechnologythattheorganizationmustsupport. However,theattitudesreflectedinthesetwoperspectivesrequiresomeminoradjustment. OnereasonIkeepcomingbacktodomainspecifictoolsisbecausetheyencouragethiskind ofwalledgardenseparation,anddonothingtoencourageeventhemostcursory cooperationbetweenITspecialists.Cooperation,whenitexists,comesaboutthroughgood humanworkingrelationshipsandthoserelationshipsoftenstrugglewiththefactthat eachspecialistislookingatadifferentsetofdataandworkingfromadifferentsheetof music,sotospeak.Ivebeeninenvironmentsandseenadministratorsspendhours arguingaboutwhosefaultsomethingwas,eachpointingtotheirowndomainspecific toolsasevidence. CaseStudy DanisanActiveDirectoryadministratorforhiscompany,andisresponsible foraroundtwodozendomaincontrollers,eachofwhichrunsinavirtual machine.Pegisresponsiblefortheorganizationsvirtualserver infrastructure,andmanagesthephysicalhoststhatrunallofthevirtual machines. Oneafternoon,PeggetsacallfromDan.Danstroubleshootingaperformance problemonsomeofthedomaincontrollers,andsuspectsthatsomethingis consumingresourcesonthevirtualizationhostthathisdomaincontrollers need.

20

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

PegopenshervirtualserverconsoleandassuresDanthattheserversarent maxedoutoneitherphysicalCPUormemory,andthatdiskthroughputis wellwithinexpectedlevels.DancountersbypointingtohisActiveDirectory monitoringtools,whichshowmaxedoutprocessorandmemorystatistics, andlengtheningdiskqueuesthatindicatedataisntbeingwrittentoandread fromdiskasquicklyasitshouldbe.Peginsiststhatthephysicalserversare fine.Danasksifthevirtualmachinessettingshavebeenreconfiguredto providefewerresourcestothem,andPegtellshimno. Thetwogobackandforthlikethisforhours.Theyreeachlookingat differenttools,whicharetellingthemcompletelydifferentthings.Because theyrenotabletospeakacommontechnologylanguage,theyrenotableto worktogethertosolvetheproblem. WedontneedtohaveeveryITstafferbeanexpertineveryITtechnology;wedoneedto makeiteasierforspecialiststocooperatewithoneanotheronthingslikeperformance, scalability,availability,andsoforth.Thatsdifficulttodowithdomainspecifictools.The routeradministratordoesntwantasetofdatabaseperformancemonitoringtools,andthe databaseadministratordoesntespeciallywanttherouteradmintohavethosetools. Havingdomainspecifictoolsforsomeoneelsestechnicalspecializationisexactlyhowthe twoattitudesIdescribedearliercomeintoplay. Ultimately,theproblemcanbesolvedbyhavingaunifiedtoolset.Geteveryones performanceinformationontothesamescreen.Thatway,everyoneisplayingfromthe samerulebook,lookingatthesamedataandthatdatareflectstheentire,interdependent environment.Everyonewillbeabletoseewheretheproblemlies,thentheycanpullout thedomainspecifictoolstostartfixingtheactualproblemarea,ifneeded.

TheCloudQuestion:UnifyingOnPremiseandOffPremiseMonitoring
Thisconceptofaunifiedmonitoringconsolebecomesevenmoreimportantas organizationsbeginshiftingmoreoftheirITinfrastructureintothecloud. TheCloudIsNothingNew IhavetoadmitthatImnotabigfanofthecloudasaterm.Itsverysales andmarketingflavored,andthefactisthatitisntaterriblynewconcept. OrganizationshaveoutsourcedITelementsforyears.Probablythemost outsourcedcomponentisWebhosting,eitheroutsourcingsingleWebsites intoasharedhostingenvironment,oroutsourcingcollocatedserversinto someoneelsesdatacenter.

21

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Forthepurposesofthisdiscussion,thecloudsimplyreferstosomeIT elementbeingoutsourcedinawaythatabstractstheunderlying infrastructure.Forexample,ifyouhavecollocatedserversinahosting companysdatacenter,youdontusuallyhavedetailsabouttheirinternal networkarchitecture,theirInternetconnectivity,theirrouters,andso forththedatacenteristhepieceyourepayingtohaveabstractedforyou. InamoderncloudcomputingmodellikeWindowsAzureorAmazonElastic Cloud,youdonthaveanyideawhatphysicalhostsarerunningyourvirtual machinesthatphysicalserverleveliswhatyourepayingtohave abstracted,alongwithsupportingelementslikestorage,networking,andso on.ForaSoftwareasaService(SaaS)offering,youdontevenknowwhat virtualmachinesmightbeinvolvedinrunningthesoftwarebecauseyoure payingtohavetheentireunderlyinginfrastructureabstracted. Regardlesswhichbitsofyourinfrastructurewindupinsomeoutsourcedservice providershands,thosebitsarestillapartofyourbusiness.Criticalbusinessapplications andprocessesrelyonthosebitsfunctioning.Yousimplyhavelesscontroloverthem,and typicallyhavelessinsightintohowwelltheyrerunningatanygiventime. Thisiswheredomainspecifictoolsfallapartcompletely.Sure,partofthewholepointof outsourcingistoletsomeoneelseworryaboutperformancebutoutsourcedITstill supportsyourbusiness,soyouatleastneedtheabilitytoseehowtheperformanceof outsourcedelementsisaffectingtherestofyourenvironment.Ifnothingelse,youneedthe abilitytoauthoritativelypointthefingeratthespecificcauseofaproblemevenifthat causeisanoutsourcedITelement,andyoucantdirectlyeffectasolution.Thisiswhere unifiedmonitoringtrulyearnsaplacewithintheITenvironment.Forexample,Figure2.4 showsaverysimpleunifieddashboardthatshowstheoverallstatusofseveral componentsoftheinfrastructureincludingseveraloutsourcedcomponents,suchas AmazonWebServices.

22

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure2.4:Unifiedmonitoringdashboard. Theideaistobeabletotell,ataglance,whereperformanceisfailing,todrillthroughfor moredetails,andthentoeitherstartfixingtheproblemifitexistsonyourendofthe cloudorescalatetheproblemtosomeonewhocan. Letsbeveryclearononething:Anyorganizationthatsoutsourcinganyportionofits businessITenvironmentandcannotmonitorthebasicperformanceofthoseoutsourced elementsisgoingtobeinbigtroublewhensomethingeventuallygoeswrong.Sure,you haveSLAswithyouroutsourcingpartnersbutreadthoseSLAs.Typically,theyonly committoarefundofwhateverfeesyoupayiftheSLAisntmet.Thatdoesnothingto compensateyouforlostbusinessthatresultsfromtheunmetSLA.Itsinyourbest interests,then,tokeepaclosewatchonperformance.Thatway,whenitstartstogobad, youcanimmediatelycontactyouroutsourcingpartnerandgetsomeoneworkingonafixso thattheimpactonyourbusinesscanatleastbeminimized.

MissingPieces
Theresanotherproblemwhenitcomestoperformancemonitoringandmanagement, scalabilityplanning,andsoforth:missingpieces.OurtechnologycentricapproachtoIT tendstogiveusamyopicviewofourenvironment.Forexample,considerthediagramin Figure2.5.Thisisatypical(ifsimplified)diagramthatanyITadministratormightcreateto helpvisualizethecomponentsofaparticularapplication.

23

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure2.5:Applicationdiagram. Theproblemisthatthereareobviouslymissingpieces.Forexample,wheresthe infrastructure?Whoevercreatedthisdiagramclearlydoesnthavetodealwiththe infrastructureroutersandswitchesandsoforthsotheydidntincludeit.Itsassumed, almostabstractedlikeanoutsourcedcomponentoftheinfrastructure.MaybeFigure2.6is amoreaccuratedepictionoftheenvironment.

24

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure2.6:Expandedapplicationdiagram. Andevenwiththisdiagram,therearestillprobablymissingpieces.Thisrealityisprobably oneofthebiggestdangersinITmanagementtoday:Weforgetaboutpiecesthatareoutside ourpurview.

25

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Again,thisiswhereaunifiedmonitoringsystemcancreateanadvantage.Ratherthan focusingonasingleareaoftechnologylikeserversitcanbetechnologyagnostic, focusingoneverything.Theresnoneedtoleavesomethingoutsimplybecauseitdoesntfit withinthetoolsdomainofexpertise;everythingcanbeincluded. Infact,anevenbetterapproachistofocusonunifiedmonitoringtoolsthatcanactuallygo outandfindthecomponentsintheenvironment.Softwaredoesnthavetomakethesame assumptions,orhavethesametechnologyprejudices,ashumans.Aunifiedmonitoring consoledoesntcareifyouhappentobeaHyperVexpert,orifyoupreferCiscorouters oversomeotherbrand.Itcansimplytaketheenvironmentasitis,discoveringthevarious componentsandconstructingareal,accurate,andcompletediagramoftheenvironment.It canthenstartmonitoringthosecomponents(perhapspromptingyouforcredentialsfor eachcomponent,ifneeded),enablingyoutogetthatcomplete,allinone,unified dashboard.Ivebeeninenvironmentswherenotusingthiskindofautodiscoverybecamea realproblem. CaseStudy Terryisresponsiblefortheinfrastructurecomponentsthatsupporthis companysprimarybusinessapplication.Thosecomponentsincluderouters, switches,databaseservers,virtualizationhosts,messagingservers,andeven anoutsourcedSaaSsalesmanagementapplication.Terrysheardaboutthe unifiedmonitoringidea,andhisorganizationhasinvestedinaservicethat providesunifiedmonitoringfortheenvironment.Terryscarefully configuredeachandeverycomponentsothateverythingshowsupinthe monitoringsolutionsdashboard. Oneafternoon,theentireapplicationgoesdown.Terryleapstotheunified monitoringconsole,andseesseveralalarmindications.Hedrillsdownand discoversthattheconnectiontotheSaaSapplicationisunavailable.Drilling further,heseesthattherouterforthatconnectionisworkingfine,andthat thefirewallisupandresponsive.Hesatacompleteloss. Severalhoursofmanualtroubleshootingandwiretracingrevealsomething abouttheenvironmentthatTerrydidntknow:Theresarouterontheother sideofthefirewallaswell,anditsfailed.NormalInternetcommunications arestillworkingbecausethosetravelthroughadifferentconnection,butthe connectionthatcarriestheSaaSapplicationstrafficisoffline.Theextra routerisactuallyalegacycomponentthatprettymucheveryonehad forgottenabout. Amonitoringsolutioncapableofautomateddiscoverywouldnthave forgotten,though.Itcouldhavedetectedtheextrarouterandincludeditin Terrysdashboard,makingitmucheasierforhimtospottheproblem.Infact, itmighthavepromptedhimtoreplaceorremovethatroutermuchearlier, onceherealizeditexisted.

26

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Discoverycanalsohelpidentifycomponentsthatdontfitneatlywithinourtechnology silos,andthatdontbelongtoanyone.Infrastructurecomponentslikeroutersand switchesarecommonlyusedexamplesoftheseorphancomponentsbecausenotevery organizationmaintainsadedicatedinfrastructurespecialisttosupportthesedevices. However,legacyapplicationsandservers,specialtyequipment,andothercomponentscan allbeoverlookedwhentheyrenotanyonesspecificareaofresponsibility.Discoveryhelps keepusfromoverlookingthem.

NotAllofITIsaProblem:Ordering,Routing,andProvidingServices
MostorganizationstendtogetintothehabitofthinkingoftheirITdepartmentasfire fighters.ITexiststosolveproblems.Thatisnttrue,ofcourse,andanyorganization probably(hopefully)dependsmoreonITtocarryoutdaytodaytasksandrequestsmore thantheyrelyonthemtosolveproblems.Butthedaytodaytasksareeasytooverlook, whereasfirefightinggetseveryonesattention. TheresultofthiswayofthinkingisthatITmanagementtendstofocusontoolsthathelp makeproblemsolvingeasier.Unifiedmonitoringisexactlythatkindoftool:Ifnothingever wentwrong,wewouldntneedit.Itstheretomakeproblemsolvingfaster,primarilyinthe areasofperformanceandavailability.Right? Notquite.TrulyunifiedmanagementalsoentailsmakingdaytodayITtaskseasierfor everyoneinvolved.Users,forexample,needtoorderandreceiveroutineservices,from simplepasswordresetsandaccountunlockstonewhardwareandsoftwarerequests.Ill makewhatsomeconsidertobeaboldstatementandsaythatthoseroutinerequests shouldbetreatedintheexactsamewayasaproblem.LookatanyITmanagement framework,suchasITIL,andyoullfindthatconceptrunsthroughout:RoutineITrequests shouldbepartofaunifiedmanagementprocess,whichalsoincludesproblemsolving. Considersomeofthesebroadfunctionalcapabilitiesthataunifiedmanagement(versus meremonitoring)canofferbothtoproblemsolvingactivitiesandtoroutineITservices: WorkflowWhenproblemsarise,followingastructuredprocess,orworkflow,can helpmakeproblemsolvingmoreconsistentandefficient.Similarly,structured workflowscanhelpmakeroutineITservicesmoreefficientandconsistent.The workflowswillbedifferentforproblemsolvingandforvariousroutineservices,but havingtheabilitytomanageandmonitorworkflowscanbearealbenefit. ApprovalsWorkflowsshouldincludeapprovals.Thiscapabilityismostobvious forroutineserviceslikehardwareandsoftwarerequests,securityrequests,andso onbutitcanbejustasimportantforproblemsolving.Noteveryproblemcanbe fixedbychangingasettingorrebootingadevice;sometimesyoullneedtomakea moresignificantchange,andhavingtheabilitytoformallyrouteapprovaltomake thatchangeisabenefit.

27

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Routing.Thespecialistwhofixesaproblemisusuallythelastonetohearaboutit. Frontlineresources,suchasyourHelpdeskandyourendusers,arethefirst responders.Beingabletoselectaproblemcategoryandhaveaticketroutedtothe rightindividualhelpsspeedproblemresolution.Thesameistrueforroutine services:Thingsgetdonequickerwhentherightpersonhastherequest.Automated routingcapabilitiescanhelpgettherightpersononthejobmorequicklyandmore accurately. Selfservice.Reducingphonecallsandmanualemailjugglingiscrucialtoachieving betterefficiency.Selfservicecanhelpdothatforbothproblemsandroutine requests.Whenusersexperienceaproblem,selfservicecanallowthemtosubmit ticketsaswellashelpthemsolvetheproblemontheirown,throughaknowledge base.Whenusersneedroutineservice,selfservicehelpsthemsubmitthatrequest withouthavingtoengageadditionalITservices. Servicecatalog.Partofselfserviceistheabilitytocreateanonlinestorefor servicesthatuserscanrequest.

Therearemorecapabilities,ofcourse,butwellcovertheminupcomingchapters.These aresimplysomeofthebasiccapabilitiesthatweneedinordertomakebothroutineIT requestsandproblemsolvingmoreconsistentandefficient.

ComingUpNext
Thischapterhasbeenaboutbreakingdownthesilosbetweentechnologyspecialties,orat leastbuildingdoorwaysbetweenthem.Thathelpstosolveoneofthemajorproblemsin modernITmonitoringandmanagement.Thenextchapterwilltackleasomewhatmore complicatedproblem:Keepingeveryoneinthemanagementloop.Itsaboutimproving communications.Unfortunately,communicationsaretoooftenavoluntary,secondary exercisewehavetomakeanefforttocommunicate,andwhenwerereallyfeelingthe pressure,itseasytowanttoputthateffortelsewhere.Soweneedtoadoptprocessesand toolsthatmakecommunicationsmoreautomatic,helpingkeepeveryoneintheloop withoutrequiringamassivesecondaryefforttodoso.

28

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Chapter3:ConnectingEveryonetotheIT ManagementLoop
ITmanagementhasfortoolonginvolveddiscrete,disconnectedprocessesthatoftenleave keyparticipantswonderingwhatsgoingon.Bringingeveryoneusers,managers,IT professionals,andmoreintotheloopcancreatesignificantbenefitsaswellasreducethe tendencytofallbackintodisciplinebasedsilos.Thisiswheretheintegrationbetween monitoringandservicedesktrulyhappens,andtheseconceptsdeliverthemostcritical, centralthemesdiscussedthroughoutthisbook.Itsallaboutcommunicationwaysto betterachievecommunicationaswellascreateopportunitiesforcontinuousimprovement. UserssometimesperceivetheirITdepartmentasoutoftouch,ivorytowergeekswith poorpeopleskills.WhetherornotthatstruedependsontheactualITteammembers,but theperception,fairornot,oftenexists.ThatsbecauseITcantoooftenbethelastonesto knowaboutthingsthatusersperceiveasproblems.Sure,theservermightmehumming alongwithinspecs,buttheorderentryapplicationisincrediblyslow.ITsaysthatemailis workingfine,butIvebeenwaitingonanincomingpurchaseorderforanhourtheemail systemcantpossiblybeworkingcorrectly! IThasitsownuniqueproblemstodealwith,andtheysometimesinvolveadisconnectwith management.Findingwindowsinwhichtomakeapprovedchanges,forexample,canbe incrediblytricky.Simplycoordinatingthechangesthatareproposed,approved,under development,readyforimplementation,andsoforthcanbedifficult.Manyorganizations haveadoptedchangemanagementframeworks,suchasthoseproposedbyITIL,that outlinespecificprocessesforreviewingandapprovingchanges.Physicallycoordinating thatprocess,however,canseemlikeherdingcats.ItsevenworsewhenIThasbeen dividedintosilos:Thedatabaseteammighthaveachangescheduledfortonight,butthat changeisgoingtoconflictwiththepowersupplychangesbeingimplementedbythedata centerteam.Weneedtogeteveryoneonthesamepage.

29

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

StartingtheLoop:ConnectingMonitoringtotheServiceDesk
MostorganizationstodayhaveaticketbasedsystemforcoordinatingITactivities.These organizationsalsousuallyhavemonitoringsystemsinplacetowatchtheirITsystemsand alertthemtoanyproblems.Toofeworganizations,however,haveconnectedthesetwo systems.Ideally,thatswhatyouwant:Asingle,integratedITmanagementsystemthatcan detectproblemsandthenautomaticallyopenticketsfortheappropriateindividuals.Ifthe emailserverisdown,theappropriateadministratorshouldgetaticket.Thosetickets,of course,shouldincludenotificationsviatextmessage,email,orwhateverothermediumis appropriatesothatalertedindividualsknowtheyhaveanalert. Thatautoassignmentyoumightevenchoosetocallitautoroutingofticketsneedsto beprettyintelligent.Differentsystems,indifferentlocations,atdifferenttimes,allmight changehowtheticketiscreated,thuschangingwhoisassignedtoworktheproblem. Ticketsshouldbeascompleteaspossible,meaningasmanyfieldsaspossibleshouldbe automaticallypopulatedyoushouldnthavetorelyonaHelpdesk,orsomeoneelse,tofill inthedetails.Thosedetailsmightincludetheaffectedserversinformation.Figure3.1 showswhatthiskindofautogeneratedticketmightlooklike,withseveralkeybitsof informationprepopulatedbythesystem.

Figure3.1:Automaticallygeneratedticketsinresponsetoalarms.

30

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Theideaistohaveaservicedesksolutionthatsthesoftwarethathelpscoordinateand manageITactivities,oftenthroughticketsworkingwiththemonitoringsolution,thus creatingatrulyintegratedresponsetoITproblems. Thisisallintendedtoprovidespecificbenefits.Firstandforemostisfasterproblem resolution.Bynotwaitingforuserstoinformyouofaproblem,youregettingstartedon solvingtheproblemfaster.Byhavingprepopulatedtickets,theITteamisabletowork morequicklybecausetheyrestartingwithmoreinformation. Theresabitmoredepththatcanbeadded,ifyouhavetherightservicedesksoftwarein place.FrameworkslikeITILencouragerootcauseanalysis,meaningyourteamshouldfocus notonlyonsolvingtodaysspecificproblembutalsoonmakingtheoverallenvironment morestableandproblemresistant.Tothatend,aservicedesksolutioncandefinetwo typesofproblems:globalissuesandspecificincidents. Specificincidentsmightbedaytodayproblemslike,Emailmovingslowlythroughoutthe organization,Orderentryapplicationoperatingslowly,andsoforth.Thosemightallbe tiedtoaglobalissueofUnexplainednetworkslowdowns,whichcouldbeexaminedand solvedperhapslocatingarouterthatwasoverheatinganddroppingmorepacketsthan usual. Sometimes,specificincidentsmightnotbeentirelysolveduntiltheoverarchingglobal issueissolved.Bytrackingthoseindividualincidentsalongwiththeglobalissue,youcan helpkeepyourusersandmanagersmoreinformed.Forexample,oncethatoverheating routerisdiscoveredandreplaced,everyoneaffectedbyanassociatedspecificissuecould benotified:Hey,wethinkwevefoundtherootcauseforalltheslowdowns,sothings shouldbebetterfromhereonout.Figure3.2showshowasingleglobalproblemcanbe attachedtomultipleincidents.

31

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure3.2:Relatingmultipleincidentstoasingleproblem. Iveusedacoupleofkeywordsintheforgoingdiscussionandwanttotakeamomentto specificallydefinetheminthecontextofthisbook: Anincidentissomethingthathappensintheenvironment,suchasafailedserveror aslowapplication. ITstaffcreateproblemrecordstohelpmanagetheincident.Problemsmayinfactbe associatedwithmultipleincidents,asinthecaseofthatoverheatingrouter,which causedmultipledisparatefailuresthroughouttheenvironment.

Imgoingtostartusingthosetwotermsmoreconsistentlyfromhereon.Hopefully,someof thebenefitsofcombiningmonitoringwithproblemsolvingwillbecomeclear.Forexample, moresimplisticHelpdesksolutionsallowmultipleticketstobeopenedagainstwhatis essentiallytheexactsameissue.Thatcanresultinalotofduplicatedeffort,asmultipleIT teammembersattempttoworktheissuesontheirown.Itcanalsoresultinalotof paperworkbecausesolvingtherootcausethenrequirestechnicianstospendtime laboriouslyclosingeachticket.Withamoresophisticatedsysteminplace,everythingcan beconsolidatedintoasingle,managedproblemrecord.Doingsocreatesadditional benefits,suchasidentifyingsolutionsorworkarounds,whichIlldiscussinupcoming chapters.

32

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Problemsandincidents,however,arenttheonlyreasonthatusersinteractwithIT. Hopefully,theyrenoteventhemajorreasonyourusersinteractwithIT!Asidefrom reportingincidents,usersalsoneedtorequestroutineservices:advice,newhardware requests,routinechangerequests,accessrequests,andsoforth.Theseinteractionsshould bemanagedthroughamoreformalworkflowinwhichuserssubmittheirrequest,haveit assignedtotheappropriatetechnicianafterbeingapproved,andbeabletotrackthestatus oftheirrequest. Forexample: 1. AusermightvisitaWebsitetobrowseacatalogofitemstheycanrequest,suchas accesstosystems,changestohardware,andsoforth. 2. Auserselectsanitemfromthecatalog,andprovideswhateverdetailsarenecessary tocompletetherequest. 3. Aticketiscreatedintheservicedeskthatrepresentstheusersrequest.Depending upontherequest,theticketmightfirstberoutedtotheusersmanagerforapproval. 4. Onceapproved,theticketwouldbeautomaticallyroutedtotheappropriate technicianorITteamforcompletion. 5. Theuserwouldreceivestatusupdates,perhapsviaemail,throughoutthisprocess, keepingtheminformedofitsprogress.Thestatusupdateswouldincludea completedupdateoncetherequestwasfinished. Byusingthesameticketbasedsystememployedforproblemsolvingtoaddressroutine requests,ITtechnicianscanrelyonasingleinterfacetomanagetheirworkload.Figure3.3 showswhataroutinerequestticketmightlooklike.

33

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure3.3:Routinerequestscanalsobemadeintotickets. Evenbetter,ITmanagementcanrelyonallITworkbeingdocumentedandtrackedina singlesystem,enablingmanagementtostayinformedthroughreports,dashboards,and othermechanisms.Figure3.4showsanexampleofwhatsuchareportmightlooklike.

34

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure3.4:ManagementreportsbecomemoreeffectivewhentheyincludeallIT workload. Theideaistokeepeveryoneintheloop:usersremaininformed,ITremainsinformed, managementremainsinformed.Muchoftheburdenofkeepingeveryoneinformedis handledbythesoftware,whichcansendemailupdatesandotherkindsofnotificationsso thateveryoneisawareofwhatshappeningatalltimes.

MakingChanges:HowtoFindaChangeManagementWindow
Large,multidisciplineITdepartmentshaveinherentproblems.Inthepreviouschapter,I discussedtheproblemofsilobasedproblemsolving,wheredomainexpertsspendtime passingaproblembackandforthbecauseeveryoneislookingatdifferenttoolsanddatato determinewhethertheproblemistheirs.Werecertainlynotgoingtogetridofdomain experts,sothesolutionistogettoolsthatcouldputeverythingintoasingleconsolein ordertounifyeveryonesefforts.

35

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Anotherproblemcreatedbythosesilosrelatestochangemanagement.Atthestartofthis chapter,Ioutlinedoneofthoseproblems:Thedatabaseteamisreadytoimplementa change,butitsgoingtobeinconflictwithachangebeingimplementedbyanothergroup. Managingchangewindowsisbecomingincreasinglydifficult.Notonlyareapplicationsand servicesneededroundtheclock,creatingtinychangewindowsinthefirstplace,butthe varyingneedsofdifferentexpertscreatescontentionforthosealreadysmallwindows. Boss,wedhavethatfixinplace,butwecanonlyimplementitatnight.Itsgoingtotake4 hours,whichjustfitsinsidethewindowmanagementallowsus.Butallthisweek,other teamshavebeenusingthewindow,andthechangestheyremakingareblockingusfrom doinganythingatthesametime.Itsnotanunusualsituation.Itgetstoughfor managementtoeventrackwhatchangesarependingandtoslotthemintotheshrinking timethatsavailabletomakethem. Thelackofvisibilityintothesewindows,andthecontentionforthem,makesitimpossible toevenmakeamanagementdecision.Forexample,ifmanagementcouldseethenumberof changesstackedup,andseethecontention,theymightdecidetoexpandthewindowfora periodoftimeinordertogetthechangesimplemented.Theymightnotdecidetodothat, buttheydbeconsciouslymakingadecisionratherthanremainingignorantoftheactual problem. Thesolution,ofcourse,issoftwarethatfacilitatesthecoordinationofdepartments.Think aboutit:Ifyoureusingaservicedesksolutiontotracktickets,thenticketscanbecreated forproposedchanges.Thoseticketswouldbeassignedtoatechnician,routedforreviews andapprovals,andsoforth,allviasomeworkflowyoudesigned.Thatsanexcellentwayto supportITILprocesses,bytheway.Theticketsthemselvescanthenfeedaunified calendar,builtrightintotheservicedesk,whichallowschangeplannerstoschedule activities.Theycanseeagreedmaintenancewindows,managecontentionbetween conflictingchanges,andsoforth.Bygettingthisinformationintoafamiliarcalendarform, theycanalsomakedecisionsaboutwhethertowidenmaintenancewindowsifdoingsois necessaryandbeneficialtotheorganization.Figure3.5showsachangemanagement calendar.

36

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure3.5:Managingchangeschedulesinacalendarview. Thisisjustanotherwaytohelpkeepeveryoneintheloop.Managementnowhasaclear visualdepictionofchangeandschedulecontention.Suchacalendarcouldevenbemade availabletouserssothattheycouldseewhatchangeswerescheduledandplantheirown activitiesaccordingly.

Communicating:HowtoBringUsersintotheLoop
Theideaofkeepingusersinformedcertainlyisntnew,butmanyorganizationsthathave attemptedtobetterengagetheirusershaventmetwithunqualifiedsuccess.Toooften, keepusersintheloopsolutionstaketheformofselfserviceWebportals,whereusers canlogintocheckthestatusoftheirticketsortocheckthestatusofaparticularservice. Thatsallwellandgood,butWebportalslikethatdontalwaysfallwithinthenatural workflowofauser.Forexample,mostusers,whenconfrontedwithsomekindofproblem, dontnecessarilythinktocheckaWebsiteandseeifsomethingswrongtheycallthe Helpdesk.

37

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Usersdo,however,spendalotoftimeintheiremailinbox.Whynotmakethatyour channelforcommunication?Organizationsdontusethismethodofcommunicationinpart becausedoingsocouldeasilybecomeatimeburdenforyourITteam.Soontopofsolving theproblem,Ihavetosendouthourlyupdateemailswiththestatusoftheproblem? SoundslikeaDilbertcartoon! Inreality,agoodservicedesksolutioncandoitforyou.Sendinganemailupdatewhena usersticketisupdated,forexample,isaneasyoperationforapieceofsoftware.Such emailscanbeinformative,andhelpusersfeelcomfortablethattheirrequestisbeing handled.Figure3.6showswhatonemightlooklike.

Figure3.6:Keepingusersinformedwithdetailedemails.

38

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Whatsmorecompellingisaservicedesksolutionthatcanactuallyacceptrequestsvia emailratherthanexpectinguserstogotoaselfserviceWebportalandopenaticket.Face it:YourusersaremorelikelytopickupthephonethanvisitaWebsite,unlessyouve placedsignificantartificialbarriersintheway,likecomplexvoicemenusinthephone system.Usersaremorelikelytosendanemail.Ifyourservicedesk,ratherthanahuman technician,canreceivethoseemailsandusethemtocreateaticket,youvetrulycreateda systemyourusersarelikelytoembrace.Suchticketscouldstillbeautoassignedand routed,helpingtherighttechniciantostartworkingtheproblemmorequickly. Evenforyourusersroutine,nonproblemrequests,emailupdatescanbevaluable.When theirrequestisapproved,rejected,underway,completed,andsoforth,anemailupdate helpskeepusersinformedwithoutadditionalhumaneffort. Note Iwanttoemphasizethatselfserviceportalsareagoodthing.Theycan providearichuserexperience,helpguideuserstoselfservicesolutions,and more.Theyjustshouldntbetheonlymeansofcommunicatingwithusers.

SLAs:SettingandMeetingRealisticExpectations
Unlessyouvebeenlivingunderarockforthepastdecadeorso,ServiceLevelAgreements (SLAs)areprobablyprettyfamiliartoyou.Theseare,intheirsimplestform,anagreement bytheITteamtoprovideaspecificlevelofperformanceoravailabilityforaspecificservice orapplication.Theemailservicewillbeavailable99.999%ofthetimeonanannualized basisisanexampleofaverysimpleSLA. ButSLAscangetcomplicatedquickly.Youcantjustpullanumberoutofthinair;what levelofservicecanyoureasonablyprovide?Whatlevelofservicehaveyouhistorically provided,andisthatmeetingthebusinessneeds?Onceestablished,howdoyoutrackthe SLAtomakesureyoureactuallymeetingitandideallygetsomekindofnotification whenyoureindangerofbreakingtheagreement? SLAsmightnotbetheonlytypeofagreementyouneedtodefineandtrack.Some organizationsalsouseunderpinningcontracts(UCs)oroperationallevelagreements(OLAs) fordifferentinandoutsourcedservices;theseoftensupportSLAs. Awellbuiltservicedeskandmonitoringsolutioncanhelpyouhandletheseagreements moreprecisely.YoullstartbydefiningtoplevelSLAs,thencreatingandmanagingUCsand OLAsasappropriate. Oncedefined,thesolutionshouldbeabletotrackongoingperformanceandavailability, perhapsofferingasimpledashboardliketheoneshowninFigure3.7thatillustrates yourcompliancewithyourSLAs.Youmightalsohavemorecomprehensiveanddetailed reportsonSLAmetrics.

39

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure3.7:ManagingSLAswithataglancedashboards. Mostimportantly,however,thesolutionneedstoprovideyouwiththeabilitytodefine rulesforyourSLAssothatticketscanbecreatedandautoassignedtotheappropriate technicianswhenSLAsareindangerofbeingbroken.Further,thesolutionshould supportescalationrulessothatifanSLAthatisindangerofbeingbrokenisnotcorrected withinacertainamountoftime,thesolutioncanautomaticallycallforbackup,summoning additionaltechnicians,notifyingmanagement,andsoforth. TheresalsoastrongneedtorecognizethatnoSLAisperfect.Sometimes,forwhatever reason,thebusinesswilldecidetotakeaserviceoffline.Perhapsitsforasoftwareupgrade orforsomekindofinfrastructuremaintenance.Inthosecases,yourenotbreakingtheSLA; youreagreeingalongwithwhateverpartofthebusinesswillbeaffectedtotemporarily suspendtheSLAtogettheworkdone.Aservicedesksolutionshouldsupportthesetypesof exceptions,includingSLAsthatareonlyvalidduringcertainhours,holidayexceptions, agreeduponreducedservicewindows,maintenancewindows,andsoforth. TheideaistoautomateSLAdefinitionandmanagementandtoautomatethenotifications thatgowithSLAs.IfanSLAisbroken,youmightagreethattheaffectedbusinessuserswill receiveanautomaticnotification.ThatletsthemknowthatITknowsabouttheproblem andisworkingonitwithoutforcinguserstovisitaselfserviceportalandopenaticket. ThatkindofproactiveresponsecangoalongwaytowardimprovingITuserrelationships, andinhelpingITbeviewedasresponsiveto,andsupportiveof,businessrequirements.

40

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

TellMeWhatYouReallyThink
ITmanagerslikeITtothinkofusersascustomers.Insomecases,yourusersmight actuallybecustomers,inthesenseofsendingyouacheckforspecificservicesyou providecustomers.Inothercases,yourusersmightbeinternalusersbutstill customersinthesensethattheyconsumeservicesyou,theITdepartment,provides,and thatyougetpaidforyourefforts. AbigproblemthatIThasalwaysstruggledwithisitsperceptionbyitscustomers.Do customersthinkyouredoingagoodjob?Whatisagoodjob? Forthisreason,monitoringEndUserExperience(EUE)metrics,whichIdiscussedinthe firstchapter,hasbecomeahottrendintheITindustry.Youmightseethatyourservers performanceiswithinnorms,butbythetimeyouthrowinoldclientcomputers,routers, networkcabling,andeverythingelseinvolvedindeliveringaservicetousers,theyhavea completelydifferentperceptionoftheserversperformance.MeasuringtheEUEisawayto getsomeinsightintothataggregateperceptionthatyourusersyourcustomers,ifyou preferdealwith. Businesseshavetraditionallyusedanotherimportanttechniquetodiscovertheir customersperceptions:surveys.Phoneyourcreditcardcompany,andtherobotwho answersthephonemightinformyouthatyouvebeenselectedtoparticipateinashort satisfactionsurvey,whichwillbeginwhenyoufinishspeakingwiththeagentwhoisabout tocomeontheline.Walkintoathemepark,andasmilingemployeewithatabletcomputer asksyouafewquestions.Lookattheregisterreceiptfromyourlastpurchase,andyou mightfindthatyoureeligibletowinagiftcardorotherprizeifyoucompleteanonline surveyaboutyourshoppingexperience. Surveysareaneffectivewayoffindingoutwhatusersreallythink,andagoodservicedesk applicationshouldprovideyouwiththeabilitytosurveyyourcustomers.Perhapsyou wanttoaskthemtheiropinionaftereachrequestthatscompleted.Maybeyouwanttobea bitlessintrusive,andonlysurveythemafterevery3or4requests.Whateveryoudecide,a servicedesksolutionshouldbeabletoautomatetheprocess.Youmightevenwantto engagecustomersinadhocsurveystofurtheryourunderstandingoftheirperceptions aboutdaytodayperformance,availability,servicelevels,andsoforth. Ofcourse,surveysareuselesswithouttheabilitytoaggregatethedataandseehowyoure doing.Thebackendofasurveysystemmustincludereportingcapabilities,perhapswith chartsandgraphs,thathelpyouvisualizeyourcustomersperceptionofyourservice. ComparethisreporttoyourSLAcompliancereportdoyouseeanydifferences?Ifyour SLAshowsthatyouredoingagreatjob,andyourcustomersurveysarentsoglowing,then maybeyourSLAsarentsetattherightlevelsormaybeyourSLAsarenttheonlymetric youshouldbelookingat.

41

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Iveworkedwithanumberofcustomerswhohavefoundthemselvesinexactlythat situation:OurSLAsareallbeingmet,everyday,butourusersstilldontthinkwedoa goodjob.Whatstheproblem?Wediscoveredtheanswerwithafewadhocsurveysthat touchedonsoftissues,suchastheITteamsattitudewhenhelpingusers.Turnsoutthat theteamcameacrossasbrusqueandsometimesrude.Wespentsometimewiththeteam, anddiscoveredthattheyfeltanincredibleamountofpressurebecauseofthenumberof ticketsassignedtothem.Intheend,thecompanydevelopedinternalmetricstotrackeach ITmembersworkloadandefficiency,andworkedtobringeachpersonsworkloadtoa moremanageablelevelwhilecontinuingtosurveythosesoftissuessuchasattitude. ThemoralofthestoryisthatSLAsarenttheonlymetricyouneedtoconcernyourself with,andintegratedsurveyingcanhelprevealcriticalinformationtohelppinpointoverall serviceproblems.

WhenEveryoneDoesntNeedtoSeeEverything:AMultiTenant Approach
MultitenantisagrowingtrendamongstITsolutionvendors,andforgoodreason. Obviously,serviceprovidersoperatetheverydefinitionofmultitenantsystems.Ifyourea serviceprovider,orperhapsmorespecificallyaManagedServiceProvider(MSP),thenyou knowtheimportanceofhavingtoolsthatcanbecustomizedandpartitionedforeachof yourcustomers.CustomerAwantsthesedashboards,whileCustomerBwantsthose. CustomerBcertainlydoesntwanttoseeCustomerAstickets(andCustomerAdoesnt wantCustomerBtoseethem!).Inthepast,itsbeenprettycommonforsuchmultitenant featurestoonlybepresentinsolutionsthatweredesignedforMSPs. Today,however,thatschanging.Large,multidivisionalcompanieswanttodeploy solutionsthatcanservealloftheirdivisionsneedswithoutnecessarilydeployingaunique solutionforeachdivision.Thatswheremultitenancycanhelp,enablingasinglesolution tobecustomized,partitioned,andpresentedtoeachdivisionasiftheyweretheonlyones usingthesolution,wheninfactthesolutionisconsistentlyservingeveryone.Different divisionscangetadifferentviewofjusttheirportionoftheenvironment.Forexample, DivisionAmightseeadashboard,whileDivisionBsawsomethingcompletelydifferent. Again,multitenancyisntsomethingthateverysinglecompanyororganizationisgoingto need.However,itsanicefeaturetohaveinyourbackpocketifthetimecomeswhenyou doneedit,sobesuretoconsiderthisfunctionalityasyoureevaluatingvarioussolutions evenifmultitenancyisntanimmediateneed.Ofcourse,ifyoureanMSP,multitenancyis definitelyamusthavefeature. Werecontinuingtosupportthischaptersthemeofkeepingeveryoneintheloop:The abilitytoprovidespecific,customized,partitionedenvironmentstoyourvarying customerswhetherinternalorexternalhelpskeepthemmoreinformedandmore accuratelyinformed.

42

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

CallItaPrivateManagementCloud:AllocatingCosts
Theresonemorethingweshouldlookattokeepeveryoneintheloop,andthatswith regardtotheircosts:Theabilitytoprovidecustomerswithdetailedreportsontheirusage oftheinfrastructure,andtopotentiallybillthemfortheirusagebaseduponthosereports. Figure3.8showswhatsuchareportmightlooklike.

Figure3.8:Reportingonusagebasedmeteringandbilling. Again,thiskindofreportingisanobvious,musthavefeatureforMSPsbutithas increasingapplicabilitytoorganizationswhodealonlywithinternalcustomers. Oneofthekeyelementsofcloudcomputingistheconceptofbillingyoubasedonyour actualusage.Thecloudproviderbuildsandmanagestheinfrastructure,whichisshared amongsttheircustomers.Eachcustomerthenpaysforthebitstheyuse.Thatsanobvious andwellunderstoodmodelforthepubliccloudbutitsbecomingamodelfortheprivate cloudaswell.RatherthanacceptingITasagiantbucketofoverhead,companiesare lookingmoreandmoreatallocatingITscostsacrosstheconsumersofthoseITservices. MarketingwantstospinupadozenvirtualWebserversforanewWebsite?Okaydo theyhavethebudgettopayforit? Chargebacks,astheyrecalled,arecertainlynothingnew.Butmonitoringandservicedesk solutionsareincreasinglyabletoprovidethelevelofdetailthatyouneedtoactuallymake chargebackswork.Thetechnologicaladvancementsthathavemadepubliccloudspossible canbereadilyintegratedintoprivatedatacentersforthesamepurposes:billing(or allocatingcosts)foractualusage. TyingITcostsdirectlytotheconsumersofthoseITservicesisagreatapproachforhelping ITmakebetterbusinessdecisions.RatherthanputtingITintheroleofgatekeeperfor whocanandcannothavespecificservices,theorganizationsmanagementgetstodecide whatmoneywillbespent,bywhom,onwhatservices.Thatshowitshouldbe.Inone sense,IThasalwaysbeenanoutsourcedactivity:AlthoughtheITteammightbepaidby theorganization,theydontmateriallyparticipateintheorganizationsactualprofit makingactivities.Theyreaseparatedivision.Essentially,thebusinesshasoutsourcedIT (albeittoaninternalteam)whynothaveITdeliverusagebasedbillingstatementsjust likeanyothervendorwouldbeexpectedtodo?

43

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Itsjustanotherwayofkeepingeveryoneintheloop.Evenifyoudontuseyourusage basedbillingreportsforactualbillingorchargeback,theyreausefulwayofhelpingupper managementunderstandthecostandvalueoftheirITinvestment.Yes,youspentazillion dollarsonITlastquarterbuthereswhy,andhereshowthatinvestmentwasconsumedby theorganization.Ifyouwanttocutback,startbylookingattheconsumers,andfinding waystomakethemconsumeless.

Conclusion
ThischapterhasbeenallaboutkeepingpeopleintheloopwhenitcomestoIT management.FromkeepingusersmoreupdatedandengagedintheITprocess,tokeeping techniciansmoreconnectedtoongoingevents,tokeepingmanagementmoreinformedso thattheycanmakebetterdecisionsitsbeenaboutcommunications.Theresverylittle Ivediscussedinthischapterthatanyorganizationcouldntstartdoingtoday,iftheywere willingtoexpendenougheffort.Thekey,however,isinaccomplishingthesegoalswith littleornoeffort,byusingasystemofintegratedsoftwaretoolsthatunderstandhowtodo thesethingsforyou.

ComingUpNext
Inthenextchapter,weregoingtolookatachallengethatsbecomemoreandmore commoninIT:keyservicesandITelementsexistoutsidethedatacenter.Yes,youcancall itthecloudoryoucouldsimplycallitoutsourcedservices.Whateveryoucallthem, theyrestillcriticaltothebusiness,andyouneedtotreatthemthesamewayyoutreatallof theinhouseservices.Youcanttreatthemasaseparatesilo,becausethenyoullbeforcing yourselftomanagethemdifferently.Ofcourse,monitoringoutsourcedservicesisawhole differentballgamethanmanaginginhouseservices,sowellneedtofindsomeclever solutions.

44

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Chapter4:Monitoring:LookOutsidethe DataCenter
IThasmovedbeyondourowndatacenters,andnearlyeveryorganizationhasatleastone ortwooutsourcedITservices.Althoughwereprobablyalwaysgoingtohaveonpremise assetstomanageandmonitor,weneedtorealizethatinmostcases,monitoringhasto startoutsidethedatacenterbothinthesenseofaccommodatingoffpremisesservicesas wellasfocusingmorecloselyonwhatendusersareactuallyexperiencing.

MonitoringTechnicalCountersvs.theEndUserExperience
ThetraditionalITmonitoringapproachiswhatIcallinsideout:Itstartswithinthedata centerandmovesoutwardtowardtheenduser.Figure4.1providesavisualforthisidea, illustratinghowtypicalmonitoringfocusesonthebackend:databaseservers,application servers,Webservers,cloudservices,andsoforth.Thegeneralreasonforthisapproachis thatwehavethebestcontrolandinsightoverwhatsinsidethedatacenter.Ifeverything insidethedatacenterisrunningsmoothly,itstandstoreasonthattheenduserswho consumethedatacentersserviceswillbehappy.

Figure4.1:Monitoringfromthedatacenteroutward.

45

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

MostServiceLevelAgreements(SLAs)derivefromthisapproach:Wepromiseacertain amountofuptime,andwesetupmonitoringthresholdsarounddatacentercentric measurementslikeCPUutilization,networkutilization,diskutilization,andsoforth.We alsotendtolookatlowlevelresponsetimes:queryresponsetime,diskresponsetime, networklatency,andsoon. Theressomethingdeeplyandinherentlyinaccurateabouttheunderlyingassumptionof thisapproach:Evenifyoustartwithaperfectpileofbricks,theresnoguaranteethat youregoingtoendupwithastablebuildingintheend.Inotherwords,whatendusers experienceisntmerelythesumofthedatacentersvariousmetrics.Asmoothlyrunning datacenterusuallyleadstosatisfiedusers,butthatisntalwaysthecase. Itsobviouslyimportantforustocontinuemonitoringthesedatacentercentric measurements,butthosecantbetheonlythingwemonitorandmeasure.Currentthinking intheindustryisthatweneedtomoredirectlymeasurewhattheenduserexperiences.In fact,enduserexperience,orEUE,hasbecomeacommonterminmoreforwardthinking managementcircles. Heresanotherwaytothinkofit:Supposeyougotoarestauranttoeat.Yoursteakcomes outcookedwrong,theybroughtthewrongsideitems,andthewaitressisrude.The manager,standingbackinthekitchen,thinkseverythingisfine:thesteaksarehot,the veggiesarehot,andthewaitresssmilesathimeverytimeshegoesbackthere.Hesfocused onthebackend,withnoknowledgeofyourexpectations.Restaurantsaddressthisby havingthemanagerperiodicallyroamaroundandask,Iseverythingokay?Thats monitoringtheEUE:Ratherthanlookingathisbackofhousemetrics,hesgoingoutinto thecubefarmer,restaurantfloorandtestingthewaters.

HowtheEUEDrivesBetterSLAs
YouestablishmetricsforwhattheEUEshouldbeforvariousoperations:somanynumber ofsecondstocompletesuchandsuchatransaction,andsoforth.Whenthatmetricisnt met,youstartdrillingdownintotheinfrastructuretofindoutwhy.Thatswheremore traditionaldatacentermonitoringreentersthepicture.Ratherthanusingqueryresponse timeorwhatevertoderivetheendusersexperience,wereusingittotroubleshootthings whentheendusersexperienceisclearlynotwhereweneedittobe.Figure4.2showshow EUEmonitoringsortofreversesthemodel.

46

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure4.2:MonitoringtheEUE. Youllstillhavethresholdsandotherconsiderations,buttheyresetatlevelsthathave historicallybeenabletodeliveranacceptableEUE.AsFigure4.3shows,afailedEUEis yourcuetostartlookingatdeeper,moretechnicallevelmeasurementssothatyoucansee whatscontributingtotheenduserproblem.

47

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure4.3:TrackingthecauseofapoorEUE. Inreality,itdoesntalwaystakeamajorchangeinthebackendtocascadeintoareal problemfortheEUE.Adatabaseserversresponsetimesslowbyamillisecondortwo, resultinginanapplicationservertakinganextrahalfsecondtoprocessatransaction, resultinginafrontendservertakinganextrasecondtopresentthenextscreenof information,resultingintheusersclientapplicationtakingacoupleofextraseconds.Add upthosecoupleextrasecondsoverthecourseofaday,andyouvelostanhourorso,and toldalotofcustomers,Sorrythisistakingsolong,thecomputerisslowtoday. InFigure4.3,bothaninternaldatabaseserverandacloudcomputingserviceare respondingslowly(indicatedbytheredflags).Neitheronemightbealarminginandof itself,buttogethertheyrecombiningtoformanoticeableproblemfortheenduser. Normally,aminorafluctuationonthedatabaseservermightnotraiseanyalarms.Itsthe cascadeofeffectsthatresultinapoorEUE.Oncewedefinitivelyknowbecauseweve beenwatchingitthattheEUEisdeclining,wecanstartlookingforcauses.Becausewere lookingforaproblem,ratherthanjustroutinelymonitoring,thatminorbackend performancedecreasewillbemorenoticeable.

48

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

TheabilitytomeasuretheEUEletsyoucreatemuchmorerealisticSLAs.Insteadoftelling users,Wellguaranteeaqueryresponsetimeof2seconds,youtellthem,Suchandsuch atransactionwilltakenomorethan3secondstoprocess.Thatssomethinganenduser canmonitorforthemselves:Clickenter,andcountoneonethousand,twoonethousand, threeoneah,itsdone.ThatkindofSLAsetsanexpectationthatuserscanrelateto. Theyllknowwhenthesystemisslowbecausetheyremeasuringthesamethingyouare. Ideally,youllknowofslowdownsbeforetheuser,oratclosetothesametime,because youllhavetoolsinplacetomonitorthingsfromtheusersperspective.

HowItsDone:SyntheticTransactions,TransactionTracking,and More
Thiskindofmonitoringisntalwayseasy.Itspossibletothrowmonitoringagentsontoend usercomputerswhentheyreallemployees,butwhataboutaWebapplication,wherethe endusersareactuallyexternalcustomers?Theyprobablywouldntbeexcitedabouthaving youinstallmonitoringagentsontheircomputersjusttotrackyourapplications performance. Instead,modernmonitoringtoolsrelyontechniquesliketransactiontracking.Withthis technique,monitoringcomponentsonyourendwatchanindividualtransactionasitflows throughyoursystems,literallymeasuringthetimeittakesthetransactiontobeprocessed. Thiscanbedoneatavarietyoflevelsofdetail.Forexample,toolsusuallyassociatedwith softwareperformanceprofilingcangetverydetailed,trackingtransactionsthrough individualsoftwaremodules.Atahighersystemsmanagementlevel,youmightjusttrack thetransactionsstarttofinishtime. Often,thisisalsodonebyinsertingsynthetictransactionsintothesystem.Essentially,a monitoringsystempretendstobeaclient,theninsertstransactionsintothesystemthat willbeprocessedbutthenlaterignored.Theseallowthemonitoringsystemtomore preciselyfigureouttheactualtimetocompleteforvarioustransactions.Thisideais illustratedinFigure4.4.

49

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure4.4:UsingtransactiontrackingtomonitortheEUE. Therearealotofvariationsonthesetechniques,andalotofspecializedtoolsthatyoucan acquiretoactuallyimplementthem.Intheend,though,itsimportanttorememberthatthe entireactivityisdesignedtomeasurejustonething:theEUE.Yourenot,atthispoint, tryingtofigureoutwhatindividualsystemsperformanceisortryingtotrackdownthe rootcauseoftheproblem.Youresimplytryingtodeterminewhetherthereisaproblem.

TopDownMonitoring:FromtheEUEtotheRootProblem
TheEUEisintendedtobeanextremelyhighleveldiagnostic;ittellsyouthatsomethingis wrong.Itwonttellyouwhat.Forthat,youregoingtoneedtogobacktothetraditional monitoringyouvealwaysknownandloved,onlythistimeyouwontjustbewatchingina vacuum:Youllbelookingforaproblemthatyouknowexists.

50

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Thisisnotthetimetopulloutthedomainspecificmonitoringtoolswevediscussedthat inpriorchapters.Youstillwanttostickwithamonitoringsystemthatcanmonitor everythinginasinglepaneofglass.Thatdoesntnecessarilymeanaframeworkthats aggregatingdomainspecifictools,eitheritmeansamonitoringsystemdesigned specificallytolookateachofyoursystems.Withtherightunderstandingofwhat performanceshouldlooklikeateachcomponentlevel,suchasystemcanquicklytellyou wheretheproblemlies.Thenyoucandigoutthedomainspecifictoolstotroubleshootthe particularproblemagain,withtheknowledgethatthereisaproblem,andthatthe componentyourelookingatistheonecausingtheproblem. DerivingtheEUE Sowhycantyousimplyusebetterthresholdsonyourbackendmonitoring tofigureoutwhentheEUEisdeclining?BecauseEUEfocusesontheentire system.Thedatabasecanbeslower,providedthattimedoesntcascade throughtherestofthesystem.Aslowrouterdoesntnecessarilymeanaslow EUE,althoughincombinationwithotherfactors,itmightbethetipping point.Thatswhyyouneedtolookdirectlyatwhatendusersare experiencing,thengolookingfortherootcause.

Agentvs.AgentlessMonitoring
Theresalotofargumentinthemonitoringindustryaboutthebestwaytomonitor.Doyou installagents?Somefolksbelieveso,andfeelthatagentsprovidethebestandmost detailedinformation.Otherfolksdontlikeinstallingandmaintainingagentsthroughout theirenvironment,andcorrectlypointoutthatnoteverycomponentofasystemcaneven haveagentsinstalled.Routers,offpremiseservices,andsoforthtypicallycantsupport dedicatedmonitoringsoftware,afterall.Sooneapproachisdefinitelytoinstallagents,as illustratedinFigure4.5.

51

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure4.5:Monitoringviaagents. Youlltypicallyhavethoseagentsreportingbacktosomecentralizedmonitoringserveror system.Dependingonyourapproach,youmighthaveagentsinstalledoneverysystem wheretheycanbeinstalled,potentiallyevenonsomeendusercomputersforspot monitoring(althoughthatsprettyunusual). Somemonitoringsolutionswillletyougetawaywithoutinstallingagentsoneverysystem, andmightnotevenneedagentsonanyofyourcomponents.AsillustratedinFigure4.6, thesesolutionstypicallyuseexternalmeansofpickinguponsystemperformance.

52

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure4.6:Agentlessmonitoring. Whetheragentlessmonitoringcanpickupasmuchdata,orpickupallthedatayouneed, dependsalotonwhatkindofcomponentsareinyournetwork,andwhatkindof monitoringtechniquesandtechnologiesareinuse.Itsamajorcompetitivepointbetween differentvendors,soitssomethingtopaycloseattentionto. ThatmonitoringproviderinFigure4.6ismyleadintoakeypointaboutmodern applications.Yourealmostalwaysgoingtowindupwithsomekindofhybridsystemthat reliespartlyonagentsandpartlyonagentlessmonitoringbecausesomeofwhatyoullbe monitoringwontbeinyourowndatacenter.

53

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

MonitoringWhatIsntYours
Theoffpremisestuffiswhereourtraditionalmonitoringfallspart.Itsunlikelythat AmazonisgoingtogiveyoudetailedperformancestatisticsintotheirElasticCompute Cloud(EC2),anditsunlikelythatMicrosoftwoulddosoforWindowsAzure. SalesForce.comisntgoingtosendyoudatabasequeryresponsetimesorWebserverCPU utilization.Eventhehostingcompanywhereyouvecollocatedyourownserversisntgoing tobesendingyoudetaileddataabouttheirroutersdroppedpacketpercentages,orany otherinfrastructurelevelstatistic. Yetthosenumbersmattertoyou.Ifyouhaveanapplicationthatreliesoncloudcomputing components,collocatedservers,SoftwareasaService(SaaS)solutions,oranyother outsourcedcomponent,thentheperformanceofthosecomponentsaffectsyourapplication performanceyourEUE.Inshort,whenAmazonfeelsperformanceissues,sodoyourusers. Thatswherehybridmonitoringentersthepicture.AsFigure4.7shows,itusuallytakesthe formofsomeexternalmonitoringservice,whichcollectskeyexternalperformance informationfrommajoroutsourceprovidersthecloudcomponents,ifyouwill,withthe datacollectionshownasredlinesandreportsbacktoyourcentralmonitoringconsole,as indicatedbythegreenline. ThisiswherealotofwhatIvebeenoutlininginthepreviouschaptersreallystartstocome together: Youneedbothyourinternalsystemsandanyexternalcomponentsmonitoredinthe sameview.Theresnowayyoucantreatyoursystemsassystemsifyoucantgetall oftheconstituentcomponentsintoasinglemonitoringspace. Thekeycompetitivepointforthesehybridmonitoringservicesisthebreadthof externalcomponentsthattheycanmonitor.Makesureyourechoosingonethatcan monitoreverythingyouvegotincludingalltheoutsourceddependencies. MonitoringtheEUEbecomesimportantbecausetheremaywellbealotof fluctuationwithyourexternalservicesandyouruseofthem.Forexample,you wontcarethatAzureisexperiencingslowresponsetimesduringperiodswhen yourusersarentrelyingonthatservice.Youonlyneedtopayattentionandbe alertedwhenyourusersareexperiencingaproblem.

54

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure4.7:Hybridmonitoring. Infact,thiskindofmonitoringofexternal,outsourcedcomponentsisthekeypiecethat makesmanyorganizationsfeeltheycantrelyoncloudcomputing.Howwillwemanage it?theyask.Howwillwemonitorit?Alongwithdatasecurity,itsprobablythebiggest questionaskedwhenorganizationsstartconsideringaddingthecloudtotheirIT portfolio.Thisishowyoullmonitorit:Usingspecializedmonitoringservicesthataddthe cloudtoyoursinglepaneofglass.Thesetoolsputoutsourcedcomponentsonthesame levelasyouronpremisecomponents,andletyoumonitorthemthesameway.

55

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Whatsinterestingisthewayinwhichsomevendorsarearchitectingthesesolutions.Many ofthemsellonpremisemonitoringsolutions,whichlookalotlikewhatsinFigure4.7. Theyactuallymonitorthecloudcomponentsontheirend,butdeliverthatinformationto you;theirsolutionthencollectsyouronpremisedataandpresentseverythingina consolidatedview. Butitdoesnthavetobethatway.AsFigure4.8shows,youcouldalsogowithahosted monitoringsolution,whereyourinternalperformancedataisshippedtothecloud(shown bybluelines),combinedwithperformanceinformationonyouroutsourcedcomponents (redlines),andpresentedinasingleviewviaaWebportalorsomeothertool.

Figure4.8:Outsourced,hybridmonitoring.

56

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Itsaninterestingmodelbecauseittakesmuchoftheresponsibilityofmonitoringoutof yourhands,andletsyoufocusontheservicesyouredeliveringtoyourusers.Itisntthe rightmodelforeveryorganization,ofcoursebutitsaninterestingoption.

CriticalCapability:YouNeedtoMonitorEverything
Thelastreallyimportantpieceofthepuzzleistomakesureyouremonitoringeverything. Everyeverything.TakealookatFigure4.9,whichistheexamplesystemIvebeenusingall along.Isanythingmissing?Ifwemonitoredeachofthecomponentsshown,insome fashion,wouldwebemonitoringenough?

Figure4.9:Isthiseverythingyoudneedtomonitor? Definitelynot.Theresalotmissingfromthisdiagram,anditsmostlythingsthatcanhave amassiveimpactonperformance.TakealookatFigure4.10itsabitmorecomplete.

57

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure4.10:Makesureyouremonitoringeverything. Routers.Switches.Firewalls.Proxyservers.Directoryservers.DNSbothinternaland external.Andprobablyalotmore.IfyourEUEisdeclining,youneedtobeabletofindthe rootcauseandyoucanonlydothatifyourmonitoringsolutioncanseeeverypossible rootcause. Thisiswhyalotofmonitoringsolutionsthesedaysofferautomateddiscovery,inaddition tolettingyouaddcomponentsonyourown.Discoverycanfindthestuffyourelikelyto missbecauseyourenotthinkingofitaspartofthesystem.Infrastructureelementslike routersandswitches.DependencieslikedirectoryservicesandDNS.Potentialbottlenecks likefirewallsandproxies.Itallmatters,soitsimportantthatitallgetontothatsingle paneofglassthatyouusetomonitoroverallsystemhealth.

58

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Conclusion
MonitoringisthethingthatletsusmanageSLAs,letsusspotproblemsbrewing,andletsus keepoursystemsrunningthewaythebusinessneedsthemto.Buttraditionalmonitoring isntnecessarilytheonlyorbestwaytogoaboutmeetingthebusinessneeds.More importantly,asbusinessesstarttorelymoreandmoreonexternalcomponentsintheir systems,traditionalmonitoringjustcantgetallofthenecessaryfactsintoasingleview. Hybridmonitoringcan.Byusingacombinationoftraditionalmonitoringtechniques,cloud providedperformancedata,andothertechniques,wecangetentiresystemsontoasingle view,intoasingledashboard,andintoasinglefocus.

ComingUpNext
Inthenextchapter,welladdressafundamentalproblemthatallorganizationsseemto strugglewith:repeatability.Inotherwords,onceyouvesolvedaproblem,howcanyou solveitmorequicklyifithappensagaininthefuture?Welllookatturningproblemsinto solutionsandimprovingservicedeliveryinthefuture.

59

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Chapter5:TurningProblemsintoSolutions
ThesatiricalnewsoutletTheOnionrecentlyranastoryrelatedtotheeconomy.Init,the publicationclaimedthataspecialkindofscientistcalledahistorianwasadvancingthe novelideaoflookingatthepast.Sometimes,onepseudohistorianwasquoted,wecan lookathowpeopletriedtosolveproblemswhicharesimilartothoseproblemsweare havingtoday.Wecanlookandseehowtheirsolutionsworked,andthatcangiveusanidea ofwhetherornotthesamesolutionwillworkforus.Hah! Althoughtargetedatpoliticianswhoseemtokeepmakingthesamemistakesoverand over,TheOnionsjibeisprettyapplicabletoITaswell.Look,ifthissameproblem happened3monthsago,andwesolveditthen,perhapswecansolveitmorequicklynow. What,exactly,didwedolasttime?Maybedoingthesamethingagainwillhavethesame effectthatitdidthen! Illputitanotherway:Perhapsyouhavechildren,oratleastknowsomeonewhodoes. Evertellakidnottotouchthehotpotthatsonthestove?Sure.Didtheytouchit?Of course.Howmanytimes?Usuallyjustonce.Thatsbecausehumanbeingsaredesignedto learnprimarilybymakingmistakes.Providedwerememberthemistake,andthatwe rememberhowtoavoiditorsolveit,wecandosointhefutureveryquickly.Memory becomesthekeyfactor,andaswegetolder,stoptouchinghotpotsandstartplayingwith computersatwork,itsometimesgetshardertoremember.Thischapterisallaboutthe finalaspectofunifiedmanagement:Takingproblemsthatwevesolved,andturningthose intosolutionsforthefuture.

ClosingtheLoop:ConnectingtheServiceDesktoMonitoring
Beforewediveintothememoryaspectofsolvingproblems,wevefirstgottoclosethe operationalloopinourunifiedmonitoringtoolset.Earlierinthisbook,wediscussedthat oneaspectofaunifiedmanagementsystemistheabilitytomonitordevicesandservices, suchasadatabaseserver.Whenaproblemconditionismonitored,themonitoringsystem createsanalert,whichistypicallyshownonaconsole,andmayinvolvenotifyingsomeone viaemailortextmessage.Atrulyunifiedsystemmayalsoopenaproblemticketinthe organizationsITtickettrackingsystem.Theticketenablesmanagementtotrackthe problemanditstimetoresolution.Italsoallowsthetickettobepassedtodifferent personnelwhocollaboratetosolvetheproblem.Theticketcanevenbeprepopulatedwith informationgermanetothecase,helpingthepersonworkingtheproblemtogetgoing morequickly.Figure5.1illustratesthisfirststep:Thealertshowingupintheconsole,and theticketbeingcreatedfromthat.

60

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure5.1:Gettinganalertandopeningaticket. Eventually,onehopes,theproblemwillbecorrected.Atthatpoint,itscommonforthe personwhocompletedittoclosetheticket,markingitascompleted. Butwhataboutthealert? Ofcourse,therealtimemonitoringcomponentofthesystemwillrealizethattheproblem nolongerexistsbutthatdoesntnecessarilygetridofthealert.Typically,youwantalerts leftinplaceuntiltheproblemisconfirmedasbeinghandled,whichmeansthatinaddition toclosingtheticket,youvealsogottogoinandclearthealert.Thisisactuallypretty commoninorganizationsthatdonthaveaunifiedmanagementsystem:Closetheticketin onesystem,thenlogintothemonitoringsystemandmarkthealertashandled.Inatruly unifiedsystem,however,itwouldmakesensefortheclosedtickettoalsoclearthe associatedalertbecausethealertiswhatcreatedtheticketinthefirstplace.Figure5.2 showshowthisloopcanbeclosedwithinasinglesystem.

61

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure5.2:Closingaticketclearstheoriginalalert. Theresaperfectlygoodreasontohavealertsandticketsremainseparatefromeachother. Atickettendstobeaninternaluseonlytypeofthing.Itcontainstechnicalinformation, intendedforuseinsolvingaproblemandforreportingonthatresolutionprocess.Analert, however,isconsumablebyawiderrangeofpeople.Analertmightsurfaceina companywidedashboard,forexample,showingusersthatagivensystemisindeed impacted.Youdontnecessarilyclearthealertjustbecausethemonitoringsystemisno longerseeingaproblem,becausetemporaryrelieffromaproblemdoesntnecessarily indicatearesolvedproblem.Youmightwantthealerttoremaininplaceasahighlevel indicatorthat,weknowitsnotworkingperfectlyrightnow.Butatsomepointyoull wanttoclearthealertandreturntheaffectedsystemtooperatingnormallystatus; havingthathappenautomaticallyaspartofclosingtheticketcanbeaconvenientwayto keepthetwodifferentaudiencesupdatedmoreeasily.

RetainingKnowledgeMeansFasterFutureResolution
Onceaproblemissolved,itdoesntgoaway.Atleast,youhopeitdoesntgoaway.AsI pointedoutinthebeginningofthischapter,everyproblemsolvedisapotentialturbo boostforsolvingproblemsinthefutureboththatexactsameproblemaswellasrelated ones.Inotherwords,youwanttoretaininformationabouttheproblem,aswellasits solution,sothatitcanbecomeusefulinthefuture.

62

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

KnowledgeBases
Probablytheoldestformalmeansofretainingthisinformationistheknowledgebase. Originally,thesewereseparatedatabases,consistingofarticlesabouthowtosolve problems.Whenyouhaveaproblem,youfirstsearchtheknowledgebasetoseeifanyclues existtohelpsolvetheproblem. OneoftheearliestknowledgebasestobewidelydistributedwasMicrosofts,whomadeit availableintheearly1990sonCD.Today,itsamassivecollectionofonlinearticlesso massive,infact,thattheresactuallyaknowledgebasearticleonhowtoquerythe knowledgebase(showninFigure5.3,justincaseyoudontbelieveme).

Figure5.3:MicrosoftKnowledgeBasearticle.

63

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Thisillustratesjustoneoftheproblemswithaknowledgebase:Peoplehavetolearntouse it,andhavetoremembertouseit.Unfortunately,ITprofessionalsarentnecessarilythe audiencemostlikelytoreachforthemanualoraknowledgebasewhenaproblem cropsup.Theyrealotmorelikelytojustdiveinandtrytousetheirownknowledgeto solvetheproblem.UsingtheknowledgebasesearchtheKB,inthevernacularusually onlyhappensaftertheyveexhaustedtheirinternalknowledge.Partofthisattitudecomes fromtheirprofessionalcompetency,partfromthepoorusabilityofmostknowledgebases, andpartfromthefactthatknowledgebasescangetoutdatedprettyquickly. Whichillustratesanothermajorproblemwithaknowledgebase:Thetaskofkeepingit updated.Unlessyourecarefulattheoutsettotagarticleswiththingslikeproductversions andsoforth,itcangetreallyeasyfortheknowledgebasetobecomearepositoryof misinformation.Consideralineofbusinessapplication,version1.5,thathasaparticular problem.YoudocumentthatinaKBarticle,thenrelyuponthatknowledgetofixthe problemwheneveritarises.Finally,yourdeveloperscorrecttheprobleminv1.6.Does anyonegobackandupdatetheKBarticle?No.EveniftheKBarticleindicatesthatitapplies tov1.5,itdoesntprovideguidancebeyondthat.Wastheproblemfixedinv1.6?Willthe samefixprocedureworkinv1.5?Ifyoureusingv1.6andtheproblemoccursagain,should youfollowthev1.5procedureorreporttheproblemasanewonesincethedevelopers thoughttheydfixedit? Allofthispresumes,ofcourse,thatyouveaddressedthemajorproblemofknowledge bases:Gettingarticlesintotheminthefirstplace.VendorslikeMicrosoftspendmillionsof dollarsperyearonthesalariesofpeoplewhodolittlemorethanwritedocumentationand contributearticlestoknowledgebases.Areyouwillingtomakethatkindofinvestment? Iveseenmanycompaniessetupaknowledgebase,useitenthusiasticallyforafewmonths, andthenletitslideandfallintodisuse.

TicketsasKnowledgeBaseArticles
Thefirstsolutiontomanyoftheknowledgebasesinherentproblemswastosimply discardtheseparateknowledgebaseandinsteaduseclosedticketsasaformofknowledge base.Thisisprettymuchwhateverymajortickettrackingsystemthesedaysoffers. Thisapproachsolvestheprimaryknowledgebaseproblemofhowtogetcontentintothe system,becauseitsimplyrepurposescontentthatsalreadyinthesystem:tickets.Witha goodticketingsystem,itcanalsohelpsolvetheproblemofwhatthisappliesto,because yourticketsaretypicallycategorizedwithaspecificproductorservice.Soyoullatleast knowwhenyourereadinganoldticket,whatproductandversionitappliedtoalthough youwontnecessarilyknowifitstillappliestothecurrentversionofthatproductordevice. UsingHelpdeskticketsasaknowledgebasedoesntsolvetheproblemofgettingpeopleto actuallysearchthemforanswers.Infact,usingHelpdeskticketscanmaketheproblem worse.Thinkaboutit:Everytimeaproblemarises,anewticketiscreated.Sowhenyou searchtheknowledgebase(forexample,theoldtickets)usingakeywordorbyjust selectingaproductordevice,youregoingtogetalotmoresearchresultsbecauseyoure goingtobelookingateveryticketthatmatchesyourcriteria.

64

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Helpdeskticketsdontalwaysmakeagreatsourceofselfservicedocumentation,either. NotallITfolksarethebestwritersintheworld,andticketshaveawayofgatheringlets justcallitinformallanguage,whichyoumightnotwanttosurfacetoyourendusers.For example,auserwhologsontoyourselfserviceknowledgebase,tryingtosolveaproblem ontheirownratherthanbuggingyourHelpdesk,mightnotbeencouragediftheresult theyfoundsaidsomethinglike,Rebootedstupiduserscomputer.Techniciansmightalso provideverylittleinthewayofdetail.Forexample,itsnotunusualtoseefixedasthe resolutiontoaticketnotveryusefulforfuturereference.ButusingHelpdeskticketsasa knowledgebasesourceisntfarofffromtherealsolution.

UnifyingtheKnowledgeBase
TherearetwothingsyoucandototurnmoreHelpdeskticketsintousefulknowledgebase articles.First,youneedsomeautomation.Wheneveranewticketiscreated,theticketing systemshouldautomaticallygolookingforrelatedpasttickets,presentingthemas candidatestowhatevertechnicianisworkingtheproblem.Onegreatexampleofthis techniqueisusedbytheStackOverflow.comsiteitselfakindofticket/knowledgebase combinationwhenyouaskanewquestion.Itautomaticallysearchespastquestionsand presentsthemtoyouinavisuallyinterruptivefashion:Theyreinsertedbelowyour question,butabovewhereyoudtypethedetailsofyourquestion,asFigure5.4shows.It essentiallyforcesyoutoreviewthosesuggestionssothatyoucanquicklyseeifyour questionhas,perhaps,alreadybeenanswered.

Figure5.4:Suggestedanswerstoaquestion.

65

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Asyoubegintotypeyourquestionsdetails,additionalsuggestionsareshownofftothe side,againhelpingyouusethedatabaseofpastanswersratherthanrequiringyouto explicitlysearchitinanextrastep. Soaunifiedsystemcanhelptakethatextrastep(seeFigure5.5).Byincludingpotentially relevanttickets,orlinkstothem,inwithanewlycreatedticket,thesystemcangive techniciansajumpstartonsolvingtheproblembycallingtheirattentiontosimilar situationsinthepastalongwiththesolutionstothosesituations.

Figure5.5:Usingoldticketstosolvenewproblems.

66

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Infact,forticketsgeneratedautomaticallyfromanalert,theticketingsystemcan potentiallydoamuchbetterjoboffindingolderticketsthatactuallyrelatetotheproblem. Becausethesystemdoesntmindtakingtheextrasteps,itcanincludemoredetailedsearch criteria,suchasthenatureoftheproblem,thedeviceorserviceaffected,andsoforth.A technicianmightnotthinktoincludeallthedetail,whichwouldnetthemalargesetof searchresults,whichisoftenwhatdiscouragesthemfromsearchinginthefirstplace.By gettinganarrowresultsettobeginwith,theautomaticallyreferencedticketsaremore likelytorelatetotheproblemathand. SuchasystemcanbemadeevenbetteriftheHelpdesksystemincludesacoupleofcheck boxesinitstickets.Whenclosingaticket,atechnicianshouldbeabletoindependently indicate: Whetherthisticketcontainsabonafideresolutiontotheproblem.Forexample, sometimesthetechnicianmightsolvetheproblembylookingatanolderticket, meaningthecurrentticketmightnotcontainalotofdetailonhowtheproblemwas fixed.Butifthetechnicianhasfixedtheproblem,andhasfilledinthedetailsofwhat wasdone,thenthecurrentticketcanbemarkedasasolutionticket,makingit bubbletothetopoffuturesearchresults. Whetherthisticketcontainsanenduserconsumable,selfservicesolution.Many tickettrackingsystemsthesedaysincludebothpublicandprivatenotesfields, helpingtoensurethatendusersdontseeanythingthattheymightfindupsettingor insulting,whileacceptingthefactthattechnicianswillputthatstuffintoaticket sometimes.Byhavingaspecificindicationofwhichticketsareconsumableoutside theITteam,andwhichonesspecificallycontainuserimplementablesolutions,you canbuildaselfserviceknowledgebasethatactuallyworks.

Figure5.6showshowasystemmightimplementthisinthiscase,ratherthanacheckbox, thesystemusesavisibilitydropdowntochangeaticketfrombeingheldtobeing published.

67

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure5.6:Controllingticketvisibility. Simplythepresenceofthesecheckboxes(orotherindicators)canhelptoremind techniciansthatdocumentedsolutionsaredesirable.Fromamanagementperspective, organizationsmightsetquotasfortechnicians:Atleast75%oftheticketsyouclosemust eithercontainadetailedsolutionorrefertotheticketthatdoescontainthesolution. Metricslikethataremanageablethroughticketsystemreports,andcanhelpensurethat ticketsreallydobegintoserveasabasisforknowledgeretention.

68

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

MakingTicketsanAsset
Theoverallideaistotakeyourticketsfrombeingawayoftrackingproblemsandworkto beingacompletelifecycleforproblemsolving.Inordertobeeffective,ticketsasa solutionhastoovercomesomeofthecommonhumanbehaviorsandimplementation issuesthathaveoftenbeenhurdlesinthepast: Techniciansdontalwayssearchtheticketdatabasesothatshouldhappen automaticallytosomedegree,withticketsbeingofferedaspotentialsolutions. Technicianssearchskillsarentalwaysthatgreatsoaunifiedsystemshould, usingtheinformationitalreadyhasatitsdisposal,makeafirstattemptatfinding relevanttickets. Technicianswritingskillsarentalwaysaprioritysothesystemshouldemphasize theneedforcompletesolutions,managementshouldfocusonthatasametric,and techniciansshouldbeabletoofferbothinternalandexternallyconsumable versionsofasolution,whenappropriate.

Withtherightsystemparticularlyoneconnectedtoyourmonitoringsystem,makinga trulyunifiedenvironmentsolutionstoproblemscanbejustaclickaway.

PastPerformanceIsanIndicationofFutureResults
Anotherwaytousehistoricaldataisindevelopingservicelevelexpectations.Im deliberatelyavoidingthephraseservicelevelagreementbecauseanSLAisaformal documentthatoftenincludessomeelementofanorganizationspolitics.Aservicelevel expectation,however,isthelevelofservicethat,baseduponpastperformance,youcan realisticallyexpecttoachieveinthefuture.AnSLAwillideallybebaseduponthosereal worldexpectationsifyoucanprovidethem. OneissuewithmanyorganizationsSLAsisthattheyrenotactuallybasedinreality. Someonewilleithermakeupanambitiousgoaltolookgood,likepromising99.999% availabilityandthenboldlystatingthattheywilljustmanagetothatnumber.Other times,someonewilltakeanoverlycautiousapproachwhenestablishingservicelevels, forcingtheorganizationtoexpectalesserlevelofservicethantheyrealisticallycould. Atfaultarethetoolsweuse.Thisgetsallthewaybacktothefirstchapterofthisbook, whenIwroteaboutthesilosthatITtendstoworkin,andthevaryingdomainspecifictools werelyontotroubleshootandsolveproblems.Thosesamedomainspecifictoolsarewhat weusetomeasureourexistingperformancelevels.Becausethosetoolsdontallspeaka commonlanguageoruseacommonsetofmetrics,itsactuallyreallytoughtofigureout whatouractualservicelevelsare.

69

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Thebottomlineisthis:Youhaveanexistingenvironment.Allpoliticalandinternalissues aside,yourexistinginfrastructureiscapableofdeliveringsomeleveloftechnically measurableperformance.Youjustneedtodiscoverwhatthatis,usingacommonand easilycommunicatedsetofmetrics,baseduponyourinfrastructurescurrentcapability. Youcantreallydothatbyusingahodgepodgeofdomainspecifictools,thoughandyou reallycantdoitwhenyourinfrastructurestartstocontainoutsourcedelements.Start bringingincloudcomputingplatforms,collocatedservers,softwareasaservice(SaaS) elements,andsoon,andyoullfindthatyourdomainspecifictoolsjustcantprovide enoughinformation.Sohowcanyouestablishagoodservicelevelexpectation? Thisdrivesrightbacktothepreviouschaptersinthisbook.Sayyouvegotacomplexsetof servicesandapplicationswhodoesntthesedays?Figure5.7showsaninfrastructure offeringalotofdifferentelements,someinsidethedatacenter,someout.

Figure5.7:Modernenvironmentsincludemanydifferentcomponents.

70

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Youstartyourmeasuringattheoneplaceitmattersmost:theenduser.Putsomeprobes, agents,synthetictransactions,orwhateverelseyouneedinplacetofigureoutwhatyour usersareactuallyseeing,today,intermsofperformance.Monitorthatoverseveraldays thatrepresentreal,normalworkloadnofairpickingholidaysasyourdaytomonitor andyoullknowwhatyourinfrastructureisactuallyproviding.Itstandstoreasonthatyou cantexpectanybetterbutthatyoualsoshouldntputupwithanythingworse.Ifthat servicelevelexpectationisntasgoodasyourSLAwell,thatsfine.Youcanstartlooking forareaswhereyoucanimprove,bringingthingsuptothatSLAlevel. Youllalsowanttocaptureindividualperformancefromeachcomponentandthisis wherethingscangettricky.Itscrucialthat,atthislevelofmonitoring,yougeteverything ontoasingleconsole,inasinglelanguage,usingasinglesetofmetrics.Whatyourelooking forisaperformancerangeforeachcomponentthatrepresentsanormalworkday. Providedeachcomponentoperateswithinthatobservedrange,youshouldbedelivering theenduserexperiencethatyoumeasured.Thoserangesprovidethebasisforyour monitoringthresholds:Anythingoutsidethoserangesissomethingyouneedtobealerted to. Withthatservicelevelexpectationestablished,youcanstartmeasuringdifferentworkload levels.Seehowthingslookonanespeciallybusyday,forexample,andwhattheylooklike onalightday(thisiswhereitsokaytopickaholiday,forexample).Youllstarttogetafeel forhowyourenduserexperiencediffersunderthosedifferentworkloads,andhowthe elementsofyourinfrastructurechangeunderdifferentworkloads. Makingsurethatanyoutsourcedelementsareincludedinallofthisis,ofcourse,absolutely crucial.AsIvepointedoutinearlierchapters,monitoringthoseisabitdifferentthan monitoringthingsthatliveinsideyourowndatacenter.Youlleitherneedaunified monitoringsolutionthatstrulycapableofhybridmonitoring,oryoullneedaspecialsetof toolstogatherperformanceinformationonthoseoutsourcedpieces. NoticethatIvelaidouttwosetsofmetricsforyoutomonitor:performanceandworkload. Toooften,IseeSLAsthatdonttaketheworkloadintoaccount.Wewillprovidea100ms responsetime.Okayunderwhatworkload,specifically?BecausemaybeIcangiveyoua 100msresponsetimeunderwhatIconsidertobeanormalworkload,butifyoustart loadingonadditionalusersandfunctions,thenthatresponsetimeisobviouslygoingtofall off.Again,monitoringsolutionscanhelpwiththisbynotonlymeasuringperformanceof thingslikeprocessor,memory,disk,andsoforth,butalsoworkload,likethenumberof transactionsbeingprocessed,thenumberofnetworkpacketsbeingrouted,andsoon.Its importantthatyourperformanceexpectationsincludethatworkloadcontextsothatyou canbegintomakebetterservicelevelagreementsinthefuture.

71

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

ItsthePerformanceDatabase
Allofthisperformancedataneedstonotonlybecapturedbutalsostored.Thatswherea lotofmonitoringsolutionsmissthepoint:Theyremonitoringinrealtime,andtheyre alertingtoproblems,buttheyrenotalwayssavingthedatatheysee.Letsexpandthe exampleapplicationtoincludethatperformancedatabaseitsinFigure5.8.

Figure5.8:Addingaperformancedatabasetotheenvironment.

72

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Thepointofthisfigureissimplythatyouneedtogetperformancedatafromevery componenteventheoutsourcedonesintothatdatabase.Why?Tworeasons: Thisdatabaseiswhatsgoingtoshowyouwhatyourperformancereallylookslike onwhatyouconsidertobeanormalday.Thisiswhereyourperformance expectationswillcomefrom,anditshopefullywhatyoullusetoderivemore realisticandmeaningfulSLAs. Thisdatabaseiswhatsgoingtotellyouwhenyourperformanceistrendingaway frompreviouslyestablishednorms.Imnotreferringheretoasituationwhereone componentsperformancegoeswonkyduetoaproblemlivemonitoringand alertingwilltakecareofthat.Thedatabaseistheretospotthelongtermtrends: Hey,didyouknowthatperformanceisdown1%fromlastmonth,whichwasdown .75%fromthemonthbefore?Atthisrate,youllbeunabletomeetyourSLAsin6 months.

Andfrankly,agoodmonitoringsolutionshouldntevenshowyoutheprototypicaltrend lineinperformanceasitsfirststep.Thefirststepshouldbeasimpledashboard:Youre meetingyourSLA,andbasedoncurrenttrends,willcontinuetodosofortheforeseeable future.Or,YouremeetingyourSLAbutbarely,andbasedontrends,youarentgoingto beabletomeetyourSLAformorethanamonthortwo. Fromthere,youcandrilldownintographsandchartsthatgiveyoumoredetailsothatyou canfindthecomponentorcomponentsthatlooklikethecurrentbottleneck,andstart makingplanstogetmorecapacityinplacebeforeyoumissyourSLA.

Summary
Embracingthepasttomakeabetterfuturethatsbeenthethemeofthischapter. Whetheryouregatheringticketresolutioninformationsothatitcanbeusedtosolve futureproblemsmorequickly,orgatheringperformanceinformationsothatyoucan establishexpectationsandpredictcapacity,itsallaboutkeepinghistoricaldataand leveragingittoputtheorganizationonabetterfootingfortomorrow.

ComingUpNext
Inthelastchapterofthisbook,weregoingtostepallthewaybacktothebeginningand lookatunifiedmanagementfromacasestudyperspective.Illusemyconsultingandfield experiencetoconstructacompositecasestudy,drawingelementsofunifiedmanagement togethertoshowyouwhatamodern,trulyunifiedenvironmentcanlooklike.Illshare specificproblemsfromeachenvironment,andexplainhowunifiedmanagementhelped solvethoseproblemsmorequicklyandeffectively.

73

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Chapter6:UnifiedManagement, Illustrated
Inthisfinalchapterofthebook,Iwanttorevisiteverythingfromthefirstfivechapters. However,Imgoingtodosointheformofcasestudies.Ivebeenfortunateenoughtospeak withseveralconsultingclientsofminewhovebeenstrugglingwiththesameissuesIve outlined,andwhoverecentlybeentryingsolutionsthatfollowthebasicapproachIve described.Theyveagreedtoletmesharetheirstories(althoughtheyveaskedthatInot usetheirnamesorcompanynames)sothatyoucangetabeforeandafterlookathowthis unifiedmanagementthingshouldwork.Alongtheway,Illalsosharesomeofthe challengesandroadblockstheyveencountered.Aswitchtounifiedmanagementisnt alwaysgoingtobehasslefree,soIthinkitsvaluableforyoutoseewhattheyvehadto dealwith,andhowtheythinktheyregoingtodoso. Thischapterwillalsoincludesomeofthepracticalinformationonunifiedmanagement thathasntmadeitintothepreviouschapters.Illprovideaconsolidatedshoppinglistof unifiedmanagementfeaturessothatwhenyoustartexaminingsolutions,youcanhavethat listinhandtohelpyou.Illalsolookatdifferentpurchasingmodelsthatvendorsare offeringthesedaystogiveyouanideawhatkindofflexibilityyoumighthaveforacquiring andimplementingasolution.

TheCaseStudies
AunifiedmanagementsolutionhastoprovidefeaturesforwhatIbelievearetwodistinct broadusecases.Thefirstisinhelpingyoutoreacttoproblems,whilethesecondhelpsyou managenonproblemrequestssuchasrequestsforchangeswithintheenvironment.Im goingtoprovidetwodistinctstoriesforeachofthese.Theyreactuallybothdrawnfromthe sameconsultingcustomer,althoughyoullmeetdifferentpeoplefromthoseorganizations ineachnarrative.

DetectingandSolvingProblems
Lisaisaseniorsystemsadministrator,responsibleprimarilyfortheWindowsbased systemsinherenvironment.Hercounterpart,Peter,isresponsibleforthecompanysUnix andLinuxbasedserverinfrastructure.Bothhaveconsiderableareasofoverlapping responsibility,asmanyofthecompanyslineofbusiness(LOB)applicationsrelyonboth Windowsand*nixbasedresources.

74

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Itisntjusttheservers,ofcourse,Lisatoldme.Itswhatsrunningonthoseservers: databases,Webservices,younameit.Someoneelsesupportsthosedifferentpieces,so thereusedtobealotoftimespentarguingaboutwhosefaultsomethingwas. Iaskedherforanexampleofhowthingsworkedintheirenvironmentpriorto implementingaunifiedmanagementsystem.Shelaughedandbroughtoutafilethatshed clearlyheldontoforsometime.ItlookedlikethetextfromaHelpdeskticketsnotes. Heresthecompletetext,withnamesedited;Iveaddedsome[editorial]notesforitems thatIhadtoaskLisatoexplain. OPENEDBYHelpDeskAT2009061413:34 UserstatesthatBOS[anLOBapplication]isextremelyslow.Haveseverale mailsaboutthisintheqalso.ServerBOSDB02respondingslowlytopings. ASSIGNEDTOLHarte[thisisLisa] NOTESBYLHarteAT2009061415:26 BOSDB02isworkingfine,apartfromthefactthatSQLishogging100%ofthe CPU.PassingtoDBA. ASSIGNEDTODShields NOTESBYDShieldsAT2009061416:53 Probablytheindexesagain,SQListakinglongertocompletequeriesthanit should.Willscheduleindexestoberebuilttonight NOTESBYHelpDeskat2009061510:44 Stillgettingcallsonthis NOTESBYDShieldsAT2009061511:12 Indexesrebuilt ASSIGNEDTOHelpDesk NOTESBYHelpDeskAT2009061511:34 StillgettingcallsthatBOSDB02isstillslowtoping ASSIGNEDTODShields NOTESBYDShieldsAT2009061513:12 SQLisstillslowlookslikeitisindiskIO.Fragmenteddisk?Needserver support. ASSIGNEDTOLHarte NOTESBYLHarteAT2009061513:47 Serverdiskshowslessthan2%fragnottheproblem.IOisslowbecause SQListhrashingthedisks.MaybeyourDBisfragged.Illcallyou. ASSIGNEDTODShields

75

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Theconversationclearlywentofflineatthatpointbecausethenextentrysimplyindicates problemresolved.Unfortunately,therewasnoofficialdocumentationofwhatwent wrongorwhatwasdonetofixit,butLisaexplained.Wekeptgoingbackandforth betweenushedseesomethinginPerformanceMonitorthatlookedliketheserverwas slow,andbounceittome,andIdtellhimthatitsbecausehisSQLServerwascausingthe problemandbounceitrightback.IdontevenhavepermissiontolookinsideSQLServer, andhejustkeptwantingtogettheticketoutofhisqueue. Intheend,itactuallyturnedouttobeaproblemwiththeSAN,whichwasPetersproblem. SomethinghadgonewrongwithourmainSANconnectionandwewereonaslower backuplink,andsomethingwaswrongwiththatlinksconfiguration,soitwasntrunning atfullspeedorsomething.WewereseeingitasslowdiskIObecauseWindowsobviously thinksthattheSANisjustonebiglocallyattachedvolume.Wewererunningallkindsof testsontheserverandinSQLServertotryandfindtheproblem,butnoneofourtools wereabletorealizethattherealproblemwasfurtherunderthehoodsomeplace. Peterrecalledtheincident.Itwasweirdbecausetherewasntanythingactuallybroken,so noneofthetoolsIusetomonitortheSANgaveoffanyalerts.Theproblemwasa configurationproblemonseveralofourhosts.Thetoolsdontseethatasbroken,ofcourse, althoughitwascausingthemtoaccesstheSANalotslowerthantheyreusedto. Therealproblemwasthatthiscroppeduponaboutsevenmachinesallatonce.Wedidnt correlatetheproblematfirst,becauseeverysinglemachinewasaffectedslightly differentlybecausetheyallusetheSANfordifferentpurposes.Theresonlyonemajor databaseonthatSAN,buttheresasmallWebfarmandafileserver.Sothesymptomsthe userssawweredifferent,andtheproblemswereallroutedtodifferentpeopletohandle.It wasthefileserverguyswhobroughttheproblemtome.Theysawthediskqueuelength goingupprettydramatically,andtheyknewthathadtobetheSAN,soIgotinvolved. Thatwastheproblemwedealtwithallthetimebackthen,Lisasaid.Weallfocused specificallyonthebitwewereresponsiblefor,butthesedaystherearesomany interactionsanddependenciesthatwecantseefromatoollevelthatwedgetalltiedup whenaproblemhappened. IalsospokewithKevin,whomanagesthecompanysHelpdesk.Hesaysthosetypesof problemswereespeciallytryingforhisteambecauseuserswouldkeepcallingandthe Helpdeskhadnoideawhatwasrelatedandwhatwasnt,orwhatthestatusofanything was.Userswouldcallinwithsomethingthatsoundednewtowhoeveransweredthe phone,sotheydopenanewticket.Wewereprobablyslowingdownwhoeverwastryingto fixtheproblemjustbyloadingnewticketsonthemforthesameproblem.Butwehadno realcommunication.Ifyouansweredthephone,youlookedtoseeiftherewasanopen ticketonanythingthatsoundedsimilar.Buttherewasnooneplacewherewekepttrackof allthecurrentlyopenproblems.IfinallyjusthadawhiteboardinstalledintheHelpdesk office,andoutstandingproblemswouldgetwrittenupthere.Sowhenacallcameinyou couldatleastlooktoseeiftheproblemwasalreadyopen,thenlookupthattickettosee whatwashappeningandgivesomestatustotheuseronthephone.

76

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

IaskedLisahowthingsworkednow,afterthecompanyhadimplementedaunified managementsystem.Wevebeenonitforaboutayearnow,shetoldme,andits completelydifferent.Sheshowedmeaticketforaproblemthathadoccurredrecently. Thisiswhatwesee,now. ALARM2011061412:13:42 NODEWindowsServerBOSDB02 SQLServerInstanceDEFAULT SYMPTOM:SQLServerresponsetimeexceedsthreshold IP:10.10.15.212 SQLServerdatabaseshows34%free SQLServerfragmentationshows<5% Diskqueuelength<1 Networkutilization<40% CPUutilization<60% Memoryutilization<75% RELATEDALARM2011061412:10:52 NODERouterMBS3667 Interfacefault Justlookingatthat,Icanstarttoguesswhattheproblemis.Sheshowedmethe monitoringconsolethattheentireITteamnowworkedfrom,whichlookssimilartoFigure 6.1.Youcanseethatitsbasicallyanetworkdiagram.Itshowstheserversandtheservices theyrun,butitalsoshowsthingslikeroutersandswitches.Sowhenaserveralarms,itll alsolookforalarmsonanydependencies,likearouter.Inthiscase,wehadarouter interfacethatwasgoingbadandstartingtodroppackets.Thattriggeredanalarmright awaytotherouterguy,butitalsoalarmedalloftheserversthatusethatrouterto communicatebecauseclientsandthemonitoringsystemsawtheserversresponse timesgoup.Justhavingthatdatainfrontofussavesatonoftimetestingforproblems.The systembasicallyrunsaseriesofbasiccheckswhenevertheresaproblem,soitgetsthose preliminarystepsoutofthewayforus.

77

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure6.1:Tracingalarmsvisually. Shesaidtheteamspendsalotlesstimepassingproblemsbackandforthbecauseits usuallymuchclearerwheretheproblemlieswhenthesystemislookingattheentirestack. Thisishugewhentheproblemisactuallyoutsidethedatacenter.Wehaveanumberof applicationsthatinterfacewithSalesForce.com,andwheneverthoseguyshaveaproblem, ormorecommonlywhenourISPgetsalittleslow,ourusersseeitasourapplicationbeing slow.Butthemonitoringsystemknowsaboutthedependencies,anditsusuallyalready alarmedus.Wellpostamessageabouttheaffectedapplicationsonourend,andstart callingtheserviceprovidertologaticketwiththem.

78

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Postingamessage,Kevinsays,hashelpedtheHelpdesktremendously.WehavethisWeb portalwhereuserscanlogtickets,andcurrentsystemstatusisshownrightthere.So beforetheyevenopenaticket,theycanseethatweknowtheresaproblem.Oncewe trainedthemtotrustusonthat,theystoppedloggingduplicatetickets. Headmitsthatthetrainingwasabigstep.Wedidntdoitinitially,hesaid,butonce usersrealizedwewerebeingprettyhonestandconsistentaboutpostingproblems,they startedtotrustusmore.Wehadabigcommunicationseffort,andnowtheresevena mailinglistuserscanaddthemselvestosothattheygetamessagewheneverasystemthey useisaffected.BeingproactivecutsbackontheHelpdeskvolumeaton. Thebenefitsofaunifiedmanagementsystemwereprettyclearforthisteam:fastertimeto resolution,lesspassingthebuck,andmoreproactivecommunicationswiththeirendusers. Thebiggestchallengetheyfaced? Atrustthing,Lisatoldme.Wehadtolearntotrustthisnewsystemtomonitor everythingaswellaswecouldwiththetoolswewerefamiliarwith.Sothefirstfewtimes thingswentwrong,wewentrightbacktowhatweknewtotroubleshoottheproblem.Once werealizedthatwewereseeingthesamedata,westartedtrustingthenewsystemmore, andjuststartedrelyingonit.Wellstilldigouttheoldtoolsifwehavetodivedeepintoan affectedsystem,butbythetimewedothatweknowtheproblemisinthatsystem,sowere notwastingtime.Youdontpassthebucktosomeoneelseatthatpoint,youstayinthat problemareauntilyouspottheproblem.

FulfillingUserOrders
Kevinprovidesthelinktotheothersideoftheunifiedmanagementstory.Werenotjust responsibleforopeningticketsforproblems.Wealsoopenticketswhenroutinechanges needtobemade.Iaskedhimtogivemeanexampleofhowthiswashandledpriortothe implementationoftheirunifiedmanagementsystem,andhepulledoutanarchivedticket. OPENEDBYHelpDeskAT2010081215:50 UserBDOUDSneedsanewSharePointsitedeployedas intranet/projects/universitybid.Userwillbesiteadmin. ASSIGNEDTOJHoltz NOTESBYJHoltzAT2010081308:27 SentemailtoBillsmanagerconfirming.AlsosentemailtoSpecialProjects confirming. NOTESBYJHoltzAT2010081611:12 Billsmanager,KHICKEY,confirms.StillwaitingtohearfromSpecialProjects. NOTESBYJHoltzAT2010081811:05 StillwaitingtohearfromSpecialProjects.LeftVM. NOTESBYHelpDeskAT2010082010:34 Userisaskingforstatus.

79

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

NOTESBYJHoltzAT2010082011:34 TellhimtocallSpecialProjects.Ijustneedthemtoconfirmsincethiscomes outoftheirbudget. NOTESBYJHoltzAT2010082213:11 SpecialProjectsconfirmed.SetupsiteandassignedBDOUDSassiteowner. STATUSSETTORESOLVEDAT2010082213:12 Thatkindofthingwentonallthetime.Someonewouldcallusaskingforsomeaccessor whatever.WedassignthetickettosomeoneinIT,butthentheydspendtimefiguringout whowasresponsible.Weusedtohaveabigbook,headded,pointingtoathickthreering binderonhisshelf,thattolduswhowasresponsibleforprettymucheverything.Then youdwaitandwaittohearbackfromthem.Thisonetook,what,twoweekstoresolve? Thatsinsane,andthewholetimetheuseriscallingustocheckonthestatus,whenwere nottheonesholdingthingsup.ThistookJeff10minutestodooncehegotapproval. Andintheworldofunifiedmanagement? Itsactuallyprettycool,Kevinsaid.Nowwehaveabigonlinecatalogwitheverythinga usermightwant.Itskindoflikeanonlinestore.Theysubmittheirrequestthroughthere, andthesystemopensaticketautomatically.Buteachitemisassociatedwithaworkflow, soITdoesntevenhearaboutituntilthetickethasbeenroutedthroughtheproper approversandbeenapproved.Onceweseeit,itsadonedeal,sowejustimplementit.For somethings,wereevenimplementingscriptsthatdotheimplementationforus,soits completelyhandsoff.Theorganizationworkedout,anddocumented,thedesired workflowsforeachpossibleproduct.Kevinprovidedanexampleofthatdocumentation, showninFigure6.2.Thiskindofdocumentationisimportantbecauseweworkedoffof thistoimplementtheworkflows.Thebusinessownerscancomeupwiththeseflowcharts ontheirown,thenwejustimplementthemonthedesignatedproductsinthecatalog.

80

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Figure6.2:Documentedworkflowusedtodriveautomatedreview/approvalsfor catalogrequests.

81

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Wewerediscussingaccesspermissionsasanexample,soIaskedwhathappenedwhen thoseneededtochange.Theyneverdid,Kevinadmitted.Onceyouhadaccess,you usuallykeptituntilyouleftthecompany.Wejustdidntkeeptrackofit.Now,thecatalog keepstrackofit.Ifyoudontneedsomething,youreturnittothestore,itgoesthrough whateverapprovals,andwegetatickettoremoveyouraccess.Differentmanagersalso havetooccasionallycompleteanattestation,whichiswheretheyreviewwhohasaccessto theirresourcesandletusknowifanyoneneedstoberemoved,orifeveryonecanstay. Werenotthegatekeepersanymore. Inotedthatanautomatedworkflowwouldntnecessarilyguaranteeaspeedyresponse time.Oh,usersstillhavetowaittwoweeksforapprovals,sometimes.Butwhenthey submittheirrequestthroughthecatalog,theycancheckthestatusofthatrequestontheir own.Theycanseethatithasntmadeittous,andtheycantakeitonthemselvestobug theirmanagerorwhatever.Weretotallyoutoftheloopuntilitsapproved,andtheyknow that,becausetherequestshowsithasntevenmadeittousyet.Suchasystemdoesa betterjobofkeepingusersinformed,andhelpingthemtounderstandwhatsreallyholding thingsup.

AShoppingListforUnifiedITManagement
IwanttousethissectiontopresentalistofwhatIbelievearethemusthavefeaturesofa trueunifiedmanagementsystem.Asyoureevaluatingsolutions,makesuretheyoffer thesefeaturesandmakesurethefeaturesoperateinawaythatmakessenseforyour environmentsneeds. Workflow.Unifiedmanagementsolutionsshouldofferworkflowsthatcanhelp automateresponsesandservicemanagement.Workflowconstructionshouldbeas draganddropaspossible,involvingaslittleprogrammingaspossible. Agents.Iknowtheresahugedividebetweenpeoplewhoarefinewithdeploying agents,andthosewhohatetheidea;Idsuggestlookingforasolutionthatsupports bothmodels.Agentlessdatacollectionisfineinsomeinstances,althoughitcanoffer lessperformanceandcoveragethananinstalledagent.Ithinkahybridapproachis probablybestformostorganizations,andunifiedmonitoringsolutionsoughtto supportthat. Alarmintegration.Whenaproblemarises,aunifiedmanagementsolutionshould obviouslytellthedesignatedindividuals;itshouldalsoopenaHelpdeskticketand automaticallysearchforrelatedalarmsfromthepast.Doingsowillhelpspeedup thetimetoresolution.Thiskindofknowledgeautomationisreallycrucial. Approvals.AsIvepointedout,ticketsarentalwaysforproblemssometimes theyrefornewwork,likechangerequests.Aunifiedmanagementsystemshould supportareview/approvalworkflowfortheserequestssothatITcanbetakenout ofitstraditionalgatekeeperroleandinsteadsimplyworkthoseticketsthathave beenapprovedforimplementationbythebusiness.

82

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Discoveryanddeployment.Aunifiedmanagementsolutionshouldhelpyou discovermanageablenodesandservicesanddeployanynecessaryagentsto monitorthem.Thisdiscoveryshouldhappenmoreorlesscontinuously,oratleast beabletoberunregularly,sothatchangestoyourenvironmentcanbecaptured. Routing.Ticketswhetherforproblemsorforrequestsshouldbeautomatically routedbasedoncustombusinessrulesthatyoucandefine.Inotherwords,tickets shouldheadstraighttothecorrectimplementerasquicklyaspossible. Scheduling.Aunifiedmanagementsystemshouldhavesomekindofinternal calendarthatletsyouschedulemaintenancetasks.Thisfunctionalityhelpsto resolvemaintenancewindowconflictsandscheduleworktohappenattheright time. Catalog.Thisisakeypartofmakingaunifiedmanagementsolutionpartofaself service,managedsystem.Inaddition,acataloghelpsworktowardbringingprocess compliancesuchasITILcomplianceintoyourenvironment.Acatalogprovides userswithalistoforderableproducts,notunlikeshoppingatanonlineWebhost. Userspurchasestranslateintotickets,whichgothroughreview/approvalpriorto beingpassedtoITforimplementation. Communications.Usersneedtobeabletosubmitrequests,andusersandyourteam mustbeabletoreviewthemfromafamiliarplace.AWebportalisthetraditional waytoenablethiscommunication,butsystemsthatcanintegrateviausers inboxeswhichtheyreinallthetime,anywayisevenbetter. Interface.Youcanthavetoomanyinterfacesintoaunifiedmanagementsystem, andwhateversolutionyoupickshouldofferbothWebbasedandmobilefriendly versionsofitsUI. Metering.Ifyouremonitoringactualpayingcustomers,youllneedtheabilityto chargethemforwhattheyuse.Evenifyourejustdealingwithinternalcustomers, beingabletoperformchargebacksfortheirITresourceconsumptionisgoingto becriticalasbusinessmanagersadvancetheirmanagementstrategies.Theresno reasonforITtobeseenpurelyasoverheadwhenresourcescanandshouldbe trackedbacktothebusinesscomponentsthatareconsumingthem. SLAs.Aunifiedmanagementsystemshouldassistyouinbothdefiningand monitoringservicelevelagreements(SLAs)basedonactualhistorictrends. Trends.Aunifiedmanagementsolutionshouldincludeaperformancedatabasethat letsyoutrackhistoricalperformancetrends.Thisdatabasecanbeusedtohelp defineandreportonSLAsaswellasperformcapacityplanning. Surveys.ClosingtheloopwithyourendusersiscrucialbecausetechnicalSLAs arenttheonlywayyoursuccessisbeingmeasured,whetheryouknowitornot. BeingabletopollusershelpsyoudefineSLAsintheirterms,creatingmore appropriateexpectations.

83

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Reports.Lookforreportsanddashboardsthatprovidemanagerialandexecutive levelviewsofitemssuchasworkload,SLAcompliance,andsoforth.Heck,even dashboardsthatcanbeexposedtoendusers,helpingthemseethatthe environmentisperformingasitshould,cangoalongwaytowardhelpingITbeseen asmoreresponsiveandengagedwiththebusiness. Visualization.Beingabletovisualizeyourenvironmentcanhelpmakerootcause analysisandproblemresolutionfasterandeasier. Everythinginoneplace.AsIvewrittenseveraltimesinthisguide,aunified managementsystemsprimaryvalueisunity,ortheabilitytogetallyour performanceconcernsintoasingleplace,usingasinglesetofmetrics,alarms, identifiers,andsoforth.Thissingularviewhelpstobreakdownthetraditional domainbasedsilosthatITisbuiltaround,andgetseveryonefocusedontheroot causeofaproblemmorequickly. Knowledgeretention.Aunifiedmanagementsystemshouldhelpyourorganization retaincriticalknowledgebyturningHelpdeskticketsintoanautomated,searchable knowledgebase. Preloadinginformation.Whenanalarmgeneratesaticket,thatticketshould includewhateverdetailstheunifiedmanagementsystemcanprovide:IPaddresses, responsetimes,andsoforth.Themoreinformationincludedintheticket,theless theresponderhastogolookup,andthesoonertheycanstartworkingonresolving theproblem.

Thislistobviouslyisntcomprehensivebutprovidesastartingpoint.Ifapotentialsolution offersthesefeaturesandmeetsyourorganizationsspecificneeds,thatsolutionisprobably worthlookingatindetailduringanevaluation.Makesureyougainnotonlyacheckmark onthesefeaturesbutalsoadetailedexplanationofhowtheyreimplemented.Also,ensure thattheimplementationisonethatwillworkwithinyourorganizationsrequirements.

WaystoBuyYourUnifiedIT
Iwanttobrieflyoutlinedifferentapproachesthatvendorstakefordeliveringunified managementsolutions.LetmeemphasizeupfrontthatIdontregardanyoftheseas rightorwrong;theresmerelywhatsrightforyou,whichyoullneedtodecideon yourown. Typically,youllfindthatsolutionsofthiskindarepricedbasedonthenumberofnodes youneedtomanage,possiblyalsoincorporatingthenumberofusersinyourorganization. Anodeistypicallydefinedasanymanageabledevice:arouter,aserver,andsoforth. Somevendorsaremorecreativethanotherswiththisportionoftheirlicensingmodel; dontletacomplexmodelscareyouoff.Insomecases,morecomplexlicensemodelsare actuallytoyourbenefitbecausevendorsaretryingtopreciselyaccommodateawiderange ofscenarios.Youshouldbemoreconcernedaboutwhatyourelicensing.

84

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Forexample,atoneendofthespectrum,youllfindwhatIcallmonolithicsolutions.With these,yougetandpayforeveryfeaturethatthevendoroffers,regardlessofwhichones youllneedrightaway.Ithinkitshugelyimportanttomakesureyoureacquiringa solutionthatcandoeverythingyouwant,althoughImnotsureyounecessarilywantto payforallofthatupfront.Insomecases,youmaywanttoimplementasolutionina phasedapproach,licensingjustthefunctionalityyouneedforeachphase,thusallowing yourselftokindoframpupintothefulllicensingandfunctionalityofaproduct.Thenice thingaboutmonolithicsolutionsisthattheyreoftenwellintegratedbecauseeverythingis deliveredtoyouinasinglepiece. Therearealsopluggableframeworks.ItendtoviewbigframeworkslikeHPOpenViewas fittingintothiskindofmodel.Withthesesolutions,youbuyabaseproduct,thenaddinthe variousbitsandpiecesyouneedtospeaktoyourenvironment.Thesemodelsofferatonof flexibility,ofcourse,andifyouregoingwithabigenoughvendor,youshouldbeableto findpluginsforeverybitoffunctionalityyouneed.Thesesolutionsruntheriskof becomingamassivedoityourselfproject,though,andthepluginsarentalwaysaswell integratedasyoumightlike.Licensingcanalsobereally,reallycomplexbecauseyoure oftenlicensingthepluginsseparatelyfromthebaseframework. Anothermodelisthepayasyougoapproach.Withthismodel,thesolutionoffersallthe functionalityyoumighteverneed,butyoudontswitchitallonrightaway.Instead,you turnonthemodules,orfunctionality,thatyouneedimmediately,andyoujustpayforthat. Asyouaddmoreresponsibilitytothesolution,youpayabitmore.Thissetupisabitmore likeacloudmodel,whereyoucangrowaslargeasyoulikebutonlypayforwhatyou needrightnow.Yourenottypicallydealingwithplugins,orifyouare,theyreusuallyall deliveredbythesamesolutionvendor.Imseeingmoreclientsconsideringthisapproach. Thelastthingyoullneedtothinkaboutiswherethesolutionwilllive.Inthisageofthe cloud,youactuallyhaveachoiceofhostingyourmonitoringandmanagementsolutionin yourowndatacenterorsimplypurchasingitasahostedservicethatlivesinthevendors datacenter.Eitherway,thevendorsagentsgetinstalledintoyourenvironment.Iwontdig intotheonpremiseversushosteddebate;youprobablyknowwhatsrightforyou,and youcancertainlydiscussthatoptionwithwhateversolutionvendorsyoureinvestigating. Regardlesswhichsideofthatdebateyoureon,Ithinkitsnicetohaveasolutionthatoffers bothoptions.

Conclusion
Where,thereyouhaveit:unifiedmanagement.Theoverallideabehindthisbookwas simple:reallyfocusingonthestraightforwardthemeofgeteverythinginoneplace,and geteveryoneononepage.Itsonlyrevolutionarycomparedwiththedisjointedapproach thatourexistingtechnologytoolshavemoreorlessforcedusinto.

85

CreatingUnifiedITMonitoringandManagementinYourEnvironment

DonJones

Ofcourse,Idontexpectyoutojustrushrightoutandstartswitchingovertoanew monitoringandmanagementframework.Thesethingscanbedoneinsmallstepssothat theycreatelessimpactonyourorganizationandallowyoutolearntousevarious techniquesandfeaturesproperlyinanorganic,ratherthandisruptive,fashion. Thegoalshouldbethere:Stopwastingtimewiththebackandforthandinsteadget yourselfontoasinglepaneofglassforyourorganizationstoplevelmonitoring.Integrate thatwithaHelpdesksystemthatletsyoukeepeveryoneinformedandgivesyouthe metricsyouneedtoanalyzeyourITperformanceobjectively. Goodluck.

86

You might also like