You are on page 1of 33

CON6018 - Best practices while maintaining large Oracle footprints

Case study of a Big Box Retailer


Sri Vuyyuru, Senior Consultant

PAGE 1

Oracle Open World 2015

About BIAS
Case study background
Problems faced with Large Oracle footprints
Best practices Managing Large Oracle footprints
Patching Management
Data Purging strategy
Proactive vs. Reactive

PAGE 2

Agenda

Who We Are

Founded in 2000
Distinguished Oracle Leader

Technology Momentum Award


Portal Blazer Award
Titan Award Red Stack + HW Momentum Awards
Excellence in Innovation Award

Management Team is Ex-Oracle


Location(s): Headquartered in Atlanta; Regional office in Washington D.C.;

~250 employees with 10+ years of Oracle experience on average


Inc.500|5000 Fastest Growing Private Company in the U.S. for the 6th
Time
Voted Best Place to work in Atlanta for 2nd year
33 Oracle Specializations spanning the entire stack

Offshore Hyderabad and Chennai, India

PAGE 3

About BIAS Corporation

PAGE 4

Oracle created the OPN Specialized Program to showcase the Oracle partners who have achieved expertise in Oracle product areas and reached
specialization status through competency development, business results, expertise and proven success. BIAS is proud to be specialized in 33 areas of
Oracle products, which include the following:

Compliance
Incident Alerting vs. Long Term Incident Resolution
Monitoring Overload
Monitoring like its 1999 (shell and cron)
Communication and coordination among various
teams
Prioritization across various application teams

PAGE 5

Problems faced with Large Oracle footprints

Business Impact
Availability
Security
Risk
Operating costs
Man power
SLAs
Efficiency

PAGE 6

Problems faced with Large Oracle footprints - Business Impact

Environment Summary -> Total of 70


production and 150 non production
databases
Time spent on Month # 1 : 100 to 110 hrs
per week
Total # of alerts : About 350-400 per week
Inefficient way of database monitoring

PAGE 7

Day One Common issues

PAGE 8

Day One Common issues


Category

Avg # of Hours spent per week

Tablespace Monitoring

40

Datafile sizing

20

Multiple Monitoring systems

20

ASMLIB Change Management

15

Database manual start/stop

10

This workload collectively represents Two and a Half FTEs!

Issue: Lot of time spent on troubleshooting tablespace


alerts due to a minor mistake in the monitoring script
How a small change in script could make a huge difference
in saving time & money
Alert!!!
host1:orcl:Cluster Database : Critical : Tablespace
SYSTEM is 100 percent full

PAGE 9

Best Practices: Tablespace Monitoring improvements

EXAMPLE
SELECT F.TABLESPACE_NAME,
TO_CHAR ((T.TOTAL_SPACE - F.FREE_SPACE),'999,999') "USED_MB",
TO_CHAR (T.TOTAL_SPACE, '999,999') "TOTAL_MB",
TO_CHAR ((ROUND (((T.TOTAL_SPACE - F.FREE_SPACE)/T.TOTAL_SPACE)*100)),'999') PERCENT_USED
FROM (SELECT TABLESPACE_NAME, ROUND (SUM (BLOCKS*(SELECT VALUE/1024
FROM V$PARAMETER
WHERE NAME = 'db_block_size')/1024)) FREE_SPACE
FROM DBA_FREE_SPACE
GROUP BY TABLESPACE_NAME) F,
(SELECT TABLESPACE_NAME, ROUND (SUM (BYTES/1048576)) TOTAL_SPACE
FROM DBA_DATA_FILES
GROUP BY TABLESPACE_NAME) T
WHERE F.TABLESPACE_NAME = T.TABLESPACE_NAME
AND F.TABLESPACE_NAME='SYSTEM';
TABLESPACE_NAME
USED_MB TOTAL_MB PERCENT_USED
-----------------------------------------------------------------------------------------------------SYSTEM
1,382
1,390
99

PAGE 10

Best Practices: Tablespace Monitoring improvements

EXAMPLE
SELECT F.TABLESPACE_NAME,
TO_CHAR ((T.ALLOC_MB - F.FREE_SPACE),'999,999') "USED_MB",
TO_CHAR (T.TOTAL_SPACE, '999,999') "TOTAL_MB",
TO_CHAR ((ROUND (((T.ALLOC_MB - F.FREE_SPACE)/T.TOTAL_SPACE)*100)),'999') PERCENT_USED
FROM (SELECT TABLESPACE_NAME, ROUND (SUM (BLOCKS*(SELECT VALUE/1024
FROM V$PARAMETER
WHERE NAME = 'db_block_size')/1024)) FREE_SPACE
FROM DBA_FREE_SPACE
GROUP BY TABLESPACE_NAME) F,
(SELECT TABLESPACE_NAME, SUM(BYTES)/1048576 "ALLOC_MB", ROUND (SUM ((CASE WHEN AUTOEXTENSIBLE = 'YES' THEN
GREATEST(BYTES, MAXBYTES) ELSE BYTES END)/1048576)) TOTAL_SPACE
FROM DBA_DATA_FILES
GROUP BY TABLESPACE_NAME) T
WHERE F.TABLESPACE_NAME = T.TABLESPACE_NAME
AND F.TABLESPACE_NAME='SYSTEM';
TABLESPACE_NAME
USED_MB TOTAL_MB PERCENT_USED
-----------------------------------------------------------------------------------------------------SYSTEM
1,382
32,768
4

PAGE 11

Best Practices: Tablespace Monitoring improvements

EXAMPLE

Resolution: Deployment of upgraded script had cut down the


monitoring time by more than 50%
Business Impact : Less operating costs and less resources needed
for this monitoring moving forward

PAGE 12

Best Practices: Tablespace Monitoring improvements

Issue: Improper sizing of data files leads to increased storage costs


for the company
Incorrect sizing during deployment of new databases
Lets get done with it attitude
Impacts other databases residing on the same server

PAGE 13

Best Practices: Datafile sizing strategy

EXAMPLE 1
Inefficient method of sizing a datafile
Autoextensible YES vs. NO

PAGE 14

Best Practices: Datafile sizing strategy

EXAMPLE 2
Efficient method of sizing a datafile

Tip : Set the Maxsize to an optimum number using Autoextensible


option

PAGE 15

Best Practices: Datafile sizing strategy

Resolution:
Sizing - Generic vs. Application Specific
OEM can be helpful Use of Information Publisher growth reports on
regular intervals
Business Impact :
This approach reduces additional storage costs and improves efficiency of
database monitoring

PAGE 16

Best Practices: Datafile sizing strategy

Issue: Too many monitoring systems, Improper


handling, and bad monitoring strategy for Oracle
Enterprise Manager (OEM)
DBA team spending unnecessary time fielding
same alert multiple times

PAGE 17

Best Practices: Leveraging OEM

Resolution:
Importance of OEM
Host and Database validation
Migration of crontab scripts
Improve metric thresholds and review often
Business Impact : This provides a more efficient
way of monitoring large oracle footprints

PAGE 18

Best Practices: Leveraging OEM

Issue: Availability issues for RAC databases after update of OS


kernel versions
About ASMLIB drivers
Role of ASMLIB with respect to Linux
Business Impact
Production
Non Production
Improvements with Oracle Linux 6

PAGE 19

Best Practices: ASMLIB Change Management

PAGE 20

Deep Dive: ASMLIB Issue Existing workflow

Simplified process streamlines the upgrade process and reduces risk!

PAGE 21

Deep Dive: ASMLIB Issue Proposed workflow

Resolution: Avoidance of multiple parties and cutting down the time


for the databases to be available for the application teams
The oracleasm kernel driver is built into the Unbreakable Enterprise
Kernel for Oracle Linux 6 and does not need to be installed manually
Business Impact : This approach helps to maintain SLAs and reduce
manual intervention from DBA team

PAGE 22

Deep Dive: ASMLIB Issue Proposed workflow

Issue: Databases unavailable for longer durations after bouncing of


servers
RAC vs. Non-RAC databases
Manual intervention was needed for some databases
Setup of scripts on standalone databases

PAGE 23

Best Practice: Automate Database Startup and Shutdown

Some caveats
dbstart, dbshut and dbora scripts should be
always executable
Environment variables to be set properly
Dont forget to set the right entries in oratab
Be cautious while editing the startup/shutdown
scripts
Resolution: After fixing the scripts, we have
improved the availability of databases and thus
reducing operating costs as well

PAGE 24

Best Practice: Automate Database Startup and Shutdown

Oracle restart available starting from 11.2


Dbstart and dbshut scripts until 11g

Oracle Restart feature starting from


11gR2 and beyond

PAGE 25

Best Practice: Automate Database Startup and Shutdown

Oracle Restart feature starting from 11gR2 Benefits


No scripts to deal with
CRSCTL easy to use
Order of startup and shutdowns can be handled

PAGE 26

Best Practice: Automate Database Startup and Shutdown

Database servers not up to date with the latest patch set


updates
Security has been a serious concern in the last few years
PSU Not just fixing security vulnerabilities, but also increases
database efficiency
Instance specific issues, functionality issues, regression testing
performed

PAGE 27

Patching Management: Compliance

Successful management of DB patching activity Be Compliant and


perform Quarterly PSU Patching

Business Impact : Reduces the risk of attacks from external sources


and thereby increases security on database front

PAGE 28

Patching Management: Compliance

Should keep monitoring the fast growing tables and indexes


Index rebuilding and shrinking tables should be done often
recycle bin purging has to be done to reclaim free space
Storage availability and costs
Local
Backups
Tape
Clones
Business Impact : Decreases storage costs

PAGE 29

Data Purging strategy

Category

Business Objective Achieved

Tablespace Monitoring

Decrease in operating costs and resources

Datafile sizing

Reducing storage costs

Multiple Monitoring systems

Improves efficiency of database monitoring

ASMLIB change management

Maintain SLAs and reduce manual intervention

Database manual start/stop

Increase high availability

Patching Management

Reduce risk and improves security

Data purging strategy

Decreases storage costs

PAGE 30

Summary - Takeaways

Category

Avg # of Hours spent per week

Tablespace Monitoring

20

Datafile sizing

Multiple Monitoring systems

ASMLIB Change Management

Database manual start/stop

FTEs Reduced from 2 to 1!

PAGE 31

Month 6: Best Practices Implemented

Time spent on Month # 6 : 40 to 45 hrs per week (Reduced from 100


to 110 hrs per week)
Total # of alerts : About 80-100 per week (Reduced from 350 to 400
per week)
Increased company's performance and profitability
Strategy Proactive vs. Reactive

PAGE 32

Month 6: Best Practices Implemented

Contact info:

Sri Vuyyuru
Email: sri.vuyyuru@biascorp.com
LinkedIn:
www.linkedin.com/in/srivuyyuru
Work: +1 770-685-6283
Cell: +1 404-398-5360

PAGE 33

QUESTIONS

You might also like