Professional Documents
Culture Documents
Dhananjay
Dhananjay D.
ITIL Certified, Salesforce Certified Developer & Advance Administrator
The first thing to be checked in AWR report is the "DB Time" metric. If it is wayoff
with the elapsed time, then it indicates that the sessions are waiting for something.
Next thing to check is the Instance Efficieny Percentages, should be ideally above
90%..
Then comes the Shared Pool Statistics..
The memory usage statistics of shared pool should be lesser.
Next thing to be looked after is the Top 5 Timed Events table.
This shows the most significant waits contributing to the DB Time.
Then , SQL Statistics can be checked.
Arunangsu
Arunangsu S.
Product Manager at Tata Consultancy Services
Check for Top SQLs, Top CPU Utilizations, Top Memory reads
Sravanthi
Sravanthi N.
Performance Testing and Engineering Architect
Please go through the link @ http://prezi.com/glqm9zemzhup/interpreting-awr-reportstraight-to-the-goal/
A very helpful link which helps you understand the terms in AWR reports..
Amelia Monica
Amelia Monica D.
Senior Database Administrator (SME - Database Performance Tuning)
When it comes to Performance Issues - two places to look for.
a. Inside the Database
b. Outside the Database
Inside the Database
=================
Performance at one particular period of time.
1. SGA and PGA Memory Advisory. - Will give an summary of the present SGA and
PGA values.
2. Segments by Logical Reads / Physical Reads / Row Lock Waits / ITL Waits.
3. File I/O Stats - The Av Rd(ms) < 20. Usually 40 - 20 are OK.
Outside the Database
=================
OSWatcher will also help to troubleshoot problem outside the Database. It has
collections of UNIX Utilities - top, ps, vmstat, iostat, and produces a Graph (A picture
is worth a thousand words.)
Collaborating OS Logs and Database AWR, we can resolve the Database Performance
Issue.
Plz go through the below link also please use Active Session History report to know
which session are active and inactive
http://extraoracle.blogspot.in/2013/05/10-steps-to-analyze-awr-report-in-oracle.html
Also use the below and run it Oracle DB for getting which SQL ID is taking more
time to execute
WITH sql_class AS
(select sql_id, state, count(*) occur from
(select sql_id
, CASE WHEN session_state = 'ON CPU' THEN 'CPU'
WHEN session_state = 'WAITING' AND wait_class IN ('User I/O') THEN 'IO'
ELSE 'WAIT' END state
from v$active_session_history
where session_type IN ( 'FOREGROUND')
and sample_time between trunc(sysdate,'MI') - 60/24/60 and trunc(sysdate,'MI') )
group by sql_id, state),
ranked_sqls AS
(select SQL_ID, sum(occur) sql_occur , rank () over (order by sum(occur)desc) xrank
from sql_class
group by sql_id )
select sc.sql_id, state, occur from sql_class sc, ranked_sqls rs
where rs.sql_id = sc.sql_id
AWR Report
I think as a performance tester we should know about the basic analysis of AWR report. In
this post I am going to explain about basic analysis of AWR report.
AWR: Automatic Workload Repository
Very crucial part of AWR report is SQL Statistics. Which has all sql query details executed during
report time interval.
We can genaerte AWR report on hourly basis. It will not generate less than an hour.So
we can say this is the limitation of Oracle database. Instead of having one report for long
time like one report for 4hrs. it's is better to have four reports each for one hour. Reason is
that if we take long durtion report it will give Average of all counters during that
period of time. That would not be much useful.
It's always good to have two AWR Reports, one for good time (when database was
performing well), second when performance is poor. This way we can easily compare good
and bad report to find out the culprit.
After getting an AWR Report This is first and Top part of the report. In this part cross check for
database and instance and and database version with the Database having performance
issue.This report also show RAC=YES if it's an RAC database.
"DB CPU(s)" per second: Before that let's understand how DB CUP's work. Suppose you
have 12 cores into the system. So, per wall clock second you have 12 seconds to work on
CPU.
So, if "DB CPU(s)" per second in this report > cores in (Host Configuration ) means env is CPU
bound
and either need more CPU's or need to further check is this happening all the time or just
for a fraction of time. As per my experience there are very few cases, when system is CPU bound.
Ex: machine has 12 cores and DB CPU(s) per second is 6.8. So, this is not a CPU bound case
Parses and Hard parses: If the ratio of hard parse to parse is high, this means Database is
performing more hard parse. So, needs to look at parameters like cursor_sharing and
application level for bind variables etc.
Top 5 Timed Foreground Events: First of all check for wait class if wait class is User I/O ,
System I/O, Others etc this could be fine but if wait class has value "Concurrency" then there
could be some serious problem. Next to look at is Time (s) which show how many times DB
was waiting in this class and then Avg Wait (ms). If Time(s) are high but Avg Wait (ms) is
low then you can ignore this. If both are high or Avg Wait (ms) is high then this has to further
investigate.
Sql Ordered by Elapsed Time: Look for query has low executions and high Elapsed time
per Exec (s) and this query could be a candidate for troubleshooting or optimizations. Ex: you
can have query as maximum Elapsed time but no execution. So you have to investigate
this.
In Important point, if executions is 0, it doesn't means
query is not executing, this might be the case when query was still executing and you took
AWR report. That's why query completion was not covered in Report.
In case if a particular query is not performing well, i would suggest to look at execution plan of the
query, stats of underlying table etc. In this case AWR won't help much.
2. Stick to Particular Time: "Database is performing slow" will not help anymore to resolve
performace issues. We have to have a specific time like Database was slow yesterday at 1 Pm and
continue till 4Pm. Here, DBA will get a report for these three hours.
3. Split Large AWR Report into Smaller Reports: Instead of having one report for long
time like one report for 4hrs. it's is better to have four reports each for one hour. This will help to
isolate the problem.
In case of RAC env. generate one report for each instance. Once, you have generated AWR report.
Now, it's time of analyze the report. Since, AWR report is a huge report and area to look into AWR is
also depends on problem to problem. Here, I am list most common area for a DBA to look into which
will give a clear picture of the issue.
2. Host Configuration:
This will give you name, platform CUP, socket and RAM etc. Important thing to notice is number of
cores into the system. In this example there are 12 CUP's in Cores.
4. Load Profile:
Here are few important stats for a DBA to look into. Fist is "DB CPU(s)" per second. Before that let's
understand how DB CUP's work. Suppose you have 12 cores into the system. So, per wall
clock second you have 12 seconds to work on CPU.
So, if "DB CPU(s)" per second in this report > cores in (Host Configuration (#2)).
means env is CPU bound and either need more CPU's or need to further check is this happening all
the time or just for a fraction of time. As per my experience there are very few cases, when system is
CPU bound.
In this case, machine has 12 cores and DB CPU(s) per second is 6.8. So, this is not a CPU
bound case.
Next stat to look at are Parses and Hard parses. If the ratio of hard parse to parse is high, this
means Database is performing more hard parse. So, needs to look at parameters like cursor_sharing
and application level for bind variables etc.
Here, first of all check for wait class if wait class is User I/O , System I/O, Others etc this could be
fine but if wait class has value "Concurrency" then there could be some serious problem. Next to look
at is Time (s) which show how many times DB was waiting in this class and then Avg Wait (ms).
If Time(s) are high but Avg Wait (ms) is low then you can ignore this. If both are high or Avg Wait
(ms) is high then this has to further investigate.
In the above screen shot, most of the resource are taken by DB CPU = 64% DB time. Taking resource
by DB CUP is a normal situation.
Let's take an example, In which event is "log file switch (checkpoint incomplete) " which has
highwaits, huge Time (s) and large values in Avg Wait (ms) and wait class is configuration. So, here
you have to investigate and resolve log file switch (checkpoint incomplete).
Host CPU, Instance CPU and Memory Statistics are self explanatory. Next is RAC Statistics, I did not
find any issue in these stats most of the time.
This report shows, system is 62 and 70% idle at time of report taken, So, there is no resource crunch
at system level. But if, you found very high busy, user or sys % and indeed this will led to low idle %.
Investigate what is causing this. OS Watcher is the tool which can help in this direction.
Next, very crucial part of AWR report for a DBA is SQL Statistics. Which has all sql query details
executed during report time interval.
We will explore few of them, To understand, how to analyzed these reports. Let's start with
In this report, look for query has low executions and high Elapsed time per Exec (s) and this query
could be a candidate for troubleshooting or optimizations. In above report, you can see first query has
maximum Elapsed time but no execution. So you have to investigate this.
In Important point, if executions is 0, it doesn't means query is not executing, this might be the case
when query was still executing and you took AWR report. That's why query completion was not
covered in Report.
From above stat, look for queries using highest CPU Times, If a query shows executions 0, this
doesn't means query is not executing. It might be same case as in SQL queries ordered by Elapsed
time. The query is still executing and you have taken the snapshot.
However, There are so many other stats in AWR Report which a DBA needs to consider, I have listed
only ten of them but these are the most commonly used stats for any performance related information.
See here for more notes on reading a STATSPACK/AWR report. Also try
www.statspackanalyser.com for a sample AWR analysis.
Reading the AWR Report
This section contains detailed guidance for evaluating each section of an AWR report. An
AWR report is very similar to a STATSPACK report, and it contains vital elapsed-time
information on what happened during particular snapshot range. The data in an AWR or
STATSPACK report is the delta, or changes, between the accumulated metrics within each
snapshot.
The main sections in an AWR report include:
Report Summary: This gives an overall summary of the instance during the snapshot period,
and it contains important aggregate summary information.
Cache Sizes (end): This shows the size of each SGA region after AMM has changed them.
This information can be compared to the original init.ora parameters at the end of the AWR
report.
Load Profile: This important section shows important rates expressed in units of per second
and transactions per second.
Instance Efficiency Percentages: With a target of 100%, these are high-level ratios for activity
in the SGA.
Shared Pool Statistics: This is a good summary of changes to the shared pool during the
snapshot period.
Top 5 Timed Events: This is the most important section in the AWR report. It shows the top
wait events and can quickly show the overall database bottleneck.
Wait Events Statistics Section: This section shows a breakdown of the main wait events in the
database including foreground and background database wait events as well as time model,
operating system, service, and wait classes statistics.
Wait Events: This AWR report section provides more detailed wait event information for
foreground user processes which includes Top 5 wait events and many other wait events that
occurred during the snapshot interval.
Background Wait Events: This section is relevant to the background process wait events.
Time Model Statistics: Time mode statistics report how database-processing time is spent.
This section contains detailed timing information on particular components participating in
database processing.
Operating System Statistics: The stress on the Oracle server is important, and this section
shows the main external resources including I/O, CPU, memory, and network usage.
Service Statistics: The service statistics section gives information about how particular
services configured in the database are operating.
SQL Section: This section displays top SQL, ordered by important SQL execution metrics.
SQL Ordered by Elapsed Time: Includes SQL statements that took significant execution time during
processing.
SQL Ordered by CPU Time: Includes SQL statements that consumed significant CPU time during its processing.
SQL Ordered by Gets: These SQLs performed a high number of logical reads while retrieving data.
SQL Ordered by Reads: These SQLs performed a high number of physical disk reads while retrieving data.
SQL Ordered by Parse Calls: These SQLs experienced a high number of reparsing operations.
SQL Ordered by Sharable Memory: Includes SQL statements cursors which consumed a large amount of
SGA shared pool memory.
SQL Ordered by Version Count: These SQLs have a large number of versions in shared pool for some reason.
Instance Activity Stats: This section contains statistical information describing how the
database operated during the snapshot period.
Instance Activity Stats (Absolute Values): This section contains statistics that have absolute values not derived from
end and start snapshots.
Instance Activity Stats (Thread Activity): This report section reports a log switch activity statistic.
I/O Section: This section shows the all important I/O activity for the instance and shows I/O
activity by tablespace, data file, and includes buffer pool statistics.
Tablespace IO Stats
File IO Stats
Buffer Pool Statistics
Advisory Section: This section show details of the advisories for the buffer, shared pool,
PGA and Java pool.
Buffer Pool Advisory
PGA Aggr Summary: PGA Aggr Target Stats; PGA Aggr Target Histogram; and PGA Memory Advisory.
Shared Pool Advisory
Java Pool Advisory
Buffer Wait Statistics: This important section shows buffer cache waits statistics.
Enqueue Activity: This important section shows how enqueue operates in the database.
Enqueues are special internal structures which provide concurrent access to various database
resources.
Undo Segment Summary: This section gives a summary about how undo segments are used
by the database.
Undo Segment Stats: This section shows detailed history information about undo segment
activity.
Latch Activity: This section shows details about latch statistics. Latches are a lightweight
serialization mechanism that is used to single-thread access to internal Oracle structures.
Latch Sleep Breakdown
Latch Miss Sources
Parent Latch Statistics
Child Latch Statistics
Segment Section: This report section provides details about hot segments using the following
criteria:
Segments by Logical Reads: Includes top segments which experienced high number of logical reads.
Segments by Physical Reads: Includes top segments which experienced high number of disk physical reads.
Segments by Buffer Busy Waits: These segments have the largest number of buffer waits caused by their data
blocks.
Segments by Row Lock Waits: Includes segments that had a large number of row locks on their data.
Segments by ITL Waits: Includes segments that had a large contention for Interested Transaction List (ITL). The
contention for ITL can be reduced by increasing INITRANS storage parameter of the table.
Dictionary Cache Stats: This section exposes details about how the data dictionary cache is
operating.
Library Cache Activity: Includes library cache statistics describing how shared library objects
are managed by Oracle.
SGA Memory Summary: This section provides summary information about various SGA
regions.
init.ora Parameters: This section shows the original init.ora parameters for the instance
during the snapshot period.