You are on page 1of 18

Oracle LogMiner Based Replication

(A detailed discussion on how to use the LogMiner utility to replicate data)

________________________________________________ Table of Contents


LogMiner Overview..............................................................................................................3 SCN..................................................................................................................................3 Oracles Redo Log...........................................................................................................4 LogMiner API...................................................................................................................6 Add Logfile .................................................................................................................6 Start Logmnr................................................................................................................6 LogMiner View.............................................................................................................7 Using LogMiner for replication:............................................................................................8 Extraction Program..........................................................................................................8 The Dynamic table X$KTUXE.....................................................................................8 Calling LogMiner to Extract Data.................................................................................9 Cleaning Up...............................................................................................................11 Mapping of Object Identifiers to known names.........................................................12 Notes on Restoring via an import..............................................................................14 Extraction Trick Number 1.........................................................................................15 Why you should drop logs ....................................................................................15 How to drop unused logs.......................................................................................15 Extraction Trick Number 2.........................................................................................16 Keep all activity out of the database......................................................................16 Extracted Data.......................................................................................................16 Export Process...............................................................................................................16 Import Process...............................................................................................................16 Posting Process.............................................................................................................16 Appendix A - LogMiner Replication Flow..........................................................................17 Appendix B - Definitions....................................................................................................18 Redo Logs......................................................................................................................18 SCN ...............................................................................................................................18

______________________________________________________________________

LogMiner Overview
LogMiner is a utility provided by Oracle Corporation to read the contents of the Oracle redo logs in order to recreate SQL that was applied on the system. LogMiner is a set of API programs that allow a user to query the contents of the online redo logs. The redo logs are presented as a table in the database, and a user has the ability to issue standard select statements to retrieve the contents of the redo log.

SCN
All changes made to all objects in the database are kept track via the SCN. The SCN (or system change number) is a sequential number generator that stamps all physical objects (control files, data files, and redo logs) with this number. This is the mechanism Oracle has chosen to enforce database integrity. Not only are the physical objects stamped with the SCN, but transactions that are recorded in the redo log are as well. The other function of the SCN is for backup and recovery, which is outside the scope of this document.

10001 10001

Database Instance

10001

Data file

The 3 physical components of an oracle database are:: redo logs control files data files

10001

The SCN (System control number) is a counter that is applied to all the physical components. It's main purpose is to enforce the integrity of the database

10001

10001

Data File

Data file

In this example, all physical components are stamped with the SCN value 10001.

Oracles Redo Log.


The function of the Oracle online redo log is to record ALL changes to the database. This includes data, indexes, the Oracle catalog, rollback segments, etc.. Any and all changes to any data block in the database are recorded in the redo log. The redo logs are setup as a circular queue, so that as one queue (log) fills up, it will switch to another log. Once the final log is filled, it starts to write to the first log. Each recording has an associated SCN value. The SCN is basically a sequential number that is stamped with the transaction, in order to keep the database consistent.

Overview of the redo logs

Database Instance

log buffer

1
LGWR

2 3
Log 1

Log 8 Log 2

4
Log 7 Log 3 SCN 10000 Log 6 Log 4 Log 5

SCN 15000

(1) As changes to the database occur, they are recorded in the redo log buffers The log-writer process (LGWR) writes to the redo log (2) As a log fill up , it moves to the next log (3) Each log (4) has a starting SCN value that can be queried from the oracle catalog

In the figure above, redo log 2 ends before SCN 9999, redo log 3 starts at 10,000 and ends at 15000.

The Oracle database catalog records the starting SCN value for each redo log. These values are easily retrieved with the following query: (NOTE: The SCN value is specified in the column FIRST_CHANGE#)
select group#, sequence#, status, first_change#, first_time from v$log;
The system will return rows similar to the results below. In this example, there are a total of nine(9) rows returned. In the results returned below, group #5 is the current redo log being written to, and the value 48138 is the first SCN value associated with the redo log. GROUP# SEQUENCE# STATUS FIRST_CHANGE# FIRST_TIME ---------- ---------- ---------------- ------------- ------------------1 1 INACTIVE 1 2004-10-29:11:36:41 2 2 INACTIVE 21236 2004-10-29:11:40:22 3 3 INACTIVE 30786 2004-10-29:11:43:33 4 4 ACTIVE 39148 2004-10-29:11:50:11 5 5 CURRENT 48138 2004-10-29:11:54:10 6 0 UNUSED 0 7 0 UNUSED 0 8 0 UNUSED 0 9 0 UNUSED 0 In order to query the catalog to see the actual name of the file being written to, issue the following query from sql*plus.

select * from v$logfile


returns the rows

GROUP# STATUS ---------- ------1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9

TYPE ------ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE

MEMBER ---------------------------------------/u01/oradata/DB01/redo01a.log /u02/oradata/DB01/redo01b.log /u02/oradata/DB01/redo02a.log /u03/oradata/DB01/redo02b.log /u03/oradata/DB01/redo03a.log /u01/oradata/DB01/redo03b.log /u01/oradata/DB01/redo04a.log /u02/oradata/DB01/redo04b.log /u02/oradata/DB01/redo05a.log /u03/oradata/DB01/redo05b.log /u03/oradata/DB01/redo06a.log /u01/oradata/DB01/redo06b.log /u01/oradata/DB01/redo07a.log /u02/oradata/DB01/redo07b.log /u02/oradata/DB01/redo08a.log /u03/oradata/DB01/redo08b.log /u03/oradata/DB01/redo09a.log /u01/oradata/DB01/redo09b.log

NOTE: Oracle needs to write to at least two logs at the same time. In the example above the two files belonging to group 5 are mirrored.

For more information on redo logs, please go to:


http://download-west.Oracle.com/docs/cd/B10501_01/server.920/a96521/onlineredo.htm - 3848

LogMiner API
Since the redo logs record all changes to the database, and the LogMiner utility allows a user to mine the changes made to the database in the form of SQL, it is possible to reconstruct all SQL that was applied on a database. Oracle Corp. provides a set of APIs to control the LogMiner utility. The APIs are described below:

Add Logfile

When invoking the LogMiner API, the very first command you must issue is the add logfile command. This instructs LogMiner to attach a specific redo log, or a set of redo logs that you are interested in mining.
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE( 2 LOGFILENAME => '/u02/oradata/DB01/redo05a.log , 3 OPTIONS => DBMS_LOGMNR.NEW); In this example, we are tying in our redo log file from the previous discussion on redo logs. In this case, its our active redo log.

Start Logmnr

The dbms_logmnr.start_logmnr is the meat and potatoes of the utility. By passing a series of options to the API, you specify a range (either by date or SCN) that you want to process. You must have the associated redo log for the SCN range attached , otherwise the utility will return with an error.
SYS.DBMS_LOGMNR.START_LOGMNR(OPTIONS => SYS.DBMS_LOGMNR.COMMITTED_DATA_ONLY, STARTSCN=> 48138, ENDSCN=> 48238 ); In this example, we are specifying the first SCN that appears in the attached redo log. The value 48238 is arbitrary for this example and will be discussed later.

LogMiner View

After issuing the START_LOGMNR API, the utility creates a view into the redo logs: This view (or logical table) allows standard SQL statements to be used to query the data.
SELECT a.scn, a.cscn, to_char(sysdate,'DD-MON-YYYY HH24:MI:SS'), substr(a.sql_redo,1, length(sql_redo)-1) FROM v$logmnr_contents a WHERE a.operation in ('INSERT', 'UPDATE','DELETE','COMMIT') AND (a.username != 'SYS' OR a.username is NULL) AND ( a.seg_owner = 'seg_owner' or a.seg_owner is null) AND a.serial# not in (select serial# from post_stats) AND cscn > :hv_WorkingSCN AND cscn <= :hv_SYSSCN AND a.cscn in ( SELECT distinct b.cscn FROM v$logmnr_contents b WHERE b.operation in ('INSERT', 'UPDATE','DELETE') AND (b.username != 'SYS' OR b.username is NULL) AND ( b.seg_owner = 'seg_owner' or b.seg_owner is null) AND b.serial# not in (select serial# from post_stats) AND cscn > :hv_WorkingSCN AND cscn <= :hv_SYSSCN ) AND a.scn < ( select nvl(min(a.start_scnb),999999999999) from v$transaction a, v$session b where a.ses_addr = b.saddr and b.username != 'SYS' ) ORDER BY a.cscn, a.scn; This is the current production SQL that the ADS replication utility uses to extract SQL.

For more information regarding Oracles Logminer utility, please visit:


http://download-west.Oracle.com/docs/cd/B10501_01/server.920/a96521/logminer.htm - 17869

______________________________________________________________________

Using LogMiner for replication:


Extraction Program
The first part of the replication process involves wrapping around the above LogMiner APIs within a Pro*C++ program. The program continuously monitors the database for activity, and then calls LogMiner to scan the redo logs for committed transactions.
The Dynamic table X$KTUXE

The first thing the replication program needs to know if there has been any type of activity in the system. Whenever there is activity, Oracle stores the system wide SCN value in a dynamic table (non-physical) called X$KTUXE. The extract program stores this information in a variable, and compares values from the last time it ran. If there is a difference, then some kind of database activity has occurred.

lastSYSSCN

The variable lastSYSSCN stores the SCN value of when the last database activity took place If any activity occurs, the value in the X$KTUXE table increases and is stored in a variaible If the two values are different, then some database activity occured, which needs to be investigated.

sys.x$ktuxe;

The process is as follows:


Step 1. Step 2. Step 3. Step 4. Step 5. Step 6. Step 7. select max(ktuxescnw * power(2, 32) + ktuxescnb) from sys.x$ktuxe;

Save value of Step 1 into variable. Copy variable from step 2 into a variable labeled old_variable_name. Re-issue above query (after sleeping for n seconds) Save query into variable. Compare old value to new. If different, then call LogMiner (something has changed), else sleep n seconds and repeat.

Calling LogMiner to Extract Data

The extract program (lg_ext) senses that some sort of database activity has occurred by comparing values from the current x$ktuxe table and the last time it ran. Before calling LogMiner, the extract program checks the value of the current system wide SCN value (from x$ktuxe), against the Oracle catalog to see if this new information might be on another (new) redo log (from v$log). If so, it will call the LogMiner attach API to attach the new redo log, and then issue the SELECT statement against the LogMiner table.

SYSSCN (from x$ktuxe) I need all the way to 65,000

48138
Log 5

64000
Log 6

55123

In this example, some activity is now occurring on redo log 6, which we will need to attach to the LogMiner utility.

After all logs have been attached, the extract program calls LogMiner to start scanning for data, starting from the last SCN we wrote out, to the current system wide SCN.

LastWorkingSCN

Possible range to scan if no active transactions.

Range to Scan

LastWorkingSCN - or the last time we printed out any committed SQL

SYSSCN (from x$ktuxe)

Starting value of any currently running transaction (from v$transaction)

Some things to note: The name of the LogMiner table is called : V$LOGMNR_CONTENTS When issuing a query against this table, its possible to filter data via the predicate. In terms of replication, we are only interested in objects that are owned by the schema owner

The call to LogMiner is as follows:


Step 1. Step 2. Step 3.

Step 4.

Call LogMiner with the SCN range of the last time LogMiner extracted information, and the current system wide SCN value. If LogMiner returns rows, print those rows to a queue file, else do nothing and again wait for system activity. If rows were printed, record the last SCN value that was printed to a variable in the program and to disk (every committed transaction will have an SCN associated with it) The lg_exp program knows how far into the file it has processed. If the file has grown, then continue to ship the SQL to the target system.

Cleaning Up

After rows have been printed, the program then compares the last SCN value it processed versus the SCN values for the online redo logs. If the redo logs are no longer needed, the program calls the LogMiner API to detach the redo logs from the session. This is done to minimize the amount of I/O LogMiner will do when extracting data, as LogMiner will use attached redo logs to scan for data.

The last work I did was at 64,300. I no longer need log 5

48138
Log 5

64000
Log 6

55123

Mapping of Object Identifiers to known names

In the internal workings of Oracle, Oracle uses identifiers (numbers) to work with objects. Whenever an object is created, its assigned a number (sequentially assigned) that will be used internally by the database. In order to have readable SQL extracted from the database, LogMiner needs to map the object ID to the name of the object. The mapping is done when LogMiner is first started. To following illustrates this concept Create a table
create table foobar (foo number(1));

Run This query


select object_id, object_name, object_type from sys.dba_objects where object_name = FOOBAR'

Returns the results


OBJECT_ID OBJECT_NAME OBJECT_TYPE ---------- ------------------------------ -----------------6658 FOOBAR TABLE

Drop and re-create the table


drop table foobar; create table foobar (foo number(1));

Re-Run this query


select object_id, object_name, object_type from sys.dba_objects where object_name = FOOBAR'

Note the change in the object ID


OBJECT_ID OBJECT_NAME OBJECT_TYPE ---------- ------------------------------ -----------------6659 FOOBAR TABLE

Object ID 12345 = DS_ASX_INFO

When an object is dropped and recreated, it is assigned a new object ID. On startup, LogMiner maps the object ID to the known name via the oracle catalog

TB_TABLENAME=ID12345

TB_TABLENAME=ID 43210

Drop and Recreate of object (via restore)

In this example, LogMiner isnt stopped. A restore is done, which drops and recreates the objects (with a new object_ID). LogMiner only knows that object_ID 12345 = TB_TABLENAME, therefore any changes after the restore are lost.

Notes on Restoring via an import The LogMiner replication program remembers the last SCN of when it did any work. This value is stored in a file, so when the program starts, it knows where in the redo log to start scanning. Stopping and starting isnt enough. If you should restore from an export and dont re-initialize (re-install) the replication program, all data imported via the restore will be extracted and replicated.

Object ID 43210 = TB_TABLENAME and the last time I did work was at 55123

SCN: 56000

SCN 58000

TB_TABLENAME=ID12345

TB_TABLENAME=ID 43210

Drop and Recreate of object (via restore) All changes in this range are replicated

In this example, the last time replication did any work was at SCN 55123. On startup, any changes from 55123 to 56000 will be ignored (since the object_id in the redo log dosent match any object in the current catalog), and all changes from 56,000 will be replicated. If the export file is also used on the target system, in this scenario, data will be applied twice to the target: Once from the export and again from replicated data originating from the primary.

Extraction Trick Number 1

With LogMiner, its possible to sustain heavy I/O if you have too many redo logs attached. If possible, only use the current active redo log for processing. When no longer needed, drop the redo log. Why you should drop logs When calling the START API, it appears that the LogMiner utility will use all redo logs that is currently attached, which creates more I/O and longer scan times. How to drop unused logs Query the v$log catalog table, which tells you the first SCN value of each redo log.
SELECT a.first_change#, a.group#, b.member FROM SYS.V_$LOG a, SYS.V_$LOGFILE b WHERE a.first_change# >= ( SELECT nvl(max(first_change#),0) FROM sys.v_$log WHERE first_change# < (THE LAST SCN YOU PROCESSED) ) AND b.group# = a.group#;

Extraction Trick Number 2

Keep all activity out of the database Extracted Data In the examples that Oracle provides for LogMiner, when extracting SQL from the redo logs, they usually stage them into another table. The problem with this, is a. You have to make sure that the table does not generate redo information. b. Even if you create the table with no logging, inserts into any table does increment the SCN. Therefore, you should always extract data to a flat file.

Export Process
The export process runs on all replicated nodes and can start independently of all other replication processes. The extractor process, when extracting, writes data to a queue file. When the queue has grown to a certain size, it closes the file and start writing to a different log file. The export process continuously looks at the size of the queue file. If changed, it reads the contents of the file in the data directory, and via a TCP socket connection, transfers the contents over. After receiving a confirmation that the information has been received, it increments its recorded offset into the queue.

Import Process
The import process is a simple TCP Socket server that accepts connections via a specified port. When a connection is made, it waits for data to be shipped over. When it receives data, it writes the content to a queue file on the target system. The import process can start independently from all other replication processes.

Posting Process
The posting process is a simple C++ program that scans the queue file (files that has been modified by the import process). The process behaves similarly to the export process, waiting for the data queue file to grow. Once its been determined that the queue has changed, the poster prepares (for dynamic SQL you must prepare an SQL statement) the SQL and executes the DML. The post process can start independently from all other replication processes. ____________________________________________________________________

Appendix A - LogMiner Replication Flow

Source DB

Target DB

lg_ext (extractor)

lg_post (poster)

Data Queue

Temp Queue

Data Queue

Temp Queue

lg_exp (exporter)

IP sockets

lg_imp (importer)

______________________________________________________________________ _______________

Appendix B - Definitions
Redo Logs
Every Oracle database has a set of two or more redo log files. The set of redo log files is collectively known as the redo log for the database. A redo log is made up of redo entries (also called redo records). The primary function of the redo log is to record all changes made to data. If a failure prevents modified data from being permanently written to the datafiles, then the changes can be obtained from the redo log, so work is never lost. To protect against a failure involving the redo log itself, Oracle allows a multiplexed redo log so that two or more copies of the redo log can be maintained on different disks. The information in a redo log file is used only to recover the database from a system or media failure that prevents database data from being written to the datafiles. For example, if an unexpected power outage terminates database operation, then data in memory cannot be written to the datafiles, and the data is lost. However, lost data can be recovered when the database is opened, after power is restored. By applying the information in the most recent redo log files to the database datafiles, Oracle restores the database to the time at which the power failure occurred.

SCN
The system SCN (System Change Number) is a system wide sequential counter that is used to synchronize all physical components of the Oracle Database. Each physical component (Datafile Headers, Control Files, Redo Logs, Redo Log entries) record the SCN. By making sure all physical components have the same SCN, ensures the integrity of the physical structure of the database, in which insures the integrity of the data. For example, the database compares the SCN of the control files to the SCN of the datafiles. If the SCNs arent the same, then the database assumes that the datafiles were restored from a previous backup and prompts that the database needs further recovery.

Oracle 9i Database Concepts Chapter 1

You might also like