You are on page 1of 46

Zero Data Loss

Recovery Appliance
Deep Dive:
Direct from Development
Timothy Chien
Principal Product Manager
Oracle
Raymond Guzman
Consultant Member of Technical Staff
Oracle
Fernando Simon
DBA Manager
Brazilian Justice Tribunal of Santa Catarina
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Agenda
1 Today’s Backup & Recovery Challenges
2 Recovery Appliance Architecture
3 Virtual Full Backup & Real-Time Redo Transport
4 Tape Backup & Replication
5 Managing Time and Space
6 Customer Case Studies
7 Summary / Q&A

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 3


Agenda
1 Today’s Backup & Recovery Challenges
2 Recovery Appliance Architecture
3 Virtual Full Backup & Real-Time Redo Transport
4 Tape Backup & Replication
5 Managing Time and Space
6 Customer Case Studies
7 Summary / Q&A

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 4


All-Purpose Storage HW & SW Not Designed for Database
Treat Databases as Just Files to Periodically Copy

Data Loss Exposure Daily Backup Window


Lose all data since last Large performance impact on
backup production

Poor Database Recoverability Many Systems to Manage


Many files are copied but Scale by deploying more
protection state of database backup appliances
is unknown

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 5


Oracle Deduplication & Validation Challenges

Understanding the backup stream is critical to identifying


redundant data
File System Data
1. File system data format does not change in
the backup process
 Ideal for storage deduplication technologies
 Deduplication claims of 10 – 50x

Oracle Database 2. RMAN passes a pre-packaged backup set


 Backup stream / block format is Oracle proprietary
 RMAN backup stream is largely opaque to external backup
applications and storage
RMAN
Full backups (production overhead) needed for dedupe
Backups
 Minimal deduplication (< 6x) vs. file system data
3. Storage technologies cannot validate Oracle
blocks nor interpret transactional change data
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 6
Fundamentally Different
Approach to Protect
Business Critical Database Data
Zero Data Loss
Recovery Appliance

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 7


Recovery Appliance Unique Benefits for Business and I.T.

Eliminate Data Loss Minimal Impact Backups


Real-time redo transport Production databases only
provides instant protection send changes. All backup and
of ongoing transactions tape processing offloaded

Database Level Recoverability Cloud-Scale Protection


End-to-end reliability, visibility, Easily protect all databases
and control of databases - not in the data center using
disjoint files massively scalable service

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 8


Zero Data Loss Recovery Appliance Overview

Protected Recovery Appliance


Databases
Offloads Tape
Delta Push Backup
• DBs access and send only changes
• Minimal impact on production
• Real-time redo transport instantly
protects ongoing transactions

Protects all DBs in Data Center Delta Store


• Petabytes of data • Stores validated, compressed DB changes on disk
• Oracle 10.2-12c, any platform • Fast restores to any point-in-time using deltas
• No expensive DB backup agents • Built on Exadata scaling and resilience
• Enterprise Manager end-to-end control Replicates to Remote
Recovery Appliance
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 9
Key Architecture Components
• Based on Exadata X5-2 HC with embedded 2-node
RAC+ASM database
Protected – Serves as central RMAN Recovery Catalog, storing all backup
Databases Recovery metadata
Catalog
• Delta Store ‘backup data’ configured in separate
ASM disk group, stored using compressed format
• Pre-bundled Oracle Secure Backup tape software
– 16Gb QLogic Fiber Cards supported for tape connectivity

• RMAN + Recovery Appliance Backup Module on


Delta protected databases
Store
• EM Cloud Control 12c Release 4 + Recovery Appliance
plug-in

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |


Agenda
1 Today’s Backup & Recovery Challenges
2 Recovery Appliance Architecture
3 Virtual Full Backup & Real-Time Redo Transport
4 Tape Backup & Replication
5 Managing Time and Space
6 Customer Case Studies
7 Summary / Q&A

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 11


Space-Efficient “Virtual” Full Backups
No More Full Backups: Incremental Forever Architecture

• After one-time full backup, incrementals


Day N Virtual Full
used to create virtual full database
backups on a daily basis
Day 1 Virtual Full
• Pointer-based representation of physical
Day 0 Day 1 Day N full backup as of incremental backup time
Full Incr Incr
• Virtual backups typically 10x space efficient
• Enables long backup history to be kept with
the smallest possible space consumption
Protected • “Time Machine” for database
Databases Delta Store

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 12


Backup & Redo (Delta Push) Workflow
Disk/Tape/Replica Backup Lifecycle

Redo Blocks Tape

Remote File Validate Full/Incremental


Server Process Redo Blocks Backup Sets

Backup Sets Validate


Data Blocks Redo Archived Log
Staging Backup Sets
HTTP Servlet Area
Archived Log Backups
RMAN Replica
Incrementals via Backup Staging Appliance
Recovery Delta Store
Appliance (FLASH)
Backup
Module
Incremental Backup Sets
Index Blocks
(Create Virtual Full)
Compress + Write to Disk Archived Log Backup Sets

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 13


Real Time Redo Transport

X
• Redo transferred from memory buffers on
database server – very low impact

• If redo connection is lost, the appliance


Storage ‘buttons up’ the received redo and
Location creates a ‘partial archived log backup’
– Preserves recovery until the last change received by
Partial Archived the appliance (i.e. 0-1 second RPO)
Log Backup

• When connection is reinstated,


Archived Log
Protected Databases Backups gap detection process on the appliance
Fetch Missing automatically fetches missing archived
Archived Logs logs from protected databases

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 14


Fast Restore to Any Point-in-Time
No Load On Production Servers to Merge Old Backups

RESTORE DATABASE Day ‘N’ Full Backup


TO DAY ‘N’ • Directly restore any virtual full backup
• All blocks referenced from virtual full are
efficiently retrieved
Day 0 Day 1 Day N • Eliminates production server overhead of
Full Incr Incr traditional restore and merge of
incrementals
• Supported by the scalability and
performance of the underlying Exadata
Protected Delta Store
Databases
hardware architecture

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 15


Restore Workflow

“RESTORE DATABASE” via


Tape
RMAN + RA Backup Module
“RECOVER DATABASE”

Full Backups

Archived Log Virtual Full Backups


Full Backup Sets Backups Replica
Appliance
HTTP Servlet Delta Store
Full Backups
Archived Log Validated & Re-assemble
Prepared for Physical Full Backup
Backup Sets
Network
Transmission

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 16


Agenda
1 Today’s Backup & Recovery Challenges
2 Recovery Appliance Architecture
3 Virtual Full Backup & Real-Time Redo Transport
4 Tape Backup & Replication
5 Managing Time and Space
6 Customer Case Studies
7 Summary / Q&A

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 17


Tape Backup Data Flow
• Create Media Manager
– SBT library pathname
– Number of total tape drives
– Number of dedicated tape drives for restore

• Create Attribute Set (media family, copies, etc.)


• Create Copy to Tape Job Template
– Media Manager, Attribute Set
– Protection Policy or Protected Database
Incremental
Full Backup& – FULL / INCREMENTAL / ARCHIVED LOG / ALL
CTDB DayDay
CTDB 1 Full Full Archived Log Backups
1 Virtual
• Queue / schedule tape job for CTDB at TIME ‘T1’
Day 0 Day 1
Incr – FULL – Most recent full backup of CTDB prior to TIME ‘T1’
Full
(assembled from corresponding virtual full)
– INCREMENTAL – Incremental backups of CTDB relative to
FULL backup
Tape Library
– ARCHIVED LOG – Archived log backups of CTDB relative to
FULL backup

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 18


Replication Data Flow
• When a Protection Policy is added to Replication
Server on Upstream Appliance, all future
incremental and archived log backups for its
HTTP protected databases will be queued for
replication
• Level 0 backup is first replicated
Storage
Location • Each level 1 incremental received by downstream
Full
Incremental
(Level 0) Backup
and generates a new virtual full
Archived Log Backups Virtual Full
Backups • New virtual full records on downstream are
periodically synchronized to upstream catalog
Archived
Log • Number of concurrent replication streams can be
Upstream Downstream Backups
Appliance Appliance adjusted, based on network consumption /
environment

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 19


Agenda
1 Today’s Backup & Recovery Challenges
2 Recovery Appliance Architecture
3 Virtual Full Backup & Real-Time Redo Transport
4 Tape Backup & Replication
5 Managing Time and Space
6 Customer Case Studies
7 Summary / Q&A

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 20


Policy-Based Cloud-Scale Database Protection

Gold Policy – Customer Critical


Disk: 35 days
Tape: 90 days Recovery Appliance
Tape
Protection Policies
Silver Policy – Internal Critical • Standardized
Disk: 10 days recovery window,
Tape: 45 days tape retention,
Replica
replication policies
Bronze Policy - Test/Dev
Disk: 3 days
Tape: 30 days
Replica Recovery
Appliance also
Policy-Based
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 21
Real-Time Database Recoverability & Space Monitoring
Enterprise Manager Provides Key Metrics At Your Fingertips

Recovery
Current
Window
Recovery
Goal: Window:
10 Days 6 Days
Projected Space Needed for Recovery Window Goal: 2.69 TB

Current RPO: < 1 sec !


Reserved Space:
Used Space: 2.57.9
TBTB
Deduplication Ratio: 10:1

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 22


Delta Store – Incremental Backups
Day 0 A0 B0 C0 D0 E0
Full Backup
• The backup strategy to Recovery
Appliance starts with an initial full
Day 1 C1 D1
(INCREMENTAL LEVEL 0) backup.
Incremental Backup

• After the LEVEL 0 backup, only


Day 2 D2 incremental backups are needed
Incremental Backup
thereafter, consisting of just the
changed database blocks relative to the
Day 3 B3 C3
previous day’s incremental.
Incremental Backup

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 23


Delta Store – Virtual Full Backup Creation
Day 0 A0 B0 C0 D0 E0
Full Backup

Day 1 Virtual Full Backup • A virtual full backup is created in the


Day 1 C1 D1 Recovery Appliance catalog for each
Incremental Backup
received RMAN incremental backup.
Day 2 Virtual Full Backup
Day 2
Incremental Backup
D2
• It operates at a data file level and
appears as a normal LEVEL 0 backup in
Day 3 Virtual Full Backup RMAN LIST/REPORT & catalog queries.
Day 3 B3 C3
Incremental Backup

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 24


Delta Store – Virtual Full Backup Optimization
Day 0 A0 B0 C0 D0 E0
E0
Full Backup
• OPTIMIZE task is a background process
Day 1 Virtual Full Backup
that runs on the most recent virtual full
Day 1
Incremental Backup
C1 D1 backup to ‘optimize’ its read I/O access
for future restore operations.
Day 2 Virtual Full Backup
Day 2 D2
Incremental Backup • Virtual full blocks are reordered into
Day 3 Optimized Virtual Full Backup
contiguous sets for fewer & larger read
I/O requests, which is more efficient
Day 3
Incremental Backup
B3 C3 than many smaller read I/O requests.

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 25


Agenda
1 Today’s Backup & Recovery Challenges
2 Recovery Appliance Architecture
3 Virtual Full Backup & Real-Time Redo Transport
4 Tape Backup & Replication
5 Managing Time and Space
6 Customer Case Studies
7 Summary / Q&A

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 26


Very Large Oracle Customer
Backup & Recovery for 16,000 Enterprise Databases
• RMAN backups to local SAN/NAS “dump” storage, 2 weeks retention.
• Disk backups swept weekly by Third Party SW to Backup Appliance for 30 day retention
– Replicated to a remote Backup Appliance residing at the bunker site
– Backups are copied to physical tape to meet > 30 day retention needs.
• Pain Point #1: Very large & costly “dump” storage deployed across 16,000 databases.
• Pain Point #2: Inability to coordinate Third Party SW sweep schedule with RMAN
backups being fully completed, resulting in incomplete backups..50% restore failure rate
• Pain Point #3: Non-Oracle integrated tape backup strategy..> 48 hours RTO due to
multiple stages in restore process
• Pervasive Risk: Recovery Failure + Data Loss = Non-Compliance, $$ Penalties, Bad Press

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 27


Recovery Appliance X5
Full Rack

POC Details
- Concurrent 200 DB backups
complete in 8 hours
Done in 2 ½ hours
- Copy virtual full backup of
200 DBs to tape in 7 days
Done in 2 days
160 x 11.2 DBs
- Report real-time RPO of Exadata X3-2 and X2-2
< 1 second on 160 x 11.2 DBs
ZERO redo transport lag
- Restore 2 DBs while rest 40 x 11.1/10.2 DBs
(198 DBs) are backing up X4800 + ZFS SA
Done in 2 hours

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |


Customer Case Study
Fernando Simon
DBA Manager
Brazilian Justice Tribunal of Santa Catarina

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 29


Brazilian Justice Tribunal of Santa Catarina
• Justice Tribunal of Santa Catarina • TJSC in numbers:
(TJSC) – 13,000 employees
– Civil and Criminal Court – Internal and public services
• Courts, Judicial Processes, Appeals, Taxes
– Nearly 14 million judicial cases recorded
• Santa Catarina in numbers: since inception of Oracle-based system
– Nearly 7 million residents • 723,000 just in 2015
– 295 cities – All judicial processes are recorded in
– 6th largest GDP by region in Brazil digital format..fully paperless
– 24x7 services for Legal Appeals
requirements

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 30


Brazilian Justice Tribunal of Santa Catarina
• TJSC and Oracle: • TJSC and Engineered Systems:
– Oracle user since 1998 – 4 Exadatas: 1 Half V2 (HP), 1 Half X2
– Nearly 40 production databases (HP), 1 Full X4 (HP), 1 Full X5 (EF)
• Among the first OLTP users of Exadata
– 15 critical databases:
• 30TB total volume – 2 Recovery Appliances
• Judicial Process, Appeals, Taxes, HR – No outages since 2010 (1st Exadata
• 24x7 access deployment, 2nd in all of Brazil)
– Major Database: – All databases run on Exadata with
• More than 8,000 internal users by day IORM (catPlan and dbPlan), Database
• More than 70,000 web access by day Resource Manager, RAC Services and
• 17TB size and 100,000 IOPS Instance Caging
• DSS/OLTP workloads

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 31


Brazilian Justice Tribunal of Santa Catarina
• Local Partner Samaia IT
– System Integrator
– Platinum Partner
– Winner of the bidding
– Concepts & implementation of the
project
• Samaia IT projects with TJSC:
– 4 Exadata racks
– 2 Recovery Appliances
– 2 StorageTek SL150 tape libraries

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 32


Brazilian Justice Tribunal of Santa Catarina
• TJSC and Other HA Infrastructure:
– Data Center Bunker (started in 2014)
– Redundant Network Switches
• Cisco Nexus 7000
• 1 and 10Gbps network – You don’t want to remain in jail just because
of an Oracle Database outage, right?
– Redundant Storage
• EMC VNX 5500, VNX 5400
– Redundant Media Management
• 2 StorageTek SL150 (8 LTO-6 Drives)
– HP Blades for Applications and VMware

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 33


Brazilian Justice Tribunal of Santa Catarina
• TJSC and Backup/Recovery – • Problems:
without Recovery Appliance – Low deduplication for databases
– EMC Networker • Even after following EMC best practices
• 8.0.2 – Slow, nearly 3 days to backup most
• Two Node Clustered Media Servers critical 17 TB database
• 1Gbps communication network – Large DBA effort to manage backups
• Needed to expire old backups from DD every
– Data Domain day, due to space constraints (no additional
• DD670 configured as VTL savings from deduplication).
• 50TB usable capacity – Nearly 3 days to recover 17 TB database
• 2 Gbps SAN network – Difficult to deliver adequate RPO and
• Shared: Databases and File Servers RTO due to network and HW constraints

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 34


Brazilian Justice Tribunal of Santa Catarina

• TJSC and Backup/Recovery – without Recovery Appliance

Cloud Control File Servers


Other Databases
12.1.0.4

EMC Networker

Exadata X2
Offload Every Week
72 hours over 1Gbps Network EMC Data Domain
Full Backup Every Week
Exadata X4 StorageTek SL150

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 35


Brazilian Justice Tribunal of Santa Catarina
Before Recovery Appliance..50, 65, 90+ hours to complete full backups

50+ HRS!

65+ HRS!

90+ HRS!!

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 36


Brazilian Justice Tribunal of Santa Catarina
Data Domain Cleaning Task – 1 day+ to Complete

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 37


Brazilian Justice Tribunal of Santa Catarina

• TJSC and Backup/Recovery – NOW WITH RECOVERY APPLIANCE

File Servers
Cloud Control
Other Databases
12.1.0.4

EMC Networker

StorageTek SL150
Exadata X2
EMC Data Domain
12 hours over 10Gbps Network
for Initial Full Backup
Exadata X5 23 minutes for Incremental Backup!

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 38


Brazilian Justice Tribunal of Santa Catarina

Backups Now Only Take ~20 Minutes


12:1 Deduplication Achieved

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 39


Brazilian Justice Tribunal of Santa Catarina
• Backup/Recovery Benefits with Recovery Appliance
– Reduce RTO and RPO
• High Protection and Zero Data Loss
– High Deduplication Factor
– Incremental Forever • The major benefit for us was
• New level 0 (virtual full) after only Incremental Forever and Virtual Full
23 minutes incremental backup time
• Nearly 200X improvement in backup performance
• Fast 10 Gbps Network
– Facilitated DB Migration from Exadata X4 to X5
– 2 Recovery Appliances less expensive than
expanding Data Domain to support same
backup requirements
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 40
Brazilian Justice Tribunal of Santa Catarina

• TJSC and Backup/Recovery – UPCOMING FULL MAA + RECOVERY APPLIANCE

Cloud Control Replication for


12.1.0.4 Other Databases non-Data Guard
Databases

Exadata X2 Data Guard

Exadata X4
Exadata X5 DR Site
Primary Site
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 41
Brazilian Justice Tribunal of Santa Catarina
• TJSC and Recovery Appliance • TJSC:
– WHY: – www.tjsc.jus.br
• Key Features: Virtual Full / • https://www.youtube.com/user/canaltjsc
Incremental Forever
• Easy / measurable growth as required
• Impossible to sustain Data Domain
– Lower Deduplication, Lower Performance
• Reduce the complexity of backup
infrastructure
– No more effort to tune and control every VTL of
Data Domain
• Oracle on Oracle, ONE support organization
for entire environment

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 42


Summary / Q&A

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 43


Summary
Recovery Appliance Designed for Oracle Data Protection

• Leverages proven Oracle Backup & Recovery + Data Guard +


Robust, Scalable Exadata Hardware Platform
– Protect database transactional changes in real-time
– Reduce backup storage and production overhead via incremental-forever
– Recover to any point-in-time with standard RMAN commands
– Offload backups to tape via user-configured policies
– Monitor space usage based on database recovery window goals
– Gather real-time data protection status across the entire Oracle enterprise via EM
• For more information: oracle.com/recoveryappliance

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 44


Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 45

You might also like