Professional Documents
Culture Documents
GeoSynchrony
Version Number 5.4.1
Product Guide
P/N 302-001-064
REV 02
Copyright 2015 EMC Corporation. All rights reserved. Published in the USA.
Published February, 2015
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect
to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
EMC2, EMC, EMC Centera, EMC ControlCenter, EMC LifeLine, EMC OnCourse, EMC Proven, EMC Snap, EMC SourceOne, EMC Storage
Administrator, Acartus, Access Logix, AdvantEdge, AlphaStor, ApplicationXtender, ArchiveXtender, Atmos, Authentica, Authentic
Problems, Automated Resource Manager, AutoStart, AutoSwap, AVALONidm, Avamar, Captiva, Catalog Solution, C-Clip, Celerra,
Celerra Replicator, Centera, CenterStage, CentraStar, ClaimPack, ClaimsEditor, CLARiiON, ClientPak, Codebook Correlation
Technology, Common Information Model, Configuration Intelligence, Connectrix, CopyCross, CopyPoint, CX, Dantz, Data Domain,
DatabaseXtender, Direct Matrix Architecture, DiskXtender, DiskXtender 2000, Document Sciences, Documentum, elnput, E-Lab,
EmailXaminer, EmailXtender, Enginuity, eRoom, Event Explorer, FarPoint, FirstPass, FLARE, FormWare, Geosynchrony, Global File
Virtualization, Graphic Visualization, Greenplum, HighRoad, HomeBase, InfoMover, Infoscape, InputAccel, InputAccel Express, Invista,
Ionix, ISIS, Max Retriever, MediaStor, MirrorView, Navisphere, NetWorker, OnAlert, OpenScale, PixTools, Powerlink, PowerPath,
PowerSnap, QuickScan, Rainfinity, RepliCare, RepliStor, ResourcePak, Retrospect, RSA, SafeLine, SAN Advisor, SAN Copy, SAN
Manager, Smarts, SnapImage, SnapSure, SnapView, SRDF, StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix,
Symmetrix DMX, Symmetrix VMAX, TimeFinder, UltraFlex, UltraPoint, UltraScale, Unisphere, Viewlets, Virtual Matrix, Virtual Matrix
Architecture, Virtual Provisioning, VisualSAN, VisualSRM, VMAX, VNX, VNXe, Voyence, VPLEX, VSAM-Assist, WebXtender, xPression,
xPresso, YottaYotta, the EMC logo, and the RSA logo, are registered trademarks or trademarks of EMC Corporation in the United States
and other countries. Vblock is a trademark of EMC Corporation in the United States.
All other trademarks used herein are the property of their respective owners.
For the most up-to-date regulatory document for your product line, go to the technical documentation and advisories section on the
EMC online support website.
CONTENTS
Preface
Chapter 1
Introducing VPLEX
VPLEX overview...........................................................................................
VPLEX product family ..................................................................................
Mobility ......................................................................................................
Availability..................................................................................................
Collaboration ..............................................................................................
Architecture highlights ................................................................................
Features and benefits .................................................................................
VPLEX Witness ............................................................................................
Non-disruptive upgrade (NDU) ....................................................................
New features in this release ........................................................................
Chapter 2
VS2 Hardware
The VPLEX cluster........................................................................................
The VPLEX engine and directors ..................................................................
VPLEX power supplies .................................................................................
Hardware failure management and best practices .......................................
Component IP addresses ............................................................................
Chapter 3
45
46
49
56
60
Chapter 5
27
32
34
35
41
VPLEX Software
GeoSynchrony.............................................................................................
Management interfaces...............................................................................
Provisioning with VPLEX ..............................................................................
Consistency groups.....................................................................................
Cache vaulting ............................................................................................
Chapter 4
13
15
17
19
20
20
22
22
24
25
63
64
65
69
76
76
79
81
Contents
................................................................................................................. 126
Appendix A
VS1 Hardware
VS1 cluster configurations ........................................................................
VS1 engine ...............................................................................................
VS1 IP addresses and component IDs .......................................................
VS1 Internal cabling ..................................................................................
Glossary
Index
127
130
130
132
FIGURES
Title
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Page
14
15
17
18
19
20
21
23
28
28
30
31
32
33
34
36
42
43
47
51
51
53
54
57
57
66
67
68
68
69
70
72
73
79
80
80
83
84
85
86
88
89
90
91
92
93
93
94
94
95
5
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
RecoverPoint architecture 97
RecoverPoint configurations 99
VPLEX Local and RecoverPoint 100
VPLEX Local and RecoverPoint Remote - remote site is independent VPLEX cluster
101
VPLEX Local and RecoverPoint remote - remote site is array-based splitter 101
VPLEX Metro and RecoverPoint local replication 102
VPLEX Metro and RecoverPoint local and remote replication- local site is located
at one cluster of the VPLEX. 102
VPLEX Metro and RecoverPoint local and remote replication- remote site is array-based splitter 103
Shared VPLEX splitter 104
Shared RecoverPoint RPA cluster 104
Replication with VPLEX Local and CLARiiON 105
Replication with VPLEX Metro and CLARiiON 105
Support for Site Recovery Manager 106
MetroPoint Two-site Topology 108
Basic MetroPoint Three-site Topology 109
RecoverPoint Local Replication for VPLEX Distributed Volumes at Site A and Site B
and RecoverPoint Remote Replication at Site C 110
Bi-directional replication for volumes in different consistency groups and local
volumes 111
Active Source Site A Failure 112
Remote Site C Failure 113
MetroPoint Four-Site Configuration 116
MetroPoint four-site configuration for a single consistency group 117
Site A and Site B failure showing failover to Site C 118
During failover, Site C and Site D become the production sites. 119
MetroPoint - two local copies 123
Source failover 124
Production failover to the remote copy 125
Recover production 126
VS1 single-engine cluster 127
VS1 dual-engine cluster 128
VS1 quad-engine cluster 129
VS1 engine 130
IP addresses in cluster-1 131
IP addresses in cluster-2 (VPLEX Metro or Geo) 132
Ethernet cabling - VS1 quad-engine cluster 134
Serial cabling - VS1 quad-engine cluster 135
Fibre Channel cabling - VS1 quad-engine cluster 136
AC power cabling - VS1 quad-engine cluster 137
Ethernet cabling - VS1 dual-engine cluster 138
Serial cabling - VS1 dual-engine cluster 139
Fibre Channel cabling- VS1 dual-engine cluster 140
AC power cabling - VS1 dual-engine cluster 141
Ethernet cabling - VS1 single-engine cluster 142
Serial cabling - VS1 single-engine cluster 142
Fibre Channel cabling - VS1 single-engine cluster 142
AC power cabling - VS1 single-engine cluster 143
Fibre Channel WAN COM connections - VS1 143
IP WAN COM connections - VS1 144
TABLES
Title
1
2
3
4
5
Page
22
29
45
86
95
Tableses
PREFACE
As part of an effort to improve and enhance the performance and capabilities of its
product line, EMC from time to time releases revisions of its hardware and software.
Therefore, some functions described in this document may not be supported by all
revisions of the software or hardware currently in use. Your product release notes provide
the most up-to-date information on product features.
If a product does not function properly or does not function as described in this
document, please contact your EMC representative.
About this guide
Audience
This document provides a high level description of the VPLEX product and
GeoSynchrony 5.4 features.
This document is part of the VPLEX system documentation set and introduces the VPLEX
product and its features. This document provides information for customers and
prospective customers to understand VPLEX and how it supports their data storage
strategies.
Related documentation
Related documents (available on EMC Online Support) include:
Preface
A caution contains information essential to avoid data loss or damage to the system or
equipment.
IMPORTANT
An important notice contains information essential to operation of the software.
Typographical conventions
EMC uses the following type style conventions in this document:
Normal
Bold
10
Italic
Courier
Used for:
System output, such as an error message or script
URLs, complete paths, filenames, prompts, and syntax when shown
outside of running text
Courier bold
Used for:
Specific user input (such as commands)
Courier italic
[]
{}
...
Preface
Your comments
Your suggestions will help to improve the accuracy, organization, and overall quality of the
user publications. Send your opinions of this document to:
DPAD.Doc.Feedback@emc.com
11
Preface
12
CHAPTER 1
Introducing VPLEX
This chapter introduces the EMC VPLEX product family. Topics include:
VPLEX overview....................................................................................................... 13
VPLEX product family .............................................................................................. 15
Mobility .................................................................................................................. 17
Availability.............................................................................................................. 19
Collaboration .......................................................................................................... 20
VPLEX Witness ........................................................................................................ 22
Non-disruptive upgrade (NDU) ................................................................................ 24
New features in this release .................................................................................... 25
VPLEX overview
EMC VPLEX federates data that is located on heterogeneous storage arrays to create
dynamic, distributed and highly available data centers.
Use VPLEX to:
Move data nondisruptively between EMC and other third party storage arrays without
any downtime for the host.
VPLEX moves data transparently and the virtual volumes retain the same identities
and the same access points to the host. The is no need to reconfigure the host.
Protect data in the event of disasters or failure of components in your data centers.
With VPLEX, you can withstand failures of storage arrays, cluster components, an
entire site failure, or loss of communication between sites (when two clusters are
deployed) and still keep applications and data online and available.
Introducing VPLEX
13
Introducing VPLEX
With VPLEX, you can transform the delivery of IT to a flexible, efficient, reliable, and
resilient service.
Mobility: VPLEX moves applications and data between different storage installations:
Within the same data center or across a campus (VPLEX Local)
Within a geographical region (VPLEX Metro)
Across even greater distances (VPLEX Geo)
Collaboration: VPLEX provides efficient real-time data collaboration over distance for
Big Data applications.
Size VPLEX to meet your current needs. Grow VPLEX as your needs grow.
A VPLEX cluster includes one, two, or four engines.
Add an engine to an operating VPLEX cluster without interrupting service.
Add a second cluster to an operating VPLEX cluster without interrupting service.
VPLEXs scalable architecture ensures maximum availability, fault tolerance, and
performance.
14
Every engine in a VPLEX cluster can access all the virtual volumes presented by VPLEX.
Introducing VPLEX
Every engine in a VPLEX cluster can access all the physical storage connected to
VPLEX.
VPLEX pools the storage resources in multiple data centers so that the data can be
accessed anywhere. With VPLEX, you can:
Replace your tedious data movement and technology refresh processes with VPLEXs
patented simple, frictionless two-way data exchange between locations.
Create an active-active configuration for the active use of resources at both sites.
Provide instant access to data between data centers. VPLEX allows simple, frictionless
two-way data exchange between locations.
Combine VPLEX with virtual servers to enable private and hybrid cloud computing.
VPLEX Local
VPLEX Metro
VPLEX Geo.
AccessAnywhere at
synchronous
distances
AccessAnywhere at
asynchronous
distances
VPLX-000389
VPLEX Local
Standardizes LUN presentation and management using simple tools to provision and
allocate virtualized storage devices.
15
Introducing VPLEX
Improves storage utilization using pooling and capacity aggregation across multiple
arrays.
VPLEX Metro
VPLEX Metro consists of two VPLEX clusters connected by inter-cluster links with not more
than 5ms1 Round Trip Time (RTT). VPLEX Metro:
Transparently relocates data and applications over distance, protects your data center
against disaster, and enables efficient collaboration between sites.
Manage all of your storage in both data centers from one management interface.
Mirrors your data to a second site, with full access at near local speeds.
Higher availability.
Metro clusters can be placed up to 100 km apart, allowing them to be located at
opposite ends of an equipment room, on different floors, or in different fire
suppression zones; all of which might be the difference between riding through a local
fault or fire without an outage.
Availability: Applications must keep running in the presence of data center failures.
Collaboration: Applications in one data center need to access data in the other data
center.
Improve utilization and availability across heterogeneous arrays and multiple sites.
VPLEX Geo
VPLEX Geo consists of two VPLEX clusters connected by inter-cluster links with not more
than 50ms RTT. VPLEX Geo:
VPLEX Geo provides the same benefits as VPLEX Metro to data centers at asynchronous
distances:
1. Refer to VPLEX and vendor-specific white papers for confirmation of latency limitations.
16
Introducing VPLEX
Increased resiliency
Efficient collaboration
Simplified management
Deploy VPLEX to meet your current high-availability and data mobility requirements.
Add engines or a second cluster to scale VPLEX as your requirements increase. You can do
all the following tasks without disrupting service:
Upgrade GeoSynchrony.
Mobility
VPLEX mobility allows you to move data located on either EMC or non-EMC storage arrays
simply and without disruption. Use VPLEX to simplify the management of your data center
and eliminate outages to migrate data or refresh technology.
Combine VPLEX with server virtualization to transparently move and relocate virtual
machines and their corresponding applications and data without downtime.
Mobility
17
Introducing VPLEX
Relocate, share, and balance resources between sites, within a campus or between data
centers.
VPLEX Local
ACCESS ANYWHERE
MOBILITY
Cluster B
Use the storage and compute resources available at either of the VPLEX cluster locations
to automatically balance loads.
Move data between sites, over distance, while the data remains online and available
during the move. No outage or downtime is required.
VPLEX federates both EMC and non-EMC arrays, so even if you have a mixed storage
environment, VPLEX provides an easy solution.
Extent migrations move data between extents in the same cluster. Use extent migrations
to:
Move extents from a hot storage volume shared by other busy extents.
Device migrations move data between devices (RAID 0, RAID 1, or RAID C devices built on
extents or on other devices) on the same cluster or between devices on different clusters.
Use device migrations to:
18
Introducing VPLEX
Note: Up to 25 local and 25 distributed migrations can be in progress at the same time.
Any migrations beyond those limits are queued until an ongoing migration completes.
Availability
VPLEX redundancy provides reduced Recovery Time Objective (RTO) and Recovery Point
Objective (RPO).
VPLEX features allow the highest possible resiliency in the event of an outage. Figure 5
shows a VPLEX Metro configuration where storage has become unavailable at one of the
cluster sites.
Cluster A
ACCESS ANYWHERE
Cluster B
Availability
19
Introducing VPLEX
Collaboration
Collaboration increases utilization of passive data recovery assets and provides
simultaneous access to data.
ACCESS ANYWHERE
Enable concurrent
read/write access to data
across locations
Figure 6 Distributed data collaboration
VPLEX AccessAnywhere enables multiple users at different sites to work on the same data
while maintaining consistency of the dataset.
Traditional solutions support collaboration across distance by shuttling entire files
between locations using FTP. This is slow and contributes to network congestion for large
files (or even small files that move regularly). One site may sit idle waiting to receive the
latest data. Independent work results in inconsistent data that must be synchronized, a
task that becomes more difficult and time-consuming as your data sets increase in size.
AccessAnywhere supports co-development that requires collaborative workflows such as
engineering, graphic arts, video, educational programs, design, and research.
VPLEX provides a scalable solution for collaboration.
Architecture highlights
A VPLEX cluster consists of:
20
A management server.
Introducing VPLEX
The management server has a public Ethernet port, which provides cluster
management services when connected to the customer network.
HP, Oracle (Sun),
Microsoft, Linux, IBM Oracle, VMware, Microsoft
Brocade,
Cisco
VPLEX
Brocade,
Cisco
VPLEX conforms to established world wide naming (WWN) guidelines that can be used for
zoning.
VPLEX supports EMC storage and arrays from other storage vendors, such as HDS, HP, and
IBM.
VPLEX provides storage federation for operating systems and applications that support
clustered file systems, including both physical and virtual server environments with
VMware ESX and Microsoft Hyper-V.
VPLEX supports network fabrics from Brocade and Cisco.
Refer to the EMC Simple Support Matrix, EMC VPLEX and GeoSynchrony, available at
http://elabnavigator.EMC.com under the Simple Support Matrix tab.
Architecture highlights
21
Introducing VPLEX
Benefits
Mobility
Availability
Resiliency: Mirror across arrays within a single data center or between data
centers without host impact. This increases availability for critical
applications.
Distributed cache coherency: Automate sharing, balancing, and failover of I/O
across the cluster and between clusters whenever possible.
Advanced data caching: Improve I/O performance and reduce storage array
contention.
Collaboration
Federates the storage volumes into hierarchies of VPLEX virtual volumes with
user-defined configuration and protection levels.
Presents virtual volumes to production hosts in the SAN through the VPLEX front-end.
For VPLEX Metro and VPLEX Geo products, presents a global, block-level directory for
distributed cache and I/O between VPLEX clusters.
VPLEX Witness
VPLEX Witness helps multi-cluster VPLEX configurations automate the response to cluster
failures and inter-cluster link outages.
VPLEX Witness is an optional component installed as a virtual machine on a customer
host.
The customer host must be deployed in a separate failure domain from either of the
VPLEX clusters to eliminate the possibility of a single fault affecting both a cluster and
VPLEX Witness.
22
Introducing VPLEX
VPLEX Witness connects to both VPLEX clusters over the management IP network as
illustrated in Figure 8:
VPLEX Witness observes the state of the clusters, and thus can distinguish between an
outage of the inter-cluster link and a cluster failure. VPLEX Witness uses this information
to guide the clusters to either resume or suspend I/O.
In VPLEX Metro configurations, VPLEX Witness provides seamless zero RTO fail-over for
synchronous consistency groups.
In VPLEX Geo configurations, VPLEX Witness can be useful for diagnostic purposes.
Note: VPLEX Witness works in conjunction with consistency groups. VPLEX Witness
guidance does not apply to local volumes and distributed volumes that are not members
of a consistency group. VPLEX Witness does not automate any fail-over decisions for
asynchronous consistency groups (VPLEX Geo configurations).
High availability
compared to
disaster recovery
Highly available (HA) designs are typically deployed within a data center and disaster
recovery (DR) functionality is typically deployed between data centers.
Traditionally:
When VPLEX Metro active/active replication technology is used in conjunction with VPLEX
Witness, the line between local high availability and long-distance disaster recovery is not
clear. With VPLEX Metro and VPLEX Witness, high availability is stretched beyond the data
center walls.
VPLEX Witness
23
Introducing VPLEX
Note: VPLEX Witness has no effect on failure handling for distributed volumes that are
outside of consistency groups or for volumes that are in asynchronous consistency
groups. Witness also has no effect on distributed volumes in synchronous consistency
groups when the preference rule is set to no-automatic-winner.
See High Availability with VPLEX Witness on page 69 for more information on VPLEX
Witness including the differences in how VPLEX Witness handles failures and recovery.
Storage,
application, and
host upgrades
VPLEX enables the easy addition or removal of storage, applications, and hosts.
When VPLEX encapsulates back-end storage, the block-level nature of the coherent cache
allows the upgrade of storage, applications, and hosts.
You can configure VPLEX so that all devices within VPLEX have uniform access to all
storage blocks.
Increase engine
count
When capacity demands increase, VPLEX supports hardware upgrades for single-engine
VPLEX systems to dual-engine and dual-engine to quad-engine VPLEX systems.
These upgrades also increase the availability of front-end and back-end ports in the data
center.
Software upgrades
Ports
Paths
Directors
Engines
This redundancy allows GeoSynchrony on VPLEX Local and Metro to be upgraded without
interrupting host access to storage, it does not require service window or application
disruption.
On VPLEX Geo configurations, the upgrade script ensures that the application is
active/passive before allowing the upgrade.
Note: You must upgrade the VPLEX management server software before upgrading
GeoSynchrony. Management server upgrades are non-disruptive.
24
Introducing VPLEX
Simple support
matrix
MetroPoint
MetroPoint provides the highest levels of protection and redundancy against full site
failures, regional-wide failures, data corruption, and other such events. It combines
the full value of VPLEX and RecoverPoint into a 3-site or 4-site topology for continuous
availability, disaster recovery, and continuous data protection. The MetroPoint
topology provides users full RecoverPoint protection for their VPLEX Metro
configuration, maintaining replication even when one site of the Metro is down.
MetroPoint protects customer applications and data from any failure scenario,
including:
Any single component failure
A complete site failure
A regional failure that affects two sites
Data unavailability due to user errors
Data corruption from viruses
In a MetroPoint two-site configuration, RecoverPoint Local Replication is added to both
sides of a VPLEX Metro to provide operational recovery for the same application.
Although this is not a typical MetroPoint configuration, it still requires the latest
software that supports MetroPoint on both VPLEX and RecoverPoint. The configuration
can be updated in future by adding a third remote site for full MetroPoint protection
and disaster recovery.
In a MetroPoint three-site configuration, data replication to the remote site continues
even if one of the local sites fails completely, and local replication is allowed on both
local sites at the same time for maximum protection.
In a MetroPoint four-site configuration, MetroPoint provides continuous availability for
production applications running in two separate regions. Each region provides
disaster recovery protection for the other region.
The VPLEX Administration Guide provides more information on configuring and
managing MetroPoint.
25
Introducing VPLEX
26
CHAPTER 2
VS2 Hardware
This chapter provides a high-level overview of the major hardware components in a VS2
VPLEX and how hardware failures are managed to support uninterrupted service.
Topics include:
27
32
34
35
35
41
Note: See Appendix A, VS1 Hardware. for information about VS1 hardware.
1, 2, or 4 VPLEX engines
Each engine contains two directors.
Dual-engine or quad-engine clusters also contain:
1 pair of Fibre Channel switches for communication between directors.
2 uninterruptible power supplies (UPS) for battery power backup of the Fibre
Channel switches and the management server.
A management server.
Ethernet or Fibre Channel cabling and respective switching hardware that connects the
distributed VPLEX hardware components.
I/O modules provide front-end and back-end connectivity between SANs and to
remote VPLEX clusters in VPLEX Metro or VPLEX Geo configurations.
Note: In the current release of GeoSynchrony, VS1 and VS2 hardware cannot co-exist in a
cluster, except in a VPLEX Local cluster during a non disruptive hardware upgrade from
VS1 to VS2.
VS2 Hardware
27
VS2 Hardware
Management server
Each VPLEX cluster has one management server.
You can manage both clusters in VPLEX Metro and VPLEX Geo configurations from a single
management server.
The management server:
Forwards VPLEX Witness traffic between directors in the local cluster and the remote
VPLEX Witness server.
Redundant internal network IP interfaces connect the management server to the public
network. Internally, the management server is on a dedicated management IP network
that provides accessibility to all major components in the cluster.
Each Fibre Channel switch is powered by a UPS, and has redundant I/O ports for
intra-cluster communication.
The Fibre Channel switches do not connect to the front-end hosts or back-end storage.
1, 2, or 4 VPLEX engines
A VPLEX cluster can have 1 (single), 2 (dual), or 4 (quad) engines.
Note: The placement of components shown for single-engine and dual-engine clusters
allows for non disruptive addition of engines to scale the cluster to a larger configuration.
Table 2 describes the major components of a VPLEX cluster and their functions.
28
VS2 Hardware
Description
Engine
Director
Contains:
Five I/O modules (IOMs), as identified in Figure 14 on page 33
Management module for intra-cluster communication
Two redundant 400 W power supplies with built-in fans
CPU
Solid-state disk (SSD) that contains the GeoSynchrony operating
environment
RAM
Management server
Provides:
Management interface to a public IP network
Management interfaces to other VPLEX components in the cluster
Event logging service
Power subsystem
One SPS assembly (two SPS modules) provides backup power to each
engine in the event of an AC power interruption. Each SPS module
maintains power for two five-minute periods of AC loss while the
engine shuts down.
Uninterruptible Power
Supply (UPS)
(Dual-engine or
quad-engine cluster
only)
One UPS provides battery backup for Fibre Channel switch A and the
management server, and a second UPS provides battery backup for
Fibre Channel switch B. Each UPS module maintains power for two
five-minute periods of AC loss while the engine shuts down.
29
VS2 Hardware
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
Laptop tray
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Engine 1, Director B
SPS 1B
Management server
Engine 1, Director A
SPS 1A
VPLX-000226
30
VS2 Hardware
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
Laptop tray
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Engine 2, Director B
SPS 2B
Management server
Engine 2, Director A
SPS 2A
O
OFF
O
OFF
ON
I
ON
I
Engine 1, Director B
SPS 1B
Engine 1, Director A
SPS 1A
VPLX-000227
31
VS2 Hardware
Engine 4, Director B
ON
I
ON
I
O
OFF
O
OFF
SPS 4B
Engine 4, Director A
SPS 4A
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
Engine 3, Director B
Engine 3, Director A
SPS 3B
SPS 3A
Laptop tray
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Engine 2, Director B
SPS 2B
Management server
Engine 2, Director A
SPS 2A
O
OFF
O
OFF
ON
I
ON
I
Engine 1, Director B
SPS 1B
Engine 1, Director A
SPS 1A
VPLX-000228
32
VS2 Hardware
Director B
IOM A4 - reserved
Management module A
IOM B4 - reserved
Management module B
Figure 14 shows a VPLEX engine and its two directors: Director A and Director B.
Director A
Depending on the cluster topology, slots A2 and B2 contain one of the following
I/O modules (IOMs) (both IOMs must be the same type):
8 Gb/s
Fibre Channel
10 Gb/s
Ethernet
Filler module
(VPLEX Local only)
VPLX-000229
The GeoSynchrony operating system runs on the VPLEX directors, and supports:
Virtual-to-physical translations
Industry-standard Fibre Channel ports connect to host initiators and storage devices.
WAN connectivity
For VPLEX Metro and Geo configurations, dual inter-cluster WAN links connect the two
clusters.
33
VS2 Hardware
For Fibre Channel connections, IOMs A2 and B2 contain four Fibre Channel ports.
For IP connections, IOMs A2 and B2 contain two Ethernet ports.
The inter cluster link carries unencrypted user data. To protect the security of the data,
secure connections are required between clusters.
AC power connection
Connect your VPLEX cluster to two independent power zones to assure a highly available
power distribution configuration.
Figure 15 shows AC power supplied from independent power distribution units (PDUs):
Customer
PDU 1
Circuit
breaker
off (0)
Circuit
breaker
off (0)
28
PDU 1
CB 28
29
30
10
...
11
PDU 2
CB 9
EMC
cabinet, rear
Labels on
customer
power lines
PDPs
Power Zone
Lower B Lower A
PDU#
Panel#
CB#(s)
PDU#
Panel#
28
Power
zone B
(black)
CB#(s)
Customer
PDU 2
Power
zone A
(gray)
...
VS2 Hardware
Standby power
supplies
Each engine is connected to two standby power supplies (SPS) that provide battery
backup to each director.
SPSs have sufficient capacity to ride through transient site power failures or to vault their
cache when power is not restored within 30 seconds.
A single standby power supply provides enough power for the attached engine to ride
through two back-to-back 5-minute losses of power.
Refer to Protection from power failure on page 78.
Uninterruptible
power supplies
Dual and quad engine clusters include two uninterruptible power supplies, UPS-A and
UPS-B.
In the event of a power failure:
UPS-A provides power to the management server and Fibre Channel switch A.
The two UPS units provide sufficient power to support the Fibre Channel switches and
management server for two back-to-back 5-minute power outages.
Power and
environmental
monitoring
GeoSynchrony monitors the overall health of the VPLEX cluster and the environment for
the VPLEX cluster hardware.
Power and environmental conditions are monitored at regular intervals. Any changes to
the VPLEX power or hardware health are logged.
Conditions that indicate a hardware or power fault generate a call-home event.
35
VS2 Hardware
Storage array
failures
VPLEX makes it easy to mirror the data of a virtual volume between two or more storage
volumes using a RAID 1 device.
When a mirror is configured, a failed array or planned outage does not interrupt service.
I/O continues on the healthy leg of the device. When the failed/removed array is restored,
the VPLEX system uses the information in logging volumes to synchronize the mirrors.
Only the changed blocks are synchronized, therefore minimizing the inter-cluster traffic.
Figure 16 shows a virtual volume mirrored between two arrays.
For critical data, mirror data on two or more storage volumes that are located on
separate arrays.
For the best performance, storage volumes at each leg of the distributed device should
be the same size and hosted on arrays with the same performance characteristics.
VPLEX communications use redundant paths that allow communication to continue during
port failures. This redundancy allows multipathing software to redirect I/O around path
failures that occur as part of port failures.
VPLEX has its own multipathing logic that maintains redundant paths to back-end storage
from each director. This allows VPLEX to continue uninterrupted during failures of the
back-end ports, back-end fabric, and the array ports that connect the physical storage to
VPLEX.
The small form-factor pluggable (SFP) transceivers used for connectivity to VPLEX are field
replaceable units (FRUs).
36
Ensure that there is a path from each host to at least one front-end port on director A
and at least one front-end port on director B.
VS2 Hardware
When the VPLEX cluster has two or more engines, ensure that the host has at least one
A-side path on one engine and at least one B-side on a separate engine.
For maximum availability, each host can have a path to at least one front-end port on
every director.
Use multi-pathing software on the host servers to ensure timely response and
continuous I/O in the presence of path failures.
Ensure that each host has a path to each virtual volume through each fabric.
Ensure that the fabric zoning provides hosts redundant access to the VPLEX front-end
ports.
Back end:
Ensure that the logical unit number (LUN) mapping and masking for each storage
volume presented from a storage array to VPLEX presents the volumes out of at least
two ports from the array and on at least two different fabrics from different controllers.
Ensure that the LUN connects to at least two different back end ports of each director
within a VPLEX cluster.
Active/passive arrays must have one active and one passive port zoned to each
director, and zoning must provide VPLEX with the redundant access to the array ports.
One COM I/O module used for intra-cluster and inter-cluster connectivity.
Each I/O module is a serviceable FRU. The following sections describe the behavior of the
system.
Front end I/O module
Failure of a front end I/O module causes all paths that are connected to the failed module
to fail. VPLEX automatically sends a call-home notification.
Follow the guideline described in Best practices: Fibre Channel ports on page 36 to
ensure that hosts have a redundant path to their data.
During the removal and replacement of an I/O module, the affected director resets.
Failure of a back end I/O module causes all paths that are connected to the failed module
to fail. VPLEX automatically sends a call-home notification.
Follow the guidelines described in Best practices: Fibre Channel ports on page 36 to
ensure that each director has a redundant path to each storage volume through a separate
I/O module.
During the removal and replacement of an I/O module, the affected director resets.
Hardware failure management and best practices
37
VS2 Hardware
Failure of the local COM I/O module of a director causes the director to reset and stops all
service provided from that director.
Follow the guidelines described in Best practices: Fibre Channel ports on page 36 to
ensure that each host has redundant access to its virtual storage through multiple
directors.
During the removal and replacement of a local I/O module, the affected director resets. If
best practices are followed, the reset of a single director does not cause the host to lose
access to its storage.
Director failure
Failure of a director causes the loss of all service from that director. The second director in
the engine continues to service I/O.
VPLEX clusters containing two or more engines benefit from the additional redundancy
provided by the additional directors.
Each director within a cluster is capable of presenting the same storage.
Follow the guidelines described in Best practices: Fibre Channel ports on page 36 to
allow a host to ride through director failures by placing redundant paths to their virtual
storage through ports provided by different directors.
The combination of multipathing software on the hosts and redundant paths through
different directors of the VPLEX system allows the host to ride through the loss of a
director.
Each director is a serviceable FRU.
Intra-cluster IP
management
network failure
In Metro and Geo configurations, VPLEX clusters are connected by a pair of private local IP
subnets between directors and the management server. The subnets:
Connect the VPLEX Witness server (if it is deployed) and the directors.
Failure on one of these subnets can result in the inability of some subnet members to
communicate with other members on that subnet.
Because the subnets are redundant, failure of one subnet results in no loss of service or
manageability.
Note: Failure of a single subnet may result in loss of connectivity between the director and
VPLEX Witness.
Intra-cluster Fibre
Channel switch
failure
Dual and quad engine clusters include a pair of dedicated Fibre Channel switches for
intra-cluster communication between the directors within the cluster.
Two redundant Fibre Channel fabrics are created. Each switch serves a different fabric.
Failure of a single Fibre Channel switch results in no loss of processing or service.
38
VS2 Hardware
Inter-cluster WAN
links
Best practices
In VPLEX Metro and VPLEX Geo configurations, the clusters are connected through
redundant WAN links that you provide.
When configuring the inter-cluster network:
Latency must be less than 5 milliseconds (ms) round trip time (RTT) for a VPLEX Metro,
and less than 50ms RTT for a VPLEX Geo.
Switches supporting the WAN links must be configured with a battery backup UPS.
Every WAN port on every director must be able to connect to a WAN port on every
director in the other cluster.
Logically isolate VPLEX Metro /Geo traffic from other WAN traffic using VSANs or
LSANs.
SPS/UPS failures
Each standby power supply (SPS) is a field replaceable unit (FRU) and can be replaced with
no disruption.
SPS batteries support two sequential outages (not greater than 5 minutes) without data
loss. The recharge time for an SPS is up to 5.5 hours.
Each uninterruptible power supply (UPS) is a FRU and can be replaced with no disruption.
UPS modules support two sequential outages (not greater than 5 minutes) to the Fibre
Channel switches in a multi-engine cluster. The recharge time for a UPS to reach 90%
capacity is 6 hours.
Note: While the batteries can support two 5-minute power losses, the VPLEX Local, VPLEX
Metro, or VPLEX Geo cluster vaults after a 30 second power loss to ensure that there is
enough battery power to complete the cache vault.
39
VS2 Hardware
Vaulting is introduced.
On all configurations, vaulting is triggered if all the following conditions are met:
AC power is lost (due to power failure or faulty hardware) in power zone A from
engine X.
AC power is lost (due to power failure or faulty hardware) in power zone B from
engine Y.
(X and Y would be the same in a single engine configuration but they may or may
not be the same in dual or quad engine configurations).
Both conditions persist for more than 30 seconds.
Release 5.1:
40
VS2 Hardware
In a VPLEX Local or VPLEX Metro configuration, vaulting is triggered if all the following
conditions are met:
AC power is lost (due to power failure or faulty hardware) or becomes unknown
in the minimum number of directors required for the cluster to be operational.
Condition persist for more than 30 seconds.
VPLEX Witness
failure
If VPLEX Witness is deployed, failure of the VPLEX Witness has no impact on I/O as long as
the two clusters stay connected with each other.
If a cluster fails or inter-cluster network partition occurs while VPLEX Witness is down, then
there will be data unavailability on all surviving clusters.
Best practice
Best practice is to disable VPLEX Witness (while the clusters are still connected) if its
outage is expected to be long, and then revert to using preconfigured detach rules.
Once VPLEX Witness recovers, re-enable VPLEX Witness.
Refer to the EMC VPLEX CLI Guide for information about the commands to disable and
enable VPLEX Witness.
VPLEX
management
server failure
I/O processing of the VPLEX directors does not depend on the management servers. Thus,
in most cases failure of a management server does not interrupt the I/O processing.
VPLEX Witness traffic traverses the management server. If the management server fails in
a configuration where VPLEX Witness is deployed, then the VPLEX Witness cannot
communicate with the cluster.
Failure of the remote VPLEX cluster results in data unavailability. Failure of only the
inter-cluster network has no effect. The remote cluster continues I/O processing
regardless of preference because it is still connected to VPLEX Witness1.
Best practice
Best practice is to disable VPLEX Witness (while the clusters are still connected) if its
outage is expected to be long, and then revert to using preconfigured detach rules.
When the management server is replaced or repaired, use the cluster-witness enable CLI
command to re-enable VPLEX Witness.
Refer to the EMC VPLEX CLI Guide for information about the commands to disable and
enable VPLEX Witness.
Component IP addresses
This section provides details about the IP addresses that are used to connect the
components within a VPLEX cluster.
1. This description applies only to synchronous consistency groups with a rule setting that identifies
a specific preference.
Component IP addresses
41
VS2 Hardware
IP addresses cluster-1
Cluster IP Seed = 1
Enclosure IDs = engine numbers
Engine 4:
Director 4B, A side: 128.221.252.42
Director 4B, B side: 128.221.253.42
Engine 4:
Director 4A, A side: 128.221.252.41
Director 4A, B side: 128.221.253.41
Engine 3:
Director 3B, A side: 128.221.252.40
Director 3B, B side: 128.221.253.40
Engine 3:
Director 3A, A side: 128.221.252.39
Director 3A, B side: 128.221.253.39
FC switch B 128.221.253.34
Service port
128.221.252.2
Mgt B port
128.221.253.33
Mgt A port
128.221.252.33
Management server
Engine 2:
Director 2B, A side: 128.221.252.38
Director 2B, B side: 128.221.253.38
Engine 2:
Director 2A, A side: 128.221.252.37
Director 2A, B side: 128.221.253.37
Engine 1:
Director 1B, A side: 128.221.252.36
Director 1B, B side: 128.221.253.36
Engine 1:
Director 1A, A side: 128.221.252.35
Director 1A, B side: 128.221.253.35
VPLX-000242
42
VS2 Hardware
IP addresses - cluster-2
Cluster IP Seed = 2
Enclosure IDs = engine numbers
Engine 4:
Director 4B, A side: 128.221.252.74
Director 4B, B side: 128.221.253.74
Engine 4:
Director 4A, A side: 128.221.252.73
Director 4A, B side: 128.221.253.73
Engine 3:
Director 3B, A side: 128.221.252.72
Director 3B, B side: 128.221.253.72
Engine 3:
Director 3A, A side: 128.221.252.71
Director 3A, B side: 128.221.253.71
FC switch B 128.221.253.66
Service port
128.221.252.2
Mgt B port
128.221.253.65
Mgt A port
128.221.252.65
Management server
Engine 2:
Director 2B, A side: 128.221.252.70
Director 2B, B side: 128.221.253.70
Engine 2:
Director 2A, A side: 128.221.252.69
Director 2A, B side: 128.221.253.69
Engine 1:
Director 1B, A side: 128.221.252.68
Director 1B, B side: 128.221.253.68
Engine 1:
Director 1A, A side: 128.221.252.67
Director 1A, B side: 128.221.253.67
VPLX-000243
Component IP addresses
43
VS2 Hardware
44
CHAPTER 3
VPLEX Software
This chapter describes the major components of VPLEX software. Topics include:
GeoSynchrony.........................................................................................................
Management interfaces...........................................................................................
Provisioning with VPLEX ..........................................................................................
Data caching ...........................................................................................................
Consistency groups .................................................................................................
Cache vaulting ........................................................................................................
45
46
49
54
56
60
GeoSynchrony
GeoSynchrony is the operating system that runs on the VPLEX directors. GeoSynchrony
runs on both VS1 and VS2 hardware.
GeoSynchrony is:
Storage volume
encapsulation
RAID 0
RAID-C
VPLEX Software
45
VPLEX Software
RAID 1
Distributed RAID
1
Extents
Storage volumes can be broken into extents and devices can be created from
these extents.
Considerations: Use extents when LUNs from a back-end storage array are
larger than the desired LUN size for a host. This provides a convenient way of
allocating what is needed while taking advantage of the dynamic thin
allocation capabilities of the back-end array.
Migration
Global Visibility
The presentation of a volume from one VPLEX cluster where the physical
storage for the volume is provided by a remote VPLEX cluster.
Considerations: Use Global Visibility for AccessAnywhere collaboration
between locations. The cluster without local storage for the volume will use
its local cache to service I/O but non-cached operations incur remote
latencies to write or read the data.
Management interfaces
In VPLEX Metro and VPLEX Geo configurations, both clusters can be managed from either
management server.
Inside VPLEX clusters, management traffic traverses a TCP/IP based private management
network.
In VPLEX Metro and VPLEX Geo configurations, management traffic traverses a VPN tunnel
between the management servers on both clusters.
Web-based GUI
46
VPLEX Software
The GUI supports most of the VPLEX operations, and includes EMC Unisphere for VPLEX
online help to assist new users in learning the interface.
VPLEX operations that are not available in the GUI, are supported by the Command Line
Interface (CLI), which supports full functionality.
VPLEX CLI
Other commands are arranged in a hierarchical context tree, and can be executed only
from the appropriate location in the context tree.
Example 1 shows a CLI session that performs the same tasks as shown in Figure 19.
Example 1 Claim storage using the CLI:
In the following example, the claimingwizard command finds unclaimed storage volumes,
claims them as thin storage, and assigns names from a CLARiiON hints file:
VPlexcli:/clusters/cluster-1/storage-elements/storage-vol
umes> claimingwizard --file /home/service/clar.txt
--thin-rebuild
Found unclaimed storage-volume
VPD83T3:6006016091c50e004f57534d0c17e011 vendor DGC:
claiming and naming clar_LUN82.
Found unclaimed storage-volume
VPD83T3:6006016091c50e005157534d0c17e011 vendor DGC:
claiming and naming clar_LUN84.
Claimed 2 storage-volumes in storage array car
Claimed 2 storage-volumes in total.
Management interfaces
47
VPLEX Software
VPlexcli:/clusters/cluster-1/storage-elements/storage-vol
umes>
The EMC VPLEX CLI Guide provides a comprehensive list of VPLEX commands and detailed
instructions on using those commands.
VPLEX Element
Manager API
VPLEX Element Manager API uses the Representational State Transfer (REST) software
architecture for distributed systems such as the World Wide Web. It allows software
developers and other users to use the API to create scripts to run VPLEX CLI commands.
VPLEX Element Manager API supports all VPLEX CLI commands that can be executed from
the root context.
The VPLEX SNMP agent:
SNMP
Runs on the management server and fetches performance related data from individual
directors using a firmware specific interface.
Runs on Port 161 of the management server and uses the UDP protocol.
VPLEX MIBs are located on the management server in the /opt/emc/VPlex/mibs directory.
LDAP/AD
VPLEX administrators can choose to configure their user accounts using either:
An external OpenLDAP or Active Directory server (which integrates with Unix using
Service for UNIX 3.5 or Identity Management for UNIX or other authentication service).
OpenLDAP and Active Directory users are authenticated by the server.
Usernames/passwords that are created on an external server are fetched from the
remote system onto the VPLEX system when they are used for the first time. They are
stored on the VPLEX system after the first use.
Customers who do not want to use an external LDAP server for maintaining user accounts
can create user accounts on the VPLEX system itself.
Call-home
48
VPLEX Software
EZ provisioning
Advanced provisioning
All provisioning features are available in the Unisphere for VPLEX GUI.
Integrated storage
Integrated storage refers to storage created through the VPLEX Integrated Services feature.
This feature requires the use of Array Management Providers (AMPs) to leverage
functionality on the array, specifically on storage pools. If your array functionality includes
storage pools and the array is supported for use with VPLEX, you can integrate the array
with VPLEX and provision storage from pools on the array through VPLEX.
Before provisioning from storage pools, you must first register the AMP that manages the
array. Your VPLEX system can include AMPs that manage some or all of the arrays in your
environment. An array must be integrated with VPLEX in order to provision storage from
pools on the array. Note that you can provision from storage volumes and also on an
integrated array.
Note: In this release, VIAS uses the Storage Management Initiative - Specification (SMI-S)
provider to communicate with the arrays that support integrated services to enable
provisioning.
For more information about registering AMPs and provisioning from storage pools, refer to
the provisioning chapter of the VPLEX Administration Guide.
Other storage
Other storage refers to storage from arrays that are not integrated with VPLEX through
AMPs. Because VPLEX cannot access functionality on the array, you cannot use array
functionality such as storage pools. Therefore, you can only provision from storage
volumes discovered on the array. There are two ways to provision from storage volumes:
EZ-Provisioning and advanced provisioning.
49
VPLEX Software
EZ-Provisioning
EZ-Provisioning uses the Provision from Storage Volumes wizard to guide you through
provisioning from selected storage volumes on the array.
Use the provisioning wizard to:
The wizard eliminates the individual steps required to claim storage, create extents, create
devices, and then create virtual volumes on those devices. The wizard creates one volume
at a time which uses the full capacity of the selected storage volume. You can also use this
wizard to preserve existing data on a storage volume and make that data visible to a host
through a virtual volume.
Refer to Provision from storage volumes in the VPLEX Online Help for more information.
Advanced
provisioning
Advanced provisioning allows you to slice (use less than the entire capacity) storage
volumes into extents. You can then use one or more of these extents to create devices,
and then virtual volumes on these devices. Advanced provisioning requires you to perform
each of these steps individually, in the order listed in the step-by-step instructions. Use
this method when you want to slice storage volumes and perform other advanced
provisioning tasks such as creating complex devices. The GUI also provides wizards that
allow you to create each of the required storage objects.
For more information about types of devices, refer to About devices on page 51.
Distributed storage
Distributed storage refers to storage objects that are created by using storage from both
clusters.
To view distributed storage in the Unisphere for VPLEX GUI, hover over Provision Storage >
Distributed, and then select a distributed object to view. You can also access distributed
storage from the Distributed Storage section in the navigation pane on the left when an
individual object screen opens.
Traditional (thick) provisioning anticipates future growth and thus allocates storage
capacity beyond the immediate requirement.
Traditional rebuilds copy all the data from the source to the target.
Thin provisioning allocates storage capacity only as the application needs it when it
writes. Thinly provisioned volumes:
Thin provisioning optimizes the efficiency with which available storage space is used.
By default, VPLEX treats all storage volumes as thickly provisioned volumes. VPLEX can
claim arrays that are thinly provisioned using the thin-rebuild attribute.
50
VPLEX Software
If a target is thinly provisioned, VPLEX reads the storage volume, and does not write
unallocated blocks to the target, preserving the targets thin provisioning.
About extents
Devices
Extents
Storage volumes
Figure 20 Extents
About devices
Devices combine extents or other devices into a device with specific RAID techniques such
as mirroring or striping.
Virtual volume
Top-level device
Devices
Extents
Storage volumes
Figure 21 Devices
51
VPLEX Software
A complex device has more than one component, combined by using a specific RAID
technique. The components can be extents or other devices (both simple and complex).
A top-level device consists of one or more child devices.
VPLEX supports the following RAID types:
RAID-1 - Mirrors data using at least two devices to duplicate the data. RAID-1 does not
stripe.
RAID-1 improves read performance because either extent can be read at the same
time.
Use RAID-1 for applications that require high fault tolerance, without heavy emphasis
on performance.
A device's storage capacity is not available until you create a virtual volume on the device
and export that virtual volume to a host. You can create only one virtual volume per device.
Device visibility
52
VPLEX Software
Distributed devices
Distributed devices have their underlying storage arrays located at both clusters in a
VPLEX Metro or VPLEX Geo:
Site A
Site B
Fibre Channel
VPLX-000433
Distributed devices support virtual volumes that are presented to a host through a storage
view. From the host, the virtual volumes appear as single volumes located on a single
array.
Distributed devices are present at both clusters for simultaneous active/active read/write
access. VPLEX AccessAnywhere ensures consistency of the data between the clusters.
VPLEX distributed devices enable distributed data centers. Some of the benefits of
implementing a distributed data center include:
Increased availability - Both data centers can serve as production workloads while
providing high availability for the other data center.
Increased asset utilization - Passive data centers can have idle resources.
Increased performance/locality of data access - Data need not be read from the
production site as the same data is read/write accessible at both sites.
You can configure up to 8000 distributed devices in a VPLEX system. That is, the total
number of distributed virtual volumes plus the number of top level local devices must not
exceed 8000.
Mirroring
53
VPLEX Software
RAID-1 data is mirrored using at least two extents to duplicate the data. Read performance
is improved because either extent can be read at the same time.
VPLEX manages mirroring between heterogenous storage arrays for both local and
distributed mirroring.
Local mirroring
Local mirroring (mirroring on VPLEX Local systems) protects RAID-1 virtual volumes within
a data center.
VPLEX RAID-1 devices provide a local full-copy RAID 1 mirror of a device independent of
the host and operating system, application, and database.
Remote mirroring
Distributed mirroring (VPLEX Metro and VPLEX Geo) protects distributed virtual volumes by
mirroring it between two VPLEX clusters.
About virtual
volumes
Devices
Extents
Storage volumes
If needed, you can expand the capacity of a virtual volume. The expansion method
supported are as follows:
Storage-volume
Concatenation
VPLEX uses advanced data caching of EMC to improve I/O performance and reduce
storage array contention. The type of data caching used for distributed volumes depends
on the VPLEX configuration.
54
VPLEX Local and Metro configurations have round-trip latencies of 5 ms or less, and
use write-through caching.
VPLEX Software
VPLEX Geo configurations support round-trip latencies of greater than 5 ms, and use
write-back caching.
Write-through caching
In write-through caching, a director writes to the back-end storage in both clusters before
acknowledging the write to the host.
Write-through caching maintains a real-time synchronized mirror of a virtual volume
between the two VPLEX clusters providing a recovery point objective (RPO) of zero data
loss and concurrent access to the volume through either of the clusters.
In the VPLEX user interface, write-through caching is known as synchronous cache mode.
Write-back caching
In write-back caching, a director stores the data in its cache and also protects it at another
director in the local cluster before acknowledging the write to the host.
At a later time, the data is written to the back end storage. Write-back caching provides a
RPO that could be as short as few seconds.
This type of caching is performed on VPLEX Geo configurations, where the latency is
greater than 5 ms.
In the VPLEX user interface, write-back caching is known as asynchronous cache mode.
Logging volumes
After the inter-cluster link or leg is restored, the VPLEX system uses the information in the
logging volumes to synchronize the mirrors by sending only the changed blocks across the
link.
Logging volumes also track changes during loss of a volume when that volume is one of
the mirror in a distributed device.
For VPLEX Metro and Geo configurations, logging volumes are required at each cluster
before a distributed device can be created.
VPLEX Local configurations and systems that do not have distributed devices do not
require logging volumes.
Back-end load
balancing
VPLEX uses all paths to a LUN in a round-robin fashion thus balancing the load across all
paths.
Slower storage hardware can be dedicated for less frequently accessed data and
optimized hardware can be dedicated to applications that require the highest storage
response.
55
VPLEX Software
Consistency groups
VPLEX consistency groups aggregate volumes to enable the application of a common set
of properties to the entire group.
Consistency groups aggregate up to 1,000 virtual volumes into a single entity that can be
managed as easily as an individual volume.
If all storage for an application with rollback capabilities is in a single consistency group,
the application can recover from a complete cluster failure or inter-cluster link failure with
little or no data loss. Data loss, if any, is determined by the applications data access
pattern and the consistency groups cache-mode.
All consistency groups guarantee a crash-consistent image of their member virtual
volumes. In the event of a director, cluster, or inter-cluster link failure, consistency groups
prevent possible data corruption.
There are two types of consistency groups:
Synchronous
consistency groups
Synchronous consistency groups provide a convenient way to apply rule sets and other
properties to a group of volumes in a VPLEX Local or VPLEX Metro system.
Synchronous consistency groups simplify configuration and administration on large
systems. VPLEX supports up to 1024 synchronous consistency groups.
Synchronous consistency groups:
Synchronous
consistency groups:
visibility
Local visibility - The local volumes in the consistency group are visible only to the local
cluster.
Global visibility - The local volumes in the consistency group have storage at one
cluster, but are visible to both clusters.
Local visibility
Local consistency groups that have the visibility property set to the local cluster can read
and write only to their local cluster.
56
VPLEX Software
1 w
4 A
Virtual Volume
2 w
3 A
VPLX-000375
Global visibility
Global visibility allows both clusters to receive I/O from the cluster that does not have a
local copy.
Any reads that cannot be serviced from local cache are transferred across the link. This
allows the remote cluster to have instant on demand access to the consistency group.
Figure 25 shows a local consistency group with global visibility.
Cluster - 1
Cluster - 2
2
w
5
A
Virtual Volume
Storage
VPLX-000372
Consistency groups
57
VPLEX Software
Asynchronous
consistency groups
Asynchronous consistency groups provide a convenient way to apply rule sets and other
properties to distributed volumes in a VPLEX Geo.
VPLEX supports up to 16 asynchronous consistency groups. Asynchronous consistency
groups:
In asynchronous cache mode, write order fidelity is maintained by batching I/O between
clusters into packages called deltas that are exchanged between clusters.
Each delta contains a group of writes that were initiated in the same window of time.
Each asynchronous consistency group maintains its own queue of deltas. Before a delta is
exchanged between clusters, data within the delta can vary by cluster. After a delta is
exchanged and committed, data is the same on both clusters.
If access to the back end array is lost while the system is writing a delta, the data on the
disk is no longer consistent and requires automatic recovery when access is restored.
Asynchronous cache mode can give better performance, but there is a higher risk of losing
the data if:
Asynchronous
consistency groups:
active vs. passive
There is an inter-cluster link partition and both clusters are actively writing. Instead of
waiting for the link to be restored, the user chooses to accept a data rollback in order
to reduce the recovery time objective (RTO).
Detach rules
Most I/O workloads require specific sets of virtual volumes to resume on one cluster and
remain suspended on the other cluster during failures.
VPLEX includes two levels of detach rules that determine which cluster continues I/O
during an inter-cluster link failure or cluster failure.
58
device-level detach rules determine which cluster continues for an individual device.
VPLEX Software
consistency group-level detach rules determine which cluster continues for all the
member volumes of a consistency group.
If a consistency group has a detach-rule configured, the rule applies to all volumes in the
consistency group, and overrides any rule-sets applied to individual volumes.
In the event of connectivity loss with the remote cluster, the detach rule defined for each
consistency group identifies a preferred cluster (if there is one) that can resume I/O to the
volumes in the consistency group.
In a VPLEX Metro configuration, I/O proceeds on the preferred cluster and is suspended on
the non-preferred cluster.
In a VPLEX Geo configuration, I/O proceeds on the active cluster only when the remote
cluster has no dirty data in cache.
In cases where the configured winner of a detach (manual, rule, or VPLEX witness
direction) cannot present a consistent image for the consistency group at that cluster, the
detach will be prevented. A consistency group may not be able to present a consistent
image at a given cluster if one or more of its virtual volumes does not have a healthy leg
at that cluster.
Asynchronous
consistency groups:
active-cluster-wins
detach rule
suspends because the cache image is inconsistent on both clusters and must be rolled
back to a point where both clusters had a consistent image to continue I/O.
Application restart is required after roll back. If both clusters were passive and have no
dirty data at the time of the failure, the cluster that was the active cluster proceeds with
I/O after failure.
Regardless of the detach rules in asynchronous consistency groups, as long as the remote
cluster has dirty data, the local cluster suspends I/O if it observes loss of connectivity with
the remote cluster regardless of preference.
This is done to allow the administrator to stop or restart the application prior to exposing
the application to the rolled back time consistent data image (if roll-back is required).
Consistency groups
59
VPLEX Software
Note: VPLEX Witness does not guide failover behavior for asynchronous consistency
groups. In Geo configurations (asynchronous consistency groups), VPLEX Witness
observations can be used to diagnose problems, but not to automate failover.
Cache vaulting
VPLEX uses the individual director's memory systems to ensure durability of user data and
critical system configuration data.
If a power failure on a VPLEX Geo cluster (using write-back cache mode) occurs, then the
data in cache memory might be at risk. Each VPLEX director copies its dirty cache data to
the local solid state storage devices (SSDs) using a process known as cache vaulting.
Dirty cache pages are pages in a director's memory that have not been written to back-end
storage but were acknowledged to the host. Dirty cache pages include the copies
protected on a second director in the cluster. These pages must be preserved in the
presence of power outages to avoid loss of data that has already been acknowledged to
the host.
After each director vaults its dirty cache pages, VPLEX shuts down the directors firmware.
Note: Although there is no dirty cache data in VPLEX Local or VPLEX Metro configurations,
vaulting quiesces all I/O when data is at risk due to power failure. This minimizes risk of
metadata corruption.
When power is restored, VPLEX initializes the hardware and the environmental system,
checks the data validity of each vault, and unvaults the data. In VPLEX Geo configurations,
cache vaulting is necessary to safeguard the dirty cache data under emergency conditions.
Vaulting can be used in two scenarios:
Data at risk due to power failure: VPLEX monitors all components that provide power
to the VPLEX cluster. If VPLEX detects AC power loss that would put data at risk, it
takes a conservative approach and initiates a cluster wide vault if the power loss
exceeds 30 seconds.
Note: Power failure of the UPS (in dual and quad engine configurations) does not
currently trigger any vaulting actions on power failure.
When performing maintenance activities on a VPLEX Geo system, service personnel must
not remove the power in one or more engines that would result in the power loss of both
directors unless both directors in those engines have been shutdown and are no longer
monitoring power. Failure to do so, will lead to data unavailability in the affected cluster.
To avoid unintended vaults, always follow official maintenance procedures.
60
VPLEX Software
Cache vaulting
61
VPLEX Software
62
CHAPTER 4
Integrity and Resiliency
This chapter describes how VPLEXs high availability and redundancy features provide
robust system integrity and resiliency. Topics include:
63
64
65
69
76
76
79
81
SAN outages
To achieve high availability, you must create redundant host connections and supply
hosts with multi path drivers.
63
Note: In the event of a front-end port failure or a director failure, hosts without redundant
physical connectivity to a VPLEX cluster and without multipathing software installed could
be susceptible to data unavailability.
Cluster
VPLEX is a true cluster architecture. That is, all devices are always available and I/O that
enters the cluster from anywhere can be serviced by any node within the cluster, while
cache and coherency is maintained for all reads and writes.
As you add more devices to the cluster, you get the added benefits of more cache,
increased processing power, and more performance.
A VPLEX cluster provides N1 fault tolerance, which means any device failure, or any
component failure can be sustained, and the cluster will continue to operate as long as
one device survives.
This is a very highly available and robust architecture, capable of sustaining even multiple
failures while the cluster still continues to provide virtualization and storage services.
A VPLEX cluster (either VS1 or VS2) consists of redundant hardware components.
A single engine supports two directors. If one director in an engine fails, the second
director in the engine continues to service I/O. Similarly, if a VPLEX cluster contains
multiple engines, VPLEX can handle more than one failure without disrupting any services
as long as quorum (defined by set rules) is not lost. Quorum provides more information
on this.
All hardware resources (CPU cycles, I/O ports, and cache memory) are pooled.
Two-cluster configurations (Metro and Geo) offer true high availability. Operations
continue and data remains online even if an entire site fails.
VPLEX Metro configurations provide a high availability solution with zero recovery point
objective (RPO).
VPLEX Geo configurations enable near-zero RPO and the failover is still automated.
Quorum
Quorum refers to the minimum number of directors required for the cluster to service and
maintain operations.
There are different quorum rules for a cluster to become operational and start servicing
I/Os when it is booting up, also called gaining quorum. Different rules for an operational
cluster seeing director failures to either continue servicing operations and I/O after failure
handling is called maintaining quorum. Stopping servicing operations and I/O is called
losing quorum. These rules are described below:
64
Director failures
If less than half of the operational directors with quorum fail.
If half of the operational directors with quorum fail, then the remaining
directors will check the operational status of the failed directors over the
management network and remain alive.
After recovering from this failure, a cluster can tolerate further similar director
failures until only one director is remaining. In a single engine cluster, a maximum
of one director failure can be tolerated.
Intra-cluster communication failure
If there is a split in the middle, that is, half of the operational directors with
quorum lose communication with the other half of the directors, and both
halves are running, then the directors detect the operational status over the
management network and instruct half with the director with the lowest UUID to
keep running and the directors without the lowest UUID to operationally stop.
Quorum loss - An operational VPLEX cluster seeing failures stops operating in the
following scenarios:
If more than half of the operational directors with quorum fail.
If half of the operational directors with quorum fail, and the directors are unable to
determine the operation status of the other half of the directors (whose
membership includes a low UUID).
In a dual or quad engine cluster, if all of the directors loose contact with each
other.
Path redundancy
Path redundancy is critical for high availability. This section describes how VPLEX delivers
resilience using multiple paths:
Different ports
Front-end ports on all directors can provide access to any virtual volume in the cluster.
Include multiple front-end ports in each storage view to protect against port failures.
Path redundancy
65
When a director port fails, the host multipathing software seamlessly fails over to another
path through a different port, as shown in Figure 26.
Director A1
Director B1
Engine 1
Virtual Volume
VPLX-000376
Combine multi-pathing software plus redundant volume presentation for continuous data
availability in the presence of port failures.
Back-end ports, local COM ports, and WAN COM ports provide similar redundancy for
additional resilience.
Different directors
Each VPLEX engine includes redundant directors. Each director can service I/O for any
other director in the cluster due to the redundant nature of the global directory and cache
coherency.
If one director in the engine fails, then other director immediately takes over the I/O
processing from the host.
66
In Figure 27, Director A has failed, but Director B services the host I/O that was previously
being serviced by Director A.
Director A1
Director B1
Engine 1
Virtual Volume
VPLX-000392
Present the virtual volumes through each director so that all directors except one can
fail without causing data loss or unavailability.
Configure paths through both an director A and a director B to ensure continuous I/O
during non-disruptive upgrade of VPLEX.
Connect VPLEX directors to both Fibre Channel fabrics (if used) for the front-end
(host-side), and the back-end (storage array side). Isolate the fabrics.
Redundant connections from the directors to the fabrics and fabric isolation allows
VPLEX to ride through failures of an entire fabric with no disruption of service.
Connect hosts to both fabrics and use multi-pathing software to ensure continuous
data access during failures.
67
FAB-A
FAB-B
VPLX-000432
Different engines
Director A1
Director B1
Director A1
Engine 1
Director B1
Engine 2
Virtual Volume
VPLX-000393
In VPLEX Geo, directors in the same engine serve as protection targets for each other. If a
single director in an engine goes down, the remaining director uses another director in the
cluster as its protection pair. Simultaneously losing an engine in an active cluster, though
very rare, could result in loss of crash consistency. However, the loss of 2 directors in
different engines can be handled as long as other directors can serve as protection targets
for the failed director.
In a VPLEX Metro, multi-pathing software plus volume presentation on different engines
yields continuous data availability in the presence of engine failures.
Site distribution
When two VPLEX clusters are connected together with VPLEX Metro or Geo, VPLEX gives
you shared data access between sites. VPLEX can withstand a component failure, a site
failure, or loss of communication between sites and still keep the application and data
online and available.
VPLEX Metro ensures that if a data center goes down, or even if the link to that data center
goes down, the other site can continue processing the host I/O.
68
In Figure 30, despite a site failure at Data Center B, I/O continues without disruption in
Data Center A.
Data center A
Data center B
Director B1
Director A1
Director A1
Director B1
Engine 1
Engine 1
Virtual Volume
Virtual Volume
VPLX-000394
Install the optional VPLEX Witness on a server in a separate failure domain to provide
further fault tolerance in VPLEX Metro configurations.
See High Availability with VPLEX Witness on page 69.
69
VPLEX Witness works in conjunction with consistency groups. VPLEX Witness guidance
does not apply to local volumes and distributed volumes that are not members of a
consistency group.
VPLEX Witness capabilities vary depending on whether the VPLEX is a Metro (synchronous
consistency groups) or Geo (asynchronous consistency groups).
In Metro systems, VPLEX Witness provides seamless zero recovery time objective
(RTO) fail-over for storage volumes in synchronous consistency groups.
Combine VPLEX Witness and VPLEX Metro to provide the following features:
High availability for applications in a VPLEX Metro configuration leveraging
synchronous consistency groups (no single points of storage failure).
Fully automatic failure handling of synchronous consistency groups in a VPLEX
Metro configuration (provided these consistency groups are configured with a
specific preference).
Better resource utilization.
Figure 31 on page 70 shows a high level architecture of VPLEX Witness. The VPLEX Witness
server must reside in a failure domain separate from cluster-1 and cluster-2.
Failure Domain #3
VPLEX Witness
Cluster 1
IP management
Network
Cluster 2
Inter-cluster
Network A
Inter-cluster
Network B
Failure Domain #1
Failure Domain #2
VPLX-000474
The VPLEX Witness server must be deployed in a separate failure domain to both of the
VPLEX clusters. This deployment enables VPLEX Witness to distinguish between a site
outage and a link and to provide the correct guidance.
70
Witness installation
considerations
It is important to deploy the VPLEX Witness server VM in a separate failure domain other
than either of the cluster domains.
A failure domain is a set of entities effected by the same set of faults. The scope of the
failure domain depends on the set of fault scenarios that can be tolerated in a given
environment. For example:
If the two clusters are deployed on different floors of the same data center, deploy the
VPLEX Witness Server VM on a separate floor.
If the two clusters are deployed in two different data centers, deploy the VPLEX
Witness Server VM in the third data center.
Protected by a firewall.
The VPLEX Witness software includes a client on each of the VPLEX clusters. VPLEX
Witness does not appear in the CLI until the client has been configured.
Failures in Metro
systems: without
VPLEX Witness
no-automatic-winner
VPLEX Witness does not guide consistency groups with the no-automatic-winner detach
rule. This discussion applies only to synchronous consistency groups with the winner
cluster-name delay seconds detach rule.
Synchronous consistency groups use write-through caching. Host writes to a distributed
volume are acknowledged back to the host only after the data is written to the back-end
storage at both VPLEX clusters.
In Figure 32, the winner cluster-name delay seconds detach rule designates cluster-1 as
the preferred cluster. That is, during an inter-cluster link outage or a cluster failure, I/O to
the device leg at cluster-1 continues, and I/O to the device leg at cluster-2 is suspended.
71
Three common types of failures that illustrate how VPLEX responds without Cluster
Witness are described below:.
Scenario 1
Cluster 1
Scenario 2
Cluster 2
Cluster 1
Scenario 3
Cluster 2
Cluster 1
Cluster 2
VPLX-000434
72
Failures in Metro
systems: with VPLEX
Witness
When VPLEX Witness is deployed in a VPLEX Metro configuration, failure of the preferred
cluster (Scenario 3) does not result in data unavailability for distributed devices that are
members of (synchronous) consistency groups.
Instead, VPLEX Witness guides the surviving cluster to continue I/O, despite its
designation as the non-preferred cluster. I/O continues to all distributed volumes in all
synchronous consistency groups that do not have the no-automatic-winner detach rule.
Host applications continue I/O on the surviving cluster without any manual intervention.
When the preferred cluster fails in a Metro configuration, VPLEX Witness provides
seamless zero RTO failover to the surviving cluster.
The VPLEX Witness Server VM connects to the VPLEX clusters over the management IP
network.
The deployment of VPLEX Witness adds a point of failure to the VPLEX deployment. This
section describes the impact of failures of the VPLEX Witness Server VM and the network
connections between the VM and the clusters.
Note: This discussion applies only to VPLEX Witness in VPLEX Metro configurations.
Failures of connections between the cluster and the VPLEX Witness VM are managed as
follows:
VPLEX
Witness
VPLEX
Witness
Cluster 2
I/
O
s
ue
in
nt
Co
ds
en
sp
Su
I/
O
O
I/
O
I/
Cluster 1
Su
sp
en
ds
Co
nt
in
ue
s
Cluster 1
Cluster 2
VPLEX
Witness
Cluster 1
I/
O
s
ue
in
nt
Co
s
ue
in
nt
Co
Cluster 2
O
I/
O
I/
Cluster 1
Co
nt
in
ue
s
VPLEX
Witness
Su
sp
en
ds
I/
O
Cluster 2
VPLX-000435
Local Cluster Isolation - The preferred cluster loses contact with both the remote cluster
and the VPLEX Witness.
The preferred cluster is unable to receive guidance from VPLEX Witness and
suspends I/O.
VPLEX Witness guides the non-preferred cluster to continue I/O.
High Availability with VPLEX Witness
73
Remote Cluster Isolation - The preferred cluster loses contact with the remote cluster and
the non-preferred cluster loses contact with the VPLEX Witness. The preferred cluster is
connected to VPLEX Witness.
The preferred cluster continues I/O as it is still in contact with the VPLEX Witness.
The non-preferred cluster suspends I/O, as it is neither in contact with the other
cluster, nor can it receive guidance from VPLEX Witness.
Inter-Cluster Partition - Both clusters lose contact with each other, but still have access to
VPLEX Witness. VPLEX Witness preserves the detach rule failure behaviors.
I/O continues on the preferred cluster.
If the preferred cluster cannot proceed because it has not fully synchronized, the
cluster suspends I/O.
Overriding the detach rule results in a zero RTO.
Loss of Contact with VPLEX Witness - The clusters are still in contact with each other, but
one or both of the clusters has lost contact with VPLEX Witness.
There is no change in I/O.
The cluster(s) that lost connectivity with VPLEX Witness sends a call-home
notification.
If either cluster fails or if the inter-cluster link fails when VPLEX Witness is down,
VPLEX experiences data unavailability in all surviving clusters.
When the VPLEX Witness observes a failure and provides guidance, it sticks to this
governance until both clusters report complete recovery. This is crucial in order to avoid
split-brain and data corruption.
As a result you may have a scenario where:
Cluster-1 is isolated
Because cluster-2 has previously received guidance to proceed from VPLEX Witness, it
proceeds even when it is isolated. In the meantime, if cluster-1 reconnects with the VPLEX
Witness server, the VPLEX Witness server tells cluster-1 to suspend. In this case, because
of event timing, cluster-1 is connected to VPLEX Witness but it is suspended, while
cluster-2 is isolated but it is proceeding.
VPLEX Witness in
VPLEX Geo
configurations
VPLEX Witness/
GeoSynchrony 5.0.1
The value of the VPLEX Witness is different in VPLEX Geo configurations than it is in VPLEX
Metro configurations. The value also varies depending on the release of GeoSynchrony.
For systems running GeoSynchrony 5.0.1, clusters in Geo configurations do not comply
with VPLEX Witness guidance. Instead the clusters operate according to the detach rules
applied to each asynchronous consistency group.
Information displayed in the VPLEX Witness CLI context helps to determine the nature of a
failure (cluster failure or inter-cluster link outage). Administrators can use this information
to make manual fail-over decisions.
74
VPLEX Witness/
GeoSynchrony 5.1
For VPLEX Geo systems, VPLEX Witness automates the response to failure scenarios that
do not require data rollback.
Manual intervention is required if data rollback is needed.
No data rollback is required - VPLEX Witness guides the surviving cluster to allow I/O
to all distributed volumes in all the asynchronous consistency groups configured with
the active-cluster-wins detach rule.
I/O continues at the non-preferred cluster.
If no rollback is required, VPLEX Witness automates failover with zero RTO and zero
RPO.
No data rollback is required - VPLEX Witness guides the preferred cluster to continue
I/O to all distributed volumes in all asynchronous consistency groups configured with
the active-cluster-wins detach rule.
I/O is suspended at the non-preferred cluster.
VPLEX Witness automates failover with zero RTO and zero RPO.
Higher availability
Combine VPLEX Witness with VMware and cross cluster connection to create even higher
availability.
See Chapter 5 VPLEX Use Cases for more information on the use of VPLEX Witness with
VPLEX Metro and VPLEX Geo configurations.
75
ALUA
Asymmetric Logical Unit Access (ALUA) routes I/O of the LUN directed to non-active/failed
storage processor to the active storage processor without changing the ownership of the
LUN.
Each LUN has two types of paths:
Active/optimized paths are direct paths to the storage processor that owns the LUN.
Active/optimized paths are usually the optimal path and provide higher bandwidth
than active/non-optimized paths.
Active/non-optimized paths are indirect paths to the storage processor that does not
own the LUN through an interconnect bus.
I/Os that traverse through the active/non-optimized paths must be transferred to the
storage processor that owns the LUN. This transfer increases latency and has an
impact on the array.
VPLEX detects the different path types and performs round robin load balancing across
the active/optimized paths.
VPLEX supports all three flavors of ALUA:
Explicit ALUA - The storage processor changes the state of paths in response to
commands (for example, the Set Target Port Groups command) from the host (the
VPLEX backend).
The storage processor must be explicitly instructed to change a paths state.
If the active/optimized path fails, VPLEX issues the instruction to transition the
active/non-optimized path to active/optimized.
There is no need to failover the LUN.
Implicit ALUA - The storage processor can change the state of a path without any
command from the host (the VPLEX back end).
If the controller that owns the LUN fails, the array changes the state of the
active/non-optimized path to active/optimized and fails over the LUN from the failed
controller.
On the next I/O, after changing the paths state, the storage processor returns a Unit
Attention Asymmetric Access State Changed to the host (the VPLEX backend).
VPLEX then re-discovers all the paths to get the updated access states.
Implicit/explicit ALUA - Either the host or the array can initiate the access state
change.
76
Metadata volumes
At system startup, VPLEX reads the metadata and loads the configuration information onto
each director.
When you make changes to the system configuration, VPLEX writes these changes to the
metadata volume.
If VPLEX loses access to the metadata volume, the VPLEX directors continue
uninterrupted, using the in-memory copy of the configuration. VPLEX blocks changes to
the system until access is restored or the automatic backup meta-volume is activated.
Meta-volumes experience high I/O only during system startup and upgrade.
I/O activity during normal operations is minimal.
Best practices
Logging volumes
Configure the metadata volume for each cluster with multiple back-end storage
volumes provided by different storage arrays of the same type.
Use the data protection capabilities provided by these storage arrays, such as RAID 1
to ensure the integrity of the system's metadata.
Create backup copies of the metadata whenever configuration changes are made to
the system.
Perform regular backups of the metadata volumes on storage arrays that are separate
from the arrays used for the metadata volume.
After the inter-cluster link or leg is restored, the VPLEX system uses the information in
logging volumes to synchronize the mirrors by sending only changed blocks across the
link.
Logging volumes also track changes during loss of a volume when that volume is one
mirror in a distributed device.
77
If no logging volume is accessible, then the entire leg is marked as out-of-date. A full
re-synchronization is required once the leg is reattached.
The logging volumes on the continuing cluster experience high I/O during:
Incremental synchronization
When the network or cluster is restored, VPLEX reads the logging volume to determine
what writes to synchronize to the reattached volume.
Stripe logging volumes across several disks to accommodate the high level of I/O that
occurs during and after outages.
Global cache
Memory systems of individual directors ensure durability of user and critical system data.
The method used to protect user data depends on cache mode:
Asynchronous systems (write-back cache mode) ensure data durability by storing user
data into the cache memory of the director that received the I/O, and then placing a
protection copy of this data on another director in the local cluster before
acknowledging the write to the host.
This ensures that the data is protected in two independent memories. The data is later
destaged to back-end storage arrays that provide the physical storage media.
If a power failure lasts longer than 30 seconds in a VPLEX Geo configuration, then each
VPLEX director copies its dirty cache data to the local solid state storage devices (SSDs).
Known as vaulting, this process protects user data in cache if that data is at risk due to
power loss. After each director vaults its dirty cache pages, VPLEX shuts down the
directors firmware.
When operations resume, if any condition is not safe, the system does not resume normal
status and calls home for diagnosis and repair.
EMC Customer Support communicate with the VPLEX system and restore normal system
operations.
78
Under normal conditions, the SPS batteries can support two consecutive vaults. This
ensures the system can resume I/O after the first power failure, and still vault a second
time if there is another power failure.
Performance
monitoring
VPLEX performance monitoring provides a customized view into the performance of your
system. You decide which aspects of the system's performance to view and compare.
VPLEX supports four general categories of performance monitoring:
Current load monitoring allows administrators to watch CPU load during upgrades, I/O
load across the inter-cluster WAN link, and front-end against the back-end load during
data mining or back up.
Both the CLI and GUI support current load monitoring.
Long term load monitoring collects data for capacity planning and load balancing.
Both the CLI and GUI support long term load monitoring.
Performance
monitoring: GUI
The GUIs performance monitoring dashboard is a customized view into the performance
of the VPLEX system:
79
You decide which aspects of the system's performance to view and compare:
For additional information about the statistics available through the Performance
Dashboard, see the EMC Unisphere for VPLEX online help available in the VPLEX GUI.
Performance
monitoring: CLI
Add a file sink to send output to the specified directory on the management server.
Note: SNMP statistics do not require a monitor or monitor sink. Use the snmp-agent
configure command to configure and start the SNMP agent. For more information about
monitoring with SNMP, refer to the EMC VPLEX Administration Guide.
80
Security features
The VPLEX management server operating system (OS) is based on a Novell SUSE Linux
Enterprise Server 11 distribution.
The operating system has been configured to meet EMC security standards by disabling or
removing unused services, and protecting access to network services through a firewall.
Security features include:
IPv6 support
IPSec VPN inter-cluster link in VPLEX Metro and VPLEX Geo configurations
IPSec VPN to connect each cluster of a VPLEX Metro or VPLEX Geo to the VPLEX
Witness server
The WAN-COM inter-cluster link carries unencrypted user data. To ensure privacy of the
data, establish an encrypted VPN tunnel between the two sites.
For more information about security features and configuration see the EMC VPLEX
Security features
81
82
CHAPTER 5
VPLEX Use Cases
This section describes examples of VPLEX configurations. Topics include:
Technology refresh.................................................................................................. 83
Mobility .................................................................................................................. 85
Collaboration .......................................................................................................... 87
VPLEX Metro HA ...................................................................................................... 88
Redundancy with RecoverPoint ............................................................................... 96
MetroPoint............................................................................................................ 106
Technology refresh
In typical IT environments, migrations to new storage arrays (technology refreshes) require
that the data that is being used by hosts be copied to a new volume on the new array. The
host must then be reconfigured to access the new storage. This process requires
downtime for the host.
Migrations between heterogeneous arrays can be complicated and may require additional
software or functionality. Integrating heterogeneous arrays in a single environment is
difficult and requires a staff with a diverse skill set.
Figure 37 shows the traditional view of storage arrays with servers attached to the
redundant front end and storage (Array 1 and Array 2) connected to a redundant fabric at
the back end.
Array 1
Server
No Federation
Server
Array 2
VPLX-000381
When VPLEX is inserted between the front-end and back-end redundant fabrics, VPLEX
appears as the target to hosts and as the initiator to storage.
This abstract view of storage is very helpful while replacing the physical array that is
providing storage to applications.
83
With VPLEX, because the data resides on virtual volumes, it can be copied nondisruptively
from one array to another without any downtime. There is no need to reconfigure the host;
the physical data relocation is performed by VPLEX transparently and the virtual volumes
retain the same identities and the same access points to the host.
In Figure 38, the virtual disk is made up of the disks of Array A and Array B. The site
administrator has determined that Array A has become obsolete and should be replaced
with a new array. Array C is the new storage array. Using Mobility Central, the
administrator:
Adds Array C array into the VPLEX cluster.
Assigns a target extent from the new array to each extent from the old array.
Instructs VPLEX to perform the migration.
VPLEX copies data from Array A to Array C while the host continues its access to the virtual
volume without disruption.
After the copy of Array A to Array C is complete, Array A can be decommissioned:
Virtual Volume
Array A
Array B
Array C
VPLX-000380
Because the virtual machine is addressing its data to the abstracted virtual volume, its
data continues to flow to the virtual volume without any need to change the address of the
data store.
84
Although this example uses virtual machines, the same is true for traditional hosts. Using
VPLEX, the administrator can move data used by an application to a different storage array
without the application or server being aware of the change.
This allows you to change the back-end storage arrays transparently, without interrupting
I/O.
VPLEX makes it easier to replace heterogeneous storage arrays on the back-end.
Mobility
Use VPLEX to move data between data centers, relocate a data center or consolidate data,
without disrupting host application access to the data.
MOBILITY
Cluster A
Cluster B
ACCESS ANYWHERE
The source and target arrays can be in the same data center (VPLEX Local) or in different
data centers separated by up to 5ms (VPLEX Metro) or 50ms (VPLEX Geo).
With VPLEX, source and target arrays can be heterogeneous.
When you use VPLEX to move data, the data retains its original VPLEX volume identifier
during and after the mobility operation. No change in volume identifiers eliminates
application cut over. The application continues to use the same storage, unaware that it
has moved.
There are many types and reasons to move data:
With VPLEX, you no longer need to spend significant time and resources preparing to
move data and applications. You dont have to accept a forced outage and restart the
application after the move is completed.
Instead, a move can be made instantly between sites, over distance, and the data remains
online and available during the move without any outage or downtime.
Mobility
85
Considerations before moving the data include the business impact, type of data to be
moved, site locations, total amount of data, and schedules.
The VPLEX GUI helps you to easily move the physical location of virtual storage while
VPLEX provides continuous access to this storage by the host.
To move storage:
Display and select the extents (for extent mobility) or devices (for device mobility) to
move.
Throughout the process, the volume retains its identity, and continuous access is
maintained to the data from the host.
There are three types of mobility jobs:
Table 4 Types of data mobility operations
Extent
Device
Batch
Moves data using a migration plan file. Create batch migrations to automate
routine tasks.
Use batched extent migrations to migrate arrays within the same cluster where
the source and destination have the same number of LUNs and identical
capacities.
Use batched device migrations to migrate to dissimilar arrays and to migrate
devices between clusters in a VPLEX Metro or Geo.
All components of the system (virtual machine, software, volumes) must be available and
in a running state.
Data mobility can be used for disaster avoidance, planned upgrade, or physical movement
of facilities.
Mobility moves data from a source extent or device to a target extent or device.
When a mobility job is started, VPLEX creates a temporary RAID 1 device above each
source device or extent that is to be migrated.
The target extent or device becomes a mirror leg of the temporary device, and
synchronization between the source and the target begins.
86
The data mobility operation is non-disruptive. Applications using the data continue to
write to the volumes during the mobility operation. New I/Os are written to both legs of the
device.
The following rules apply to mobility operations:
The target extent/device must be the same size or larger than the source
extent/device.
The target device cannot be in use (no virtual volumes created on it).
You can control the transfer speed of the data mobility operation.
Higher transfer speeds complete the operation more quickly, but have a greater impact on
host I/O.
Slower transfer speeds have less impact on host I/O, but take longer to complete.
You can change the transfer speed of a job while the job is in the queue or in progress. The
change takes effect immediately.
Starting in GeoSynchrony 5.0, the thinness of a thinly provisioned storage volume is
retained through a mobility operation. Prior to 5.0, you must specify that rebuilds should
be thin at the time you provision the thin volume.
Refer to the EMC VPLEX CLI Guide or the online help for more information on thin
provisioning of volumes.
Collaboration
If you require distributed data collaboration, VPLEX can provide a significant advantage.
Traditional collaboration across distance required files to be saved at one location and
then sent to another site. This is slow, incurs bandwidth costs for large files or even small
files that move regularly. Also, it negatively impacts resource productivity as sites sit idle
while they wait to receive the latest changes. Independent work quickly leads to version
control problems as multiple people working at the same time are unaware of each others
most recent changes. Merging independent work is time-consuming, costly, and grows
more complicated as the dataset gets larger.
Current applications for information sharing over distance are not suitable for
collaboration in Big Data environments. Transfer of hundreds of Gb or TB of data across
WAN is inefficient, especially if you need to modify only a small portion of a huge data set,
or use it for analysis.
VPLEX enables multiple users at different sites to work on the same data, and maintain
consistency in the dataset when changes are made.
With VPLEX, the same data can be accessed by all users at all times, even if users are at
different sites. The data is shared and not copied, so that a change made in one site
shows up immediately at the other site.
VPLEX need not ship the entire file back and forth like other solutions. It only sends the
changed updates as they are made, greatly reducing bandwidth costs and offering
significant savings over other solutions.
Collaboration
87
With VPLEXs Federated AccessAnywhere, the data remains consistent, online, and always
available.
ACCESS ANYWHERE
Enable concurrent
read/write access to data
across locations
Figure 41 Collaborate over distance with AccessAnywhere
Deploy VPLEX to enable real collaboration between teams located at different sites.
Collaboration using
local consistency
groups
Collaboration over
asynchronous
distances
For distributed data collaboration over greater distances, configure VPLEX Geo
asynchronous consistency groups with member volumes mirrored at both clusters.
At asynchronous distances, latency between clusters can be up to 50ms RTT.
VPLEX Metro HA
VPLEX Metro High Availability (HA) configurations consist of a VPLEX Metro system
deployed in conjunction with VPLEX Witness. There are two types of Metro HA
configurations:
VPLEX Metro HA can be deployed in places where the clusters are separated by 5 ms
latency RTT or less.
VPLEX Metro HA combined with Cross Connect between the VPLEX clusters and hosts
can be deployed where the clusters are separated by 1 ms latency RTT or less.
88
Metro HA (without
cross-connect)
Combine VPLEX Metro HA with host failover clustering technologies such as VMware HA to
create fully automatic application restart for any site-level disaster.
VPLEX Metro/VMware HA configurations:
Significantly reduce the Recovery Time Objective (RTO). In some cases, RTO can be
eliminated.
Ride through any single component failure (including the failure of an entire storage
array) without disruption.
Eliminate the requirement to stretch the Fiber Channel fabric between sites. You can
maintain fabric isolation between the two sites.
Note: The VPLEX clusters in VPLEX Metro HA configuration must be within 5 ms RTT
latency.
In this deployment, virtual machines can write to the same distributed device from either
cluster and move between two geographically disparate locations.
If you use VMware Distributed Resource Scheduler (DRS) to automate load distribution on
virtual machines across multiple ESX servers, you can move a virtual machine from an ESX
server attached to one VPLEX cluster to an ESX server attached to the second VPLEX
cluster, without losing access to the underlying storage.
VPLEX Metro HA
89
Metro HA without
cross-connect failure
management
This section describes the failure scenarios for VPLEX Metro HA without cross-connect.
VMware restarts the virtual machines at the site where the outage occurred,
redirecting I/O to the surviving cluster.
VMware can restart because the second VPLEX cluster has continued I/O without
interruption.
If a virtual machine is located at the non-preferred cluster, the storage associated with
the virtual machine is suspended.
Most guest operating systems will fail. The virtual machine will be restarted at the
preferred cluster after a short disruption.
90
Metro HA with
cross-connect
VPLEX Metro HA with cross-connect (VPLEXs front end ports are cross-connected) can be
deployed where the VPLEX clusters are separated by 1 ms latency RTT or less.
VPLEX Metro HA
91
VPLEX Metro HA combined with cross-connect eliminates RTO for most of the failure
scenarios.
Metro HA with
cross-connect failure
management
This section describes how VPLEX Metro HA with cross-connect rides through failures of
hosts, storage arrays, clusters, VPLEX Witness, and the inter-cluster link.
Host failure
If hosts at one site fail, then VMware HA restarts the virtual machines on the surviving
hosts. Since surviving hosts are connected to the same datastore, VMware can restart the
virtual machines on any of the surviving hosts.
92
Cluster failure
If a VPLEX cluster fails:
No disruption to I/O.
93
I/O is disrupted only to local virtual volumes on the VPLEX cluster attached to the
failed array.
Both VPLEX clusters call home to report that VPLEX Witness is not reachable.
94
Although this failure causes no disruption to the clusters or the virtual machines, it
makes the configuration vulnerable to a second failure of a major component. If a cluster
or inter-cluster link failure occurs while VPLEX Witness is not available, distributed
devices are suspended at both clusters. Therefore, if VPLEX Witness will be unavailable
for an extended period, best practice is to disable the VPLEX Witness and allow the
devices to use their configured detach rules.
No disruption to I/O.
Failure handling
VPLEX Witness detects the failure and enables all volumes on the
surviving cluster.
VPLEX Metro HA
95
Failure handling
Inter-cluster link
failure
Failure of
VPLEX Witness
96
The VPLEX splitter works with a RecoverPoint Appliance (RPA) to orchestrate the
replication of data either remotely or locally, or both.
The VPLEX splitter enables VPLEX volumes in a VPLEX Local or VPLEX Metro to mirror I/O to
a RecoverPoint Appliance.
RecoverPoint Appliances
The RecoverPoint Appliance manages all aspects of data protection. One RPA can manage
multiple replication sets (production volume and one or more replica volumes to which it
replicates), each with differing policies.
For redundancy, a minimum of two RPAs are installed at each site, located in the same
facility as the host and storage. Each site can have as many as eight RPAs.
The set of RPAs installed at each site is referred to as an RPA cluster. If one RPA in a cluster
fails, the functions provided by the failed RPA are automatically moved to one or more of
the remaining RPAs.
In the case of remote replication or mixed local and remote replication, RPAs are required
at both sites. The RPAs at the production site transfer the split I/O to the replica site.
The RPAs at the replica site distribute the data to the replica storage.
In the event of a failover, these roles can be reversed. The same RPA can serve as the
production RPA for one RecoverPoint consistency group and as the replica RPA for another.
RecoverPoint Volumes
97
Recovery/Failover
Repository volume - A volume that is dedicated to RecoverPoint for each RPA cluster.
The repository volume serves all RPAs of the particular RPA cluster and the splitter
associated with that cluster. The repository volume stores configuration information
about the RPAs and RecoverPoint consistency groups. There is one repository volume
per RPA cluster.
Production volumes - Volumes that are written to by the host applications. Writes to
production volumes are split such that they are sent to both the normally designated
volumes and RPAs simultaneously. Each production volume must be exactly the same
size as the replica volume to which it replicates.
Journal volumes - Volumes that contain data waiting to be distributed to target replica
volumes and copies of the data previously distributed to the target volumes. Journal
volumes allow convenient rollback to any point in time, enabling instantaneous
recovery for application environments.
Logged (physical) access - Used for production recovery, failover, testing, and cloning
a replica.
Virtual (instant) access - Used for single file recovery or light testing. Used to gain
access to the replica data immediately, or when I/O performance is not important.
Virtual (instant) access with roll - Used for production recovery, failover, or processing
with a high write-rate. Used when the point-in-time image is far from the current
point-in-time (and would take too long to access in logged access mode).
Direct access - This access mode can only be enabled after logged access, or virtual
access with roll, are enabled. Used for extensive processing with a high write-rate,
when image access is needed for a long period of time (and may not have the journal
space to support all of the data written to the image access log in this time), and when
it is not required to save the history in the replica journal (the replica journal is lost
after direct access).
Note: In the current release, virtual (instant) access and virtual (instant) access with roll
are not supported by the VPLEX splitter.
A bookmark is a label applied to a snapshot so that the snapshot can be explicitly called
(identified) during the recovery processes (during image access).
Bookmarks are created through the RecoverPoint GUI and can be created manually, by the
user, or automatically, by the system. Bookmarks created automatically can be created at
pre-defined intervals or in response to specific system events. Parallel bookmarks are
bookmarks that are created simultaneously across multiple consistency groups.
98
RecoverPoint configurations
RecoverPoint supports three replication configurations:
Local Replication
In a Local Replication, RecoverPoint continuously replicates data within the same site.
Every write is kept in the journal volume, allowing recovery to any point in time. By default,
snapshot granularity is set to per second, so the exact data size and contents are
determined by the number of writes made by the host application per second. If
necessary, the snapshot granularity can be set to per write.
Remote Replication
In these configurations, data is transferred between a local and a remote site over Fibre
Channel or a WAN. The RPAs, storage, and splitters are located at both the local and the
remote site.
By default, the replication mode is set to asynchronous, and the snapshot granularity is
set to dynamic, so the exact data size and contents are determined by the policies set by
the user and system performance. This provides application consistency for specific
points in time.
Synchronous replication is supported when the local and remote sites are connected
using Fibre Channel or IP.
Local and Remote Data Protection
In this configuration RecoverPoint replicates data to both a local and a remote site
simultaneously, providing concurrent local and remote data protection.
The local copy is normally used for operational recovery, while the remote copy is used for
disaster recovery.
RecoverPoint/VPLEX configurations
RecoverPoint can be configured on VPLEX Local or Metro systems as follows:
99
VPLEX Metro and RecoverPoint with both Local and Remote Replication
100
Figure 54 VPLEX Local and RecoverPoint Remote - remote site is independent VPLEX cluster
Figure 55 VPLEX Local and RecoverPoint remote - remote site is array-based splitter
101
VPLEX Metro and RecoverPoint with both Local and Remote Replication
In VPLEX Metro/RecoverPoint with both local and remote replication configurations, I/O is:
Split on one VPLEX cluster to replica volumes located both at the cluster and at a
remote site.
Figure 57 VPLEX Metro and RecoverPoint local and remote replication- local site is located at one
cluster of the VPLEX.
102
Figure 58 VPLEX Metro and RecoverPoint local and remote replication- remote site is array-based
splitter
This configuration supports unlimited points in time, with granularity up to a single write,
for local and distributed VPLEX virtual volumes.
RecoverPoint Appliances can (and for MetroPoint must) be deployed at each VPLEX
cluster in a Metro system. For MetroPoint replication, a different RecoverPoint cluster
must be viewing each exposed leg of the VPLEX distributed volume. The two
RecoverPoint clusters become the active and standby sites for the MetroPoint group.
If the VPLEX cluster fails, then the customers can recover to any point in time at the
remote replication site. Recovery at the remote site to any point in time can be
automated through integration with MSCE and VMware Site Recovery Manager (SRM).
See vCenter Site Recovery Manager support for VPLEX on page 106.
This configuration can simulate a disaster at the VPLEX cluster to test RecoverPoint
disaster recovery features at the remote site.
103
104
In the configuration depicted below, a host writes to VPLEX Local. Virtual volumes are
written to both legs of RAID 1 devices. The VPLEX splitter sends one copy to the usual
back-end storage, and one copy across a WAN to a CLARiiON array at a remote disaster
recovery site.
Figure 61
In the configuration depicted below, host writes to the distributed virtual volumes are
written to the both legs of the distributed RAID 1 volume. Additionally, a copy of the I/O is
sent to the RPA. RPA then distributes to the replica on the CLARiiON array at a remote
disaster recovery site:
Figure 62
105
When an outage occurs in VPLEX Local or VPLEX Metro configurations, the virtual
machines can be restarted at the replication site with automatic synchronization to the
VPLEX configuration when the outage is over.
MetroPoint
VPLEX GeoSynchrony 5.4 along with RecoverPoint 4.1 SP1 supports a new topology for
VPLEX Metro with RecoverPoint protection. This MetroPoint topology provides a 3-site or
4-site solution for continuous availability, operational and disaster recovery, and
continuous data protection. MetroPoint also supports a 2-site topology with the ability to
expand to a third remote site in future.
The MetroPoint topology provides full RecoverPoint protection of both sides of a VPLEX
distributed volume across both sides of a VPLEX Metro configuration, maintaining
replication and protection at a consistency group level, even when a link from one side of
the VPLEX Metro to the replication site is down.
In MetroPoint, VPLEX Metro and RecoverPoint replication are combined in a fully
redundant manner to provide data protection at both sides of the VPLEX Metro and at the
replication site. With this solution, data is replicated only once from the active source site
to the replication site. The standby source site is ready to pick up and continue replication
even under a complete failure of the active source site.
MetroPoint combines the high availability of the VPLEX Metro with redundant replication
and data protection of RecoverPoint. MetroPoint protection allows for one production copy
of a distributed volume on each Metro site, one local copy at each Metro site, and one
remote copy for each MetroPoint consistency group. Each production copy can have
multiple distributed volumes.
MetroPoint offers the following benefits:
106
Efficient data transfer between VPLEX Metro sites and to the remote site.
Any-Point-in-Time operational recovery in the remote site and optionally in each of the
local sites. RecoverPoint provides continuous data protection with any-point-in-time
recovery.
Full support for all operating systems and clusters normally supported with VPLEX
Metro.
Two-site topology
RecoverPoint Local Replication can now be added to both sides of a VPLEX Metro to
provide redundant operational recovery for the same application. To achieve this, all of the
volumes associated with an application must be placed in a MetroPoint group, even
though a third site is not configured.
This two-site configuration still requires GeoSynchrony 5.4 and RecoverPoint 4.1 SP1 that
supports MetroPoint on both VPLEX and RecoverPoint. In the two-site topology,
distributed devices are protected with an independent local copy at both Metro sites that
allows for independent replication at either side of a VPLEX Metro for a distributed
volume. Replication continues at one site if an outage occurs at the other site.
With VPLEX Metro, production continues to be available from one site after a WAN COM
failure or a VPLEX cluster failure when configured with VPLEX Witness. To access copy
data, the administrator decides which copy to use. When two VPLEX clusters are not in
contact, administrators are recommended to use the copy from the site where the
production volume is accessible by the host. Regardless of which copy is used for
recovery, replication history is maintained.
If recovery is done on the losing site during a VPLEX WAN partition, manual intervention is
required to select the sites data to use after the WAN link is restored. The losing site will
not be able to create bookmarks while the volumes rebuild, but replication history will not
be lost.
MetroPoint
107
Three-site topology
The MetroPoint three-site topology provides continuous availability between both sides of
the VPLEX Metro, with operational and disaster recovery to a third remote site.
This solution provides full RecoverPoint protection to the VPLEX Metro configuration,
maintaining replication even when one site of the Metro is down. RecoverPoint protects
distributed RAID 1 devices on both sides of the Metro. If there is a site failure, replication
automatically continues at a consistency group level from the surviving site, without
losing replication history.
Journal and repository volumes are local, not distributed. This enhances performance and
reduces load on the inter-cluster link resulting from high I/O traffic needed for journals.
The three-site MetroPoint topology features a VPLEX Metro (Site A and Site B) and a
remote Site C. Site C can be any valid array based splitter option including:
108
VPLEX Local
VPLEX Metro
VNX
VNX2
VMAX
The above figure illustrates the basic MetroPoint topology, with a VPLEX Metro system
consisting of VPLEX sites A and B, and a remote Site C. Each site is protected by a
RecoverPoint cluster. In this example, both production copies of the VPLEX distributed
volume are protected by RecoverPoint clusters at sites A and B. The remote replicating
RecoverPoint cluster is Site C.
One RecoverPoint cluster at Metro Site A is configured as the active source of remote
replication, while the other RecoverPoint cluster at Metro Site B is the standby source.
Both active and standby source sites maintain up-to-date journals. If the active source
cluster fails, the standby source site instantly switches over and becomes the active
source of remote replication after a short initialization period.
MetroPoint also allows for local copies of the distributed volume at each site in addition to
the remote copy. The local copies are independent of one another.
For each volume protected, a single consistency group spans the three related sites.
MetroPoint consistency groups are referred to as MetroPoint groups. The MetroPoint group
has a production copy on both Metro sites. The production copy at the active source site is
the only one that replicates to the remote site.
MetroPoint
109
Figure 66 RecoverPoint Local Replication for VPLEX Distributed Volumes at Site A and Site B and
RecoverPoint Remote Replication at Site C
MetroPoint configurations must adhere to both VPLEX and RecoverPoint best practices for
the distances between sites. In a VPLEX Metro, the production copies should be within
synchronous distance. For RecoverPoint, the site for the remote copy can be within
synchronous or asynchronous distance.
Note: See the document, Implementation and Planning Best Practices for EMC VPLEX:
VS2 Hardware and GeoSynchrony v5.x Technical Note for best practices information.
Distributed volumes
RecoverPoint protects VPLEX distributed volumes on both Metro Sites A and B for
redundancy, and protects copies on the remote site.
Local (non-distributed) volumes
VPLEX local (non-distributed) volumes can be replicated by RecoverPoint from any Site A,
B, or C, and are protected by RecoverPoint Local Replication.
Local volumes can be replicated from Site C to Site A or Site B if the volumes are
independent from the VPLEX volumes that are distributed from Sites A and B.
110
Figure 67 Bi-directional replication for volumes in different consistency groups and local volumes
Distributed RAID 1 volumes in MetroPoint groups from Site A and Site B replicate to
remote Site C.
Local volumes on Site A, B, or C can replicate to a target volume on a site other than their
local site. For example, a local volume on Site C can replicate to Site B, or a local volume in
a different VPLEX consistency group in Site A can replicate to Site C. RecoverPoint
supports bi-directional replication for volumes in different consistency groups.
Failure scenarios
MetroPoint protects applications and data from any failure scenario including:
Data corruption
For disaster recovery, when creating MetroPoint groups, the best practice is to set the
group policy for the preferred source of remote replication to the default Follow VPLEX
bias rules. This provides added protection in that RecoverPoint will select the side that
the VPLEX consistency group will declare as the winner site in the event of link failures in
the VPLEX Metro system and prevent unnecessary swapping of the source of replication.
MetroPoint
111
If a disaster occurs with either the RecoverPoint cluster or the VPLEX Metro which prevents
the active source site from replicating, it can select the other site for the remote replication
to continue.
If Site A failed, operations would continue uninterrupted at Site B. Replication from Site A
would switch over to replicate from Site B to remote Site C without losing replication
history. Site B would now be the active source site until Site A is restored on a consistency
group level.
When recovering from a local copy after a switchover, the remote copy is temporarily
disabled. Both local copies are initialized when one is restored, and the remote copy will
be updated after recovery and a short initialization.
Asynchronous WAN replication load balancing is done using VPLEX bias rules. By default,
the active and standby source sites are based on VPLEX bias rules that are configured at
the consistency group level. Using this method, the active source site is the site that has
bias. Using the bias rules, replication traffic can be load balanced across the links at a
consistency group level. Load balancing between the two source sites improves resource
distribution and provides better high availability.
112
If the RecoverPoint cluster fails at the remote site (Site C), local replication continues on
distributed RAID 1 volumes on both sides of the Metro (Site A and Site B). There is no
copy, no replication, and no image access at Site C.
When the remote RecoverPoint cluster is restored, RecoverPoint automatically
resynchronizes and replication resumes at the remote site.
The MetroPoint topology is specifically designed to provide protection in the following
scenarios:
Data corruption
MetroPoint
113
If the preferred RecoverPoint cluster is not at the VPLEX winning site, a production
switchover occurs. During the site departure, RecoverPoint at the VPLEX winning site
continues to replicate to the remote site.
The losing site does not receive any I/Os during the site departure and does not
maintain an up-to-date journal.
Once communication between the two VPLEX sites is restored, VPLEX automatically
resynchronizes both clusters according to the I/Os that occurred during the fracture.
When the VPLEX clusters are fully synchronized, if there is a local copy at the losing
site, RecoverPoint synchronizes the local copy with the winning site, then resumes
replicating to it.
The standby production copy becomes the active production copy and continues
replicating to the remote cluster.
When the failed VPLEX site resumes writing to storage and if it is set as the preferred
site, there is a production switchback and RecoverPoint at that cluster resumes
replicating to the remote cluster.
The standby production copy becomes the active production copy and continues
replicating to the remote cluster.
114
Data corruption
Data corruption can occur for various reasons. When data becomes corrupted:
Halt applications.
Note: See a complete list of production and copy disaster/failure scenarios in EMC
RecoverPoint: Deploying with VPLEX Technical Notes.
Four-site topology
The four-site MetroPoint topology is an advanced configuration that features two VPLEX
Metro systems in different regions. This protects applications running in separate regions
and allows for cross-regional and bi-directional replication. As with the three-site
MetroPoint topology, each consistency group in a region is configured to have an active
source production and a standby source production.
The target replication device for the consistency group must be a VPLEX Metro in the case
of four-site topology. Third party devices are supported if they are behind VPLEX. The
target volume can be either a local or distributed volume. If the target volume is a
distributed volume, it will be fractured. However, when a switchover to the remote copy is
executed, active-active access to the volume on both Metro sites becomes available
immediately, as the remote copy is now the production copy.
MetroPoint
115
Data is not replicated to all four sites, at a consistency group level it is replicated to a
single target. If the target replica is a distributed RAID 1, it will be in a fractured state.
If production fails over from across regions, the target fractured volume will automatically
start rebuilding the mirror leg. This provides continuous availability between the two sites,
once the distributed volume is rebuilt and synchronized.
The four-site MetroPoint topology provides regional replication protection for two regions
or two VPLEX Metro systems.
116
In the four-site configuration, production applications running in Site A and Site B are
replicated to Site C for disaster protection. Distributed volumes at VPLEX Metro Site C and
Site D can be replicated to a single target volume at remote Site A or Site B (but not both).
Different applications, with volumes in different consistency groups, can run in Site C and
Site D as active and standby source sites and be protected in one remote site, Site A or
Site B.
Note: Data is not replicated to all four sites. Each MetroPoint group is replicated to a single
target. Different MetroPoint groups can have different targets.
MetroPoint
117
If a regional disaster causes both sides of the Metro to fail, operations can fail over to the
disaster recovery Site C.
When the MetroPoint group fails over to Site C, the replication will be from Site C to Metro
Site A or Site B, whichever is currently defined as the winner site in VPLEX.
The production source is now in the remote site and one of the Metro sites becomes the
replica. The mirrored leg of the fractured DR1 is rebuilt at the remote site.
118
Figure 73 During failover, Site C and Site D become the production sites.
This figure illustrates that production sites (A and B) have failed over to the remote Site C,
which is now the production site with Site D. The fractured distributed RAID 1 volume on
Site C changes from a replica volume to a production volume, so Site D is rebuilt. The DR1
volume on Site A and Site B becomes fractured during the failover, as it is now the replica
copy at the remote site for production sites C and D.
In a recovery following Site A and Site B failure, Site C will restore the last active site, Site A
or Site B, which will automatically become the winner site and be restored. VPLEX rebuilds
the non-winner site. Site A and Site B are restored as the production sites and the DR1
volume at Site C becomes fractured again, as it is now the replica copy.
Upgrading to MetroPoint
Install a new MetroPoint configuration, or upgrade from existing VPLEX and/or
RecoverPoint configurations easily and non-disruptively.
MetroPoint
119
IMPORTANT
Note that MetroPoint is a property that is specific to a configured consistency group.
Simply having a VPLEX Metro system with multiple RecoverPoint clusters configured for
protection does not provide MetroPoint protection to the system. For MetroPoint
protection, the MetroPoint property must be configured for each consistency group to be
protected.
All operating system platforms and clusters normally supported with VPLEX are fully
supported with MetroPoint. See the latest copy of VPLEX ESSM for specific details.
The following MetroPoint deployments are available:
New MetroPoint installation - Install VPLEX Metro and RecoverPoint and create
MetroPoint groups.
See the VPLEX procedures in the EMC SolVe Desktop for more information about installing
and upgrading to MetroPoint.
MetroPoint groups
After MetroPoint has been enabled on the system, create new MetroPoint groups or
convert existing RecoverPoint consistency groups to MetroPoint groups to enable
MetroPoint protection. All MetroPoint groups are protected at both sides of the VPLEX
Metro by RecoverPoint.
A MetroPoint group can contain the following copies:
A MetroPoint group, when configured with two local copies, will contain the maximum of
five copies.
MetroPoint groups and regular RecoverPoint consistency groups can co-exist in the same
system. However, regular RecoverPoint consistency groups are protected on one side only,
at the VPLEX preferred site.
120
The best practice is to select the default "Follow VPLEX bias rules" which provides added
protection in RecoverPoint that will select the side that the VPLEX consistency group will
declare as the winner in the instance of link failures in the VPLEX Metro system and
prevent any unnecessary swapping of the source of replication.
See the EMC RecoverPoint Administration Guide and VPLEX procedures in the SolVe
Desktop for more information on creating or converting a consistency group to a
MetroPoint group.
In Unisphere for VPLEX, ensure you have provisioned storage and created the
distributed devices you want to use.
Expose the storage in the VPLEX consistency groups to RecoverPoint and expose the
hosts through the storage views.
In Unisphere for RecoverPoint, ensure that the distributed device is exposed to the
RecoverPoint cluster local to each VPLEX Metro cluster and it belongs to a storage view
on VPLEX Metro Site A and Metro Site B. RecoverPoint will see the same volume on
both Metro sites and use it to filter distributed volumes.
Upgrade from a consistency group that is using local volumes by converting the local
volumes to VPLEX distributed volumes and creating a MetroPoint group.
121
consistency group. A mirror leg is added to create a VPLEX distributed device, which is
then placed back in a RecoverPoint enabled consistency group. Create replication sets for
the volumes in the RecoverPoint consistency group in the Unisphere for RecoverPoint user
interface, then create the MetroPoint group using the Protect Volumes Wizard in the same
user interface.
Failover
The following are the two types of failover:
Active source failover - This failover is unique to MetroPoint groups having two
production sources, one active and one standby for a given remote replication target.
During active source failover, the active source and standby source will exchange their
roles, that is, the active source becomes the standby source and the standby source
becomes the active source.
122
Figure 75 illustrates source failover. If the active source site (Site A) fails, the best site
mechanism initiates a source failover. The delta marker information from the standby site
(Site B) is resynchronized to the remote site (Site C). Site B becomes the active source site.
MetroPoint
123
You can also manually failover the active source site. Manually failing over the active
source of production depends upon the group policy of the RecoverPoint consistency
group and the set preferred cluster. The preferred cluster can be set either to a specific
RecoverPoint cluster, or to follow VPLEX bias rules.
If the preferred cluster is set to Follow VPLEX bias rules, then the active source site
follows the VPLEX winner rules set for the consistency group. If you want to swap from
Site A to Site B, you must change the VPLEX winner rules to specify that Site B wins
instead of Site A and the RecoverPoint active source automatically swaps to the other
VPLEX site.
The VPLEX SolVe Desktop provides more information on performing a manual switch over
of the active site.
124
Recover Production
The MetroPoint topology enables you to perform recover production from each of the
copies (either remote or local on any site) without losing the journal on either of the Metro
sites.
MetroPoint
125
Figure 77 illustrates the recover production flow. In this example, the recover production
happens from the remote copy. The remote copy is moved to the image access state. The
active source site (Site A) is moved to FAIL-ALL, which stops host I/O and VPLEX I/O
synchronous distribution between Site A and Site B (DR1 fracture). In this state, only
replication I/O from RPA is allowed on DR1 leg at Site A. The replication to the local copy is
paused. The splitters in Site B are in the Marking On Hosts (MOH) state because of a
mirror fracture. A init is performed from the remote copy to Site A. Then the replication
direction is reversed back and Site A is out of FAIL-ALL. Mirror recovery of Site B starts with
VPLEX DR1 rebuild using Site A as the source. When mirror recovery ends, the local copy is
initialized. Finally, the sync-peer-splitter removes the redundant information and
information from Site B to Site C is synchronized.
126
APPENDIX A
VS1 Hardware
This appendix describes VPLEX VS1 hardware, IP addressing, and internal cabling. Topics
include:
127
130
130
132
ON
I
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Management server
ON
I
O
OFF
Engine 1
Director 1B
Director 1A
SPS 1
VPLX-000215
127
VS1 Hardware
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
O
OFF
ON
I
ON
I
Management server
Engine 2
Director 2B
Director 2A
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
SPS 2
Engine 1
Director 1B
Director 1A
SPS 1
VPL
128
VS1 Hardware
Director 4B
Director 4A
Engine 4
ON
I
ON
I
O
OFF
O
OFF
SPS 4
Engine 3
SPS 3
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
Director 3B
Director 3A
O
OFF
ON
I
ON
I
Management server
Engine 2
Director 2B
Director 2A
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
SPS 2
Engine 1
Director 1B
Director 1A
SPS 1
VPL
129
VS1 Hardware
VS1 engine
IOM B0
1
2
F
F
0
F
3
F
0
F
IOM B1
1
2
F
F
3
F
0
B
IOM B2
1
2
B B
3
B
0
B
IOM B3
1
2
B B
3
B
Power supply B
Director B
Power supply A
Side B
mgmt. module
0
F
IOM A0
1
2
F
F
0
L
IOM B4
IOM B5
1
2
3
0 1
2
L WF WF WI WI n
3
F
0
F
IOM A1
1
2
F
F
3
F
0
B
IOM A2
1
2
B B
0
L
IOM A4
IOM A5
1
2
3
0 1
2
L WF WF WI WI n
3
B
0
B
IOM A3
1
2
B B
3
B
Director A
Side A mgmt. module
3
n
L
WF
WI
n
3
n
VPLX-000343
Note: The WAN COM ports on IOMs A4 and B4 are used if the inter cluster connections are
over Fibre Channel, and the WAN COM ports on IOMs A5 and B5 are used if the inter
cluster connections are over IP.
130
For VPLEX Metro and Geo - Cluster ID for the first cluster that is set up is 1, and the
second cluster is 2.
VS1 Hardware
128.221.253.42
128.221.253.41
Engine 4:
Director 4B
Director 4A
128.221.252.42
128.221.252.41
Engine 3:
Director 3B
Director 3A
128.221.253.40
128.221.253.39
Engine 3:
Director 3B
Director 3A
128.221.252.40
128.221.252.39
FC switch B 128.221.253.34
Service port
128.221.252.2
Mgt B port
128.221.253.33
Mgt A port
128.221.252.33
Management server
Engine 2:
Director 2B
Director 2A
128.221.253.38
128.221.253.37
Engine 2:
Director 2B
Director 2A
128.221.252.38
128.221.252.37
Engine 1:
Director 1B
Director 1A
128.221.253.36
128.221.253.35
Engine 1:
Director 1B
Director 1A
128.221.252.36
128.221.252.35
VPLX-000107
131
VS1 Hardware
Engine 4:
Director 4B
Director 4A
128.221.253.74
128.221.253.73
Engine 4:
Director 4B
Director 4A
128.221.252.74
128.221.252.73
Engine 3:
Director 3B
Director 3A
128.221.253.72
128.221.253.71
Engine 3:
Director 3B
Director 3A
128.221.252.72
128.221.252.71
FC switch B 128.221.253.66
Service port
128.221.252.2
Mgt B port
128.221.253.65
Mgt A port
128.221.252.65
Management server
Engine 2:
Director 2B
Director 2A
128.221.253.70
128.221.253.69
Engine 2:
Director 2B
Director 2A
128.221.252.70
128.221.252.69
Engine 1:
Director 1B
Director 1A
128.221.253.68
128.221.253.67
Engine 1:
Director 1B
Director 1A
128.221.252.68
128.221.252.67
VPLX-000108
132
Cluster size
Cable type
Figure
Quad-engine
Ethernet
Serial
Fibre Channel
AC power
VS1 Hardware
Cluster size
Cable type
Figure
Dual-engine
Ethernet
Serial
Fibre Channel
AC power
Ethernet
Serial
Fibre Channel
AC power
IP WAN COM
Single-engine
133
VS1 Hardware
Engine 4
Purple, 71 in.
Green, 48 in.
Purple, 20 in.
Green, 20 in.
Engine 3
Green, 37 in.
Purple, 37 in.
Purple, 20 in.
Green, 37 in.
Green, 20 in.
Purple, 71 in.
Management server
Engine 2
Engine 1
VPLX-000044
134
VS1 Hardware
Serial cabling
Engine 4
12 in.
12 in.
Engine 3
12 in.
12 in.
UPS B
UPS A
40 in.
40 in.
Engine 2
12 in.
12 in.
Engine1
12 in.
12 in.
VPLX-000065
135
VS1 Hardware
Engine 4
Engine 3
79 in.
(all 16 cables)
Engine 2
Engine 1
VPLX-000055
Note: All 16 Fibre Channel cables are light blue. However, the cables connected to Fibre
Channel switch A have blue labels, and the cables connected to switch B have orange
labels.
136
VS1 Hardware
AC power cabling
ON
I
ON
I
O
OFF
O
OFF
Engine 4
SPS 4
ON
I
ON
I
O
OFF
O
OFF
Engine 3
ON
I
ON
I
O
OFF
O
OFF
SPS 3
Engine 2
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
SPS 2
O
OFF
O
OFF
ON
I
ON
I
Engine 1
SPS 1
137
VS1 Hardware
Green, 20 in.
Green, 37 in.
Purple, 71 in.
Management server
Purple, 20 in.
Purple, 71 in.
Green, 48 in.
Engine 2
Engine 1
VPLX-000043
138
VS1 Hardware
Serial cabling
UPS B
UPS A
40 in.
40 in.
Engine 2
12 in.
12 in.
Engine1
12 in.
12 in.
VPLX-000067
139
VS1 Hardware
Fibre-channel cabling
79 in. (all 16 cables)
Eight cables for a quad-engine configuration are included for
ease of upgrading, and are tied to the cabinet sidewalls
Engine 2
Engine 1
VPLX-000057
Note: All 16 Fibre Channel cables are light blue. However, the cables connected to Fibre
Channel switch A have blue labels, and the cables connected to switch B have orange
labels.
140
VS1 Hardware
AC power cabling
I
O
I
I
O
SPS 2
I
O
I
O
Engine 1
SPS 1
VPLX-000042
141
VS1 Hardware
Green, 37 in.
Purple, 71 in.
Management server
Engine 1
VPLX-000052
Engine 1
12 in.
12 in.
VPLX-000069
Engine 1
39 in.
(2 cables)
VPLX-000063
Note: Both Fibre Channel cables are light blue. However, the A side cable has a blue
label, and the B side cable has an orange label.
142
VS1 Hardware
Management server
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Engine 1
SPS
VPLX-000041
B4-FC02
B4-FC03
A4-FC02
B4-FC02
A4-FC03
Intercluster
COM SAN
switch 1A
Intercluster
COM SAN
switch 1B
ISL 1
ISL 2
Intercluster
COM SAN
switch 2A
B4-FC03
A4-FC02
A4-FC03
NOTE: ISL is
inter-switch link
Intercluster
COM SAN
switch 2B
VPLX-000317
143
VS1 Hardware
Cluster 1
(same connections from each engine in cluster)
B5-GE00
B5-GE01
A5-GE00
A5-GE01
Cluster 2
(same connections from each engine in cluster)
IP
subnet A
B5-GE00
B5-GE01
A5-GE00
A5-GE01
IP
subnet B
VPLX-000368
144
GLOSSARY
This glossary contains terms related to VPLEX federated storage systems. Many of these
terms are used in the manuals.
A
AccessAnywhere
The breakthrough technology that enables VPLEX clusters to provide access to information
between clusters that are separated by distance.
active/active
A cluster with no primary or standby servers, because all servers can run applications and
interchangeably act as backup for one another.
Active Directory
active mirror
active/passive
array
asynchronous
B
bandwidth
backend port
bias
bit
block
block size
145
Glossary
Bookmark
A label applied to a snapshot so that the snapshot can be explicitly called (identified)
during recovery processes (during image access).
Bookmarks are created through the CLI or GUI and can be created manually, by the user, or
automatically, by the system. Bookmarks created automatically can be created at
pre-defined intervals or in response to specific system events. Parallel bookmarks are
bookmarks that are created simultaneously across multiple consistency groups.
byte
C
cache
cache coherency
cluster
cluster ID
Managing the cache so that data is not lost, corrupted, or overwritten. With multiple
processors, data blocks may have several copies, one in the main memory and one in
each of the cache memories. Cache coherency propagates the blocks of multiple users
throughout the system in a timely fashion, ensuring that the data blocks do not have
inconsistent versions in the different processors caches.
Two or more VPLEX directors forming a single fault-tolerant cluster, deployed as one to four
engines.
The identifier for each cluster in a multi-cluster deployment. The ID is assigned during
installation.
cluster deployment ID
A numerical cluster identifier, unique within a VPLEX cluster. By default, VPLEX clusters
have a cluster deployment ID of 1. For multi-cluster deployments, all but one cluster must
be reconfigured to have different cluster deployment IDs.
cluster IP seed
The VPLEX IP seed is used to generate the IP addresses used by the internal components
of the VPLEX. For more information about components and their IP addresses, refer to EMC
VPLEX Installation and Setup Guide. Cluster ID is used by the virtualization software (inter
director messaging, cluster identification).
clustering
Using two or more computers to function together as a single entity. Benefits include fault
tolerance and load balancing, which increases reliability and up time.
COM
146
Temporary storage for recent writes and recently accessed data. Disk data is read through
the cache so that subsequent read references are found in the cache.
The intra-cluster communication (Fibre Channel). The communication used for cache
coherency and replication traffic.
An interface that supports the use of typed commands to execute specific tasks.
A VPLEX structure that groups together virtual volumes and applies the same detach and
failover rules to all member volumes. Consistency groups ensures the common
application of a set of properties to the entire group. Create consistency groups for sets of
volumes that require the same I/O behavior in the event of a link failure. There are two
types of consistency groups:
Glossary
continuity of operations
(COOP)
controller
D
data sharing
The ability to share access to the same data with multiple servers regardless of time and
location.
detach rule
Predefined rules that determine which cluster continues I/O when connectivity between
clusters is lost. A cluster loses connectivity to its peer cluster due to cluster partition or
cluster failure.
Detach rules are applied at two levels; to individual volumes, and to consistency groups. If
a volume is a member of a consistency group, the group detach rule overrides the rule set
for the individual volumes. Note that all detach rules may be overridden by VPLEX Witness,
if VPLEX Witness is deployed.
device
director
dirty data
disaster recovery (DR)
discovered array
A combination of one or more extents to which you add specific RAID properties. Local
devices use storage from only one cluster. In VPLEX Metro and Geo configurations,
distributed devices use storage from both clusters.
A CPU module that runs GeoSynchrony, the core VPLEX software. There are two directors (A
and B) in each engine, and each has dedicated resources and is capable of functioning
independently.
The write-specific data stored in the cache memory that is yet to be written to disk.
The ability to restart system operations after an error, preventing data loss.
An array that is connected to the SAN and discovered by VPLEX.
disk cache
A section of RAM that provides cache between the disk and the CPU. RAMs access time is
significantly faster than disk access time. Therefore, a disk-caching program enables the
computer to operate faster by placing recently accessed data in the disk cache.
Distributed consistency
groups
The RecoverPoint consistency group is divided into four segments. Each segment runs on
one primary RPA and one to three secondary RPAs.
Distributed consistency groups enable a much higher throughput and IOPS rate,
regardless of the amount of data being replicated.
distributed device
147
Glossary
Supports the sharing of files and resources in the form of persistent storage over a
network.
Distributed devices have physical volumes at both clusters in a VPLEX Metro or VPLEX Geo
configuration for simultaneous active/active and read/write access using
AccessAnywhere
E
engine
Consists of two directors, management modules, and redundant power. Unit of scale for
VPLEX configurations. Single = 1 engine, dual = 2 engines, Quad = 4 engines per cluster.
Ethernet
A Local Area Network (LAN) protocol. Ethernet uses a bus topology, meaning all devices
are connected to a central cable, and supports data transfer rates of between 10 megabits
per second and 10 gigabits per second. For example, 100 Base-T supports data transfer
rates of 100 Mb/s.
event
extent
A log message that results from a significant action initiated by a user or the system.
All or a portion (range of blocks) of a storage volume.
F
failover
Automatically switching to a redundant or standby device, system, or data path upon the
failure or abnormal termination of the currently active device, system, or data path.
fault domain
A set of components that share a single point of failure. For VPLEX, the concept that every
component of a highly available system is separated, so that if a fault occurs in one
domain, it will not result in failure in other domains to which it is connected.
fault tolerance
Ability of a system to keep working in the event of hardware or software failure, usually
achieved by duplicating key system components.
A protocol for transmitting data between computer devices. Longer distance requires the
use of optical fiber. However, FC also works using coaxial cable and ordinary telephone
twisted pair media. Fibre channel offers point-to-point, switched, and loop interfaces.
Used within a SAN to carry SCSI traffic.
Combines Fibre Channel and Internet protocol features to connect SANs in geographically
distributed systems.
A unit or component of a system that can be replaced on site as opposed to returning the
system to the manufacturer for repair.
firmware
front end port
Software that is loaded on and runs from the flash ROM on the VPLEX directors.
VPLEX director port connected to host initiators (acts as a target).
G
geographically
distributed system
148
A system that is physically distributed across two or more geographically separated sites.
The degree of distribution can vary widely, from different locations on a campus or in a city
to different continents.
Glossary
H
hold provisioning
An attribute of a registered array that allows you to set the array as unavailable for further
provisioning of new storage.
An I/O adapter that manages the transfer of information between the host computers bus
and memory system. The adapter performs many low-level interface functions
automatically or with minimal processor involvement to minimize the impact on the host
processors performance.
I
input/output (I/O)
internet Fibre Channel
protocol (iFCP)
intranet
J
Journal volumes
Volumes that contain data waiting to be distributed to target replica volumes and copies
of the data previously distributed to the target volumes. Journal volumes allow convenient
rollback to any point in time, enabling instantaneous recovery for application
environments.
K
kilobit (Kb)
kilobyte (KB)
L
latency
LDAP
load balancing
149
Glossary
local device
A group of computers and associated devices that share a common communications line
and typically share the resources of a single processor or server within a small geographic
area.
A combination of one or more extents to which you add specific RAID properties. Local
devices use storage from only one cluster.
Virtual storage to which a given server with a physical connection to the underlying
storage device may be granted or denied access. LUNs are used to identify SCSI devices,
such as external hard drives that are connected to a computer. Each device is assigned a
LUN number which serves as the device's unique address.
M
megabit (Mb)
megabyte (MB)
metadata
metavolume
MetroPoint consistency
group (Metro group)
Metro-Plex
Two VPLEX Metro clusters connected within Metro (synchronous) distances, approximately
60 miles or 100 kilometers.
mirroring
The writing of data to two or more disks simultaneously. If one of the disk drives fails, the
system can instantly switch to one of the other disks without losing data or service. RAID 1
provides mirroring.
mirroring services
miss
N
namespace
network
network architecture
150
A set of names recognized by a file system in which all names are unique.
System of computers, terminals, and databases connected by communication lines.
Design of a network, including hardware, software, method of connection, and the
protocol used.
network-attached
storage (NAS)
network partition
Glossary
Non-distributed
consistency groups
Transfer data through one primary RPA that is designated by the user during group
creation. The policies applied by the consistency group can be modified at any time.
In the event of RPA failure, groups that transfer data through the failed RPA will move to
other RPAs in the cluster.
O
Open LDAP
P
parity checking
partition
Production journal
volumes
Production volumes
Checking for errors in binary data. Depending on whether the byte has an even or odd
number of bits, an extra 0 or 1 bit, called a parity bit, is added to each byte in a
transmission. The sender and receiver agree on odd parity, even parity, or no parity. If they
agree on even parity, a parity bit is added that makes each byte even. If they agree on odd
parity, a parity bit is added that makes each byte odd. If the data is transmitted incorrectly,
the change in parity will reveal the error.
A subdivision of a physical or virtual disk, which is a logical entity only visible to the end
user and not to any of the devices.
Volumes that hold system delta marking information.
Volumes that are written to by the host applications. Writes to production volumes are
split such that they are sent to both the normally designated volumes and RPAs
simultaneously.
Each production volume must be exactly the same size as the replica volume to which it
replicates.
point-in-time (PIT)
R
RAID
The use of two or more storage volumes to provide better performance, error recovery, and
fault tolerance.
RAID 0
RAID 1
Also called mirroring, this has been used longer than any other form of RAID. It remains
popular because of simplicity and a high level of data availability. A mirrored array
consists of two or more disks. Each disk in a mirrored array holds an identical image of the
user data. RAID 1 has no striping. Read performance is improved since either disk can be
read at the same time. Write performance is lower than single disk storage. Writes must be
performed on all disks, or mirrors, in the RAID 1. RAID 1 provides very good data reliability
for read-intensive applications.
RAID leg
151
Glossary
rebuild
The process of reconstructing data onto a spare or replacement drive after a drive failure.
Data is reconstructed from the data on the surviving disks, assuming mirroring has been
employed.
RecoverPoint Appliance
(RPA)
Hardware that manages all aspects of data protection for a storage group, including
capturing changes, maintaining the images in the journal volumes, and performing image
recovery.
RecoverPoint cluster
RecoverPoint
consistency groups
RecoverPoint site
RecoverPoint volumes
Repository volumes
Production volumes
Replica volumes
Journal volumes
redundancy
registered array
An array that is registered with VPLEX. Registration is required to make the array available
for services-based provisioning. Registration includes connecting to and creating
awareness of the arrays intelligent features. Only VMAX and VNX arrays can be registered.
reliability
152
Glossary
Allows computers within a network to exchange data using their main memories and
without using the processor, cache, or operating system of either computer.
replication set
When RecoverPoint is deployed, a production source volume and one or more replica
volumes to which it replicates.
Snapshots that are either waiting to be replicated or already distributed to the replica
Bookmarks
Volumes to which production volumes replicate. In prior releases, the replica volume must
be exactly the same size as its production volume. In RecoverPoint (RP) 4.0 and
GeoSynchrony release 5.2, RecoverPoint supports a feature called Fake Size, where the
replica volume size can be higher than the production volume.
repository volume
A volume dedicated to RecoverPoint for each RPA cluster. The repository volume serves all
RPAs of the particular RPA cluster and the splitter associated with that cluster. The
repository volume stores configuration information about the RPAs and RecoverPoint
consistency groups. There is one repository volume per RPA cluster.
restore source
This operation restores the source consistency group from data on the copy target.
RPO
Recovery Point Objective. The time interval between the point of failure of a storage
system and the expected point in the past at which the storage system is capable of
recovering customer data. Informally, RPO is maximum amount of data loss that can be
tolerated by the application after a failure. The value of the RPO is highly dependent upon
the recovery technique used. For example, RPO for backups is typically days, for
asynchronous replication minutes, and for mirroring or synchronous replication seconds
or instantaneous.
RTO
Recovery Time Objective. Not to be confused with RPO, RTO is the time duration within
which a storage solution is expected to recover from failure and begin servicing
application requests. Informally, RTO is the longest tolerable application outage due to a
failure of a storage system. RTO is a function of the storage technology. It may measure in
hours for backup systems, minutes for a remote replication, and seconds (or less) for a
mirroring.
S
scalability
simple network
management protocol
(SNMP)
153
Glossary
site ID
SLES
SUSE Linux Enterprise Server is a Linux distribution supplied by SUSE and targeted at the
business market.
A set of evolving ANSI standard electronic interfaces that allow personal computers to
communicate faster and more flexibly than previous interfaces with peripheral hardware
such as disk drives, tape drives, CD-ROM drives, printers, and scanners.
Snapshot/PIT
A point-in-time copy that preserves the state of data at an instant in time, by storing only
those blocks that are different from an already existing full copy of the data.
Snapshots are also referred to as point-in-time (PIT). Snapshots stored at a replica journal
represent the data that has changed on the production storage since the closing of the
previous snapshot.
splitter
storage area network
(SAN)
storage view
storage volume
stripe depth
striping
A technique for spreading data over multiple disk drives. Disk striping can speed up
operations that retrieve data from disk storage. Data is divided into units and distributed
across the available disks. RAID 0 provides disk striping.
synchronous
Describes objects or events that are coordinated in time. A process is initiated and must
be completed before another task is allowed to begin.
For example, in banking, two withdrawals from a checking account that started at the
same time must not overlap; therefore, they are processed synchronously.
throughput
154
A scripting language often used for rapid prototypes and scripted applications.
Glossary
transfer size
The size of the region in cache used to service data migration. The area is globally locked,
read at the source, and written at the target. Transfer size can be as small as 40 K, as large
as 128 M, and must be a multiple of 4 K. The default value is 128 K.
A larger transfer size results in higher performance for the migration, but may negatively
impact front-end I/O. This is especially true for VPLEX Metro migrations. Set a large
transfer-size for migrations when the priority is data protection or migration performance.
A smaller transfer size results in lower performance for the migration, but creates less
impact on front-end I/O and response times for hosts. Set a smaller transfer-size for
migrations when the priority is front-end storage response time.
transmission control
protocol/Internet
protocol (TCP/IP)
The basic communication language or protocol used for traffic on a private network and
the Internet.
U
uninterruptible power
supply (UPS)
A power supply that includes a battery to maintain power in the event of a power failure.
universal unique
identifier (UUID)
A 64-bit number used to uniquely identify each VPLEX director. This number is based on
the hardware serial number assigned to each director.
V
virtualization
A layer of abstraction that is implemented in software that servers use to divide available
physical storage into storage volumes or virtual volumes.
virtual volume
Unit of storage presented by the VPLEX front end ports to hosts. A virtual volume looks like
a contiguous volume, but can be distributed over two or more storage volumes.
W
wide area network (WAN)
write-back mode
write-through mode
A caching technique in which the completion of a write request is communicated only after
data is written to disk. This is almost equivalent to non-cached systems, but with data
protection.
155
Glossary
156
INDEX
A
addresses of hardware components 130
C
cabling, internal
VPLEX VS1 dual-engine configuration 138
VPLEX VS1 quad-engine configuration 134
VPLEX VS1 single-engine configuration 142
consistency groups
asynchronous 58
synchronous 56
global visibility 57
local visibility 56
I
IP addresses 130
M
MetroPoint 106
Four-site topology 115
Three-site topology 108
Two-site topology 107
MetroPoint groups 120
Converting to a MetroPoint group 121
Creating a MetroPoint group 121
MetroPoint upgrades
Upgrading to MetroPoint 119
R
RecoverPoint
configurations 99
VPLEX Local and Local Protection 100
VPLEX Local and Local/Remote Protection 100
VPLEX Metro and RecoverPoint Local 101
VPLEX Metro and RecoverPoint with Local and Remote
Replication 102
V
VPLEX Witness
deploment 23
Index