You are on page 1of 57

INF-BCO2807

vSphere HA and Datastore Access Outages CurrentCapabilities Deep-Dive and Tech Preview

Smriti Desai, VMware, Inc. Keith Farkas, VMware, Inc.

#vmworldinf

Disclaimer

This session may contain product features that are


currently under development.

This session/overview of the new technology represents


no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in


contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features
discussed or presented have not been determined.

VMware Business Continuity Solutions


Local Site Failover Site

vSphere

vSphere

vSphere

vSphere

vSphere

Local Availability

Disaster Recovery

vSphere High Availability This talk vSphere Fault Tolerance vMotion and Storage vMotion
Data Protection

vCenter Site Recovery Manager and


vSphere Replication

vSphere Data Protection Storage APIs for Data Protection


3

vSphere HA Recap

vSphere HA minimizes unplanned downtime Provides automatic VM recovery in minutes Protects against 3 types of failures
Infrastructure Host failures VM crashes Connectivity Host network isolated Datastore incurs PDL Application GuestOS hangs/crashes Application hangs/crashes

Does not require complex configuration Is OS and application-independent

Talk Focus

Datastore accessibility outages occur infrequently but have a


large cost

Loss of accessibility is due to Network or switch failure Array, NFS sever, etc. misconfiguration

VMware ESX

VMware ESX

VM manageability and availability is affected Applications with vdisks on inaccessible datastores hang, crash, or fail May not be able to manage VMs on the affected hosts vSphere HAs protection impacted

Agenda and Objectives


Session has two major parts

1. Impact of datastore inaccessibility on HA failover workflows 2. Expanding HA protection against datastore inaccessibility
Objectives Learn how HA workflows are impacted by datastore accessibility Understand how vSphere 5.0/1 reduces the impact of inaccessibility Preview the future - protecting VMs against datastore inaccessibility

Agenda: Part 1
1. Impact of datastore inaccessibility on HA failover workflows Architecture overview Datastore usage HA workflows and responses
2. Expanding HA protection against datastore inaccessibility

vSphere 5.0+ Architecture


HA Agent

Called the Fault Domain Manager (FDM) Provides all the HA on-host functionality
Operation

vCenter Server (VC) manages the cluster Failover operations are independent of VC Communicate over
Management network Datastores

vCenter Server (VC)


8

FDM Master and Slave Roles


Any FDM can be master, selected by election

All others assume the role of FDM slaves


The FDM master

Monitors hosts and VMs Manages VM restarts after failures Reports cluster state to VC
The FDM slave
Forwards critical state changes to the master Restart VMs when directed by the master Elect new master
vCenter Server (VC)
9

Datastore Usage

Datastores are used by vSphere HA for two purposes


A communication channel between FDMs Persistent storage for configuration information

Both influence HAs response to cluster conditions


used in

Datastore communication Persisted Configuration

HA host states
influence triggers

Response

used in

10

Datastores Used for Communication

Datastores are used when management network is not available Heartbeat datastores
Used by a master to monitor a partitioned/isolated slave Enables a master to detect VM power state changes VC chooses two (by default) for each host Reselected after datastore accessibility changes

Home datastore of each VM


Used by isolated slaves to determine if a master owns the VM

11

used in

Datastore communication Persisted Configuration

influence

HA host states
triggers

Response

used in

12

Information Sources for HA Host States


The HA state reported for each host is derived using information from VC The master VC is communicating with The FDM on the host

State Election Running (Master) Connected (Slave) Unreachable Isolated Partitioned Dead

Source FDM on the host VC Master Master or VC FDM on the host, reported by master Master Master

13

Information Sources for HA Host States


The HA state reported for each host is derived using information from VC The master VC is communicating with The FDM on the host

State Election Running (Master) Connected (Slave) Unreachable Isolated Partitioned Dead

Source FDM on the hostDetermined using VC Master Master or VC FDM on the host, reported by master Master Master

datastore communication

14

How a Master Determines a Slaves State


Connected Slave Not connected Unreachable No response to pings Datastore heartbeats Partitioned / Isolated* No datastore heartbeats No datastore heartbeats Network available

Dead
* See slide notes
15

Impact of datastore accessibility on responses

used in

Datastore communication Persisted Configuration

influence

HA host states
triggers

Response

used in

16

Network Isolated FDM - VM Isolation Workflow

Determine VMs to power off / shutdown

2 Home datastore accessible? No Apply isolation response 4 Yes

3 Master owns VM? Yes Report VM power off 5 No. Wait

17

Network Isolated FDM - Home Datastore Inaccessibility

Determine VMs to power off / shutdown

2 Home datastore accessible? No Apply isolation response 4 Yes

3 Master owns VM? Yes Report VM power off 5 No. Wait

If all FDMs are isolated, all will apply isolation responses VMs not restarted until master has access to VM datastores Best practices Redundant management networks Reconfigure storage to reduce likelihood of inaccessible datastores Use leave powered on isolation option
18

Network Isolated FDM - Heartbeat Datastore Accessibility

Determine VMs to power off / shutdown

2 Home datastore accessible? No Apply isolation response 4 Yes

3 Master owns VM? Yes Report VM power off 5 No. Wait

Isolated host and master have access to heartbeat datastores Master will attempt failover on power off notification Otherwise Master will declare host dead and start failover immediately* Same situation applies to partitioned hosts
* More info in backup slides
19

Host Declared Dead


The VM Failover Response

20

Host Dead: FDM Masters Workflow


1

Host declared dead


2

Note: steps 2 to 7 apply anytime a VM is to be restarted

Determine VMs to be failed over


3 4 Yes 6

Try to place each VM


7 5

Found place?
No

Restarted ?
No

Yes

Retry after a delay

Wait for capacity change

End Error encountered

21

Impact of Datastore Accessibility: VMs to Failover

Case 1: home datastore of VM is not accessible on masters host Master will proxy all accesses via a slave with access
Case 2: master may not know the VM is protected Reason #1: VMs home datastore is inaccessible VM cant be powered on in any case Master will retry once datastore is accessible

Reason #2: partition, multiple masters, other master owns VM But other master knows and will restart it if needed

22

Host Dead: FDM Masters Workflow


1

Host declared dead


2

Note: steps 2 to 7 apply anytime a VM is to be restarted

Determine VMs to be failed over


3 4 Yes 6

Try to place each VM


7 5

Found place?
No

Restarted ?
No

Yes

Retry after a delay

Wait for capacity change

End Error encountered

23

Impact of Datastore Accessibility: VM Restart

Case 1: host manageability impacted by datastore inaccessibility Master will retry failovers on another host after timeout Could take a long time to restart failed VMs vSphere 5.0 and 5.1 enhancements significantly reduces impact Case 2: one of VMs datastores is inaccessible on some/all hosts Master will retry but could exhaust 5 retries before success Future opportunity to enhance HA
Both are discussed next in part 2 of this session

24

Agenda: Part 2
1. Impact of datastore inaccessibility on HA failover workflows
2. Expanding HA protection against datastore inaccessibility

Technical direction VM manageability and availability Tech preview

25

Solution Approach for Inaccessible Datastores

Improve VM availability by ensuring 1. VMs are manageable 2. VMs that use the datastore are moved to healthy hosts Address #1 by enhancing ESX, #2 by enhancing HA

vCenter Server

Manage

VMware ESX

VMware ESX

26

Types of Inaccessibility: PDL and APD

Are ESX storage-device states that indicate inaccessibility PDL (Permanent Device Loss): device is permanently inaccessible E.g., caused by removing a LUN using array management software
ESX infers state from SCSI sense codes returned by an array iSCSI login reject (target is gone or access not authorized) Device must be recreated to restore normal operation

APD (All Paths Down): device is possibly temporarily inaccessible E.g., caused by unplugging a network cable Device could become accessible at any time

27

ESX Enhancements for VM Manageability

Idea: if a datastore is under APD/PDL, fail I/Os quickly Impacted operations notified faster and allows others to proceed
ESX PDL Support (vSphere 5.0) When under PDL, I/Os are failed immediately ESX APD Support (vSphere 5.1) When under APD, non guest I/Os are failed immediately after a delay

T=0 APD detection

T=140s (default) APD timeout declared I/O fast failing starts

Datastore reachable APD cleared Normal I/O behavior

28

HA Enhancements for VM Availability

Technical Direction Restart VMs with datastores under APD/PDL on a healthy host Response is fully configurable and automatic In 5.0U1 introduced initial support for PDL Terminates a VM on first guest-issued I/O to a PDL virtual disk Once a VM has been terminated, vSphere HA will restart it Enabled using advanced options. See slide notes for details.

29

HA Enhancements for VM Availability: The Future

We are exploring a significant extension to this mechanism Design goals Add support for APD Triggered by PDL/APD declaration rather than guest I/Os Full customization of responses (e.g., event only option) Full user interface and detailed reporting VM placement sensitive to accessibility

30

VM Component Protection
Caveat: what follows is a prototype and a feature based on it may look quite different, if and when we offer it

31

Protection Workflow: PDL


1

Datastore inaccessible

PDL

2 Determine

per VM response

No action
3

Failover

Terminate and restart VM


4

End

Protection Workflow: APD


1

APD

Wait for APD declaration

Datastore inaccessible

Wait for optional delay

Determine per VM response No action

Failover
6 5

Terminate and restart VM


7

Yes

Could reserve capacity? No

End

Protection Workflow: APD


1

APD

Wait for APD declaration

Datastore inaccessible

Wait for optional delay

Failover
6 5

No action

Terminate and restart VM


7

Yes

Could reserve capacity? No

Reset guest if requested

End

APD cleared

Determine per VM response

Combined Workflow: APD and PDL


Datastore inaccessible

APD

Wait for APD declaration

PDL

Wait for optional delay

No action

Failover

Failover Could reserve capacity? No

No action

Terminate and restart VM

Yes

Restart guest if requested

End

APD cleared

Determine per VM response

Determine per VM response

Demo Overview
OR EX

2 VMs on NFS OracleDBServer ExchangeServer 3 VMs on SAN Webserver 1 Webserver 2 Ubuntu

UB

WS1 WS2

FC Switch

Ethernet Switch

APD impacts host A SAN

SAN

NFS

Converged NFS/iSCSI storage array


36

Demo Overview
OR EX

2 VMs on NFS OracleDBServer ExchangeServer 3 VMs on SAN Webserver 1 Webserver 2 Ubuntu

UB

WS1 WS2

FC Switch

Ethernet Switch

APD impacts host A SAN

SAN

NFS

Converged NFS/iSCSI storage array


37

Demo Overview
OR EX UB WS1 WS2

2 VMs on NFS OracleDBServer ExchangeServer 3 VMs on SAN Webserver 1 Webserver 2 Ubuntu

FC Switch

Ethernet Switch

APD impacts host A SAN

SAN

NFS

Converged NFS/iSCSI storage array


38

Summary: Protection Against Datastore Inaccessibility

Several platform enhancements in recent years vSphere 5.0: PDL support vSphere 5.1: APD support vSphere 5.0U1: HA restarts VMs if they fail during a PDL/APD The future: HA recovering VMs impacted by PDL/APD Comprehensive: APD and PDL, and covers all VM I/Os Configurable: Various levels of VM remediation Usable: Enable with 1 click, detailed error reporting
Please send us your feedback on the proposed feature

39

Session Summary

40

Session Summary

vSphere HA provides organizations the ability to run their critical


business applications with confidence Offers a solid, scalable foundation upon which to build a business Is simple to enable and manage

vSphere HA failure coverage extended in 5.0U1 to cover PDL HA vision: Full coverage of datastore accessibility outages Extend coverage of failures for more applications Extend HA coverage to multi-VM applications

41

Questions?

42

FILL OUT A SURVEY


EVERY COMPLETE SURVEY IS ENTERED INTO DRAWING FOR A $25 VMWARE COMPANY STORE GIFT CERTIFICATE

INF-BCO2807

vSphere HA and Datastore Access Outages CurrentCapabilities Deep-Dive and Tech Preview

Smriti Desai, VMware, Inc. Keith Farkas, VMware, Inc.

#vmworldinf

Additional vSphere HA 5.0+ Details

45

VM Protection Workflow Example

46

2. VM Protection Workflow Power On


1. VM is off. VM protection is N/A 2. User powers on VM 3. Host reports to VC VM powered on 4. VC reports VM is unprotected 5. VC tells master to protect VM 6. Master updates protection list on disk 7. Master informs VC that it did 8. VC reports VM is protected
47

Host Failure, Partition, and Isolation Responses

48

Applying Concepts: Host Failures Network Partition


Master declares a host partitioned when:

It cant communicate with it over the network it can see its datastore heartbeats
ESX 1 ESX 3

Results in:

Another master elected VC reports one masters view of the cluster A VM running in the other partition will be monitored via the heartbeat datastores restarted if it fails (in masters partition)

ESX 2

ESX 4

When partition is resolved, all but one master abdicates


49

Applying Concepts: Host Failures Host Dead


Master declares a host dead when: Master cant communicate with it over the network Host is not connected to master Host does not respond to ICMP pings Master observes no datastore heartbeats Results in: Master attempts to restart all VMs from host Restarts on network-reachable hosts and its own host
ESX 2 ESX 4

ESX 1

ESX 3

50

Troubleshooting

51

Troubleshooting vSphere HA 5.0


HA issues proactive warning about possible future conditions VMs not protected after powering on Management network discontinuities Isolation addresses stop working HA host states provide granularity into error conditions All HA conditions reported via events; config issues/alarms for some Event descriptions describe problem and actions to take All event messages contain vSphere HA so searching for HA issues easier HA alarms are more fine grain and auto clearing (where appropriate) 5.0+ Troubleshooting guide which discusses likely top issues. E.g., Implications of each of the HA host states Topics on HB datastores, failovers, admission control Will be updated periodically

52

HA Agent Logging
HA 5.0+ writes operational information to a single log file called fdm.log
A configurable number of historical copies are kept to assist with debugging

File contains a record of, for example, Inventory updates relating to VMs, the host, and datastores received from the host
management agent (hostd)

Processing of configuration updates sent to a master by vCenter Server Significant actions taken by the HA agent, such as protecting a VM or restarting a VM Messages sent by a slave to a master and by a master to a slave

Default location
ESXi 5.0+: /var/log/fdm.log (historical copies in var/run/log) Earlier ESX versions: /var/log/vmware/fdm (all files in the same directory)

Notes
See vSphere HA best practices guide for recommended log capacities HA log files are designed to assist VMware support in diagnosing problems and the format may
change at any time. Thus, for reporting, we recommend you rely on the vCenter Server HA-related events, alarms, config issues, and VM/host properties

53

Log File Format


Log file contains time stamped rows Many rows report the HA agent (FDM) module that logged the info E.g.,
2011-06-01T05:48:00.945Z [FFFE2B90 info 'Invt' opID=SWI-a111addb] [InventoryManagerImpl::ProcessClusterChange] Cluster state changed to Startup

Noteworthy modules are Cluster module responsible for cluster functions Invt module responsible for caching key inventory details Policy module responsible for deciding what to do on a failure Placement module responsible for placing failed VMs Execution module responsible for restarting VMs Monitor modules responsible for periodic health checks FDM module responsible for communication with vCenter Server

54

Additional Datastore Details for HA 5.0+


Heartbeating and heartbeat files Protected VM files File locations

55

Heartbeat Datastores(HB): Mechanisms


Used by master for slaves not connected to it over network Determine if a slave is alive
Rely on heartbeats issued to slaves HB datastores Each FDM opens a file on each of its HB datastores for heartbeating purposes Files contain no information. On VMFS datastores, file will have the minimum-allowed
file size

Files are named X-hb, where X is the (SDK API) moID of the host Master periodically reads heartbeats of all partitioned / isolated slaves

Determine the set of VMs running on a slave


A FDM writes a list of powered on VMs into a file on each of its HB datastores Master periodically reads the files of all partitioned/isolated slaves Each poweron file contains at most 140 KB of info. On VMFS datastores, actual disk
usage is determined by the file-sizes supported by the VMFS version

They are named X-powereon, where X is the (SDK API) moID of the host

56

Location of Heartbeating and Protection Files


FDMs create a directory (.vSphere-HA) in root of each relevant datastore Within it, they create a subdirectory for each cluster using the datastore Each subdirectory is given a unique name called the Fault Domain ID <VC uuid>-<cluster entity ID>-<8 random hex characters>-<VC hostname> Entity ID is the number portion of the (SDK API) moID of the cluster E.g., in /vmfs/volumes/clusterDS/.vSphere-HA/
FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-9-d6bfc023-vc23/ Cluster 9 FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-17-ad9fd307-vc23/ Cluster 17

57

You might also like