Professional Documents
Culture Documents
Table of contents
Executive summary............................................................................................................................... 2 Introduction ......................................................................................................................................... 2 Storage overview ............................................................................................................................. 3 High availability for SAP systems ........................................................................................................... 4 Solution objectives ........................................................................................................................... 4 Multi-site cluster with FT and HA ........................................................................................................ 5 FT overview ..................................................................................................................................... 6 Selecting an appropriate solution ....................................................................................................... 6 Validating an Always-on solution ........................................................................................................... 7 Solution options ............................................................................................................................... 7 Always-on failover testing.................................................................................................................. 8 FT performance considerations ............................................................................................................ 10 Optimizing FT-protected VM performance ......................................................................................... 10 Conclusion ........................................................................................................................................ 11 Appendix: Setting up the solution ........................................................................................................ 12 Qualified hardware and software .................................................................................................... 12 Configuring storage resources ......................................................................................................... 13 Configuring vSphere resources ........................................................................................................ 14 Configuring SAP systems................................................................................................................. 15 For more information .......................................................................................................................... 16
Executive summary
This white paper, one in a series on the virtualization of an SAP environment through a combination of HP storage systems, HP ProLiant servers, and VMware virtualization technology, focuses on deploying advanced VMware vSphere functionality High Availability (HA) and Fault Tolerance (FT) so that virtualized SAP systems can take full advantage of the disaster-tolerance delivered by HP StorageWorks P4000 G2 SAN storage. HP has developed and validated a disaster-tolerant SAP solution featuring HA and FT, in conjunction with automatic, synchronous SAN storage replication. This solution was based on best practices for HP Converged Infrastructure and published maximum configurations for a vSphere environment. The combination of P4000 G2 SAN storage and HP ProLiant servers allows a disaster-tolerant SAP system to scale1 from an entry-level P4000 Virtual SAN Appliance Software (VSA) to P4300 G2 SAS Starter SAN and enterprise-class P4500 G2 SAN, and from a single HP ProLiant DL server to a multiblade HP BladeSystem configuration, until the resource limits of FT-protected virtual machines (VMs) are reached. Target audience: This paper is intended for SAP administrators wishing to learn more about virtualizing and protecting SAP systems using vSphere software in conjunction with an HP hardware platform. In general, however, the reader does not require experience with SAP software. Testing performed in September 2010 is described.
Introduction
The combination of HP storage systems, HP servers, and SAP software in a VMware virtualized environment has been described in the following series of HP white papers: SAP system virtualization with HP ProLiant servers an introduction This paper highlights virtualization as a business enabler and provides detailed information on how to use HP offerings (such as HP ProLiant servers, shared HP StorageWorks SAN storage, and HP Virtual Connect FlexFabric networking) to virtualize an SAP landscape with vSphere 4. In addition, guidelines for sizing a virtualized SAP landscape are presented. SAP system virtualization with HP ProLiant servers total cost of ownership and return on investment with VMware vSphere This paper outlines the business case for virtualization, describing economic, environmental, and technological drivers. It also reviews a total cost of ownership (TCO) study concerning the virtualization of a legacy environment and outlines projected cost savings for the first three years of operation. HP LeftHand P4000 iSCSI SAN for SAP landscapes with HP BladeSystem and HP ProCurve configuration and performance This paper in conjunction with the VMware paper, SAP Solutions on VMware vSphere 4 - Best Practice Guidelines outlines the capabilities and benefits of a P4000 G2 iSCSI SAN and provides sample configurations for environments featuring bare-metal and virtualized SAP servers. Performance testing validated that the I/O performance delivered by these configurations satisfies SAP system requirements. The solution described in this paper was designed to enhance the availability of an SAP system protected by HA and FT by deploying a P4000 G2 SAN to provide multi-site storage. This solution is suitable for larger installations where many virtualized SAP systems can be deployed.
1
HP validated a disaster-tolerant SAP NetWeaver 7.0 system with Microsoft SQL Server 2008 on Microsoft Windows Server 2008.
Storage overview
Most SAP production systems need a high level of availability along with reasonable levels of performance. Thus, a suitable storage system is required whether there are thousands of interactive users connecting to a terabyte-sized SAP database or fewer than a hundred users connecting to a database that only contains several hundred gigabytes. Figure 1 outlines three sample configurations ranging from entry level software-only P4000 VSA to a solution featuring P4300 G2 Starter SAN with ProLiant DL model servers and one featuring a P4500 G2 Multi-site SAN with HP BladeSystem. Here the server blades can scale to accommodate a large number of virtualized SAP systems.
Attached to an HP BladeSystem, the HP StorageWorks P4500 G2 SAN is optimized to deliver scalability to a multi-blade environment supporting a predominantly Enterprise Resource Planning (ERP) random-access I/O workload. Features include: Highly-dense storage that can dynamically scale performance and capacity as the infrastructure expands Clustered architecture that promotes high availability with no single point of failure (SPOF). Simplicity and performance are enhanced by high-speed storage paths, dense disk spindle counts, and no external storage switching All P4000 G2 SANs are driven by built-in SAN/iQ software with features that include storage clustering, thin provisioning, snapshots, remote copy, and Network RAID for high availability.
Solution objectives
The objective of the high-availability solution described in this paper is to provide high availably for an SAP system at the VM level by eliminating the following SPOFs that have been identified in the SAP architecture: SAP database When each SAP work process starts, it makes a private connection to the database. If this connection is interrupted due the failure of the database instance, the work process attempts to set up a new connection and changes to "database reconnect" state until the instance is restored. User sessions engaged in database activity receive SQL error messages; however, logged-on sessions are preserved on the application server. SAP Message Service Message Service is used to exchange and regulate messages between SAP instances. For example, it schedules batch jobs and determines to which instance a user will log on. SAP Enqueue Service Enqueue Service manages the locking of business objects at the SAP transaction level. Locks are set in a lock table that is stored in the shared memory of the host on which Enqueue Service is running. Failure of this service would have a considerable effect on the system because all transactions containing locks would have to be rolled back; in addition, any SAP updates being processed would fail and, depending on business requirements, may have to be manually applied via SAP transaction SM13 once Enqueue Service has been restored. You can protect the database by allowing the instance to failover to another physical machine in the event of a failure. In addition, isolating Message and Enqueue Services from the SAP Central Instance (CI) helps address the high-availability requirements of these particular SPOFs. The SAP Central Services (SCS) component is less cumbersome than the CI and, if necessary, can be restarted much more quickly after a failure.
In the event of the failure of a database VM protected by HA, the SAP system is not regarded as down; however, transactions cannot continue until the database has restarted at the second site.
HA continuously monitors all vSphere hosts in the cluster and, in the event of a host failure, migrates affected VMs and restarts them on surviving hosts. FT gives you the ability to run two VMs (primary and secondary) simultaneously, in lockstep3. If the primary VM were to fail, the secondary VM would immediately take over, becoming the new primary VM and continuing processing at the point the original VM stopped. A new secondary VM is then created on the next available vSphere host. In the solution shown in Figure 2, FT enables the transparent failover of an SCS instance.
Note The combination of HA and FT can be used to address SPOFs in a virtualized SAP environment. For more information, refer to the VMware white paper, SAP Solutions on VMware vSphere: High Availability.
An HP P4000 G2 multi-site SAN cluster extends the protection delivered by HA (vSphere host-level) and FT (VM-level) to the storage-level. By combining the three components HA, FT, and a P4000 G2 multi-site SAN cluster an entire installation at one site can be protected by a second site and vice versa.
FT overview
FT establishes and maintains an active secondary VM that runs in lockstep with the primary VM. Although the secondary VM resides on a different vSphere host, it executes exactly the same sequence of instructions as the primary and receives the same inputs; their states are identical. Should the primary fail, the secondary is ready to take over at any time, without data loss or interruption of service. To summarize, both VMs are managed as a single unit but run on different physical hosts. Currently, FT only supports single-vCPU VMs, making this feature a viable solution for lighter components of the SAP architecture, such as SCS.
3 4
Using what VMware refers to as vLockstep technology Assuming SCS is installed in a VM with a single vCPU
Solution options
HP focused on Always-on solutions configured with HA and FT. The options described below are designed to balance complexity, cost, and SAP application availability. Single VM A two-tier SAP environment with the CI and database server deployed on a single VM is the simplest way to virtualize an SAP system. According to VMware, this single-VM solution can scale to as many as eight vCPUs (as shown in Table 1) and, thus, is able to support a reasonable number of SAP transactions.
Table 1. Configuration maximums
Item VM vCPUs per VM RAM per VM FT vCPUs per FT VM RAM per FT VM vDisks per FT VM
Maximum
8 255 GB
1 64 GB 16
The single-VM approach allows an SAP system to be protected by HA. However, using FT to protect Enqueue and Message Services would impose a limit of a single vCPU on the VM, which would, in turn, restrict the number of transactions supported. Thus, for larger workloads, a different configuration is required. Multiple VMs To enhance scalability, you can use a three-tier SAP environment, separating the database from the CI. Moreover, since NetWeaver architecture would then allow you to separate Enqueue and Message Services from the CI via a lightweight SCS server, you can protect this SPOF by enabling FT, as shown in Figure 2 Simplified view of a multi-site, replicated SAP system protected by HA and FT. The FT-protected VM is configured with a single vCPU and a theoretical maximum of 64 GB of memory. The database server is protected by HA, allowing it to be configured with up to eight vCPUs and a theoretical maximum of 255 GB of memory.
Figure 3 Overview of the tested configuration, showing the loss of an entire site with minimal interruption to SAP transaction processing
Losing the FT-protected VM While SAP transactions were being processed, HP manually powered off the primary SCS VM, which had been running at Site A. However, there was a secondary SCS VM running at Site B; both SCS VMs had access to the same iSCSI storage, which was protected by the P4000 G2 multi-site SAN described in Appendix: Setting up the solution. As a result of the simulated VM failure, the SCS VM at Site B became the primary. Since there had been no disruption to the database VM, FT protection allowed SAP transactions to continue running seamlessly.
Losing the vSphere host Figure 4 shows what happened when the vSphere host at Site A (172.16.2.18) was powered off. SAP dialog and SCS instances are both running on the host at Site B (172.16.2.19); however, the SCS VM is showing an alert because it cannot be protected by FT until you provide a secondary VM at some other site. The database VM, which is protected by HA, is in the process of failing over to Site B.
Figure 4 The SCS VM is running at Site B; following the failure of the host at Site A, the DB VM is in the process of failing5 over
After the HA failover, the database VM automatically re-started at Site B, as shown in Figure 5. Because they were protected by FT, Enqueue and Message Services remained active despite the failure of the SCS VM at Site A. As a result, SAP transactions that had been halted by the failure of the database VM at Site A were able to continue processing when it re-started at Site B.
Figure 5 SAP transaction processing continued after the database VM failed over to the surviving host at Site B 6
5 6
Refer to Appendix: Setting up the solution for details on vSphere resources Once the second VMware cluster member is back online the FT protected SCS secondary is spawned and the alert status from figure 4 gets cleared.
Losing storage When power to the P4000 G2 SAN storage nodes at Site A was abruptly cut, all VMs now at Site B were able to continue running, thanks to synchronous storage replication between the sites. After noting that storage at Site A was no longer available, SAN/iQ Failover Manager functionality was able to maintain quorum7. Storage at Site B was still operational and was able to process I/Os for the VMs; however, the loss of a pair of storage nodes led to some degradation of storage performance.
Note Since the failed vSphere host at Site A had also been running a Windows domain controller (SAP-FT-DC) and Domain Name System (DNS) services, the failure of storage at this site constituted a failure of the entire site. For convenience, the domain controller and DNS services in the tested environment were protected by HA.
The P4000 SAN/iQ central management console (CMC) reported that Site As storage was offline, as shown in Figure 6.
Figure 6 Due to the protection provided by Network RAID, a storage failure at Site A caused no interruption to storage services
After completing these three test cases, when HP restored power to the affected VM, vSphere host, or storage nodes, Site A came back online with minimal effort.
FT performance considerations
Protecting a lightweight SCS instance with FT allows a production SAP system to scale to support higher transaction loads until the resource limits of the FT-protected VM are reached.
In the tested environment there was one Failover Manager (vote) for each of the four storage nodes. A Virtual Manager, the fifth vote, was deployed at Site B.
10
Inherent resource requirements Enabling FT inherently requires additional resources because the secondary VM uses as much CPU and memory as the primary. If the secondary VM lags too far behind the primary, which may happen, for example, if the primary VM is CPU-bound and the secondary VM is not receiving enough CPU cycles, the hypervisor may slow the primary to allow the secondary to catch up. Optimizing network traffic When FT is first enabled, the live migration required to spawn a secondary instance may temporarily saturate the associated VMware vMotion network link. If this link is also being used for other operations, such as FT logging, the performance of those operations can be impacted. Thus, you should use separate, dedicated network interface cards (NICs) for FT logging and vMotion traffic. FT-protected VMs that receive large amounts of network traffic or perform lots of disk reads can create a significant load on the NIC specified for logging traffic. In this case, you should implement a dedicated SCS VM that is separate from the database server. To avoid saturating the network link used for FT logging traffic, limit the number of FT-protected VMs on each host or limit the disk-read and network-receive bandwidths of those VMs. Make sure that logging traffic is carried by at least a Gigabit-rated NIC. Multiple FT-protected instances If you are implementing multiple FT-protected SCS instances, distribute the primary VMs across available vSphere hosts. By spreading out the FT logging traffic, which is asymmetric (that is, the majority flowing from the primary VM to the secondary), you avoid saturating the logging NIC.
Conclusion
If you are running an SAP system on VMware software, you can successfully utilize HA to help satisfy the high-availability requirements of mission-critical SAP environments. FT can further enhance availability by protecting SPOFs such as the SCS instance. Combining FT-protection for the SCS with HA-protection for the SAP database helps create an SAP system with no SPOFs. Adding a P4000 G2 SAN to provide disaster-tolerant, replicated storage can further enhance data and application availability without introducing excessive levels of complexity. Since it is critical to understand how to set up and configure the underlying infrastructure, the appendix to this white paper outlines the setup of a VMware-protected SAP system that incorporates a P4000 G2 SAN multi-site architecture.
11
Server-side /Network
vSphere hosts Two HP ProLiant BL460c G6 server blades, each configured with:
72 GB RAM Two 15,000 rpm SAS hard drives One NC373m Dual Port Multifunction 1Gb NIC)
Two Ethernet switches Software vSphere hosts On each BL460c G6 server blade:
VMware ESXi 4.0.0 HP StorageWorks P4000 SAN/iQ 8.5
VMs
Windows Server 2008 NetWeaver 7.0
12
The tested P4300 G2 SAN complied with best practices presented in the HP white paper, Running VMware vSphere 4 on HP P4000 SAN Solutions.
13
While the SAP database server (SAP-FT-DB) and application dialog servers (SAP-FT-DI-1 and SAP-FTDI-2,) have been configured for HA, the characteristics of the SAP SCS instance like single vCPU and FT protection (dark blue icon) are marked with red circles in Figure 9.
To protect a VM with FT, simply right-click its icon in the vSphere client, which allows you to enable the FT feature and spawn a secondary instance on a different vSphere cluster member.
14
Figure 10 SAP Management Console (sapmmc) showing the SAP system configuration
For more information on how the tested SAP systems were installed, refer to the SAP document, SAP Installation Guide: NetWeaver 7.0 ABAP on Windows: MS SQL Server, available at http://service.sap.com/instguides.
15
http://h20195.www2.hp.com/V2/GetDocument.aspx? docname=4AA3-0261ENW&cc=us&lc=en
Other solution white papers, forums, and webinars HP StorageWorks P4000 G2 SAN solutions HP StorageWorks P4000 Virtual SAN Appliance Software VMware VMware Fault Tolerance Recommendations and Considerations on VMware vSphere 4 SAP Solutions on VMware vSphere 4 Best Practice Guidelines Configuration maximums for VMware vSphere 4.1 SAP Solutions on VMware vSphere High Availability Performance Best Practices for VMware vSphere 4.0
http://www.vmware.com/resources/techresources/100 40
SAP SAP S- or C-user account required for the following: SAP Note 1409608 Virtualization on Windows Installation Guide: NetWeaver 7.0 ABAP on Windows: MS SQL Server http://service.sap.com
http://service.sap.com/instguides
Copyright 2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. 4AA0-8259ENW, Created October 2010; Updated October 2010, Rev. 2