You are on page 1of 33

IBM System Storage DS8800 Performance Whitepaper

December 2010

Andrew W. Lin
David Whitworth
Sonny E. Williams
Yan Xu

Document WP101799

Systems and Technology Group
2010, International Business Machines Corporation


IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 2 of 33
Notices, Disclaimer and Trademarks
Copyright 2010 by International Business Machines Corporation.

No part of this document may be reproduced or transmitted in any form without written
permission from IBM Corporation. Product data has been reviewed for accuracy as of the date
of initial publication. Product data is subject to change without notice. This information may
include technical inaccuracies or typographical errors. IBM may make improvements and/or
changes in the product(s) and/or programs(s) at any time without notice. References in this
document to IBM products, programs, or services does not imply that IBM intends to make such
products, programs or services available in all countries in which IBM operates or does
business. THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS"
WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS
ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR
NON-INFRINGEMENT.

IBM shall have no responsibility to update this information. IBM products are warranted
according to the terms and conditions of the agreements (e.g., IBM Customer Agreement,
Statement of Limited Warranty, International Program License Agreement, etc.) Under which
they are provided. IBM is not responsible for the performance or interoperability of any non-IBM
products discussed herein. The performance data contained herein was obtained in a
controlled, isolated environment. Actual results that may be obtained in other operating
environments may vary significantly. While IBM has reviewed each item for accuracy in a
specific situation, there is no guarantee that the same or similar results will be obtained
elsewhere. Statements regarding IBMs future direction and intent are subject to change or
withdraw without notice, and represent goals and objectives only. The provision of the
information contained herein is not intended to, and does not, grant any right or license under
any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made,
in writing, to:


IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.


IBM, Enterprise Storage Server, ESCON, FICON, FlashCopy, System Storage, System z,
System p, z/OS, zEnterprise, Easy Tier, and DS8000 are trademarks of International Business
Machines Corporation in the United States, other countries, or both. Other company, products
or service names may be trademarks or service marks of others.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 3 of 33
Acknowledgements

The authors would like to thank the following colleagues for their comments and insight:
Lee La Frese IBM Systems & Technology Group, Tucson, AZ
Joseph Hyde IBM Systems & Technology Group, Tucson, AZ
Allen Marin IBM Systems & Technology Group, Boulder, CO
Vic Peltz IBM Systems & Technology Group, San Jose, CA
David Sacks IBM Systems & Technology Group, Chicago, IL
Christopher Sansone IBM Systems & Technology Group, Tucson, AZ
Leslie Sutton IBM Systems & Technology Group, Poughkeepsie, NY

A Note to the Reader

This White Paper assumes a familiarity with the general concepts of Enterprise Disk Storage
Systems. Readers unfamiliar with these topics should consult the References section at the
end of this paper.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 4 of 33
Table of Contents

Executive Summary ................................................................................................................... 5
1 Introduction ........................................................................................................................ 6
1.1 Audience ........................................................................................................................ 6
1.2 Overview of DS8800 Hardware Enhancements.............................................................. 7
2 Open Systems Performance .............................................................................................. 8
2.1 SPC-2 Benchmark Results............................................................................................. 8
2.2 Maximum Throughput Benchmarks................................................................................ 9
2.3 OLTP Performance........................................................................................................10
3 Host Adapter Performance................................................................................................12
4 Device Adapter Performance ............................................................................................14
5 Drive Performance ............................................................................................................16
5.1 HDD Performance.........................................................................................................16
5.2 SSD Performance .........................................................................................................18
6 System z Performance......................................................................................................22
6.1 Maximum Throughput Benchmarks...............................................................................22
6.2 OLTP Performance........................................................................................................23
7 Copy Services Performance..............................................................................................25
7.1 FlashCopy .....................................................................................................................25
7.2 Metro Mirror Establish ...................................................................................................27
8 Conclusions.......................................................................................................................29
9 References........................................................................................................................29
10 Appendix ...........................................................................................................................30

IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 5 of 33
Executive Summary

This paper describes the results of performance measurements conducted by the IBM
Enterprise Storage performance team in Tucson, AZ utilizing the POWER6+ server
technology enhanced IBM

System Storage

DS8800. The main objective of this Paper is to


contrast the performance capacity of the DS8800 to that of the DS8700 and DS8300 models. In
addition to base functionality, performance comparisons are provided for many of the key
advanced features and functions offered by the DS8800, including;

Metro Mirror (synchronous mirroring)
FlashCopy (local subsystem copy)
FlashCopy SE (space-efficient local subsystem copy)
zHPF (System z high performance FICON I/O protocol)

Additionally, performance capability for both Fixed Block (FB) and Count Key Data (CKD) data
formats are included in this Paper.

Compared to the DS8300, laboratory measurements show that the POWER6+ enhanced
DS8800 typically can achieve as much as a 200% performance improvement for sequential
bandwidth. Additionally, transaction processing workloads used in the laboratory
measurements achieved 25% or more improvement in I/O operations per second.

The DS8800 utilizes faster device and host adapters than previous DS8700 and DS8300
models. These enhanced adapters help improve throughput and reduce read miss latency,
especially with solid-state drive (SSD) technology. Laboratory measurements using SSDs in
the DS8800 have shown up to a 156% increase in IOPS (I/Os per second) compared to the
equivalent DS8300 configuration. When compared to SSDs on a DS8700, up to a 49%
increase in IOPS was achieved with a DS8800.

Finally, with a submission of 9,706 MBPS of aggregate sequential throughput, the DS8800
ranks #1 for the industry standard SPC-2 (Storage Performance Council
1
) benchmark. This is
200% faster than the DS8300 and 34% faster than the DS8700 on SPC-2 Aggregate
performance. Sequential bandwidth applications such as business intelligence, data
warehousing, video on demand, and critical batch processing workloads may observe a
significant elapsed time improvement.

1
http://www.storageperformance.org
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 6 of 33
1 Introduction

The IBM System Storage DS8000 series is the flagship disk storage platform within the
IBM System Storage product portfolio. Introduced in October of 2010, the new DS8800 (IBM
2107 Model 951) represents the latest in this series of high-performance, high-capacity, flexible,
and resilient disk storage systems. The most visible change from the DS8300 and DS8700
models is the high density storage enclosure and frame design. The DS8800 provides storage
enclosure support for 24 small form factor (SFF), 2.5", 6 Gbps (gigabits per second) serial-
attached SCSI (SAS) drives in 2U (rack unit) of rack space. This new compact design enables
higher scalability, significant footprint reduction, and greater energy savings as compared to
previous enclosures which only supported 32 drives in 3.5U of rack space. The most notable
changes that enhance performance of the DS8800 include:

8 Gbps PCIe Host Adapter (HA)
8 Gbps PCIe RAID Device Adapters (DA)

Both the DS8800 and the DS8700 utilize the next generation IBM POWER6 processor within
their Central Electronics Complex (CEC) and have replaced the RIO-G I/O enclosure, used on
the DS8300, with the Peripheral Component Interconnect Express (PCIe) I/O enclosure.
Additionally, the POWER6 CEC in the DS8800 is based on the ultra-high frequency, dual-core
POWER6+ processor technology, which at 5.0 GHz is one of the industry leaders in
performance, scalability, and modularity.

The DS8800 delivers unprecedented performance and capacity growth while drawing upon the
rich design heritage of previous DS8000 Storage Systems. The DS8800 is a well-balanced
general purpose storage system that is equally at ease with bandwidth-intensive workloads, I/O-
intensive workloads with low I/O latency requirements. Compared to the performance of the
previous DS8000 system, the DS8700, the new processor, 8Gbps Host Adapters and 8Gbps
Device adapters aids the DS8800 in achieving sequential read bandwidth performance
improvement of up to 20% and sequential write bandwidth performance improvement of up to
40%. Compared to the DS8300, the DS8800 achieves as much as a 200% bandwidth
improvement for sequential workloads. Additionally, transaction processing workloads achieve
as much as 25% or more performance improvement, compared to the DS8300 model.

The DS8800 is available with either a pair of POWER6+ 2-way processor complex or a pair of
POWER6+ 4-way processor complex. The measurement data in this paper reflects
performance of the POWER6+ 4-way model.
1.1 Audience

This technical paper was developed to assist IBM and IBM Business Partner field sales
representatives and technical specialists in understanding the performance characteristics of
the IBM 2107 Model 951 by contrasting the performance with that of the IBM 2107 Model 941
and the IBM 2107 Model 932. The IBM 2107 Model 932 is the DS8300, POWER5+ Turbo
model; the IBM2107 Model 941 is the DS8700, POWER6 model. In this paper the IBM 2107
Model 951 will be referred to as the DS8800, the IBM 2107 Model 941 will be referred to as the
DS8700 while the IBM 2107 Model 932 will be referred to as the DS8300.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 7 of 33
1.2 Overview of DS8800 Hardware Enhancements

The following hardware enhancements have been incorporated in the new DS8800:

CEC:

The DS8800 utilizes the POWER6+, 5.0 GHz dual-core processor in its CEC, as opposed to the
DS8700, which uses the POWER6, 4.7 GHz dual-core processor.

8 Gbps Host Adapters:

The DS8800 model offers enhanced connectivity with 4-port and 8-port fibre-channel (FC) /
FICON host adapters. The advanced host adapters support 8 Gbps, as opposed to 4 Gbps on
the DS8700 and DS8300, and offer up to 100% improvement in single-port throughput
performance and up to 400% improvement in single adapter (e.g. 4-port) throughput
performance. This can help deliver not only faster performance but also cost savings by
enabling a potential reduction in the number of host ports needed to support a given level of
performance.

The new host adapters support FICON attachment to FICON Express8 on zEnterprise 196
(z196) and System z10 (z10 EC, z10 BC).

8 Gbps Device Adapters:

The DS8800 offers 8 Gbps device adapters, as opposed to 2 Gbps on the DS8700 and
DS8300. These adapters are designed to provide improved IOPS performance, throughput, and
scalability. They are optimized for SSD technology architected to support scalability growth over
the long term. These capabilities complement the Power6+ server family to help provide
significant performance enhancements.

High-Density Storage Enclosure:

The DS8800 provides storage enclosure support for 24 SFF, 2.5", 6 Gbps SAS drives in 2U of
rack space, as opposed to 3.5, 3 Gbps on the DS8700 and DS8300. The smaller and more
efficient drives help improve the storage density for drives as compared to previous enclosures,
which support 32 drives in 3.5U of rack space.

Improved High-Density Frame Design:

The DS8800 can support a total of 1056 drives in a smaller footprint (three frames, as opposed
to 1024 drives in five frames on the DS8700 and DS8300), thereby supporting higher density
and helping to preserve valuable raised floor space in data center environments. The DS8800 is
designed to leverage best practices with hot/cold aisle data center design, drawing air for
cooling from the front of the rack and exhausting hot air at the rear of the rack. Coupled with this
improved cooling implementation, the reduced system footprint, and small form factor SAS-2
drives, a fully configured DS8800 consumes up to 40% less power than previous generations of
DS8000. The DS8800 base model supports up to 240 drives, with the first expansion frame
supporting up to 336 drives and second expansion frame supporting up to 480 drives.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 8 of 33
2 Open Systems Performance

The following section describes the results of various Open Systems performance
measurements and draws a comparison among the DS8800, DS8700 and the DS8300. A
detailed description of the configuration for these measurements can be found in Appendix B.
2.1 SPC-2 Benchmark Results

IBM has long been a strong proponent and supporter of the industry-standard benchmarks
developed by the Storage Performance Council for illustration of an objective measure of
storage system performance capabilities. An SPC-2 result for the DS8800 was published in
December of 2010. As of the publication date of this paper, DS8800 owns the leading SPC-2
result for scale up disk storage.

SPC-2 is designed to emulate applications that read and write large blocks of data in a
sequential manner. Examples of these classes of applications include archival, backup,
business intelligence, and video streaming. The SPC-2 benchmark consists of three workloads
as well as a composite score that aggregates all of the workload results into a single metric.

Figure 4 summarizes the SPC-2 results for the DS8800 and compares them to the published
SPC-2 benchmark results for DS8300 and DS8700.
0
2,000
4,000
6,000
8,000
10,000
12,000
M
B
P
S
DS8300 (non-
Turbo)
DS8700 DS8800
Published SPC-2 Comparison
Large File Processing
Large Database Query
Video on Demand
SPC-2 Aggregate
Figure 4: SPC-2 results for DS8300, DS8700 and DS8800

The DS8800 is 200% faster than the DS8300 and 34% faster than the DS8700 on SPC-2
Aggregate performance.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 9 of 33
2.2 Maximum Throughput Benchmarks

For a benchmark utilizing 4KB reads or writes and a 100% cache hit ratio, the DS8800 shows
about a 5% improvement in throughput over the DS8700. Cache Hit workloads show how fast
data can be moved to or from the systems cache. Since the system is retrieving the data from
cache, rather than the disk drives on the backend, this measurement is useful in determining the
performance difference in the systems host adapters and processors. The 5% gain achieved is
primarily a result of the upgrade from the POWER6 processor in the DS8700 to the POWER6+
processor in the DS8800. However, cache hit performance does not reflect the full picture of
the performance capabilities of a storage system, but rather provides a starting point to
understanding the gains that may be attainable by the system. Measurements illustrated in the
following sections will help complete that picture using workloads reflective of real production
systems.

Production systems virtually never exhibit 100% cache hits, but typically feature some
combination of disk and cache accesses. Therefore, the systems cache miss performance is
also of interest. With the Cache Hit workload, the DS8800 boasts an improvement of about 5%
over the DS8700 for 4KB reads and writes with cache miss operations. The 5% improvement is
mostly due to the upgraded POWER6+ processor. This balanced improvement in performance
between cache misses and cache hits for reads and writes is important. It suggests that typical
online transaction processing (OLTP) workloads will improve in a consistent manner when
migrated to DS8800 regardless of the workload mix.

Sequential workloads will generally stress the internal data paths and disk adapters in a storage
system. In Figure 1, we see that the new DS8800 device adapters provide substantial
improvements the DS8700 and huge improvements over the DS8300s adapters.
Measurements showed a gain of 22% for Sequential Reads and 42% for Sequential Writes for
the DS8800 versus the DS8700. The DS8800 is 200% faster than the DS8300 on both
Sequential Reads and Writes.
0.0
2.0
4.0
6.0
8.0
10.0
12.0
G
B
P
S
Bandwidth
64KB Sequential Reads
DS8800 DS8700 DS8300
0.0
3.0
6.0
9.0
G
B
P
S
Bandwidth
64KB Sequential Writes
DS8800 DS8700 DS8300

Figure 1: DS8800 vs. DS8700 vs. DS8300, Open systems Sequential IO

To realize the performance gains shown in Figure 1 a total of 8 DA-Pairs were used. With
normal configuration rules this requires a system with at least 432 drives.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 10 of 33
0.0
4.0
8.0
12.0
16.0
20.0
0 20 40 60 80 100 120 140 160 180 200
I/O Rate (KIOPS)
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
DS8800 - 16 HA, 768 HDD DS8700 - 32 HA, 992 HDD
DS8700 - 32 HA, 512 HDD DS8300 - 16 HA, 512 HDD
2.3 OLTP Performance

Online Transaction Processing workloads are designed to represent the type of mixed I/O
patterns seen in online applications. They are composed of a mixture of both reads and writes
with some cache hits and some cache misses. These workloads access data primarily in a
random fashion. Figure 2 compares the measured performance of the DS8800 against the
DS8700 and the DS8300 with a Database for Open systems (DBO) workload which represents
a typical OLTP environment. This workload is also referred to as 70/30/50 because it is
composed of 70% reads, 30% writes, and 50% read cache hits.





















Figure 2: DS8800 vs. DS8700 vs. DS8300, 4KB DB Open (70/30/50)

In Figure 2 we see the benefit from the upgraded processor resulting in better performance with
a DS8800 than with a DS8700 despite the DS8800 having fewer drives and host adapters than
the DS8700.

In Figure 3, we see a similar comparison using the 50/50/50 workload, which is much like the
70/30/50 OLTP workload but with a higher proportion of writes, i.e. 50% writes and
consequently 50% reads. The basic result is similar although in this case the DS8700s
additional drives provide slightly better response times than measured on the DS8800 at some
data points. The DS8700 tested had 992 drives and 32 host adapters while the DS8800 had
768 drives and 16 host adapters.







IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 11 of 33
0.0
4.0
8.0
12.0
16.0
20.0
0 20 40 60 80 100 120 140 160 180 200
I/O Rate (KIOPS)
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
DS8800 - 16 HA, 768 HDD DS8700 - 32 HA, 992 HDD
DS8700 - 32 HA, 512 HDD DS8300 - 16 HA, 512 HDD





















Figure 3: DS8800 vs. DS8700 vs. DS8300, 4KB 50/50/50


IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 12 of 33
3 Host Adapter Performance

The host adapter hardware in the DS8800 now features an 8 Gbps FCP and FICON
infrastructure as opposed to 4 Gbps on the DS8700/DS8300.

Figure 5 shows the measured gains in single port throughput performance of the DS8800 over
previous models. Here we see incremental performance gains of up to 108% for a single port
for read and write IOPS and an improvement up to 162% for a single HA. Significant bandwidth
performance increases are also shown in Figure 6. Single port bandwidth has increased at
least 100% to over 800 MBPS for both sequential read and write throughput. That increase
translates into at least a 255% improvement when comparing a single 8 Gbps HA on the
DS8800 as opposed to the DS8700s 4 Gbps HA.
0
50
100
150
200
K
I
O
P
S
S
i
n
g
l
e

P
o
r
t

R
e
a
d

H
i
t
S
i
n
g
l
e

P
o
r
t

W
r
i
t
e

H
i
t
S
i
n
g
l
e

H
A

R
e
a
d

H
i
t
S
i
n
g
l
e

H
A

W
r
i
t
e

H
i
t
DS8800 DS8700 DS8300

Figure 5: DS8800 vs. DS8700 vs. DS8300, Open HA 4KB IOPS Performance

IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 13 of 33
0.0
500.0
1,000.0
1,500.0
2,000.0
2,500.0
3,000.0
M
B
P
S
S
i
n
g
l
e

P
o
r
t

S
e
q

R
e
a
d
S
i
n
g
l
e

P
o
r
t

S
e
q

W
r
i
t
e
S
i
n
g
l
e

H
A

S
e
q

R
e
a
d

S
i
n
g
l
e

H
A

S
e
q

W
r
i
t
e
DS8800 DS8700 DS8300



Figure 6: DS8800 vs. DS8700 vs. DS8300, Open HA 64KB Bandwidth Performance

IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 14 of 33
0
10
20
30
40
50
60
70
80
K
I
O
P
S
Read Write
DA-Pair DS8800
DA-Pair DS8700
DA-Pair DS8300
Single DA DS8800
Single DA DS8700
Single DA DS8300
4 Device Adapter Performance

Solid State Drives can sustain significantly higher random IO rates than traditional spinning hard
disk drives (HDDs). When using SSDs with small block random workloads, the device adapters
are more likely to become the bottleneck than the drives themselves. Consequently,
measurements were taken with SSDs to demonstrate the random IO rate performance of the
DAs used in the DS8800. A detailed description of the configuration for these measurements
can be found in Appendix B.

Figures 7 shows read and write performance measured on both a single DA and on a DA pair.
Here we see significant gains in small block random I/O performance for both reads and writes
when comparing the DS8800 DA to the DS8700 and DS8300 DAs. The DA pair measurements
show a 37% gain in read throughput and a 49% gain in write throughput for the DS8800 versus
the DS8700 while the single DA measurements show a 71% and 80% gain in read and write
throughput respectively.


















Figure 7: DS8800 vs. DS8700 vs. DS8300, Device Adapter with SSDs, 4KB Random IO

DA sequential performance is shown in Figure 8. The DA pair measurements show a 175%
gain in sequential read bandwidth and a 191% gain is sequential write bandwidth for the
DS8800 versus the DS8700. For a single DA, the gains for sequential read and write bandwidth
are 175% and 210% respectively. HDDs were used for these measurements. SSD
performance on these tests is about the same as with HDDs because the bottleneck is the DAs
bandwidth capability and not the drives.








IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 15 of 33
0
500
1,000
1,500
2,000
2,500
3,000
3,500
M
B
P
S
Read Write
DA-Pair DS8800
DA-Pair DS8700
DA-Pair DS8300
Single DA DS8800
Single DA DS8700
Single DA DS8300
















Figure 8: DS8800 vs. DS8700 vs. DS8300, Device Adapter with SSDs, 64KB Sequential IO


IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 16 of 33
5 Drive Performance

The DS8800 supports three SFF-2.5 SAS hard drive models: 146GB 15K RPM, 450GB 10K
RPM, and 600 GB 10K RPM. It also supports SFF 2.5 300GB SSDs. A detailed description of
the configuration for these measurements can be found in Appendix B.
5.1 HDD Performance

Figures 9-11 show the performance of a single rank with available RAID configurations of the
2.5 drives on a DS8800 and 15K RPM 3.5 fibre-channel drives on previous models. When
comparing the large form factor (LFF) DS8700 15K RPM drives against the small form factor
(SFF) DS8800 15K RPM drives for 4KB random reads, the DS8800 performed 13%-22% better,
while the DS8800 with the SFF 10K RPM drives performed 17-26% worse. However, for typical
OLTP-like workloads, we expect the performance between SFF 15K and LFF 15K drives to be
similar. For 4KB random writes, both SFF 15K RPM and 10K drives had 15-20% and 26-28%
less throughput respectively than the LFF 15K RPM drives in the DS8700. For sequential read
and write performance, both SFF 15K RPM and 10K RPM drives had equal or better
performance than that of the LFF 15K RPM drives, primarily aided by the DS8800s 8Gbps
Device Adapters










Figure 9: 15K/10K RPM 2.5 vs. 15K RPM 3.5 Drives, 4KB Random Reads and Writes










Figure 10: 15K/10K RPM 2.5 vs. 15K RPM 3.5 Drives, Sequential Reads and Writes

Figure 11 shows 4KB random read throughput and response time curves. The SFF 15K drives
had better response time and higher throughput than LFF 15K drives.

0
1,000
2,000
3,000
I
O
P
S
RAID-6 RAID-5 RAID-10
4KB Random Reads
15K/2.5" 10K/2.5" 15K/3.5"
0
1,000
2,000
3,000
I
O
P
S
RAID-6 RAID-5 RAID-10
4KB Random Writes
15K/2.5" 10K/2.5" 15K/3.5"
0.0
300.0
600.0
900.0
M
B
P
S
RAID-6 RAID-5 RAID-10
Sequential Reads
15K/2.5" 10K/2.5" 15K/3.5"
0.0
300.0
600.0
900.0
M
B
P
S
RAID-6 RAID-5 RAID-10
Sequential Writes
15K/2.5" 10K/2.5" 15K/3.5"
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 17 of 33
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
40.0
0 500 1,000 1,500 2,000 2,500
I/O Rate (IOPS)
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
15K/2.5" 10K/2.5" 15K/3.5"

Figure 11: 15K/10K RPM 2.5 vs. 15K RPM 3.5 Drives, 4KB Random Reads, RAID-5 6+p

In general, the 15K 2.5 drives had better performance than 15K 3.5 drives, except for 4KB
Random Writes. The 10K 2.5 drives had excellent Sequential performance, but had lower
Random Read and Write performance as expected due to the lower rotational speed.

Array rebuild results are shown in Figure 12. Both 15K and 10K 2.5 drives had better rebuild
rate than 15K 3.5 drives, which is shown with better sequential read and write performance in
Figure 12. The workload used for rebuild was an OLTP-like workload.

0.0
20.0
40.0
60.0
80.0
100.0
R
e
b
u
i
l
d

R
a
t
e

(
M
B
P
S
)
No Workload Moderate
Workload
Heavy Workload
15K/2.5" 10K/2.5" 15K/3.5"

Figure 12: 15K/10K RPM 2.5 vs. 15K RPM 3.5 Drives, Rebuild Rate, RAID-5 6+p
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 18 of 33
5.2 SSD Performance

Solid-state drives can support significantly larger workloads than traditional spinning hard disk
drives. As shown in Figure 13, significant gains were seen in small block random I/O
performance for both reads and writes when comparing 2.5 SSDs in the DS8800 to 3.5SSDs
in the DS8700. The improvement was about 71% in both read and write throughput.

0
10
20
30
40
50
K
I
O
P
S
4KB Random Read 4KB Random Write
SFF 2.5" LFF 3.5"
Figure 13: 2.5 300GB SSDs on DS8800 vs. 3.5 SSDs on DS8700, 4KB Random IO


The Sequential performance of SSD drives is shown in Figure 14. The measurements here
show a gain of 27% in sequential read bandwidth for 2.5 SSDs versus 3.5 SSDs in the
DS8700. For sequential write bandwidth, the performance of 2.5 and 3.5 SSDs were
equivalent.


IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 19 of 33
0.0
200.0
400.0
600.0
800.0
M
B
P
S
Read Bandwidth Write Bandwidth
SFF 2.5" LFF 2.5"

Figure 14: 2.5 300GB SSDs on DS8800 vs. 3.5 SSDs on DS8700, Sequential IO

Figures 15 and 16 illustrate SSD performance for various random I/O patterns. The
measurements included random reads, random writes and a 50/50 mixture of reads and writes.
Figure 15 focuses on small block (4KB) I/O while figure 16 covers large block I/O. As seen in
these charts, 2.5 SFF SSDs used on the DS8800 had better response time and throughput with
all the workloads and transfer sizes compared to the 3.5" LFF SSDs used on the DS8700.
0.0
1.0
2.0
3.0
4.0
5.0
0 10,000 20,000 30,000 40,000 50,000
I/O Rate (IOPS)
V
o
l
u
m
e

R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
SFF 2.5" 100% Read Miss SFF 2.5" 50% Read/Write
LFF 3.5" 100% Read Miss LFF 3.5" 50% Read/Write

Figure 15: 300GB/DS8800 vs. 146GB/DS8700, Single Rank (RAID-5) 4KB Random IO

IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 20 of 33
0.0
1.0
2.0
3.0
4.0
5.0
0 5,000 10,000 15,000 20,000 25,000
I/O Rate (IOPS)
V
o
l
u
m
e

R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
SFF 2.5" 100% Read Miss SFF 2.5" 50% Read/Write
LFF 3.5" 100% Read Miss LFF 3.5" 50% Read/Write

Figure 16: 300GB/DS8800 vs. 146GB/DS8700, Single Rank (RAID-5)
Large Block Random IO (56 to 64KB)

An important observation from these results is that the performance improvement with SSDs for
large block writes is not as remarkable as seen with just reads or with small block I/O in general.
For example, while SSDs provide about 20 times the throughput of 15 KRPM HDDs for 4KB
reads, the difference is only about 2 times for large block writes. This is a property of Enterprise
SSDs and not specific to the DS8000. Thus the best use cases for SSDs tend to be small block
I/Os that have a higher percentage of reads.

Figure 17 shows a single SSD array rebuild rate. The 2.5 SSDs in the DS8800 demonstrated
better rebuild rates than the 3.5 SSDs with or without host workload. The workload used during
rebuild was OLTP-like.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 21 of 33
0.0
20.0
40.0
60.0
80.0
100.0
R
e
b
u
i
l
d

r
a
t
e

(
M
B
P
S
)
No Workload Moderate
Workload
Heavy
Workload
SFF 2.5" LFF 2.5"

Figure 17: 2.5 300GB SSDs on DS8800 vs. 3.5 SSDs on DS8700,
Rebuild Rate, RAID-5 6+p
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 22 of 33
6 System z Performance

The following section describes the results of various System z performance measurements and
draws a comparison between the DS8800 and both the DS8700 and the DS8300. A full
description of each workload used can be found in Appendix A. A detailed description of the
configuration for these measurements can be found in Appendix B. The term CKD below
refers to the data format of disk subsystems attached to System z hosts.

6.1 Maximum Throughput Benchmarks

The DS8800 showed improvement for 4KB Read and Write Hit benchmarks compared to the
DS8700 due to use of the new and POWER6+ processors. Testing showed gains of 5% to 10%
from the DS8700 to the DS8800 for cache hits using both zHPF and traditional FICON protocol.
Cache miss benchmarks on the DS8800 also yielded gains over the DS8700. Testing of 4KB
Read and Write Miss performance showed a gain of up to about 14% in terms of IOPS using
both zHPF and FICON protocol.

The large block Sequential Read and Write benchmarks best exploit the systems internal PCIe
fabric as seen in Figure 18. For sequential performance, our testing showed a 15%
improvement for reads and a 7% improvement for writes on the DS8800 compared to the
DS8700.
0.0
4.0
8.0
12.0
G
B
P
S
Bandwidth
6x27KB Sequential Reads
DS8800 DS8700 DS8300
0.0
2.0
4.0
6.0
G
B
P
S
Bandwidth
6x27KB Sequential Writes
DS8800 DS8700 DS8300

Figure 18: DS8800 vs. DS8700 vs. DS8300, System z Sequential IO

161% over
DS8300
171% over
DS8300
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 23 of 33
0.0
3.0
6.0
9.0
12.0
15.0
0 20 40 60 80 100 120 140 160 180 200
I/O Rate (KIOPS)
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
DS8800 zHPF - 16 HA, 384 HDD DS8700 zHPF - 32 HA, 512 HDD
DS8800 FICON - 16 HA, 384 HDD DS8700 FICON - 32 HA, 512 HDD
6.2 OLTP Performance

Figure 19 compares the measured performance results of the DS8800 with that of the DS8700
for the Database for System z (DBz) workload. This workload is designed to be comparable to
online-transaction processing. With the DBz workload, we observed an improvement in
throughput of about 4% from the DS8700, which is achieved with fewer drives and host
adapters but the same number of host channels.





















Figure 19: DS8800 vs. DS8700, CKD 4KB DBz

Figure 20 also shows the results for DBz for the DS8800 compared to the DS8300. The
performance improvement in this comparison is 24% using zHPF and 47% using FICON. The
POWER6+ processors in the DS8800 are the main contributors to the significant increase in
throughput compared to the DS8300.















IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 24 of 33
0.0
3.0
6.0
9.0
12.0
15.0
0 20 40 60 80 100 120 140 160 180 200
I/O Rate (KIOPS)
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
DS8800 zHPF - 16 HA, 384 HDD DS8300 zHPF - 16 HA, 512 HDD
DS8800 FICON - 16 HA, 384 HDD DS8300 FICON - 16 HA, 512 HDD























Figure 20: DS8800 vs. DS8300, CKD 4KB DBz


IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 25 of 33
7 Copy Services Performance

The following section describes the results of various Copy Services performance
measurements and draws a comparison among the DS8800, the DS8700 and the DS8300. A
detailed description of the each workload can be found in Appendix A. A detailed description of
the configuration for these measurements can be found in Appendix B.

7.1 FlashCopy

Both FlashCopy and Space Efficient FlashCopy performance were examined using a System z
(CKD) environment. For FlashCopy performance for Open systems, we observed equal or
better performance to that of System z. For an explanation of these two versions of FlashCopy,
please see Appendix C.

For single volume Background Copy, results from the DS8800 showed a 10% improvement
over the DS8700 and over 77% improvement when compared to the DS8300. A full-box
Background Copy with 8 DA-Pairs (with 48 ranks in DS8800 and 64 ranks in DS8700/DS8300)
was also executed and the DS8800 was a 50% improvement over the DS8700 and a 90%
improvement over the DS8300. Lab measurements show near linear scaling from 2 DA-Pairs to
8-DA Pair configurations. The improvement on the DS8800 from DS8700 benefited from faster
Device Adapters. Background Copy results are shown in Figure 21.
0.0
100.0
200.0
300.0
M
B
P
S
Datarate
Single Volume Copy
DS8800 DS8700 DS8300
0.0
1.0
2.0
3.0
4.0
5.0
G
B
P
S
8 DA-Pairs 2 DA-Pairs
Background Copy
DS8800 DS8700 DS8300

Figure 21: DS8800 vs. DS8700 vs. DS8300, FlashCopy Background Copy

Figure 22 shows the host performance results of running DBz at 60% of the maximum
throughput seen in section 6.2 with either FlashCopy or Space Efficient FlashCopy configured
on the DS8800. The Copy-Source-to-Target performance for this workload showed no
significant difference between the DS8800 and the DS8700 with 8 DA Pairs. With 6 DA Pairs,
the DS8700 performed at least 34% better than the DS8300 for 6 DA Pairs. The chart also
indicates that for this workload on the DS8700, some incremental improvement in throughput
was observed comparing 8 DA pairs with 6 DA Pairs when running with or without FlashCopy.
However, Space Efficient FlashCopy does not show benefit for this workload from additional DA
pairs. On the DS8300 and DS8800, measurements with both 8 DA pairs and 6 DA Pairs are not
available, however, it is expected that a similar amount of change in DBz throughput would be
observed with 8 DA pairs verses 6 DA pairs as was seen with the DS8700.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 26 of 33
0
40
80
120
K
I
O
P
S
No FlashCopy Standard
FlashCopy
Space Efficient
FlashCopy
DS8800 - 8 DA Pairs DS8700 - 8 DA Pairs
DS8700 - 6 DA Pairs DS8300 - 6 DA Pairs

















Figure 22: DS8800 vs. DS8700 vs. DS8300, FlashCopy 4KB DBz at 60% Max Throughput

Results of running a Sequential Write workload with FlashCopy active is shown in Figure 23.
With 8 DA Pairs, the Copy-Source-to-Target measured performance is 14% higher for
FlashCopy volumes and 9% higher for Space Efficient FlashCopy volumes on the DS8800 than
on the DS8700. Similar tests were done on the DS8700 and DS8300 with 6 DA Pairs, the
performance is 48% higher for FlashCopy volumes and 37% higher for Space Efficient
FlashCopy volumes on the DS8700 than on the DS8300. Figure 23 also shows performance
improvement on the DS8700 when using 8 DA Pairs versus 6 DA Pairs. Although
measurements are not available, it is likely that similar improvement in Sequential Write
throughput would be seen when comparing 8 DA Pairs with 6 DA Pairs on both the DS8800 and
the DS8300.

Keep in mind that this is a very intensive worst case FlashCopy environment where each host
I/O causes data to be moved in the background because all of them are sequential writes to the
source volumes. Typical production environments would likely see much less of an effect on
host throughput due to FlashCopy. This also reinforces that Space Efficient FlashCopy is not a
good use case for workloads where the source volumes will be subjected to heavy sequential
writes.















IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 27 of 33
0.0
2.0
4.0
6.0
G
B
P
S
No FlashCopy Standard FlashCopy Space Efficient
FlashCopy
DS8800 - 8 DA Pairs DS8700 - 8 DA Pairs
DS8700 - 6 DA Pairs DS8300 - 6 DA Pairs
















Figure 23: DS8800 vs. DS8700 vs. DS8300, FlashCopy 6x27K Sequential Write

Figure 24 is intended to simulate a typical customer application. This test is designed to
validate that host sequential reads and background copies may coexist on a RAID rank without
one dominating the other. Individually, single volume Background Copy improved 9% and 77%
respectively when comparing with results on the DS8700 and DS8300. Single stream
sequential read increased 22% and 71% respectively versus the DS8700 and DS8300. When
running the two tasks concurrently, the performance improvement was 36% and 72% in
Background-Copy and 59% and 73% in Sequential Read respectively on the DS8800 as
compared to the same experiment measured on the DS8700 and DS8300.
0.0
100.0
200.0
300.0
400.0
M
B
P
S
Single Volume
Background Copy
Single Stream
Sequential Read
Independently
DS8800 DS8700 DS8300
0.0
100.0
200.0
300.0
M
B
P
S
DS8800 DS8700 DS8300
Concurrently
Background Copy Sequential Read

Figure 24: DS8800 vs. DS8700 vs. DS8300, Background Copy with
Concurrent Sequential Read

7.2 Metro Mirror Establish

Figure 25 shows the comparison of Metro Mirror establish bandwidth between the DS8800 and
the DS8700/DS8300. For these tests, the data link between the two DS8800s or
DS8700/DS8300s was a direct connection. With the new 8 Gbps Host Adapters, Metro Mirror
establish bandwidth of the DS8800 was improved over 70% when comparing with
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 28 of 33
0.0
800.0
1,600.0
2,400.0
3,200.0
M
B
P
S
1 Path 2 Paths 4 Paths
DS8800 DS8700/DS8300
measurements from the DS8700/DS8300. Good scaling of bandwidth over number of
connected paths was seen for both the DS8800 and the DS8700/DS8300. This good scaling
would facilitate data migration from one DS8800 or DS8700/DS8300 to another when a
customer uses more Metro Mirror paths.


















Figure 25: DS8800 vs. DS8700/DS8300, Metro Mirror Establish Bandwidth





IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 29 of 33
8 Conclusions

The DS8800 is the next chapter in IBMs flagship enterprise disk platform. Built on 50+ years of
enterprise class innovation, the new DS8800 enables much higher performance and scalability,
while preserving client investments in prior DS8000 models.

Additionally, the DS8800 illustrates IBMs focus on constant improvement of technology and
performance of its storage products. The DS8800 offers:

Faster processor speeds that result in unprecedented performance and capacity growth
while drawing upon the rich heritage of previous DS8000 Storage Systems.
Faster adapters and buses which enable the DS8800 to achieve world class sequential
bandwidth. The SPC-2 benchmark results are ranked #1 overall.
A well-balanced general purpose storage system that can effectively support bandwidth-
intensive workloads and I/O-intensive workloads.


9 References

[1] La Frese, L., Lin, A., Martin, J., Williams, S., and Xu, Y. IBM System Storage
DS8700 Performance Whitepaper. August 2010.

[2] La Frese, L., Sutton, L., and Whitworth, D. IBM System Storage DS8000 with SSDs:
An In-Depth Look at SSD Performance in the DS8000. April 2009.

[3] Roll, M. Understanding Storage Performance: Concepts, Issues and FAQ. 2006.

[4] Lin, A., and Peltz, V. IBM Global Mirror Performance Study. August 2009.

[5] La Frese, L., Hossain, K., Hyde, J., Lin, A., McNutt, B., Sansone, C., Sutton, L., Xu, Y.,
Zhang, Y. IBM System Storage DS8700 Performance with Easy Tier. May
2010.


IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 30 of 33
10 Appendix

10.A Appendix A: Workload Characteristics

Read Hit (WH): 100% Random read requests to cache. A "read hit" test issues read
requests repeatedly to a small group of blocks or records. The number of affected blocks
is small enough to ensure that the entire set can be retained in cache at the same time.
Hence, all requests in the read hit test are serviced out of cache. Read Hit tests
generally give the highest I/O rate for a storage system.
Read Miss (WM): 100% Random read requests to disk. A "read miss" test issues read
requests at random across a storage area much larger than the available cache size.
This test is designed in such a way that the probability of finding the requested data in
cache is nearly zero. Read Miss tests usually serve engineering purposes and are not
typical of customer environments.
Write Hit (WH): 100% Random write requests to cache. A "write hit" test issues write
requests repeatedly to a small group of blocks or records. The number of written blocks
is small enough to ensure that the entire set can be retained in cache at the same time.
It is possible that the controller may defer all destaging until after the completion of a
"write hit" test. This allows throughput on the front end to be isolated and benchmarked.
These types of workloads are for engineering purposes and are not typical of customer
environments.
Write Miss (WM): 100% Random write requests to disk. A "write miss" test issues write
requests at random across a storage area much larger than the available cache size.
This test is designed in such a way that the probability of writing a block a second time,
before that block has been destaged from cache, is almost zero. For this reason, the
number of destage operations is approximately equal to the number of writes.
10.A.1 Open Workloads
70/30/50: An open workload that is similar to typical OLTP applications. Its
characterized by 70% reads, 30% writes, a 50% read hit ratio, an approximate destage
rate of 17% of all I/Os and a 4KB block transfer size.
50/50/50: An open workload that is similar to very write-intensive OLTP applications. Its
characterized by 50% reads, 50% writes, a 50% read hit ratio, an approximate destage
rate of 17% of all I/Os and a 4KB block transfer size.
Sequential: Open Sequential workloads provide for reading or writing data records in
sequential order, one after the other. They are either 100% reads or writes using 64KB
block data transfers to disk, similar to data warehouse scan/load operations. 256K and 1
MB large transfer block sizes have also been used, similar to video imaging operations.
10.A.2 System z workloads
DB z/OS: DB z/OS (formerly known as Cache Standard) is a System z workload that
simulates a typical OLTP environment on the mainframe. Its characterized by 75%
reads, 25% writes, a 4KB block transfer size and skewed I/O rates to different volumes.
DB z/OS has a cache read hit ratio that varies with the configurations cache to
backstore ratio, but a frequently used value is 72%. The destage rate is not constant, but
common values are between 14 - 17% of all I/Os.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 31 of 33
Cache Hostile: This workload is characterized by 67% reads, 33% writes, skewed I/O
and a 4KB block transfer size. It has a write destage rate of 50% and a destage rate of
18.3% of all I/Os. The cache read hit ratio is adjustable depending on testing
requirements and the cache/backstore ratio.
Cache Friendly: This workload is characterized by 83% reads, 17% writes, skewed I/O
and a 4KB block transfer size. It has a write destage rate of 50% and a destage rate of
7.5% of all I/Os. The cache read hit ratio is adjustable depending on testing
requirements and the cache/backstore ratio, but generally uses a value of 83%.
Sequential: These workloads are similar to typical batch processing. 100% Read or
100% Write, 6 x 27K transfers as indicated to/from disk with a sequential access pattern.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 32 of 33
10.B Appendix B: DS8800 Hardware Configurations

10.B.1 Configuration for Open Systems Measurements
RAID-5 measurements were taken with a total of up to 768 146GB 15K RPM drives.
256 GB cache, 16 Host Adapters.
Host workloads were run on an IBM Power 770 host (AIX 6.1.4.0) with 16 8Gb Fibre
Channels.
10.B.2 Configuration for System z Measurements
RAID-5 measurements were taken with 384 146 GB 15K RPM drives.
256 GB cache, 16 Host Adapters.
Host workloads were run on a System z 2097 with 32 8Gb Fibre Channels.
10.B.3 Configuration for Drive Performance Measurements
2.5 15K RPM HDDs are of size 146GB.
2.5 10K RPM HDDs are of size 600GB.
3.5 15K RPM HDDs are of size 300GB, 450GB, or 600GB.
2.5 SSDs are of size 300GB.
3.5 SSDs are of size 146GB or 600GB.
10.B.4 Configuration for FlashCopy
384 146 GB RPM drives across 8 DA-Pairs were used for source volumes and another
384 146 GB RPM drives across the same 8 DA-Pairs were used for target volumes.
256 GB cache, 16 Host Adapters.
Host workloads were run on a System z 2097 with 32 8Gb Fibre Channels.
IBM DS8800 Performance
Document: WP101799 IBM Corporation Page 33 of 33
10.C Appendix C: Definitions and Methodologies

Open system: Sometimes referred to as distributed systems, often attached to an
AIX/UNIX host and uses the Fixed Block data format.
System z: Attached to a z/OS host and uses the CKD data format.
SCSI: Small Computer System Interface. A set of standards for physically connecting
and transferring data between computers and peripheral devices.
IOPS: input/output operations per second.
RAID-5: A popular RAID implementation that optimizes cost effective performance while
emphasizing use of available capacity through data striping. RAID-5 provides fault
tolerance for one failed disk drive. This scheme uses XOR parity for redundancy. Data is
striped across all drives in the array and parity is distributed across all the drives.
RAID-10: Combines two schemes: RAID-0 (data striping) and RAID-1 (mirroring).
Volume data is striped across several drives and the first set of disk drives is mirrored to
an identical set. Since redundancy is achieved through mirroring, there is no parity in
RAID-10. RAID-10 optimizes high performance while maintaining fault tolerance for disk
drive failures. It can tolerate at least one, and in most cases, multiple disk failures.
FlashCopy: Uses normal volumes as target volumes for FlashCopy. These target
volumes have the same size (or larger) as their corresponding source volumes.
Space Efficient FlashCopy (SEFC): Uses volumes formatted for SEFC as the target
volumes for FlashCopy. These volumes, known as Space Efficient volumes, have a
virtual size equal to the source volume size. However, physical space is not allocated for
Space Efficient volumes when the volumes are created and the FlashCopy initiated.
Instead, space is allocated in a Repository when the first update is made to original
tracks on the source volumes and the tracks are copied to the SEFC target volume.
Writes to the SEFC target will also consume Repository space. Space Efficient
FlashCopy can be a cost-effective method for replicating data locally.

You might also like