Professional Documents
Culture Documents
Section Objectives
Solutions
Products
Technology
Science
POWER5 Technology
POWER5
L3 Dir
servers SMT Core SMT Core
Enhanced memory subsystem
Improved performance
Mem Ctrl
Simultaneous Multi-Threading
1.9 MB L2 Cache
Hardware support for Shared
Processor Partitions (Micro-
Partitioning)
Dynamic power management Chip-Chip / MCM-MCM / SMPLink
Compatibility with existing
POWER4 systems
Enhanced reliability, availability,
serviceability GX+
L3 Dir
– 4-way set associative d-cache SMT Core SMT Core
– New replacement algorithm (LRU vs. FIFO)
Larger L2 cache
– 1.9 MB, 10-way set associative
Mem Ctrl
Improved L3 cache design
1.9 MB L2 Cache
– 36 MB, 12-way set associative
– L3 on the processor side of the fabric
– Satisfies L2 cache misses more frequently
– Avoids traffic on the interchip fabric Chip-Chip / MCM-MCM / SMPLink
On-chip L3 directory and memory controller
– L3 directory on the chip reduces off-chip delays after an
L2 miss
– Reduced memory latencies
Improved pre-fetch algorithms
L3 Dir
L2
L2 L2
L2 L3 L2
L2 L2
L2 L3
Cache
Cache Cache
Cache Cache Cache
Cache Cache
Cache Cache
ri D 3L
Fabric
Fabric Fabric
Fabric Fabric
Fabric Fabric
Fabric
controller
controller controller
controller controller
controller controller
controller
Larger
Memory
Memory Memory
Memory SMPs
L3 L3
Cache Cache
controller
controller controller
controller 64-way
What is it?
Why would I want it?
POWER4 pipeline
Branch
Instruction Fetch pipeline
MP ISS RF EX Load/store WB Xfer
pipeline
IF IC BP
MP ISS RF EA DC Fmt WB Xfer CP
POWER4 instruction pipeline (IF = instruction fetch, IC = instruction cache, BP = branch predict, D0
= decode stage 0, Xfer = transfer, GD = group dispatch, MP = mapping, ISS = instruction issue, RF =
register file read, EX = execute, EA = compute address, DC = data caches, F6 = six-cycle floating-
point execution pipe, Fmt = data format, WB = write back, and CP = group commit)
POWER5 pipeline
Multi-threading evolution
FX0
FX1
LS0
LS1
FP0
FP1
BFX
CRL
ehc a C-i
Processor Cycles
Coarse-grained multi-threading
Swap
Swap
Swap
FX0
FX1
LS0
LS1
FP0
FP1
BFX
CRL
ehc a C-i
Processor Cycles
Fine-grained multi-threading
FX0
FX1
LS0
LS1
FP0
FP1
BFX
CRL
ehc a C-i
Processor Cycles
POWER5 pipeline
Branch
Instruction Fetch pipeline
MP ISS RF EX Load/store WB Xfer
IF
IF IC BP
pipeline
MP ISS RF EA DC Fmt WB Xfer CP
CP
POWER5 instruction pipeline (IF = instruction fetch, IC = instruction cache, BP = branch predict, D0
= decode stage 0, Xfer = transfer, GD = group dispatch, MP = mapping, ISS = instruction issue, RF =
register file read, EX = execute, EA = compute address, DC = data caches, F6 = six-cycle floating-
point execution pipe, Fmt = data format, WB = write back, and CP = group commit)
POWER4 pipeline
FX0
FX1
LS0
LS1
FP0
FP1
BFX
CRL
ehc a C-i
Processor Cycles
Single-threaded operation
Micro-Partitioning
Micro-Partitioning overview
Processor terminology
Shared processor Shared processor Dedicated
partition partition processor partition Logical (SMT)
SMT Off SMT On SMT Off
Virtual
Shared
Dedicated
Inactive (CUoD)
Entitled capacity
Deconfigured
Installed physical
processors
• Capacity weight
– Dedicated memory
• Minimum of 128 MB and 16 MB increments
– Physical or virtual I/O resources
Processing units
– 1.0 processing unit represents one
physical processor Minimum requirement
0.1 processing units
Entitled processor capacity
– Commitment of capacity that is
reserved for the partition
– Set upper limit of processor utilization 0.5 processing unit 0.4 processing unit
for capped partitions
– Each virtual processor must be granted
at least 1/10 of a processing unit of
entitlement
Shared processor capacity is always Processing capacity
delivered in terms of whole physical 1 physical processor
processors 1.0 processing units
Capped partition
– Not allowed to exceed its entitlement
Uncapped partition
– Is allowed to exceed its entitlement
Capacity weight
– Used for prioritizing uncapped partitions
– Value 0-255
– Value of 0 referred to as a “soft cap”
Uncapped (16PPs/16VPs/9.5CE)
15
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Elapsed time
16 virtual processors.
Uncapped.
Can use all available resource.
The workload requires 26 minutes to complete.
Uncapped (16PPs/12VPs/9.5CE)
15
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Elapsed time
12 virtual processors.
Even though the partition is uncapped, it can only use 12
processing units.
The workload now requires 27 minutes to complete.
Capped
Capped (16PPs/12VPs/9.5E)
15
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Elapses time
Dynamic LPAR
Hypervisor
HMC
Firmware
POWER Hypervisor
HMC
HSC
VLAN
Enhanced distributed
L3 Dir
L3 Dir
SMT Core SMT Core SMT Core SMT Core
L3 Dir
Mem Ctrl
Mem Ctrl
SMT Core SMT Core SMT Core SMT Core
1.9 MB L2 Cache 1.9 MB L2 Cache
Mem Ctrl
Chip-Chip / MCM-MCM / SMPLink Chip-Chip / MCM-MCM / SMPLink
1.9 MB L2 Cache 1.9 MB L2 Cache
Virtual I/O
Dynamic LPAR
Capacity Upgrade on Demand
Client Capacity Growth
Planned
Actual Disk LAN
L3 Dir
SMT Core SMT Core SMT Core SMT Core
L3 Dir
L3 Dir
SMT Core SMT Core SMT Core SMT Core
Mem Ctrl
Mem Ctrl
1.9 MB L2 Cache 1.9 MB L2 Cache
Mem Ctrl
Mem Ctrl
1.9 MB L2 Cache 1.9 MB L2 Cache
CPU 2
Chip-Chip / MCM-MCM / SMPLink
CPU 3
dispatch interval will receive half its CE at the
start of the next dispatch interval.
Shared processor pool
L3 Dir
L3 Dir
L3 Dir
SMT Core SMT Core SMT Core SMT Core SMT Core SMT Core SMT Core SMT Core
Mem Ctrl
Mem Ctrl
Mem Ctrl
Mem Ctrl
1.9 MB L2 Cache 1.9 MB L2 Cache 1.9 MB L2 Cache 1.9 MB L2 Cache
Chip-Chip / MCM-MCM / SMPLink Chip-Chip / MCM-MCM / SMPLink Chip-Chip / MCM-MCM / SMPLink Chip-Chip / MCM-MCM / SMPLink
Affinity scheduling
Example
Physical LPAR 1 LPAR 3 LPAR 1 LPAR 3 LPAR 1
IDLE IDLE
processor 0 VP 1 VP 2 VP 1 VP 0 VP 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
POWER Hypervisor dispatch interval pass 1 (msec) POWER Hypervisor dispatch interval pass 2 (msec)
LPAR1
Capacity entitlement = 0.8 processing units; virtual processors = 2 (capped)
LPAR2
Capacity entitlement = 0.2 processing units; virtual processors = 1 (capped)
LPAR3
Capacity entitlement = 0.6 processing units; virtual processors = 3 (capped)
Virtual SCSI
Dynamic LPAR operations allowed. VSCSI server VSCSI server VSCI client VSCI client
adapter adapter adapter adapter
Physical disk
(SCSI, FC)
Virtual devices
Client partition
Virtual
Are defined as LVs in the I/O server hdisk disk
partition
LVM
– Normal LV rules apply
VSCI client
Appear as real devices (hdisks) in the adapter
Performance considerations
Limitations
Implementation guideline
LVM mirroring
Multipath I/O
Virtual I/O Client Virtual I/O
This configuration protects Server partition Server
virtual disks in a client partition partition
partition against failure of: LVM
LVM LVM
(hdisk) (hdisk)
– Failure of one physical FC
adapter in one I/O server VSCSI server
adapter
VSCSI
client
VSCSI
client
VSCSI server
adapter
adapter adapter
– Failure of one Virtual I/O
server POWER Hypervisor
Physical disk is assigned as a Physical FC adapter Physical FC adapter
Node B-1 Node B-2 Node B-3 Node C-1 Node C-2
Virtual Ethernet
How it works
Virtual Ethernet adapter
Dest. MAC in N
table?
Y
Configured associated switch
N
port N
Trunk adapter
defined?
Y Match for
VLAN Nr. in Y
table?
N
Pass to Trunk
Deliver Drop packet
adapter
Performance considerations
Throughput per 0.1 entitlement
Throughput/0.1
Virtual Ethernet performance entitlement
[Mb/s]
1000
Throughput, TCP_STREAM
Virtual LAN vs. Gigabit Ethernet throughput Throughput
[M b/s]
– Virtual Ethernet adapter has higher raw 10000
6000
VLAN
– In-memory copy is more efficient at larger MTU 4000 Gb Ethernet
2000
0
MTU 1
1500 1500 9000 9000 65394 65394
Simpl./Dupl. S D S D S D
Limitations
Implementation guideline
IP Router
IP subnet 1.1.1.X 1.1.1.1 IP subnet 2.1.1.X
2.1.1.1
AIX Linux
Server Server
1.1.1.10 2.1.1.10
Performance considerations
performance
2000
1500
500
capacity entitlement.
Virtual I/O Server
40
20
0
1 2 3 4
MTU 1500 1500 9000 9000
Simplex/Duplex simplex duplex simplex duplex
Limitations
Implementation guideline
AIX Linux
Server Server
10.1.1.14 10.1.2.15
External
network
1. b
2. a
3. d
4. d
5. b
6. a
Unit Summary
Reference