Professional Documents
Culture Documents
of Computer Design
Kai Bu
kaibu@zju.edu.cn
http://list.zju.edu.cn/kaibu/comparch
Chapter 1
Outline
Classes of computers
Parallelism
Instruction Set Architecture
Trends
Dependability
Performance Measurement
Outline
Classes of computers
Parallelism
Instruction Set Architecture
Trends
Dependability
Performance Measurement
5 Classes of Computers
PMD Characteristics
Cost effectiveness
less expensive packaging;
absence of fan for cooling
Memory efficiency
optimize code size
Energy efficiency
battery power, heat dissipation
Desktop Computing
Largest market share
low-end netbooks: $x00
Desktop Characteristics
Price-Performance
combination of performance and price;
compute performance
graphics performance
The most important to customers,
and hence to computer designers
Servers
Provide large-scale and reliable file and
computing services (to desktops)
Constitute the backbone of large-scale
enterprise computing
Servers Characteristics
Availability
against server failure
Scalability
in response to increasing demand with
scaling up computing capacity,
memory, storage, and I/O bandwidth
Efficient throughput
toward more requests handled in a unit
time
Clusters/WSCs
Warehouse-Scale Computers
collections of desktop computers or servers
connected by local area networks
to act as a single larger computer
Characteristics
price-performance, power, availability
Embedded Computers
hide everywhere
Embedded vs Nonembedded
Dividing line
the ability to run third-party software
Embedded computers primary goal
meet the performance need at a
minimum price;
rather than achieve higher performance
at a higher price
Outline
Classes of computers
Parallelism
Instruction Set Architecture
Trends
Dependability
Performance Measurement
Application Parallelism
DLP: Data-Level Parallelism
many data items being operated on at
the same time
TLP: Task-Level Parallelism
tasks of work created to be operate
independently and largely in parallel
Hardware Parallelism
Computer hardware exploits two kinds
of application parallelism in four major
ways:
Instruction-Level Parallelism
Vector Architectures and GPUs
Thread-Level Parallelism
Request-Level Parallelism
Hardware Parallelism
Instruction-Level Parallelism
exploits data-level parallelism
at modest levels pipelining;
at medium levels speculative exec;
Hardware Parallelism
Vector Architectures &
GPUs (Graphic Process Units)
exploit data-level parallelism
apply a single instruction to a collection
of data in parallel
Hardware Parallelism
Thread-Level Parallelism
exploits either DLP or TLP
in a tightly coupled hardware model
that allows for interaction among
parallel threads
Hardware Parallelism
Request-Level Parallelism
exploits parallelism among largely
decoupled tasks specified by the
programmer or the OS
SISD
Single instruction stream, single data
stream uniprocessor
Can exploit instruction-level parallelism
SIMD
Single instruction stream, multiple data
stream
The same instruction is executed by
multiple processors using different
data streams.
Exploits data-level parallelism
Data memory for each processor;
whereas a single instruction memory
and control processor.
MISD
Multiple instruction streams, single
data stream
No commercial multiprocessor of this
type yet
MIMD
Multiple instruction streams, multiple
data streams
Each processor fetches its own
instructions and operates on its own
data.
Exploits task-level parallelism
Outline
Classes of computers
Parallelism
Instruction Set Architecture
Trends
Dependability
Performance Measurement
ISA: Class
Most are general-purpose register
architectures with operands of either
registers or memory locations
Two popular versions
register-memory ISA: e.g., 80x86
many instructions can access
memory
load-store ISA: e.g., ARM, MIPS
only load or store instructions can
access memory
Size in bits
ASCII character
Unicode character
Half word
16
Integer
word
32
Double word
Long integer
64
32
64
Floating point
extended double precision
80
MIPS64 Operations
Data transfer
MIPS64 Operations
Arithmetic Logical
MIPS64 Operations
Control
MIPS64 Operations
Floating point
http://en.wikipedia.org/wiki/MIPS_architecture
Start with a 6-bit opcode.
R-type:
three registers,
a shift amount field,
and a function field;
I-type:
two registers,
a 16-bit immediate value;
J-type:
a 26-bit jump target.
Computer Architecture
ISA
Organization
actual programmer
high-level aspects
visible instruction set; of computer design:
boundary between sw
memory system,
and hw;
memory
interconnect,
design of internal
processor or CPU;
Hardware
computer specifics:
logic design,
packaging tech;
Outline
Classes of computers
Parallelism
Instruction Set Architecture
Trends
Dependability
Performance Measurement
Five Critical
Implementation Technologies
Integrated circuit logic technology
Semiconductor DRAM
Semiconductor flash
Magnetic disk technology
Network technology
Semiconductor DRAM
Capacity per DRAM chip doubles
roughly every 2 or 3 years
Semiconductor Flash
Electronically erasable programmable
read-only memory
Capacity per Flash chip doubles roughly
every two years
In 2011, 15 to 20 times cheaper per bit
than DRAM
Network Technology
Switches
Transmission systems
Performance Trends
Bandwidth/Throughput
the total amount of work done in a
given time;
Latency/Response Time
the time between the start and the
completion of an event;
DVFS
Trends in Cost
Cost of an Integrated Circuit
wafer for test; chopped into dies for
packaging
Trends in Cost
Cost of an Integrated Circuit
percentage of
manufactured devices
that survives the
testing procedure
Trends in Cost
Cost of an Integrated Circuit
Trends in Cost
Cost of an Integrated Circuit
Trends in Cost
Example
Trends in Cost
Example
Trends in Cost
Cost of an Integrated Circuit
Outline
Classes of computers
Parallelism
Instruction Set Architecture
Trends
Dependability
Performance Measurement
Dependability
SLA: service level agreements
System states: up or down
Service states
service accomplishment
failure
restoration
service interruption
Dependability
Two measures of dependability
Module reliability
Module availability
Dependability
Two measures of dependability
Module reliability
continuous service accomplishment
from a reference initial instant
MTTF: mean time to failure
MTTR: mean time to repair
MTBF: mean time between failures
MTBF = MTTF + MTTR
Dependability
Two measures of dependability
Module reliability
FIT: failures in time
failures per billion hours
MTTF of 1,000,000 hours
= 109/106
= 1000 FIT
Dependability
Two measures of dependability
Module availability
Dependability
Example
Dependability
Answer
Outline
Classes of computers
Parallelism
Instruction Set Architecture
Trends
Dependability
Performance Measurement
Measuring Performance
Execution time
the time between the start and the
completion of an event
Throughput
the total amount of work done in a
given time
Measuring Performance
Computer X and Computer Y
X is n times faster than Y
Quantitative Principles
Parallelism
Locality
temporal locality: recently accessed
items are likely to be accessed in the
near future;
spatial locality: items whose
addresses are near one another tend to
be referenced close together in time
Quantitative Principles
Amdahls Law
Quantitative Principles
Amdahls Law: two factors
1. Fractionenhanced:
e.g., 20/60 if 20 seconds out of a 60second program to enhance
2. Speedupenhanced:
e.g., 5/2 if enhanced to 2 seconds
while originally 5 seconds
Quantitative Principles
Example
Quantitative Principles
The Processor Performance Equation
Quantitative Principles
Example
Quantitative Principles
Example
Reading
Chapter 1.8, 1.10 1.13