You are on page 1of 20

Department of Computer Science and Engineering

Subject Name: HIGH PERFORMANCE COMPUTING

Subject Code: CS T82

Prepared By:

Mr. J. Amudhavel, AP/CSE


Mr.B.Thiyagarajan, AP/CSE
Mr.D. Saravanan, AP/CSE
Verified by:

Approved by:

UNIT I
Introduction: Need of high speed computing increase the speed of computers history of parallel
computers and recent parallel computers; solving problems in parallel temporal parallelism data
parallelism comparison of temporal and data parallel processing data parallel processing with
specialized processors inter-task dependency. The need for parallel computers - models of computation
- analyzing algorithms expressing algorithms.

2 MARKS
1. What is high performance computing?

(APRIL 2014)

High Performance Computing most generally refers to the practice of aggregating computing power in a way that
delivers much higher performance than one could get out of a typical desktop computer or workstation in order to
solve large problems in science, engineering, or business.
2. Define Computing.

The process of utilizing computer technology to complete a task.


Computing may involve computer hardware and/or software, but must involve some form of a
computer system.
Most individuals use some form of computing every day whether they realize it or not.
3. What is Parallel Computing?

The simultaneous use of more than one processor or computer to solve a problem. Use of multiple
processors or computers working together on a common task.
Each processor works on its section of the problem
Processors can exchange information

4. Why do we need parallel Computing?


Serial computing is too slow
Need for large amounts of memory not accessible by a single processor
To compute beyond the limits of single PU systems:
achieve more performance;
Utilize more memory.
5. To be able to:
solve that can't be solved in a reasonable time with single PU systems;
Solve problems that dont fit on a single PU system or even a single system.
6. So we can:
Solve larger problems
Solve problems faster
Solve more problems
7. Why parallel Processing?
Single core performance growth has slowed.
More cost-effective to add multiple cores.
8. Write Limits of Parallel Computing.
Theoretical Upper Limits:
Amdahl's Law.
Gustafsons Law
Practical Limits
Load Balancing.
Non-computational sections.
Other considerations:

Time to develop/rewrite code.


Time do debug and optimize code

9. How to Classify Shared and Distributed Memory?

All processors have access to a pool of

Memory is local to each processor

shared memory
Access times vary from CPU to CPU

Data exchange by message passing

in NUMA systems

over a network

Example: SGI Altix, IBM P5 nodes

Example: Clusters with single-socket


blades

10. Define Hybrid system

A limited number, N, of processors have access to a common pool of shared memory


To use more than N processors requires data exchange over a network
Example: Cluster with multi-socket blades

11. Define Multi Core Systems.

Extension of hybrid model


Communication details increasingly complex
Cache access
Main memory access
Quick Path / Hyper Transport socket connections
Node to node connection via network

12. Define Accelerated Systems

Calculations made in both CPU and accelerator


Provide abundance of low-cost flops
Typically communicate over PCI-e bus
Load balancing critical for performance

13. What is data parallelism?

(Apr 2013, 2014)

Data parallelism is a form of parallelization of computing across multiple processors in parallel computing
environments. Data parallelism focuses on distributing the data across different parallel computing nodes. It
contrasts to task parallelism as another form of parallelism.

14. Define Stored Program Concept.

Memory is used to store both program and data instructions


Program instructions are coded data which tell the computer to do something
Data is simply information to be used by the program
A central processing unit (CPU) gets instructions and/or data from memory, decodes the instructions and then
sequentially performs them.

15. What is Parallel Computing?


Parallel computing is the simultaneous use of multiple compute resources to solve a computational problem. The
compute resources can include:
A single computer with multiple processors;
An arbitrary number of computers connected by a network;
A combination of both.

16. Why we use Parallel Computing?


There are two primary reasons for using parallel computing:

Save time - wall clock time

Solve larger problems

17. Why do we need high speed Parallel Computing?

(November 2014)

The traditional scientific paradigm is first to do theory (say on paper), and then lab experiments to confirm or deny
the theory. The traditional engineering paradigm is first to do a design (say on paper), and then build a laboratory
prototype. Both paradigms are being replacing by numerical experiments and numerical prototyping. There are
several reasons for this.
Real phenomena are too complicated to model on paper (eg. climate prediction).
Real experiments are too hard, too expensive, too slow, or too dangerous for a laboratory (eg oil reservoir
simulation, large wind tunnels, overall aircraft design, galactic evolution, whole factory or product life cycle
design and optimization, etc.).

18. How do we increase the speed of Parallel Computers?


We can increase the speed of computers in several ways.
By increasing the speed of the processing element using faster semiconductor technology (by using advanced
technology)
By architecture methods. It in turn we can increase the speed of computer by applying parallelism.
19. Use parallelism in Single processor
overlap the execution of number of instructions by pipelining or by using multiple functional units
Overlap the operation of different units.
20. Use parallelism in the problem
Use number of interconnected processors to work cooperatively to solve the problem.

21. Define Temporal Parallelism?


Use parallelism in Single processor. Temporal means Pertaining to time.
Overlap the execution of number of instructions by pipelining or by using multiple functional units
Overlap the operation of different units.

22. State the advantages & Disadvantages of temporal Parallelism?


Synchronization: Identical time

Bubbles in pipeline Bubbles are formed

Fault tolerance: does not tolerate.

Inter task communication small

Scalability: Cant be increased.

23. State the advantages & Disadvantages of Data Parallelism?


Advantages:

No Synchronization:
No Bubbles in pipeline
More Fault tolerance
No communication

Disadvantages:

Static assignment
Partitionable
Time to divide jobs is small.

24. Define Data Parallelism?


Use parallelism in the problem: use number of interconnected processors to work cooperatively to solve
the problem. Here input is divided into some set of jobs and each job is given to one Processor and works
simultaneously.

25. Define Inter task Dependency?

The following assumptions are made in evolving tasks to teacher:


The answer to a question is independent of answers to other questions
Teachers do not have to interact
The same instructions are used to grad all answer books
Tasks are inter related. Some tasks are done independently and simultaneously while others have
to wait for completion of previous tasks.

26. What is parallel Computer?

(Apr 2013 )

Parallel computing is a form of computation in which many calculations are carried out
simultaneously, operating on the principle that large problems can often be divided into smaller
ones, which are then solved concurrently.
27. Comparison Between Temporal and Data Parallelism.
TEMPORAL PARALLELISM

(November 2014)

DATA PARALLELISM

Independent task

Full jobs are assigned

Tasks take equal time

Tasks take different time

Bubbles leads to idling

No bubbles

Task assignment is static

Task assignment is static, dynamic or

Not tolerant to processor

quasi dynamic

Efficient with fine grained

Tolerates to processor
Efficient with coarse grained

28. Specify the types of parallelism that have been seen software.
Task Parallelism
Data parallelism

(November 2014)

11 MARKS
1. What is Parallel Computing and explain?

Additionally, software has been written for serial computation:

To be executed by a single computer having a single Central Processing Unit (CPU);

Problems are solved by a series of instructions, executed one after the other by the CPU. Only one
instruction may be executed at any moment in time.

In the simplest sense, parallel computing is the simultaneous use of multiple compute resources
to solve a computational problem.
The compute resources can include:
A single computer with multiple processors;
An arbitrary number of computers connected by a network;
A combination of both.
The computational problem usually demonstrates characteristics such as the ability to be:
Broken apart into discrete pieces of work that can be solved simultaneously;
Execute multiple program instructions at any moment in time;
Solved in less time with multiple compute resources than with a single compute resource.

Parallel computing is an evolution of serial computing that attempts to emulate what has always been
the state of affairs in the natural world: many complex, interrelated events happening at the same
time, yet within a sequence. Some examples:
Planetary and galactic orbits
Weather and ocean patterns
Tectonic plate drift
Rush hour traffic in LA
Automobile assembly line
Daily operations within a business
Building a shopping mall
Ordering a hamburger at the drive through.

Traditionally, parallel computing has been considered to be "the high end of computing" and has been
motivated by numerical simulations of complex systems and "Grand Challenge Problems" such as:
weather and climate
chemical and nuclear reactions
biological, human genome
geological, seismic activity

mechanical devices - from prosthetics to spacecraft


electronic circuits
manufacturing processes

Today, commercial applications are providing an equal or greater driving force in the development of
faster computers. These applications require the processing of large amounts of data in sophisticated
ways. Example applications include:
parallel databases, data mining
oil exploration
web search engines, web based business services
computer-aided diagnosis in medicine
management of national and multi-national corporations
advanced graphics and virtual reality, particularly in the entertainment industry
networked video and multi-media technologies
collaborative work environments

Ultimately, parallel computing is an attempt to maximize the infinite but seemingly scarce
commodity called time.

2. Why do we need Use Parallel Computing?

(April 2013)

There are two primary reasons for using parallel computing:


Save time - wall clock time
Solve larger problems

Other reasons might include:


Taking advantage of non-local resources - using available compute resources on a wide area
network, or even the Internet when local compute resources are scarce.
Cost savings - using multiple "cheap" computing resources instead of paying for time on a
supercomputer.
Overcoming memory constraints - single computers have very finite memory resources. For
large problems, using the memories of multiple computers may overcome this obstacle.

Limits to serial computing - both physical and practical reasons pose significant constraints to simply
building ever faster serial computers:
Transmission speeds - the speed of a serial computer is directly dependent upon how fast data
can move through hardware. Absolute limits are the speed of light (30 cm/nanosecond) and
the transmission limit of copper wire (9 cm/nanosecond). Increasing speeds necessitate
increasing proximity of processing elements.

Limits to miniaturization - processor technology is allowing an increasing number of


transistors to be placed on a chip. However, even with molecular or atomic-level components,
a limit will be reached on how small components can be.
Economic limitations - it is increasingly expensive to make a single processor faster. Using a
larger number of moderately fast commodity processors to achieve the same (or better)
performance is less expensive.
The future: during the past 10 years, the trends indicated by ever faster networks, distributed
systems, and multi-processor computer architectures (even at the desktop level) suggest that
parallelism is the future of computing
3. What is the Need of high speed computing?
The traditional scientific paradigm is first to do theory (say on paper), and then lab experiments to
confirm or deny the theory.
The traditional engineering paradigm is first to do a design (say on paper), and then build a
laboratory prototype.
Both paradigms are being replacing by numerical experiments and numerical prototyping.
There are several reasons for this.

Real phenomena are too complicated to model on paper (eg. climate prediction).

Real experiments are too hard, too expensive, too slow, or too dangerous for a laboratory (eg oil
reservoir simulation, large wind tunnels, overall aircraft design, galactic evolution, whole factory
or product life cycle design and optimization, etc.).

Scientific and engineering problems requiring the most computing power to simulate are
commonly called "Grand Challenges like predicting the climate 50 years hence, are estimated to
require computers computing at the rate of 1 Tflop = 1 Teraflop = 10^12 floating point operations
per second, and with a memory size of 1 TB = 1 Terabyte = 10^12 bytes. Here is some commonly
used notation we will use to describe problem sizes:

1 Mflop = 1 Megaflop = 10^6 floating point operations per second


1 Gflop = 1 Gigaflop = 10^9 floating point operations per second
1 Tflop = 1 Teraflop = 10^12 floating point operations per second
1 MB = 1 Megabyte = 10^6 bytes
1 GB = 1 Gigabyte = 10^9 bytes
1 TB = 1 Terabyte = 10^12 bytes
1 PB = 1 Petabyte = 10^15 bytes

How do we Increase the speed of Computers.


We can increase the speed of computers in several ways.

By increasing the speed of the processing element using faster semiconductor technology (by
using advanced technology)
By architecture methods. It in turn we can increase the speed of computer by applying
parallelism.
Use parallelism in Single processor
overlap the execution of number of instructions by pipelining or by using multiple functional
units
Overlap the operation of different units.

Use parallelism in the problem


Use number of interconnected processors to work cooperatively to solve the problem.

4. Describe about History of parallel computers:

(November 2014)

A brief history of parallel computers are given below


VECTOR SUPERCOMPUTERS:

Glory days: 76-90


Famous examples: Cray machines

Characterized by:
The fastest clock rates, because vector pipelines can be very simple.
Vector processing.
Quite good vectorizing compilers.
High price tag; small market share.
Not always scalable because of shared-memory bottleneck (vector processors need more data per
cycles than conventional processors). Vector processing is back in various forms: SIMD
extensions of commodity microprocessors (e.g. Intel's SSE), vector processors for game consoles
(Cell), multithreaded vector processors (Cray), etc.
Vector processors went down temporarily because of:
Market issues, price/performance, microprocessor revolution, commodity microprocessors.
Not enough parallelism for biggest problems. Hard to vectorize/parallelize automatically
Didn't scale down.

MPPs
Glory days: 90-96
Famous examples: Intel hypercubes and Paragon, TMC Connection Machine, IBM SP, Cray/SGI
T3E.
Characterized by:
Scalable interconnection network, up to 1000's of processors. We'll discuss these networks shortly
Commodity (or at least, modest) microprocessors.
Message passing programming paradigm.
Killed by:
Small market niche, especially as a modest number of processors can do more and more.
Programming paradigm too hard.
Relatively slow communication (especially latency) compared to ever-faster processors (this is
actually no more and no less than another example of the memory wall).
Today
A state of flux in hardware.
But more stability in software, e.g., MPI and OpenMP.
Machines are being sold, and important problems are being solved, on all of the following:
Vector SMPs, e.g., Cray X1, Hitachi, Fujitsu, NEC.
SMPs and ccNUMA, e.g., Sun, IBM, HP, SGI, Dell, hundreds of custom boxes.
Distributed memory multiprocessors, e.g., Cray XT3, IBM Blue Gene.
Clusters: Beowulf (Linux) and many manufacturers and assemblers.
A complete top-down view: At the highest level you have either a distributed memory
architecture with a scalable interconnection network, or an SMP architecture with a bus.
A distributed memory architecture may or may not provide support for a global memory
consistency model (such as cache coherence, software distributed shared memory, coherent
RDMA, etc.). On an SMP architecture you expect hardware support for cache coherence.
A distributed memory architecture can be built from SMP or even (rarely) ccNUMA boxes. Each
box is treated as a tightly coupled node (with local processors and uniformly accessed shared
memory). Boxes communicate via message passing, or (less frequently) with hardware or
software memory coherence schemes. Both on distributed and on shared memory architectures,
the processors themselves may support an internal form of task or data parallelism. Processors

may be vector processors, commodity microprocessors with multiple cores, or multiple threads
multiplexed over a single core, heterogeneous multicore processors, etc.
Programming: Typically MPI is supported over both distributed and shared-memory substrates
for portability (large existing base of code written and optimized in MPI). OpenMP and POSIX
threads are almost always available on SMPs and ccNUMA machines. OpenMP implementations
over distributed memory machines with software support for cache coherence also exist, but
scaling these implementation is hard and is a subject of ongoing research.
Future
The end of Moore's Law?
Nanoscale electronics
Exotic architectures? Quantum, DNA/molecular.
5. Explain various methods for solving problems in parallel.

(November 2014)

Solving a simple job can be solved in parallel in many ways.


Method 1: Utilizing temporal parallelism
Consider there are 1000 who appeared for the exam. There are 4 questions in each answer book. If a
teacher is to correct these answer books, the following instructions are given to them.

1. Take an answer book from the pile of answer books.


2. Correct the answer to Q1,namely A1.
3. Repeat step 2 for answers to Q2,Q3,Q4 namely A2,A3,A4.
4. Add marks.
5. Put answer book in pile of corrected answer books.
6. Repeat steps 1 to 5 until no answer books are left.
Ask 4 teachers to correct each answer book by sitting in one line.
The first teacher corrects answer Q1,namely A1 of first paper and passes the paper to the second
teacher.
When the first three papers are corrected, some are idle.
Time taken to correct A1=Time to correct A2= Time to correct A3= Time to correct A4=5
minutes. Then first answer book takes 20 min.
Total time taken to correct 1000 papers will be 20+(999*5)=5015 min.This is about 1/4th of the
time taken.
Temporal means pertaining to time.

The method is correct if:

Jobs are identical.

Independent tasks are possible.

Time is same.

No. of tasks is small compared to total no of jobs.

Let no of jobs=n
Time to do a job=p
Each job is divided into k tasks
Time for each task=p/k
Time to complete n jobs with no pipeline processing =np
Time complete n jobs with pipeline processing of k teachers=p+(n-1)p/k=p*[(k+n-1)/k]
Speedup due to pipeline processing=[np/p(k+n-1)/k]=[k/1+(k-1)/n]

Problems encountered:

Synchronization:
Identical time

Bubbles in pipeline
Bubbles are formed

Fault tolerance
Does not tolerate.

Inter task communication


small

Scalability
Cant be increased.

Method 2: Utilizing Data Parallelism


Divide the answer books into four piles and give one pile to eachteacher.
Each teacher takes 20 min to correct an answer book,the time taken for 1000 papers is 5000
min.
Each teacher corrects 250 papers but simultaneously.
Let no of jobs=n
Time to do a job=p
Let there be k teachers
Time to distribute=kq
Time to complete n jobs by single teacher=np

Time to complete n jobs by k teachers=kq+np/k


Speed up due to parallel processing=np/kq+np/k=knp/k8k*q+np=k/1+(kq/np)
Advantages:

No Synchronization:

No Bubbles in pipeline

More Fault tolerance

No communication

Disadvantages:

Static assignment

Partitionable

Time to divide jobs is small

METHOD 3: Combined Temporal and Data Parallelism:


Combining method 1 and 2 gives this method.
Two pipelines of teachers are formed and each pipeline is given half of total no of jobs.
Halves the time taken by single pipeline.
Reduces time to complete set of jobs.
Very efficient for numerical computing in which a no of long vectors and large matrices are used
as data and could be processed.
METHOD 4: Data Parallelism with Dynamic Assignment
A head examiner gives one answer book to each teacher.
All teachers simultaneously correct the paper.
A teacher who completes goes to head examiner for another paper.
If second completes at the same time, then he queues up in front of head examiner.

Advantages:

Balancing of the work assigned to each teacher.

Teacher is not forced to be idle.

No bubbles

Overall time is minimized

Disadvantages:

Teachers Have To Wait In The Queue.

Head examiner can become bottle neck

Head examiner is idle after handing the papers.

Difficult to increase the number of teachers

If speedup of a method is directly proportional to the number, then the method is said to scale well.
Let total no of papers=n
Let there be k teachers
Time waited to get paper=q
Time for each teacher to get, grade and return a paper=(q+p)
Total time to correct papers by k teachers=[n(q+p)/k]
Speed up due to parallel processing=np/[n(q+p)/k]=k/[1+(q/p)]

METHOD 5: Data Parallelism with Quasi-Dynamic Scheduling


Method 4 can be made better by giving each teacher unequal sets of papers to correct. Teacher 1,2,3,4
may be given with 7, 9, 11, 13 papers. When finish that further papers will be given. This randomizes the
job completion and reduces the probability of queue. Each job is much smaller compared to the time to
actually do the job. This method is in between purely static and purely dynamic schedule. The jobs are
coarser grain in the sense that a bunch of jobs are assigned and the completion time will be more than if
one job is assigned.

Comparison between temporal and data parallelism:


TEMPORAL PARALLELISM

(April 2013)
DATA PARALLELISM

Independent task

Full jobs are assigned

Tasks take equal time

Tasks take different time

Bubbles leads to idling

No bubbles

Task assignment is static

Task assignment is static, dynamic or

Not tolerant to processor

quasi dynamic

Efficient with fine grained

Tolerates to processor
Efficient with coarse grained

Data parallel processing with specialized processor:


Data Parallel Processing is more tolerant but requires each teacher to be capable of correcting answers to
all questions with equal case.

METHOD 6: Specialist Data Parallelism

There is a head examiner whop dispatches answer papers to teachers. We assume that teacher 1(T1)
grades A1, teacher 2(T2) grades A2 and teacher i(Ti) grades Ai to question Qi.
Procedure:
Give one answer book to T1,T2,T3,T4
When a corrected answer paper is returned check if all questions are graded. If yes add marks and
put the paper in the output pile.
If no check which questions are not graded
For each I,if Ai is ungraded and teacher Ti is idle send it to teacher Ti or if any other teacher Tp is
idle.
Repeat steps 2,3 and 4 until no answer paper remains in input pile

METHOD 7: Coarse Grained Specialist Temporal Parallelism


All teachers are independently and simultaneously at their pace. That teacher will end up spending a lot
of time inefficiently waiting for other teachers to complete their work.
Procedure:

Answer papers are divided into 4 equal piles and put in the in-trays of each teacher. Each teacher repeats
4 times simultaneously steps 1 to 5.
For teachers Ti(i=1 to 4) do in parallel
Take an answer paper from in-tray
Grade answer Ai to question Qi and put in out-tray
Repeat steps 1 and 2 till no papers are left
Check if teacher (i+1)mod4s in-tray is empty.
As soon as it is empty, empty own out-tray into in-tray of that teacher.
METHOD 8: Agenda Parallelism
Answer book is thought as an agenda of questions to be graded. All teachers are asked to work on the
first item on agenda, namely grade the answer to first question in all papers. Head examiner gives one
paper to each teacher and asks him to grade the answer A1 to Q1.When a teacher finishes this, he is given
with another paper. This is data parallel method with dynamic schedule and fine grain tasks.
6. Briefly explain about Inter Task Dependency with example
The following assumptions are made in evolving tasks to teacher:
The answer to a question is independent of answers to other questions
Teachers do not have to interact
The same instructions are used to grad all answer books

Tasks are inter related. Some tasks are done independently and simultaneously while others have to wait
for completion of previous tasks. The inter relations of various tasks of a job may be represented
graphically as a task graph.
Procedure: Recipe for Chinese vegetable fried rice:
T1: Clean and wash rice
T2: Boil water in a vessel with 1 teaspoon salt
T3: Put rice in boiling water with some oil and cook till soft
T4: Drain rice and cool
T5: Wash and scrape carrots
T6: Wash and string French beans
T7: Boil water with teaspoon salt in 2 vessels
T8: Drop carrots and French beans in boiling water
T9: Drain and cool carrots and French beans
T10: Dice carrots
T11: Dice French beans
T12: Peel onions and dice into small pieces
T13: Clean cauliflower .Cut into small pieces.
T14: Heat oil in iron pan and fry diced onion cauliflower for 1 min in heated oil
T15: Add diced carrots and French beans to above and fry for 2 min.
T16: Add cooled cooked rice, chopped onions and soya sauce to the above and stir and fry for 5 min.
There are 16 tasks in this,in that they have to be carried out in sequence. A graph showing the
relationship among the tasks is given

7. Explain the Various Computation Models?

RAM

PRAM (parallel RAM)

Interconnection Network
Combinatorial Circuits

Parallel and Distributed Computation

Many interconnected processors working concurrently

Connection machine

Internet

(April 2013)

Types of multiprocessing frameworks

Parallel

Distributed

Technical aspects

Parallel computers (usually) work in tight syncrony, share memory to a large extent and have a
very fast and reliable communication mechanism between them.

Distributed computers are more independent, communication is less Frequent and less
synchronous, and the cooperation is limited.

Purposes

Parallel computers cooperate to solve more efficiently (possibly) Difficult problems

Distributed computers have individual goals and private activities. Sometime communications
with other ones are needed. (e. G. Distributed data base operations).

The RAM Sequential Model

RAM is an acronym for Random Access Machine

RAM consists of
A memory with M locations.

Size of M can be as large as needed.


A processor operating under the control of a sequential program which can

load data from memory

store date into memory

execute arithmetic & logical computations on data.


A memory access unit (MAU) that creates a path from the processor to an arbitrary
memory location.

RAM Sequential Algorithm Steps

A READ phase in which the processor reads datum from a memory location and copies it into a
register.

A COMPUTE phase in which a processor performs a basic operation on data from one or two of
its registers.

A WRITE phase in which the processor copies the contents of an internal register into a memory
location.

Explain the PRAM Model of Computation


8. PRAM (Parallel Random Access Machine)

Let P1, P2 , ... , Pn be identical processors

Each processor is a RAM processor with a private local memory.

(April 2014)

The processors communicate using m shared (or global) memory locations, U1, U2, ..., Um.
Allowing both local & global memory is typical in model study.

Each Pi can read or write to each of the m shared memory locations.

All processors operate synchronously (i.e. using same clock), but can execute a different sequence
of instructions.
Some authors inaccurately restrict PRAM to simultaneously executing the same sequence
of instructions (i.e., SIMD fashion)

Each processor has a unique index called, the processor ID, which can be referenced by the
processors program.
Often an unstated assumption for a parallel model

Each PRAM step consists of three phases, executed in the following order:
A read phase in which each processor may read a value from shared memory
A compute phase in which each processor may perform basic arithmetic/logical operations
on their local data.
A write phase where each processor may write a value to shared memory.

Note that this prevents reads and writes from being simultaneous.

Above requires a PRAM step to be sufficiently long to allow processors to do different


arithmetic/logic operations simultaneously.

PRAM Memory Access Methods

Exclusive Read (ER): Two or more processors cannot simultaneously read the same memory
location.

Concurrent Read (CR): Any number of processors can read the same memory location
simultaneously.

Exclusive Write (EW): Two or more processors can not write to the same memory location
simultaneously.

Concurrent Write (CW): Any number of processors can write to the same memory location
simultaneously.

Variants for Concurrent Write

Priority CW: The processor with the highest priority writes its value into a memory location.

Common CW: Processors writing to a common memory location succeed only if they write the
same value.

Arbitrary CW: When more than one value is written to the same location, any one of these values
(e.g., one with lowest processor ID) is stored in memory.

Random CW: One of the processors is randomly selected write its value into memory.

PONDICHERRY UNIVERSITY QUESTIONS


2 Marks:
1.
2.
3.
4.
5.
6.
7.
8.

What is parallel computer? (April 2013) (Q. No. 26, Ref.Pg.No.6)


Define data parallelism. (April 2013) (Q. No. 13, Ref.Pg.No.4)
Compare temporal and data parallel processing. (November 2014) (Q. No. 27, Ref.Pg.No.7)
Greater the value of speedup, better the parallel algorithm justify. (November 2014)
What is high performance computing? (April 2014) (Q. No. 1, Ref.Pg.No.1)
What do you mean by data parallelism? (April 2014) (Q. No. 13, Ref.Pg.No.4)
Explain the need for high speed computing. (November 2014) (Q. No. 17, Ref.Pg.No.5)
Specify the types of parallelism that have been seen software. (November 2014) (Q. No. 28,
Ref.Pg.No.7)

11 Marks:
1. Explain about the models of Computation. (April 2013) (Q. No. 7, Ref.Pg.No.22)
2. Write a comparison of Temporal and Parallel Processing. (April 2013) (Q. No. 5, Ref.Pg.No.17)
3. Consider an examination paper has 4 questions to be answered and there are 1000 answer books.
Illustrate how data parallel processing with specialization processor is done for the above problem.
(November 2013)
4. Discuss the various abstract machine models for parallel computers in detail. (November 2013)
5. A. Compare Temporal and Data Parallelism. (April 2014) (Q. No. 5, Ref.Pg.No.17)
B. Compare BSP and PRAM.
6. Explain the PRAM Model of Computation. (April 2014) (Q. No. 8, Ref.Pg.No.20)
7. Discuss the history of past and present parallel Computers. (November 2014) (Q. No. 4,
Ref.Pg.No.11)
8. Discuss the various parallel computing models. (November 2014) (Q. No. 5, Ref.Pg.No.13)

You might also like