You are on page 1of 9

DS/UNIT 1 Truba college of Science & Tech.

, Bhopal

Distributed Computing System

Over the past two decades, advancements in microelectronic technology have resulted in the
availability of fast, inexpensive processors, and advancements in communication technology have
resulted in the availability of cost-effective and highly efficient computer networks.

The net result of the advancement in these two technologies is that the price performance ratio has
now changed to favour the use of interconnected multiple processors in place of a single, high-
speed processor.

Computer architecture consisting of interconnected; multiple processors are basically of two types:

1. Tightly coupled systems: In these systems, there is a single system wide primary memory
(address space) that is shared by all the processors. If any processor writes; for example, the
value 100 to the memory location x, any other processor subsequently reading from location
x will get the value 100. Therefore, in these systems, any communication between the
processors usually takes place through the shared memory.
2. Loosely coupled systems: In these systems, the processors do not share memory, and each
processor has its own local memory. If a processor writes the value 100 to the memory
location x, this write operation will only change the contents of its local memory and will
not affect the contents of the memory of any other processors. Hence, if another processor
reads the memory location x, it will get whatever value was there before in that location of
its own local memory .In these systems, all physical communication between the processors
is done by passing messages across the network that interconnects the processors.

Usually, Tightly coupled systems are referred to as parallel processing systems, and Loosely
coupled systems are referred as distributed computing systems, or simply distributed systems.

A distributed system is a collection of autonomous computers linked by a computer network that


appear to the users of the system as a single computer.

Some comments:
System architecture: the machines are autonomous; this means they are computers which, in
principle, could work independently;
The users perception: the distributed system is perceived as a single system solving a certain
problem (even though, in reality, we have several computers placed in different locations). By
running a distributed system software the computers are enabled to:
- coordinate their activities
- share resources: hardware, software, data.

Examples of Distributed Systems

Network of workstations

Personal workstations + processors not assigned to specific users.

Prepared by : Nandini Sharma(CSE Deptt.) Page 1


DS/UNIT 1 Truba college of Science & Tech., Bhopal

Single file system, with all files accessible from all machines in the same way and using the same path
name.
For a certain command the system can look for the best place (workstation) to execute it.

Examples of Distributed Systems


Automatic banking (teller machine) system

Primary requirements: security and reliability.


Consistency of replicated data.
Concurrent transactions (operations which involve accounts in different banks; simultaneous access.

Examples of Distributed Systems


Distributed Real-Time Systems

Synchronization of physical clocks

Prepared by : Nandini Sharma(CSE Deptt.) Page 2


DS/UNIT 1 Truba college of Science & Tech., Bhopal

Scheduling with hard time constraints


Real-time communication
Fault tolerance

Distributed Computing System is basically a collection of processors interconnected by a communication


network in which each processors has its own local memory and other peripherals, and the communication
between any two processors of the system takes place by message passing over the communication network.

Distributed Computing System Models

Various models are used for building Distributed Computing System. These models can be broadly
classified into five categories minicomputer, workstation, workstation-server, processor pool, and hybrid.
They are described below.

1. Minicomputer Model is a simple extension of the centralized time-sharing system. As shown in fig., a
Distributed Computing System based on this model consists of a few minicomputer interconnected by a
communication network. Each minicomputer usually has multiple users simultaneously logged on it. For
this, several interactive terminals are connected to each minicomputer. Each user is logged on to one
specific minicomputer, with remote access to other minicomputers. The network allows a user to access
remote resources that are available on some machine other than the one on to which the user is currently
logged. The minicomputer model may be used when resources sharing (such as sharing of information
databases of different types, with each type of database located on a different machine) with remote
users is desired.
Example- ARPAnet is an example of a Distributed Computing System based on the minicomputer
model.
2. Workstation Model as shown in fig., a Distributed Computing System based on the workstation model
consists of several workstations interconnected by a communication network. A companys office or a
university department may have several workstation scattered throughout a building or campus, each
workstations equipped with its own disk and serving as a single-user computer. It has been often found
that in such an environment, at any one time, a significant proportion of the workstations are idle,
resulting in the waste of large amounts of CPU time. Therefore, the idea of the workstation model is to
interconnect all these workstation may be used to process jobs of users who are logged onto other
workstations and do not have sufficient processing power at their own workstation to get their jobs
processed efficiently.
In this model, a user logs onto one of the workstation called his or her home workstation and submits
jobs for execution. When the system finds that the users workstation does not have sufficient processing
power for executing the processes of the submitted jobs efficiently, it transfers one or more of the
processes from the users workstation to some other workstation that is idle and gets the process
executed there, and finally the result of execution is returned to the users workstation. This model is not
so simple to implements as it might appear at first sight because several issues must be resolved. These
issues are as follows:
How does the system find the idle workstation?
How is a process transferred from one workstation to get it executed on another
workstation?

Prepared by : Nandini Sharma(CSE Deptt.) Page 3


DS/UNIT 1 Truba college of Science & Tech., Bhopal

What happens to a remote process if a user logs onto a workstation that was idle
until now and was being used to execute a process of another workstation?

Three commonly used approaches for handling the third issues are as follows:

The first approach is to allow the remote process share the resources of the workstation
along with its own logged-on users processes. This method is easy to implement, but it
defeats the main idea of workstation serving as personal computers, because if remote
processes are allowed to execute simultaneously with the logged-on users own processes,
the logged-on user does not get his or her guaranteed response.
The second approach is to kill the remote process. The main drawbacks of this method are
that all processing done for the remote process gets lost and the file system may be left in an
inconsistent state, making this method unattractive.
The third approach is to migrate the remote process back to its home workstation, so that its
execution can be continued there. This method is different to implement because it require
the system to support pre-emptive process migration facility.
Example- The Sprite system and an experimental system developed at Xerox PARC.
3. Workstation-Server-Model is a network of personal workstations, each with its own disk and a local
file system. A workstations with its own local disk is usually called a diskful workstations and a
workstations without a local disk is called a diskless workstations. With the proliferation of high-speed
networks, diskless workstations have become more popular in network environments than diskful
workstations, making the workstations-server model more popular than the workstation model for
building Distributed Computing Systems. As shown in fig., a Distributed Computing System based on
the workstations server model consists of a few minicomputers and several workstations interconnected
by a communication network.
Advantages:
In general, it is much cheaper to use a few minicomputer equipped with large, fast
disks that are accessed over the network than a large number of diskful workstations,
with each workstations having a small, slow disk.
Diskless workstations are also preferred to diskful workstations from a system
maintenance point of view. Back up and hardware maintenance are easier to perform
with a few large disks than with many small disks scattered all over a building or
campus. Furthermore, installing new release of software is easier when the software
is to be installed on a few file server machines than on every workstation.
In the workstation-server model, since all files are managed by the file servers, users
have the flexibility to use any workstation and access the files in the same manner
irrespective of which workstation the user is currently logged on. Note that this is
not true with workstation model, in which each workstation has its local file system,
because different mechanisms are needed to access local and remote files.
In the workstations-server model, the request-response described above is mainly
used to access the services of the server machines. Therefore, unlike the
workstations model, this model does not need a process migration facility, which is
difficult to implement.

Prepared by : Nandini Sharma(CSE Deptt.) Page 4


DS/UNIT 1 Truba college of Science & Tech., Bhopal

A user has guaranteed response time because workstations are not used for executing
remote processes. However, the model does not utilize the processing capability of
idle workstations.
Examples- V-system.
4. Processor-Pool Model is based on the observation that most of the time a user does not need any
computing power but once in a while he or she may need a very large amount of computing power for
short time. Therefore, unlike the workstation-server model in which a processor is allocated to each user,
in the processor-pool model the processors are pooled together to be shared by the users as needed. The
pool of processors consists of a large number of microcomputers and minicomputers attached to the
network. Each processor in the pool has its own memory to load and run a system program or an
application program of the distributed computing system.
As shown in figure, in the pure processor-pool model, the processors in the pool have no terminals
attached directly to them, and users access the system from terminals that are attached to the network via
special devices. These terminals are either small diskless workstation or graphic terminals, such as X
terminals. A special server manages and allocates the processors in the pool to different users on a
demand basis. When a user submits a job for computation, an appropriate number of processors are
temporarily assigned to his or her job by the run server. For example, if the users computation job is the
compilation of a program having n segments, in which each of the segments can be compiled
independently to produce separate re-locatable object files, n processors from the pool can be allocated
to this job to compile all the n segments in parallel. When the computation is completed, the processors
are returned to the pool for use by other users. In the processor- pool model there is no concept of a
home machine. That is, a user does not log onto a particular machine but to the system as a whole. This
is in contrast to other models in which each user has a home machine onto which he or she logs and runs
most of his or her programs there by defaults. As compared to the workstation-server model, the
processor-pool model allows better utilization of the available processing power of a distributed
computing system.
Example- Amoeba and the Cambridge distributed computing system.
5. Hybrid model To combine the advantage of the workstation-server and processors-pool models, a
hybrid model may be used to build a distributed computing system. The hybrid model is based on the
workstation-server but with the addition of pool processors. The processors in the pool can be allocated
dynamically for computations that are too large for workstations or to that require several computers
concurrently for efficient execution. This model gives guaranteed response to interactive jobs by
allowing them to be processed on local workstation of the users. It is more expensive to implement than
the workstation-server and processors-pool models.

Advantages & Disadvantages:


1. Inherently Distributed Applications Distributed computing systems come into existence
in some very natural ways. For Example, several applications are inherently distributed in
nature and require a distributed computing system for their realization. For instance, in an
employee database of a nationwide organization, the data pertaining to a particular
employee are generated at the employees branch office, and in addition to the global need
to view the entire database, there is a local need for frequent and immediate access to locally
generated data at each branch office.

Prepared by : Nandini Sharma(CSE Deptt.) Page 5


DS/UNIT 1 Truba college of Science & Tech., Bhopal

Examples Worldwide airline reservations system, a computerized banking system in which a


customer can deposit/withdraw money from their account from any branch of the bank, and factory
automation system controlling robots and machines all along an assembly line.

2. Information Sharing among Distributed Users It can be easily and efficiently shared
by the users working at other nodes of the system. For Example - A project can be
performed by two or more users who are geographically far off from each other but whose
computers are a part of the same distributed computing system, although, the users are
geographically separated from each other, they can work in cooperation.

Example By transferring the files of the project, logging onto each others remote computers to
run programs, and exchanging messages by electronic mail to coordinate the work.

3. Resource Sharing Information is not the only thing that can be shared in a distributed
computing system. Sharing of software resources such as software libraries and database as
well as hardware resources such as printers, hard disks, and plotters can also be done in a
very effective way among all the computers and the users of a single distributed computing
system. For Example In a distributed computing system based on the workstation-server
model the workstation may have no disk or only a small disk(10-20 megabytes) for
temporary storage, and access to permanent files on a large disk can be provided to all the
workstations by a single file server.
4. Better Price-Performance Ratio With the rapidly increasing power and reduction in the
price of microprocessors, combined with the increasing speed of communication networks,
distributed computing systems potentially have a much better price-performance ratio than a
single large centralized system. For Example We saw how a small number of CPUs in a
distributed computing systems based on the processor-pool model can be effectively used
by a large number of users from inexpensive terminals, giving a fairly high price-
performance ratio as compared to either a centralized time sharing system or a personal
computer.
5. Shorter Response times and higher throughput - Distributed computing systems are
expected to have better performance than single-processor centralized systems. The two
most commonly used performance metrics are response time and throughput of user
processes. That is, the multiple processors of distributed computing systems can be utilized
properly for providing shorter response times and higher throughput than a single-processor
centralized system.
6. Higher Reliability refers to the degree of tolerance against errors and component failures
in a system. A reliable system prevents loss of information even in the event of component
failures. The multiplicity of storage devices and processors in distributed computing
systems allows the maintenance of multiple copies of critical information within the system
and the execution of important computations redundantly to protect them against
catastrophic failures.
7. Extensibility and Incremental Growth - Distributed computing systems is that they are
capable of incremental growth. That is, it is possible to gradually extend the power and
functionality of distributed computing systems by simply adding additional resources to the
system as and when the need arises.

Prepared by : Nandini Sharma(CSE Deptt.) Page 6


DS/UNIT 1 Truba college of Science & Tech., Bhopal

8. Better flexibility in Meeting Users Needs - Different type of computers is usually more
suitable for performing different types of computations. For Example, computers with
ordinary power are suitable for ordinary data processing jobs, whereas high-performance
computers are more suitable for complex mathematical computations.

Design Issues with Distributed Systems

Design issues that arise specifically from the distributed nature of the application:
Transparency
Communication
Performance & scalability
Heterogeneity
Openness
Reliability & fault tolerance
Security

Transparency
How to achieve the single system image?
How to "fool" everyone into thinking that the collection of machines is a "simple" computer?

Access transparency
- Local and remote resources are accessed using identical operations.

Location transparency
- Users cannot tell where hardware and software resources (CPUs, files, data bases) are located; the
name of the resource shouldnt encode the location of the resource.

Migration (mobility) transparency


- Resources should be free to move from one location to another without having their names
changed.

Replication transparency
- The system is free to make additional copies of files and other resources (for purpose of
performance and/or reliability), without the users noticing.

Example: several copies of a file; at a certain request that copy is accessed which is the closest to
the client.

Concurrency transparency
- The users will not notice the existence of other users in the system (even if they access the same
resources).

Failure transparency
- Applications should be able to complete their task despite failures occurring in certain
components of the system.

Performance transparency
- Load variation should not lead to performance degradation. This could be achieved by automatic
reconfiguration as response to changes of the load; it is difficult to achieve.

Prepared by : Nandini Sharma(CSE Deptt.) Page 7


DS/UNIT 1 Truba college of Science & Tech., Bhopal

Communication
Components of a distributed system have to communicate in order to interact. This implies
support at two levels:
1. Networking infrastructure (interconnections & network software).
2. Appropriate communication primitives and models and their implementation:
Communication primitives:
- send
- receive
- remote procedure call (RPC)
communication models
- Client-server communication: implies a message exchange between two processes:
the process which requests a service and the one which provides it;
- Group muticast: the target of a message is a set of processes, which are me which are members of
a given group.

Performance and Scalability


Several factors are influencing the performance of a distributed system:
The performance of individual workstations.
The speed of the communication infrastructure.
Extent to which reliability (fault tolerance) is provided (replication and preservation of coherence
imply large overheads).
Flexibility in workload allocation: for example, idle processors (workstations) could be allocated
automatically to a users task.

Scalability
The system should remain efficient even with a significant increase in the number of users and
resources connected:
- cost of adding resources should be reasonable;
- Performance loss with increased number of users and resources should be controlled;
- Software resources should not run out (number of bits allocated to addresses, number of
entries in tables, etc.)

Heterogeneity
Distributed applications are typically heterogeneous:
- different hardware: mainframes, workstations, PCs, servers, etc.;
- different software: UNIX, MS Windows, IBM OS/2,
Real-time OSs, etc.;
- unconventional devices: teller machines, telephone switches, robots, manufacturing systems, etc.;
- diverse networks and protocols: Ethernet, FDDI, ATM, TCP/IP, Novell Netware, etc.

Openness
One of the important features of distributed systems is openness and flexibility:
- every service is equally accessible to every client (local or remote);
- it is easy to implement, install and debug new services;
- users can write and install their own services.
Key aspect of openness:
- Standard interfaces and protocols (like Internet communication protocols)
- Support of heterogeneity (by adequate middleware, like CORBA)

Prepared by : Nandini Sharma(CSE Deptt.) Page 8


DS/UNIT 1 Truba college of Science & Tech., Bhopal

Reliability and Fault Tolerance


One of the main goals of building distributed systems is improvement of reliability.

Availability: If machines go down, the system should work with the reduced amount of resources.
There should be a very small number of critical resources; critical resources: resources which
have to be up in order the distributed system to work.
Key pieces of hardware and software (critical resources) should be replicated if one of them
fails another one takes up - redundancy. Data on the system must not be lost, and copies stored
redundantly on different servers must be kept consistent.
The more copies kept, the better the availability, but keeping consistency becomes more difficult.

Fault-tolerance is a main issue related to reliability: the system has to detect faults and act in a
reasonable way:
mask the fault: continue to work with possibly reduced performance but without loss of data/
information.
fail gracefully: react to the fault in a predictable way and possibly stop functionality for a short
period, but without loss of data/information

Security
Security of information resources:
1. Confidentiality
Protection against disclosure to unauthorised person
2. Integrity
Protection against alteration and corruption
3. Availability
Keep the resource accessible. The appropriate use of resources by different users has
to be guaranteed. Distributed systems should allow communication between programs/users/
resources on different computers. Security risks associated with free access.

Prepared by : Nandini Sharma(CSE Deptt.) Page 9

You might also like