You are on page 1of 4

PERFORMANCE EVALUATION OF PEER-TO-PEER APPROACH TO

RESOURCE DISCOVERY FOR LARGE SCALE GRID ENVIRONMENTS

Sivadon Chaisiri1 and Putchong Uthayopas2

ABSTRACT: Grid environment usually consists of large number of shared resources distributed
across multiple locations. Hence, the problem of resource discovery can be profoundly complex due
to the size and complexity of grid system. Recently, peer-to-peer system has been widely used due to
its massive resource sharing environment. The merging of the capability of both systems has a
potential to create a very large scale sharing environment with rich functionality. In this work, the
architecture for building grid networks based on peer-to-peer approach is proposed. Furthermore,
some resource discovery algorithms based on the architecture are explored, and evaluated by building
a simulation. The results of this work will lead to the development of the effectively large scale grid
network in the future.

KEYWORDS: peer-to-peer computing, grid computing, resource discovery algorithm

1. INTRODUCTION

Grid environment consists of a large set of resources, which are distributed among geographical
locations (Foster 1999) (Foster 2001). In large set of shared resources, a basic problem is locating
resources when they are represented only by a set of desired attributes (i.e. a Unix computer with
physical memory of 256 MB) rather than a globally unique identifier (such Internet address).
Moreover, it is impossible to assign the unique name to resources that change their attributes
dynamically (e.g. CPU load and available memory). The complexity and size of grid environments
even make it more challenging to discover the preferred resources in an appropriate amount of time.

Recently, peer-to-peer system is emerging as a new sharing mechanism that widely used around the
world (Alfred 2003). Millions of resources are shared among massive number of computers that can
freely join or leave the system. The dynamism and scale of peer-to-peer system seems to be what is
needed by the grid system. Therefore, the convergence of both systems (peer-to-peer and grid) can
potentially create a large distributed system that is scalable and efficient. Nevertheless, the problem of
discovering resources under this system is still existed.

In this work, a proposed architecture for that combines the benefit of grid and peer-to-peer system are
presented. Furthermore, several resource discovery algorithms also are proposed for this architecture.
Then, the proposed architecture is simulated to evaluate the effectiveness of these proposed
algorithms.

2. PROPOSED ARCHITECTURE

In this work, resources (e.g. computers, printers, sensors etc.) that connected together using the
network are termed “peer”. There are 4 types of peer as follows.

1
Master Student ,High Performance Computing and Networking Center, Kasetsart University
2
Assistant Professor, Department of Computer Engineering, Kasetsart University
• Edge Peer is a peer that shares resources to other peers.
• Super Peer is a peer that keeps a set of locations of edge peers and known super peers.
Furthermore, it has a duty to propagate queries (for resources) received from any peers to its
known peers. We call peers that known by other as “neighbored peers”.
• VO Peer is a peer that forms a virtual organization. Moreover, it defines the policy for
managing peers inside the VO. To join a specific VO by any peers, they need to join via
registering with the VO peer that manages the VO.
• Global Peer is a peer that provides the naming directory of VO peers that need to advertise
their existence to other peers that need to join with.

Grid system proposes the term “virtual organization (VO)” for managing shared resources based on
the defined policy. To merge the capability of peer-to-peer to the grid system for building a
systematic sharing environment, we must provide the concept of virtual organization as well.
However, the conventional peer-to-peer systems do not provide the facility for defining global policy.
Although, some literature such as (Iamnitchi 2001) has also proposed the approach and structure for
merging both grid and peer-to-peer systems, the building of VO has not fully addressed. Our work is
different from since we propose the components for building the grid environment that also address
the problem of building a virtual organization as well. In addition, we show that the propose
algorithms can easily scale to as many as 100,000 peers while other works usually address the system
of only up to 5,000 peers. In the next section, several algorithms for resources discovery will be
proposed.

3. ALGORITHMS

Most of the resource discovery algorithms (included our proposed algorithms) for peer-to-peer
systems are based on the widely used flooding strategy. This strategy is to flood queries from one peer
to its neighbored peers. Firstly, a user submits queries from his peer (called a requester) to other
neighbored peers. When a neighbored peer receives the queries, it will process the queries over its
own local shared resource. If any results are found, it will send a response message (included
information for accessing to the resources) back to requester. The neighbored peer also forwards the
queries to its own neighbored peers. To prevent infinite loop of flowing queries in the network, every
message of query is handled by time to live (TTL) that specifies how many hops the message may
take. When a peer receives a query, it decrements the TTL, and if the TTL is greater than 0, it still
forwards the query to all its neighbored peers. Until the TTL is 0, the query is stopped forwarding.

The advantage of this flooding strategy lies on the fact that it does not require any centralized servers.
However, this strategy produces large amount of messages in order to propagate the queries to peers.
This it will degrade the overall performance of network severely.

In this work, 3 resource discovery algorithms are proposed as follows:

• Algorithm-1 (named “Query-Filter Algorithm) this algorithm is designed to improve the old
flooding algorithm that we labeled as Algorithm-0 (named “Dumb Flooding Algorithm”).
The query-filter algorithm improves the dumb flooding algorithm by reducing the bursting of
produced messages. Each query attaches the information about routed peers (the peers that
routed by queries). That is, when a query passes each peer, the query will attach information
to inform that it has already passed this peer. Consequently, before propagating queries to
others by a peer, it must check the routed peers from the information. If it found that the
query had to pass their neighbored peers, they will not propagate it to them; otherwise, they
continuously flood it to them. Furthermore, every peer also filters seen queries by checking
from the identifier number of every query. That is, each peer must store identifier numbers of
received queries. If it found the seen queries, it will not propagate them to others.

• Algorithm-2 (named “Backing Links Algorithm”) this algorithm improves on the query-
filter algorithm. The algorithm is done when a preferred resource is found, the response
message that included links (URLs for connecting peers) of all routed peers will be returned
to each others (included the requester who send the queries) for connecting them as
neighbored peers. We assumed that the requester should discover preferred resources faster if
it knows a lot of neighbor peers.

• Algorithm-3 (named “Backing Resource Links Algorithm”) this algorithm improves the
backing links algorithm by inserting additional information to the response message. The
additional information is about the method for accessing the preferred resource. We assumed
that if peers know about resources of many peers, the performance of resource discovery
should be speeded up.

4. SIMULATION AND EVALUATION

We build the simulated environment for evaluate the proposed resource discovery algorithms. The
simulation is implemented by HyperSim (Sugree 2003). We simulated 4 sharing environments
composed with different numbers of peers: 100, 1,000, 10,000, and 100,000 peers. In each
environment, every VO consists of one selected VO peer to manage a policy. Each peer shares only
one unique resource. Every peer is assumed to be connected via 1 Mbps network.

Table 1. Configuration of Simulated Environments

Environment Number of Number of Number of Number of


No. Peers VO Super peers/VO Peers/VO
1 100 1 1 100
2 1,000 2 2 500
3 10,000 4 4 2,500
4 100,000 8 8 12,500

3000000
Num b er of p ro d uced m essag es

We have 2 metrics for testing the algorithms. The first10000 metric is the number of produced messages
Average Response Time (ms.)

2500000
Algorithm-0 9000 Algorithm-0
originated from flooding queries to peers. For testing this8000
Algorithm-1 metric, we simulate situations that there is a
Algorithm-1
peer2000000 Algorithm-2
submits a query for finding out a preferred resource.
7000 Algorithm-2
We observe the number of produced
Algorithm-3 6000 Algorithm-3
messages
1500000 within 10 seconds by setting TTL = 12 hops.5000 From the figure 1., we test two algorithms
(dumb flooding
1000000
and query-filter algorithms) and redo this testing for 100 times on each environment.
4000
3000
Then we calculate the average of the number of produced messages. We found that the number of
500000 2000
peers will dramatically affect to the message production, 1000especially produced by the dumb flooding
0
algorithm. But our three proposed algorithms reduce tons of0 produced messages.
100 1000 10000 100000 100 1000 10000 100000
Number of peers Number of peers

Figure 1: Number of messages produced by the Figure 2: Average response time taken by the
algorithms simulated on each environment algorithms simulated on each environment
In each simulated environment, we test by submitting one query for every 5 seconds. We submit
totally 20 queries and record the average response time calculated from the 20 queries. We redo this
testing for 100 times for each environment. Then we recalculate the average response time of 100
values (recorded from each environment).

The results of the simulation are as shown in Figure 1 and 2. In Figure 1, the number of messages
generated from each is shown. It can be clearly seen that number of message for algorithm 1, 2, and 3
are much less than the basic dump flooding algorithm. But it is hard to distinguish between algorithm
1, 2 and 3 since the number of message is practically at the same level. In Figure 2, the average
response time for querying for each algorithm are shown. A few facts can be observed. First, each
improvement results in the faster query time. The query time of the backing resource links algorithm
is shortest and much better than basic dump flooding algorithm. Moreover, as the number of peers are
increased for 10 times, the average response time increases at a very slow rate. This means that
algorithm-3 has a potential to scale to a very large system size.

5. CONCLUSIONS AND FUTURE WORK

From the simulation, we found that our proposed resource discovery algorithms have a great potential
to be used on a large scale resources sharing environments. The simulation shows that the proposed
algorithms can reduce the number of messages produced effectively. This results in a much less
average response time. In the future work, we need to implement a real system based on our proposed
architecture. Many implementation issues such as buffering, added latency needed to address to create
a fast and scalable infrastructure.

6. REFERENCES

Alfred, W. L. (2003) “The Future of Peer-to-Peer Computing.” Communications of The ACM,


September 2003/Vol.46 No. 9, pages 57 – 61, 2003
Foster, I., Kesselman, C., and Tuecke, S. (2001) “The anatomy of the Grid: Enabling scalable virtual
organization.” The International Journal of Supercomputing Applications, 2001
Foster, I., Kesselman, C., and Tuecke, S. (1999) “The Grid: Blueprint for a New Computing
Infrastructure.” Morgan Kaufmann, 1999.
Iamnichi, A., and Foster, I. (2001) “On Fully Decentralized Resource Discovery in Grid
environments.” The International Workshop on Grid Computing, 2001
Sugree, P., Putchong, U., and Voratas, K. (2003 )“Fast Simulation Model for Grid Scheduling using
HyperSim”, In Proceedings of Winter Simulation Conference, New Orlean, 2003.

You might also like