Professional Documents
Culture Documents
Abstract - The objective of this paper is to offer alternatives of II. PERFORMANCE AND LOAD INDEXES
process scheduling and to compare their performances. The
A load index is a metric that quantifies the workload
environments obtained were considered by applying parallel
processing of medical images using the comparison of their submitted to a system resource [9][15] and the objective is to
performance's time to measure them. indicate if the resource analyzed is idle, moderate, or
overloaded [3].
If the load is attributed to the resource, it can be observed
I. INTRODUCTION that the resource is idle when its workload is inexistent or has
a value which is too small, and then it is able to receive a load.
The process of obtaining load index is not common
considering the dynamic nature and does not determine the When the resource is considered moderate, the workload is
executed applications, therefore good load indexes can regular, and therefore the resource can still receive more
improve the global performance observed in the system. workloads. When a resource receives a workload that goes
Traffic bigger than that expected can be generated and as a over a determined limit, it is overloaded and therefore must
result hinders the system performance as a whole. It depends transfer and is not able to receive a load.
on the load frequency information collected for the calculation A load index can be defined as a numerical variable, entire
of the index. and not negative which has a value of zero when the resource
A distributed parallel computational system is usually is idle and, as the load of this resource increases, its value is
compound by processing elements both configurationally and added [9][15][3].
architecturally heterogeneous. This characteristic affects the The load index quality is related to the process scheduling
results of the load indexes and consequently the load performance as its objective is compound by the scheduling
scheduling, resulting either in the improvement or not of the algorithm. Therefore, the way the system load information is
platform’s final performance as a whole. collected and used and the periodicity of the collection
As there was no existing literature on the subject, Branco influences the efficiency of the load balancing.
[3] proposed a performance index that could provide In distributed parallel computational systems, it is of utmost
information on the work load and the operation situation of importance to be aware of the communication. If the
each system element involved in the process, considering frequency loads observed in the processing elements are not
different sorts of heterogeneity. coherent with the real necessity, it is likely that there will be
From this information, the importance of the process undesired results.
scheduling in the distributed parallel environments can be If the load is often updated, an overload occurs in the
observed, as well as the obstacles that can be found to achieve interconnection network and the general performance of the
a good performance. system is reduced; but if it is not often updated, the load
Thus, the objective of this article is to offer alternatives of balancing does not have the real information and will be
process scheduling aiming at load index collection and to carried out in a wrong way, hindering the performance of the
compare their performances. Therefore, it was necessary to system. Therefore, the purpose of the load index is to foresee
provide instruments to the message passage library so that its the future of the load behavior based on the current/past
process scheduling could have a basis for the load and behavior and to provide this information to the scheduler to
performance indexes to carry out load scheduling. The carry out load balancing [9][25][3].
environments achieved were considered by applying parallel Many varieties of load indexes can be mentioned, among
processing of medical images using the comparison of their them the CPU row length (instantaneous), CPU row middle
performance time to measure them. length in a determined time, utilization of the CPU, response
time, response time normalized, quantity of available memory,
rate of the context load, and others [9][25][3]. On the whole,
they can be distributed into groups based on the size of the
V. IMPLEMENTATION
Figure 1: Distributed Parallel Environments using JPVM
The distributed systems must provide an improvement in PAgent
performance mainly concerning the execution time, utilization
of resources, communication in the inter-communication Parallel medical image processing requires quality and
network and mixed hardware and software. quantity of hardware resources, i.e., strong computational
Analyzing the advantages to use mobile agents in power. Datum loss, precision and short processing space time
distributed environments, it is believed that by using a is not allowed [1]. This has made medical image processing an
collection of loads and performance indexes in these interesting application in terms of evaluating the real gain
environments can provide a reduction in traffic in the network achieved when it is paralleled [18].
and, consequently, an improvement in the general The basic requirements of a parallel image processing
performance of the system in question. system consists of an infra-structure compound essentially of
To evaluate the performance of scheduling processes that adequate datum distribution and communication functions that
use load indexes collected by mobile agents in distributed can efficiently execute any image algorithms [1].
computational systems, the following is necessary: to change Edge techniques of softening and detention were chosen in
the way of scheduling carried out by the library of passage order to observe the cost/benefit relation of the application of
message JPVM (Java Parallel Virtual Machine) [8], as its distributed parallel computation technology when compared to
standard is the round-robin; to develop a mobile agent scheduling with and without help of mobile agents.
(PAgent – Performance Agent) using the Java programming The softening techniques are used to reduce noise and to
language in the μCode (muCode) environment [22], to remove small details of an image before the segmentation
develop the resource monitor (PRM – Performance Resource [12][20]. A common softening technique is the median filter
Monitor), which has been incorporated to the library, and to that consists of substituting the value of a pixel determined by
compare the scheduling achieved with the scheduling round- the median value of the neighborhood. The median value is
robin [19]. the central value achieved when the pixels of the
To initiate the modified JPVM daemon in each machine that neighborhood are in order.
composes the system, the PRM is started. These PRMs are The Edge detection is another example of algorithm that
responsible for collecting the raw load from the processor, uses operations based on the neighborhood. It is mainly used
from the memory, from the disc and from the network by the when one wants to know the size and the form of the objects
dstat command (Linux pack installed to evaluate the represented in the image [21].
performance of the machine resources). They are also The Edge detection process uses the gradient concept to
responsible for the performance and load index calculation of enhance the points that show a large difference with their
these resources helped by standardized benchmarks to deal neighborhood. For an image f(x,y) where x and y are the
with the environment heterogeneity, if it exists. spatial co-coordinating – line and column – the gradient of f in
When a machine is introduced into the distributed virtual the co-coordinating x and y can be defined according to
parallel environment (as shown in Figure 1), a message is equation 2 and its magnitude according to equation 3.
discharged, requesting that the servant software μCode must
be started in a determined gate connection. After introducing
all the machines, the PAgent is started by the master machine ª ∂f º
as it goes through the environment. Therefore, the PAgent is « »
responsible for collecting the performance and load indexes (2) ∇f = « ∂x »
calculated by the PRMs and for judging which machines are « ∂f »
able to share as load receivers in the load distribution process. «¬ ∂y »¼
1
the image sort and a larger neighborhood indicates a better
2 2 processing result.
ª§ ∂f · § ∂f ·
2 º
(3) ∇f = mag(∇f )«¨ ¸ + ¨¨ ¸¸ » 15 executions of each algorithm were carried out and the
«¬© ∂x ¹ © ∂y ¹ »¼ averages of execution time in the following scenarios were
achieved:
The gradient concept mentioned was introduced using Sobel (I) Homogeneous environment using JPVM;
operators [21], which calculate the approximate absolute value (II) Homogeneous environment using JPVM
of the gradient in each point of the image analyzed, showing instrumented (with process scheduling using mobile
that the areas with spatial frequency have a high value and agents);
correspond to the image edges [12]. (III) Heterogeneous environment using JPVM; and
The selection criterion of these algorithms is based on how (IV) Heterogeneous environment using JPVM
instrumented (with process scheduling using mobile
many resources are used. The median filter requires an
agents).
intensive memory use, due to the utilization of a sub-vector,
while the Sobel operator requests an intensive processor use, In Figure 2 and Figure 3, it can be observed that the
as successive multiplications are carried out. medium execution time of Sobel processors and median filter
The parallel of the algorithms is based on dividing images algorithms were close both in the homogeneous environment
in parts distributed by processors, processing at the same time using the standard JPVM and in the homogeneous
many parts of a same image. The master host divides and environment using the instrumented JPVM.
distributes the datum to the other hosts; each one processes its
received part and then sends it to the master that recomposes
the image. Hom ogeneous: JPVM - Standard JPVM - PAgent
50000
40000
VI. RESULTS
Exe c ut io n 30000
A parallel homogeneous and heterogeneous distributed T ime ( ms )
20000
environment has been achieved. The homogeneous 10000
environment consisted of seven homogeneous 4 2.7GHz IBM 0
Pentium machines with 512MB of memory. Moreover, the M 4P M 7P M 11P M 14P
pro c e ss e s
heterogeneous environment consisted of seven machines: a 4
2.66GHz Pentium with 1024MB of memory, three IBM 4
Figure 2: Comparison of average execution time of the median
2.7GHz Pentiums with 512MB of memory, a 4 1600 MHz filters in scenarios I and II.
Pentium with 256MB of memory, a 4 1600 MHz Pentium with
256MB of memory, and a 3 733MHz Pentium with 256MB of
Hom ogeneous: JPVM - Standard JPVM - PAgent
memory. All the machines were interlinked by a network
Ethernet of 100 Mbps, and used Linux Kernel 2.6. 20000
To analyze the performance of the environments, the
15000
execution times of the algorithms of parallel medical image E xe c ut io n
10000
processing were compared. First of all, a message passage T im e ( m s )
library JPVM standard was used and after the JPVM 5000