You are on page 1of 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/293793158

Automatic Counting of Leukocytes in Thick Blood Smears

Article · October 2014

CITATIONS READS
0 192

4 authors, including:

Djimeli TSAJIO Alain Bernard Daniel Tchiotsop


Université de Dschang IUT FOTSO Victor-University of Dschang
2 PUBLICATIONS   1 CITATION    29 PUBLICATIONS   131 CITATIONS   

SEE PROFILE SEE PROFILE

Réné Tchinda
Université de Dschang
70 PUBLICATIONS   934 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Energy and Resources Efficiency in Building Codes in West Africa View project

Optimal transmission of biomedical signals in telemedecine View project

All content following this page was uploaded by Daniel Tchiotsop on 17 March 2016.

The user has requested enhancement of the downloaded file.


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

Automatic Counting of Leukocytes in Thick Blood Smears

A. Djimeli#1, D. Tchiotsop#2, P. Nagabhushan#3, and R. Tchinda#4

#1 (Laboratory of Electronic and Signal Processing, Laboratoire d'Automatique et d'Informatique


Appliquée (LAIA), Department of Physics, Faculty of Sciences,UDs, Cameroon)
#2 (Department of Electrical engineering, Laboratoire d'Automatique et d'Informatique
Appliquée (LAIA), IUT-FV, University of Dschang(UDs), Cameroon,)
#3 (Department of Studies in Computer Science, University of Mysore, Mysore-570006, India)
#4 (Laboratoire d'Ingénierie des Systèmes Industriels et de l'Environnement (LISIE), IUT-FV,
University of Dschang(UDs), Cameroon,)

ABSTRACT
Observation of blood samples by microscopist is one of the diagnostic procedures
available for the recognition of different diseases. Blood smears are recommended by Word
Health Organization as gold method in endemic countries for malarial diagnosis. Manual
analysis using blood smears is vulgar but its operatory modes are archaic and time consuming.
Computerized image processing methods can overcome these difficulties. Many automatic
blood smears analysis are presented in the literature but only few of them have focus on thick
blood smears. Because good quality images were used for techniques presented, leukocytes
count was trivial. However, the quality of thick smears images obtained from laboratories is
some time under the standard of World Health Organization. These noisy thick smears images
are complex and are most of the time the result of bad staining conditions. The accuracy
obtained in works presented in the literature may be different if these noisy thick blood films
were used for test because global threholding is not more applicable. We are presenting in this
work an algorithm for counting of leukocytes including cases of complex thick blood images as
parasitemia estimation using leukocytes count is more reliable. Curve approaches for image
segmentation are presented. Curve approach for local threshold selection and clump splitting
was used for leukocyte segmentation. First order statistics features were used in hierarchical
clustering for class separation and back propagation neural network of 35 neurons was trained
for class selection. Leukocytes recognition depends on the local properties of the image. Our
resized images were 1500 × 1250 pixels and experimental results show counting accuracy of
97.98%. The execution time is near 60 seconds for automatic mode and less than 45 seconds
for semi-automatic mode where the operator is invited to click on one leukocyte. Experimental
errors were mainly obtained for bad stained images.

Key words: Curve, Leukocytes count, Segmentation, Statistical pattern recognition, Thick
blood smears

Corresponding Author: DJIMELI TSAJIO Alain B.

R S. Publication, rspublicationhouse@gmail.com Page 176


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

1. INTRODUCTION
Background
Observation of blood samples by microscopist is one of the diagnostic procedures available for
the recognition of different diseases. For the diagnosis of malaria, thin and thick blood smear
analysis are recommended by Word Health Organization as gold method in endemic countries
[1,2,3]. For a fixed number of microscope fields, thick films allow the microscopist to examine
a larger number of red cells for the presence of parasites, and low parasitemias can be more
readily identified by thick film. For these reasons the thick blood smear is mainly used to
know parasitemia and the thin blood smear analysis is mainly used to estimate species. There
are two methods for paratitemia estimation in thick blood smears. The thick film is examined
using the x100 oil immersion objectives and leukocytes are counted on one tally counter until
100 are recorded. A reference of 8000 leukocytes is taken per micro liter of blood. Another
tally counter using one to four + is also in used [4]. It is less precise than the first and it can be
used when it is difficult to count leukocytes in certain complex images. Analysis using blood
smears is vulgar but its operatory modes is archaic and time consuming because a skilful
laboratory technician uses a microscope to visualize and count the number of elements to
estimate the parasitemia. The analysis and processing of microscopic images, in order to
provide an automated procedure to support the medical activity is needed. Many works
presented in the literature [5,6,7] focus on computer vision for thin blood smears analysis, but
few of them have been interested on thick films. Works focus on automatic analysis of thick
blood smears [8,9,10] have used smears images measured in good conditions and there was no
need to present leukocytes count as it seem to be trivial. Meanwhile the accuracy obtained may
be different if complex images were taken for example.

Complexity of thick blood films for leucocytes counting


The distribution of the leukocytes in the thick film varies with the thickness and the part of the
film examined. Blood smears that are too thin or too thick present a problem such as increased
white blood cells. Correct staining and timing following the rules of [11] are therefore critical
for a good quality outcome, as stains depend on the density on elements. Macroscopically, a
properly prepared and stained blood film should be pink in its thin part of the slide and show a
purple/blue tint in the thicker parts [12]. Microscopically, the red blood cells should be pink
and the nuclei of the white blood cells more purple than blue. There should be no or minimal
precipitation, and staining should be uniform throughout the slide [12]. However smear images
obtained from laboratories are not as described. Some slide images are too dark or too pale, too
blue or too pink. Background of others slide images appear blue and contain sometime stain
deposits. Such images are complex and threshold techniques presented in the literature for
parasitemia estimation in thick blood smears are not applicable. Reasons of the bad quality of
thick smear images are given in [12]. Practically, in some of these images, leukocytes are
located in dim background and it become difficult to differentiate the region of interest (ROI)
from the background. Other images have artifacts dimmer than ROI. It can be observed
deposits with same colours, and some time with the same range of size as ROI. Dim
background or artifacts shown in fig 1 are examples of those elements. Fig 2 illustrates
examples of images where leukocytes can‟t be recognized because of the bad staining
conditions. That is why the final decision of medical aided systems is to be taken by operator.
Elements of interest for leukocytes counting are leukocytes nuclei. They cannot be
distinguished in thick blood films with standard colour like in thin blood film because elements
R S. Publication, rspublicationhouse@gmail.com Page 177
International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

are lysed. There are five types of leukocyte nuclei with different shape and size in thick blood
smears: Lymphocytes, monocytes and basophiles are round or near round fragment; while
neutophiles, eosinophils and sometime basophiles are many nearby fragments.

Fig 1: Artifact in thick blood smear.

(a) (b)

Fig 2: Thick film with unrecognized leukocytes

While analyzing images from our database, we remarked that formation of clump due to
touching or overlapping of individual leukocytes was not common for good stained images but
was normal for complex images and must be taken into account. For all these reasons,
thresholding techniques used in the literature for leukocytes count are not applicable for
complex images.

Objective
Counting of leucocytes is useful for parasitemia estimation. The aim of this work is to provide
an algorithm for counting of leukocytes in thick smear images including the case of complex
thick blood films. Our approach include a location, segmentation, classification and counting
of leukocytes. The rest of the paper is organized as follows: section 2 presents the model and
describes the choice of signal and image processing tools, section 3 shows the experimental
analyzes with the proposed methods and section 4 conclude the paper.

2. MATERIALS AND METHODS


2.1 Choice of signal and image processing tools
2.1.1 Colour sub-band selection
A digital image can be considered as a large array of sampled points. Each point also called
pixel has a particular quantized brightness. In binary image, each pixel is just black or white.
For grayscale image, there are 2B different possible intensities, where B (most of the time equal
R S. Publication, rspublicationhouse@gmail.com Page 178
International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

to 8) is the number of bits for the displayed image. Any motorized digital camera mounted on
the microscope can be used for image acquisition. The system presented in [13] is an example
of such advanced system. Input image obtained is colour image. In colour images, each pixel
has a particular colour. For RGB colour system, any pixel colour describes the amount of red,
green and blue in it. Colour image can be seen as three gray scale images of Red sub-band,
green sub-band and blue sub-band. Because it is easy to work on one image than association of
three images, a gray scale image is manly used. In our case, all visual information needed for
leukocytes segmentation are found in gray level images. Moreover, red sub-band seems to
preserve leukocytes shapes than other sub-bands or their combination and was chosen as gray
level working sub-band for image segmentation. However all colour sub-bands are needed for
feature extraction in order to discriminate leukocytes from other elements.

2.1.2 Contrast stretching


In this work, contrast stretching is computed globally to locate ROI and locally to segment
elements. Image enhancement is technique that seeks to improve the visual appearance of an
image to a form better suited for analysis by a human or machine [14]. Contrast stretching is
one of the image enhancement technique that attempts to improve the contrast in an image by
stretching the range of the intensity values it contains to span a desired range of values. It
changes the distribution and range of digital numbers assigned to each pixel in an image in
order to accent the details that are difficult to observe. Let the image 𝐼1 with minimum and
maximum intensity value (Min1, Max1) is to stretch to image 𝐼2 with minimum and maximum
intensity value(Min2, Max2). Contrast stretching consists to find the linear combination
needed to map colour of image number 1 into colour of image number 2 [14]. We have chosen
Min2 = 0 and Max2 = 255, because we want dark region to darker and bright region to be
brighter.

2.1.3 Image thresholding

Image segmentation is the first step in image analysis for further pattern recognition. It is a
critical and essential component of image analysis. It is qualifies as one of the most difficult
tasks in image processing and determines the quality of the final result of analysis. Image
segmentation is a process of partitioning the image into mutually exclusive components or
region where the intersection between each region is null. Segmentation is a partition of the
image I into connected subsets (I1 , I2 , … , In ) such that ni=1 Ii = I with Ii ∩ Ij = Φ (i ≠ j).
As result of segmentation, the digital image is binary. Typically the two colours used for binary
image are black for background and white for foreground.
Thresholding is a popular technique in image segmentation with computation simpler than the
other techniques [15]. When an image consists of only object and background, the best way to
pick up a threshold is to search a histogram, assuming it is bimodal, and find a gray level which
separates the two peaks. Global thresholding and local thresholding are two different
approaches for thresholding. Local thresholding was preferred because complex images are not
bimodal and we made the assumption that ROI is localy bimodal. We are not using histogram
for automatic threshold selection in this work and another approach is proposed based on lines
and rows brightness variations in the ROI. Global threshold is also performed for the location
of ROI.

R S. Publication, rspublicationhouse@gmail.com Page 179


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

2.1.4 Curve analyses


Signal theory is used in most application of our daily life and particularly in image processing.
The signal is a function that conveys information about the behavior or attributes exhibiting
variation in time or in space. Gray level image is a two-dimensional array of columns and
rows. Individual column/row in the gray level image is the variation of the brightness value
(amplitude) with the space coordinate (pixel position). Because our analysis is based on the
graph representation of column/row signal, we refer the method as curve approach. Curve
profile is widely used for handwriting and optical character recognition. The projection of the
bounding boxes onto the horizontal or vertical line collects significant information along the
projection direction [16,17,18].

Curve smoothing
Gray level values variation on the line/row in the image may have sharp, isolated
discontinuities of very short variation in space, which are to be eliminated. Mean filter is used
for curve smoothing as it is simple, intuitive and easy to implement [19]. Mean filtering consist
to replace each value in the signal with the average value of its neighbors including itself.

Curve approaches for leukocytes segmentation

(a) (b) (c)

Fig 3: (a) Signal of one line of thick smears image; (b) Red points represent bits retained with lost
of leukocytes L2 for threshold value 45; (c) Red points represent bits retained with Leukocytes
L3, L4 and L5 belonging to the same region.

Let signal S(i, j), representing the brightness of line i. Thresholding the Signal S(i, j) with the
threshold value t consist to choose all value of S i, j > t pixels of interest or not. For fig 3 the
points of interest are local minimums of S i, j representing leukocytes L1, L2, L3 and L4. For
the global view of the signal, leukocytes L3 and leukocytes L4 are nearer region but while
looking locally they are not close. A practical problem occurs, when both the object and the
background assume some broad range of gray level value. In these cases, the histogram is no
longer bimodal and global threshold algorithm give some time bad result as thresholding at too
high level results in a loss of information (value t = 45) , while thresholding at low levels can
give rise to objectionable background clutter(value t = 60). Local iterative thresholding can be
a solution for this segmentation task if local criteria of leukocytes location are defined.

R S. Publication, rspublicationhouse@gmail.com Page 180


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

The value of the threshold represents in the image, the set of points between two regions with
different intensity in their texture. These set of points are called in the transform domain, high
frequency known as contours. Edge segmentation techniques consist to detect object
boundaries using some edge detection algorithms such as Roberts, Sobel, Prewitt, Marc-
Hildreth, Canny, Zero Crossing and Gaussian [20,21,22]. Fig 4 shows points of edge on S(i, j).
Prewitt and Sobel edge detector (a and b) detect only one pixel of contour. These edge
detectors will give non closed contours and then, are not recommended for the segmentation of
the entire image. Canny edge detector (c) detects too much edge points than needed for
segmentation. Moreover if contours were to be detected with prior knowledge of the image,
leukocyte L5 and second part of leukocyte L2 will be over-segmented. For this reasons edge
detection techniques are not realistic for automatic segmentation of leukocytes in thick film
using curve approach.

(a) (b) (c)


Fig 4: Red colour shows only one point of contour detected for Sobel and Prewitt edge detector
(a) and (b). (c) red colour shows more than 15 points of contour detected.

Closed contour can be obtained using active contour [23]. Active contour or snake is pairs of
points in a signal that is allowed to change its location until a predefine conditions. It can be
used to segment an object by taking into account all lines of the image as signal. Let the initial
set of points are the minimums of the signal. These set of points can recursively change their
position to minimize the energy function until a predefined condition is met. Red doted points
in Fig 5 are such points for depth = 13. Snake is a solution for promising technique for
segmentation using curve approach if local criterions of leukocytes location are defined.

Fig 5: Good approach for leukocytes segmentation. Red points are potential candidate points for
depth=13

Curve growing and curve splitting are promising techniques for curve approach for leukocytes
segmentation. Curve growing can be seen as segmentation technique where an initial point is
iteratively merged according to similarity constraints [24,25,26]. Initial points can be taken as
signal minimum for leukocytes segmentation. The process iteratively merges points until the

R S. Publication, rspublicationhouse@gmail.com Page 181


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

predefined condition is met. The difficulty with this technique is the computation expense and
the difficulty to find good minimum. The concept of curve splitting is to break the signal into a
set of disjoint points which are coherent with themselves. This technique can be seen as
opposite to curve growing.

Curve approach for local threshold selection


The first derivative of line/row signal is the rate of change of gray value amplitude g with the
space coordinate x, is defined as dg/dx. This derivative is interpreted as the slope of the
tangent to the signal at each point. As the x-interval between adjacent points is constant, the
digital first derivative is computed:
dg/dy = (Yj+1 − Yj−1 )/2 × ∆X (1)
Where ∆X = 1 is the difference between the special coordinate X values of adjacent data
points.
Yj is the gray level value of the current point j.
Fig 6 shows the signal of gray level value variation of line signal traveling one leukocyte in
blue (a) and background in blue (b), and the corresponding slope in green. The gray level value
of the corresponding minimum and maximum in the slope curve correspond to researched
points in the curve, called contour points. Leukocyte nuclei can be segmented directly by
choosing points between contour points as interesting points or by using gray level value of
contour points as local threshold value. It can also be observed in fig 6 (b) that the slope of the
background signal is near flat and the amplitude of the slope can help to validate the signal as
to flat slop signal belongs to the background.

(a) region of leukocytes nucleus. (b) Region of background.

Fig 6: Signal of gray level value variation in the image in blue colour and the corresponding slope
in green colour.

Curve approach for clump splitting


Clump of leukocytes is touching or overlapping of individual leukocytes. Clump splitting aims
to separate touching or overlapping leukocytes into individual element to avoid rejection of the
clump that will result to low counting rate. Column/rows of the smears image that travel clump
are two or more nearby interesting minimums. The aim of clump splitting using curve
approach is to isolate interesting minimum in the curve and locate contours points. Because
points with gray level value less than the mean don‟t belong to points of interest, they are first
replaced by maximum value. Valleys that separate two interesting minimums are also detected
and their value in the signal are replaced by the maximum of the signal. Valleys of the signal
are points where slope amplitude value passes from positive value to negative value. Points
that delimit leukocytes peak in the curve are called frontier point and points near the valley are

R S. Publication, rspublicationhouse@gmail.com Page 182


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

some of these points. Contour points that are researched points, and correspond to minimum
and maximum in the slope curve, are easily detected. Fig 7 shows the detail for clump splitting
process: (a) smoothed signal and corresponding slope; (b) shows the valley detected and
frontier points marked; (c) shows contour points detected and interesting points marked in red.

(a) (b) (c)

Fig 7: Detail for clump splitting process. (a) Smoothed signal in blue and corresponding slope
in green. (b) Valley delimitation and point frontier detection. (c) Contour points and Interesting
points detection.

2.1.5 Binary morphological operators


In this work, morphological operators are combined with binary images or their complements
to have non overlapping segmented elements. In the mathematical morphology theory, images
are treated as sets, and morphological transformations which derived from Minkowski addition
and subtraction are defined to extract features in images [27]. Morphology relates to structure
or form of objects. The morphologic operations work with two images: The original binary
image and a structuring element. Each structuring element has a shape which can be thought of
as a parameter to the operation. Most fundamental morphological operations are morphological
dilation and morphological erosion. Based on these, two compound operations named as
opening and closing are defined. The dilation and erosion process are performed by laying the
structuring element B on the image A and sliding it across the image in a manner similar to
convolution. If the origin of the structuring element coincides with a ‟white‟ pixel in the image,
there is no change. If the origin of the structuring element coincides with a ‟black‟ pixel in the
image, dilatation of A and B (A ⊕ B) make black all pixels from the image covered by the
structuring element and erosion of A and B (A ⊝ B) change the ‟black‟ pixel in the image from
„black‟ to a ‟white‟ if at least one of the ‟black‟ pixels in the structuring element falls over a
white pixel in the image. Opening of A and B ((A ⊝ B) ⊕ B) is an erosion operation followed
by dilation. It can be used to eliminate all pixels in regions that are too small to contain the
structuring element. Closing ((A ⊕ B) ⊝ B) consists of a dilation followed by erosion and can
be used to fill holes and small gaps.

2.1.6 First order statistic features


First order statistic features are used in classifiers to know how similar, elements in the image
are. The histogram is a graph showing the number of pixels in an image at each different
intensity value found in that image. For grayscale image there are 2B different possible
intensities, where B most of the time 8 is the number of bits for the displayed image. The

R S. Publication, rspublicationhouse@gmail.com Page 183


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

probability distribution, P(a), is the probability that a brightness chosen from the region is less
than or equal to a given brightness value a.
Median brightness of a given region is the gray level value K that gives:
P K = 0.5. (2)
The mean is defined as:
B max
ma = i=B min
iP(i) (3)
The standard deviation is defined as:
B max
i=B min i−P(i) 2
2
sa = (4)
B max − B min

The coefficient of variation Cv of a given region is computed as:


Cv = Sa × 100/ma (5)
Where Sa is the standard deviation and ma is the mean.
The entropy is define as :
B max
Ent = i=B min
P i log 2 P(i) (6)
Granulometry area Are is the number of pixels of the region.

2.1.7 Hierarchical clustering operations

Hierarchical clustering operations are needed to discriminate elements in the image into two
classes. ROI can be modeled as points (Pj ) in the Cartesian plane. The cluster depends on the
choice of the distance function and the distance value D > 0. Distance function measures
distance between points and linkage function measures distance between clusters. The
Euclidian distance was preferred in this work because it is the most used in biological data
analyses among Manattan distance, Maximum distance and Mahalanobis distance. Euclidian
distance between point Pi and point Pj is computed as:
di,j = (xi − xj )2 + (yi − yj )2 . (7)

Where Pj = xj , yj , 1 ≤ j ≤ M. and M is the number of ROI


The distance between two clusters contain points Pj and Pk is computed as:
dCj,k = min dj,k . (8)
Two points belong to the same cluster if the distance between the two points is less than D and,
two clusters are combined into the same cluster if the distance between two clusters is less
than D [28,29].

2.2 Model building


The global architecture of our system starts with the building of over-segmented image. The
over-segmented image aims to locate probable ROIs. Globally the artifact removal follows the
over-segmented computation step. For each region of the over-segmented image, the local
threshold selection is preformed to get the locally segmented image. Signal validation step is
needed for local threshold selection and for clump splitting steps. Bigger segmented regions are

R S. Publication, rspublicationhouse@gmail.com Page 184


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

spitted and, recognition and counting pre-processing is performed. Class separation using
hierarchical clustering helps to get two classes. Class selection using Neural Network is
performed to test the validity of classes. The system architecture is presented in fig 8.

Over- Local threshold Clump Recognition Class


segmented selection and splitting and counting selection
image segmentation pre-processing and count

Artifacts Curve Curve Class


removal validation validation separation

Fig 8: System architecture.

2.2.1 Over-segmented image

For leukocytes counting task, only leukocytes nuclei are interesting object but there are also
other elements in thick film image such as clear background, dim background artifacts and
black region out of oil immersion field. Regions out of the oil immersion field are replaced by
the mean value so that ROI remains darker regions. Because it has been observed that some
non leukocytes nuclei deteriorate rapidly when the threshold value decreases, the image is
thresholded successively using threshold values 0.7 × L, 0.5 × L, 0.2 × L , where L is the
gray value of the black region out of the oil immersion field after contrast stretching. After
each thresholding, small elements less than 0.5 times the reference leukocyte size is deleted
and element less than two time leukocytes reference size are captured on the over-segmented
image. Only elements coming out of bigger region of the former thresholding step are
considered. Bigger elements are captured in the over-segmented image after the last
thresholding step.

2.2.2 Artifacts removal

Some big stains deposit, artifacts and parasites are darker than elements of interest. They are
seen as clump of leukocytes and the consequence of their presence in the processing chain is
the increasing execution time. It has been observed that mean gray value of leukocytes in blue
sub-band was near the mean value of the background in the same sub-band while mean gray
value of some stains deposit, artifact and parasites was smaller. The minimum difference
obtained experimentally was 39 and we add a margin of 4 in our algorithm.

2.2.3 Curve validation and local threshold selection

The first computation on the curve is three times filtering operations with mean filter, where
the signal value is replaced by the mean of its four/two neighbors including itself. After
smoothing operations, the slope is computed and its maximum value is subtracted to its
minimum value. The signal is valid if this difference operation is greater or equal to 10. This
value was determined experimentally by comparing values obtained from low contrast
leukocytes. When the signal is valid, the contours points correspond to minimum and

R S. Publication, rspublicationhouse@gmail.com Page 185


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

maximum of the slope. The amplitudes of these points in the signal are candidate for the local
threshold computation. When n signals are valid, the local threshold value is the mean of n/2
smaller candidates. The flowchart of local threshold selection process is given in fig 9.

2.2.4 Clump splitting

Bigger elements with size greater than 1.3 times the size of average leukocytes are either
artifacts or nearby touching or overlapping of individual leukocytes. Clump splitting consist to
put corresponding clump area in an enlarged bounding box. This small image is then stretched.
For valid line signals, valleys are detected if any, frontier points are marked and contour points
are extracted. Points between contour points are replaced by 1 and the other are replaced by 0
to obtain logical line corresponding to the valid signal. The concatenation of logical lines gives
horizontal split image. The same demarche is made for rows to obtain vertical image. A point
by point multiplication of the horizontal image with vertical image and filling operation gives
us the split image. If the split region is bigger than normal leukocyte size, it is supposed to be
an artifact. The flow chart of the clump splitting is given in fig 10.

2.2.5 Recognition and counting pre-processing

The aim of recognition and counting pre-processing is to carry out necessary process to ease
the recognition and counting process task. It is a filtering operation according to the size of
elements. This operation is performed on segmented image to remove too small element and on
over-segmented image to make sure that each over-segmented element correspond to one
unique leukocyte or non leukocyte.
Elements resulting from segmentation and clump splitting process are selected according to
their size. The over-segmented elements is valid if its gives elements obtained after
segmentation and clump splitting, bigger than 0.65 times the average leukocyte size and lest
than 1.5 times the average leukocyte size. When segmented elements coming from single over-
segmented region are bigger or equal to 1.5 times the average leukocyte size, this over-
segmented region is first replaced by segmented elements bigger than 0.65 times the average
leukocyte size. The rest of element are enlarged using mathematic morphology operation and
move to over-segmented image.

2.2.6 Class separation using hierarchical clustering

Euclidian distance was used in hierarchical clustering dendrogram for recognition process.
First order statistics feature F1, F2 and F3 were used.
F1 = K R − K G , K R , Cv . (9)
F2 = K R − K G , K R , K B . (10)
F3 = K R − K G , Ent . (11)
Where K i is the element‟s median value in i subband.
Cv is the coefficient of variation
Ent is the entropy of the region

R S. Publication, rspublicationhouse@gmail.com Page 186


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

Input image: Ir;


Over-segmented image: verS ;
Image of the region to split: 𝐾1;
Average leukocyte size: 𝑆𝑖𝑧;

Enlarge region K1;


Get coordinates of the bounding box;
IrC ← Ir(coordinates) ;
[L1, L2] ←Size of IrC;
[𝛼, 𝛽 ] ← Coordinate of the minimum of the
C=0;
intersection of 𝐼𝑟 with the region 𝐶;
Hor ←Binary zeros image with same size as IrC;
𝐼𝑟𝐶 ← Square window image of 𝐼𝑟 where (𝛼, 𝛽 ) is
Ver ← Hor ;
the middle;
𝛼𝐶 , 𝛽𝐶 ← midle of 𝐼𝑟𝐶;
𝐼𝑟𝐶 ←Stretched 𝐼𝑟𝐶;
𝑆1 ← Get diagonal signal1 ; Y 𝐶=0
𝐶 = 𝐿1
𝑆2 ← Get diagonal signal2 ;
𝑆3 ← Get line signal 𝑆(𝛼𝐶 , ∶) ; N
𝑆4 ← Get line signal 𝑆(𝛼𝐶 + 15, ∶) ;
𝑆5 ← Get line signal 𝑆(𝛼𝐶 − 15, ∶) ; C ← C + 1;
N Y
𝑆6 ← Get row signal 𝑆(: , 𝛽𝐶 ) ; S ← Get line signal IrC(C, ∶); 𝐶 = 𝐿2
𝑆7 ← Get row signal 𝑆(: , 𝛽𝐶 + 15) ; S ← Mean filterinf of S ;
𝑆8 ← Get row signal 𝑆(: , 𝛽𝐶 − 15) ; Sl ← Splope of 𝑆 ;
𝑉 ← [𝑆1, 𝑆2, 𝑆3, 𝑆4, 𝑆5, 𝑆6, 𝑆7, 𝑆8] ;
𝐶 ← 0; C ← C + 1;
𝑇 ← []; S ← Get transposed row
Remove region 𝐶 from 𝐼𝑟; N signal IrC(: , C);
𝑉𝑎𝑙 = 1
S ← Mean filterinf of S ;
Sl ← Splope of S ;
Y

𝐶=8 Y Put value lest than mean to


maximum;
Research valleys and put their
N values to maximum;
𝐶 ← 𝐶 + 1; Y Get frontiers points; N
𝑇=∅ 𝑉𝑎𝑙 = 1
𝑆 ← V(C) ; Get contour points;
𝑆𝑙 ← Slope of 𝑆 ; Compute binary line signal;
N 𝐇𝐨𝐫(𝐂, : ) ← Binary line signal; Y

𝑳𝒂𝒎 ← Mean of half


N 𝑉𝑎𝑙 = 1 smaller Put value lest than mean to maximum;
element in 𝑇 ; Research valleys and put their values to
maximum;
Y Get frontiers points;
𝑋1, 𝑋2 ← Coordinates of Get contour points;
minimum and Compute signal binary line ;
maximum of 𝑆𝑙; 𝐕𝐞𝐫(: , 𝐂) ← Transposed binary line signal;
𝑳𝟏 ← [𝑆(𝑋1), 𝑆(𝑋2) ] ;
𝑇 ← [𝑇, 𝐿1] ; End Split image ← Ver.× Hor;
Apply morphological operation to split image;
End Remove elements less than 0.18 × Siz;
Move elements less than 1.3 × Siz to replace
Fig 9: Flow chart of local threshold selection split region in OverS;
process.
Fig 10: Flow chart of the clump splitting process.

R S. Publication, rspublicationhouse@gmail.com Page 187


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

We used the dengrogram function from MATLAB with default distance D and Euclidian
distance. For each set of features, we obtain two class elements. The mean feature value of
each class is computed and the median value of red sub-band helps us to choose the good class
as leukocytes region are supposed to be darker. Classes obtained from features were
complementary. The and operation between binary vectors representing the position of good
elements from set of features helps us to obtain position of classified elements.

2.2.7 Class selection using Neural Network

Artificial Neural Network called Neural Network is a computational model based on biological
Neural Network. Some neuron not interconnected are arranged to take inputs. In multilayer
Neural Network, there is one or more hiding layer with interconnected neurons. Result from
the network is collected in the output layer. For class selection, we define a three layer back
propagation neural network. 51 samples of data used for training and validation of the network
were collected from two good stained images and four categories of complex images. 35
neurons in the hidden layer seem to be stable for the set of features F4.
F4 = K R − K G , K R , K B , Cv (12)
The network has been trained to give output response value 0 for non leukocyte and 2 for
leukocyte element. For class selection, the network receives the mean value of non leukocyte
class obtained from hierarchical clustering in order to know if it is a single class problem or
two classes problem.

3. Experimental analyzes
Fig 11 shows the process details. (a) Shows the original image taken for example. This image
has a clump in the upper left corner and no black artifact. (b) presents a gray scale image in
which dim background out of the oil immersion field was replaced by the mean value. (c) gives
the over-segmented image. (d) is the image of segmented image after clump splitting. For this
example, there was not too smaller element to be rejected in the recognition and count pre-
processing step. It is seen in (e) the over-segmented image where any element of the over-
segmented image corresponds to one element in the segmented image. Image (f) shows the
upper left region of the modified over-segmented image where it can be observed association
of elements in the clump. It is also observed that one nuclei situated at boundary was
automatically removed because we are not taken into account boundaries element in our
algorithm.
For the recognition process, elements are classified in two classes using hierarchical clustering
with the set of features F1, F2 and F3. Table 1 shows features F1, F2 and F3 for the example of
fig 11 (a). Table 2 shows corresponding classes. The mean median value of each class helps to
identify the better class for each set of features. These mean values are bold in last two lines of
table 1. In this example, class two is the better class for the three features because it has the
smallest mean median value. The result of class separation is shown in table 3. It was difficult
to use hierarchical clustering algorithm for the all recognition process because the distance of
the good class alternates from smaller to bigger according to the image used. Moreover,
hierarchical clustering algorithm will give two classes in case one class problem.

R S. Publication, rspublicationhouse@gmail.com Page 188


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

(a) Original image. (b) Gray scale image in which black


backgound was replaced by the mean.

(c) Over-segmented Image (d) Segmented image.

(e) Modified over-segmented image. (f) Upper left corner of the modified over-
segmented image.

Fig 11: Details of the segmentation process.

R S. Publication, rspublicationhouse@gmail.com Page 189


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

Table1: Set of features value of segmented regions.


Feature 𝐹1 𝐹2 𝐹3
N° 𝑲𝑹 − 𝑲𝑮 𝑲𝑹 𝑪𝒗 𝑲𝑹 − 𝑲𝑮 𝐾𝑅 𝑬𝒏𝒕 𝑲𝑹 − 𝑲𝑮 𝑲𝑹 𝑲𝑩
1 30.0000 70.0000 8.3333 30.0000 1.8858 30.0000 70.0000 193.0642
2 11.0000 34.0000 3.0303 11.0000 6.0202 11.0000 34.0000 190.5049
3 5.0000 62.0000 2.3810 5.0000 8.0865 5.0000 62.0000 202.2457
4 14.0000 72.0000 1.6667 14.0000 14.0332 14.0000 72.0000 201.9865
5 4.0000 35.0000 2.1739 4.0000 7.0328 4.0000 35.0000 195.2746
6 14.0000 42.0000 2.2222 14.0000 8.6845 14.0000 42.0000 195.0773
7 11.0000 28.0000 4.0000 11.0000 3. 0362 11.0000 28.0000 183.1157
8 12.0000 35.0000 2.3810 12.0000 7.4815 12.0000 35.0000 189.5971
9 17.0000 49.0000 2.4390 17.0000 8.0100 17.0000 49.0000 195.5218
10 18.0000 52.0000 2.2727 18.0000 8.2994 18.0000 52.0000 195.6497
11 30. 0000 89.0000 5.2632 30. 0000 3.4713 30. 0000 89.0000 209.2000
12 13.0000 63.0000 3.5714 13.0000 4.8387 13.0000 63.0000 202.5665
13 30.0000 88.0000 4.7619 30.0000 3.3256 30.0000 88.0000 209.0637
14 -1.0000 54.0000 2.7027 -1.0000 7.6735 -1.0000 54.0000 200.7141
15 11.0000 58.0000 1.4085 11.0000 12.9580 11.0000 58.0000 204.7921
16 5.0000 53.0000 2.5641 5.0000 5.3080 5.0000 53.0000 202.9596
17 10.0000 39.0000 1.7241 10.0000 8.5671 10.0000 39.0000 192.9501
18 26.0000 73.0000 3.8462 26.0000 3.6657 26.0000 73.0000 209.3169

Mean 30.0000 88.5000 5.0125 29.0000 80.000 3.0871 30.0000 70.0000 193.0642
of
claas1
Mean 12.5000 51.1875 2.9198 10.2857 48.2857 7.8592 13.5294 54.4706 198.8551
of
claas2

Table2: Classes obtained.


N° 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Classes of 𝐹1 2 2 2 2 2 2 2 2 2 2 1 2 1 2 2 2 2 2
Classes of 𝐹2 1 2 2 2 2 2 2 2 2 2 1 2 1 2 2 2 2 1
Classes of 𝐹3 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Neural network was trained using four bad stained images and two good stained images with
feature F4. Data obtained was also used to validate the network and 35 neurons in the hided
layer seem to be stable. This trained network gives bad counting rate for leukocytes count
because there are cases where leukocytes features values in one image are near the non
leukocytes value in other images. For example, all leukocytes feature of fig 12 are seen as non
leukocytes comparing their distance with distances in the previous example. Statistic
parameters of one of them have been added as number 19 in the previous example and the
dendrogram of fig 13 was drawn. This dendrogram shows how fare is the distance between the
added element and others. We conclude that leukocytes classification depend on local
properties of the image. Meanwhile Neural Network gives good classification for the 20
complex images of our database when the mean value of class 1 and 2 was used respectively as
input. For the example presented, we obtained 0.4182 and 1.8563 for class one and two
R S. Publication, rspublicationhouse@gmail.com Page 190
International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

respectively. Neural network response shows that class 1 is a non leukocytes class because the
score obtained (0.4182<1) is less than the value expected.

Fig 12: Example of thick smears image

Fig 13 : Dendrogram of elements of fig 16 (a) where one leukocyte of fig 17 was added as
element number 19.

Our algorithm performs well for good stained images with counting rate of 100%. Meanwhile
images techniques presented in this work were tested with 20 images selected among bad
stained images and images that cannot be segmented with global threshold. The number of
leukocytes per image was varying from one to heighten. The presented methods achieve a
counting rate of 97.98%. In semi-automatic mode, operator is invited to click on one leukocyte
to capture the average size of leukocyte. In automatic mode, the algorithm searches and
recognizes one leukocyte, get the average size and start the computation again as leukocyte
seize is important for the segmentation and count. The execution time is less than one minute
for automatic mode and less than 40 minutes for semi-automatic. Our images were resized to
1500 × 1250 pixels. A computer with 2.5 GHz CPU, 2.99 GB of RAM and MATLAB version
7.0.0.19920 (R14) was used for the computation and evaluation of presented techniques.

R S. Publication, rspublicationhouse@gmail.com Page 191


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

4. Conclusion
The aim of the presented work was the automatically counting of leukocytes in complex thick
blood smears images. Curve approaches for image segmentation were presented. Curve
approaches for local threshold selection and for clump splitting were proposed. First order
statistics features were used for classification. A combination hierarchical clustering
dendrogram and neural network was used for elements classification because classification of
leukocytes in some complex images seems to be a contextual problem. Two classes
hierarchical clustering dendrogram with Euclidian distance was used to classified elements and
back propagation neural network of 35 neurons was trained for class selection. The system was
tested using Complex images that are images stained in bad condition or images that cannot be
segmented with global threshold. Our algorithm achieve a counting rate of 100% for well
stained images and 97,98% for complex images. Bad results were mainly obtained for bad
stained images.

5. Acknowledgements
Authors would like to thank the Government of India through C V RAMAN International
fellowship for African Researchers program who make possible this part of research in India.
Authors would also like to thank the staff of the Department of Studies in Computer Science of
the University of Mysore for their lovely support.

6. References
[1]World Health organization, World Malaria Report 2008, WHO/HTM/GMP/2008.1,
http://whqlibdoc.who.int/publications/2008/9789241563697_eng.pdf
[2]O. GAYE, G. MC LAUGHLIN, M. DIOUF, S. DIALLO, Etude Comparative de Cinq
Méthodes de Diagnostic Biologique du Paludisme: La Goutte Epaisse, la méthode QBC, la
Sonde à ADN, la PCR et le PARASIGHT F TEST, Médecine d'Afrique Noire,244-248.
[3]E. HERNANDEZ, J-J. DE PINA, R. FABRE, E. GARRABE PHENON, J-D. CAVALLO,
Evaluation du Test Optimal dans le Diagnostic des Accès palustres d‟Importation, Med. Trop.,
6, 2001, pp 153-157.
[4]B. M. Greenwood and J. R. M. Armstrong, Comparison of two simple methods for determining
malaria parasite density, Trans R Soc Trop Med Hyg, 85 (2), 1991, 186-188.
[5]Tek FB, Dempster AG, Kale I. Computer vision for microscopy diagnosis of malaria. Malar J,
8(153) 2009, pp 1-14.
[6] J. Somasekar, B. Eswara Reddy, E. Keshava Reddy and Ching-Hao Lai, An Image Processing
Approach for Accurate Detarnination of Parasitemia in Peripheral Blood Smear Images, IJCA
Special Issue on “Novel Aspects of Digital Imaging Applications” (DIA) (1), 2011, pp 23-27.
[7]K. M. Khatri, V.R. Ratnaparkhe, S.S. Agrawal, A.S. Bhalchandra, Image Processing Approach
for Malarial Parasite Identification, IJCA, Proc. GTETC-IP (1) 2014, pp 5-7.
[8]Miguel Angel Luengo-Oroz, Asier Arranz, John Frean, Crowdsourcing Malaria Parasite
Quantification: An Online Game for Analyzing Images of Infected Thick Blood Smears, J Med
Internet Res, 14(6), 2012.
[9]Frean JA. Reliable enumeration of malaria parasites in thick blood films using digital image
analysis. Malar J, 8(1), 2009.

R S. Publication, rspublicationhouse@gmail.com Page 192


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

[10] Prescott WR, Jordan RG, Grobusch MP, Chinchilli VM, Kleinschmidt I, Borovsky J, et al.
Performance of a malaria microscopy image analysis slide reading device, Malar J, 11(1),
2012.
[11] World Health Organization. Basic Malaria Microscopy: Part I. Learner's Guide. Geneva:
World Health Organization; 2010.
[12] Berend Houwen, Blood Film Preparation and Staining Procedures, Laboratory
Hematology, 6, pp 1-7.
[13] Kaewkamnerd et al., An automatic device for detection and classification of malaria
parasite species in thick blood film, BMC Bioinformatics 13(17), 2012, S18.
[14] Jaspreet Kaur, Amita Choudhary, Comparison of Several Contrast Stretching Techniques
on Acute Leukemia Images, International Journal of Engineering and Innovative Technology
Vol. 2(1), 2012, pp 332- 335.
[15] SANG UK LEE and SEOK YOON CHUNG, A Comparative Performance Study of
Several Global Thresholding Techniques for Segmentation, Computer Vision, Graphics, and
Image Processing, 52, 1990, pp 171-190.
[16] T. Achint, A. Rusu and V. Govindaraju, Synthetic Handwriting CAPTCHAs, Pattern
Recognition, Vol. 42, 2009, pp 3365-3373.
[17] Réjean Plamondon, and Claudio M. Privitera, The Segmentation of Cursive Handwriting:
An Approach Based on Off-Line Recovery of the Motor-Temporal Information, IEEE
Transactions on Image Processing, VOL. 8(1), 1999, pp 80-91.
[18] Mohammed Javed, P. Nagabhushan, B.B. Chaudhuri, Extraction of Projection Profile,
Run-Histogram and Entropy Features Straight from Run-Length Compressed Text-Documents,
In Pattern Recognition (ACPR), 2013, pp 813-817.
[19] Rupinderpal Kaur and Rajneet Kaur, Survey of De-noising Methods Using Filters and
Fast Wavelet Transform, International Journal of Advanced Research in Computer Science
and Software Engineering, Vol. 1(2), 2013, pp 133-136.
[20] Rashmi, Mukesh Kumar, and Rohini Saxena, Algorithm and Technique on Various Edge
Detection: A Survey, Signal & Image Processing : An International Journal (SIPIJ) Vol.4(3),
2013, pp 65-73.
[21] N. Senthilkumaran and Rajesh, Edge Detection Techniques for Image Segmentation – A
Survey of Soft Computing Approaches, Int. J. of Recent Trends In Engineering and
Technology, 1(2), 2009, pp 250-254.
[22] A. Djimeli, D. Tchiotsop, R. Tchinda, Analysis of Interest Points of Curvelet Coefficients
Contributions of Microscopic Images and Improvement of Edges, Signal & Image Processing :
An International Journal (SIPIJ) Vol. 4(2), 2013, pp 1-9.
[23] M. Kass, A.D.T. Witkin, “Snakes: active contour models”. International Journal of
Computer Vision, 1, 1988, pp 321-331.
[24] Seerha and Rajneet Kaur, Review on Recent Image Segmentation Techniques,
International Journal on Computer Science and Engineering, Vol. 5(2), 2013, pp 105-112.
[25] Nikhil R. Pal and Sankar K. Pal, A Review on Image Segmentation Techniques, Pattern
Recognition, Vol. 26(9), 1993, pp 1277-1294.
[26] Dzung L. Phamy, Chenyang Xu, Jerry L. Prince, A Survey of Current methods in Medical
Image segmentation, Annual Review of Biomedical Engineering, Vol. 2, 2000, pp 315- 337.
[27] Suman Rani, Deepti Bansal and Beant Kaur, Detection of Edges Using Mathematical
Morphological Operators, Open Transactions on Information Processing, Vol. 1(1), 2014, pp
17- 26.

R S. Publication, rspublicationhouse@gmail.com Page 193


International Journal of Computer Application Issue 4, Volume 5 (Sep - Oct 2014)
Available online on http://www.rspublication.com/ijca/ijca_index.htm ISSN: 2250-1797

[28] Fukunaga K., Introduction to statistical pattern recognition. (London: Academic Press
Limited; 1990).
[29] Andrew R. Webb, QineiQ Ltd., Malvem, Statistical Pattern Recognition, (Second Edition,
John Wiley & SoNS LTD, 2012).

R S. Publication, rspublicationhouse@gmail.com Page 194

View publication stats

You might also like