You are on page 1of 8

Available online at www.sciencedirect.

com

ScienceDirect
Procedia Computer Science00 (2015) 000–000
www.elsevier.com/locate/procedia

The 6th International Conference on Ambient Systems, Networks and Technologies (ANT 2015)

Hybrid Bio-MER: A Novel Hybrid Bio-Inspired Emotion Recognition


System Based on CSO-GA-PSO augmented with SVM and ANN
T V Viveka*, Guddeti Ram Mohana Reddyb†
a,b
Department of Information Technology, National Institute of Technology Karnataka, Surathkal, Mangalore, India

Abstract

Study of human emotions plays a vital role in the fields of social networking, e-commerce, education,
entertainment etc. Though many recognition systems were developed, none of them were optimized very well to
handle big datasets for both training and classification. Also in real time applications, human-computer interactions
should be fast and accurate for its effectiveness. Keeping these goals in mind, we propose a novel hybrid bio-
inspired algorithm using Cat swarm optimization (CSO) [4] with PSO (Particle Swarm Optimization) [3] and GA
(Genetic Algorithm) [5] for optimal feature extraction. Further, we introduce a 2-stage classifying method, firstly
using Support Vector Machine (SVM) [6] followed by Artificial Neural Networks (ANN) [18] for better accuracy.
Experimental results from CK+ [1] dataset demonstrate that our proposed Hybrid Bio-MER system outperforms
MER system with SVM by 10.5% improvement in accuracy gain while using CSO-GA-PSO with SVM and ANN.
For testing the performance in real time application, we demonstrate an E-Learning [2] environment b evaluating
facial characteristics of students in teaching-learning environment. Classified basic emotions namely Happy, Sad,
Disgust, Anger, Surprise and Neutral are used as indication of student‟s interest in the subject.

© 2015 The Authors. Published by Elsevier B.V.


Peer-review under responsibility of the Conference Program Chairs.

Keywords: Emotion Recognition; Bioinspired Algorithms; Cat Swarm Optimization; Genetic Algorithm; Particle Swarm
Optimization; Support Vector Machines; Artificial Neural Networks; Hybrid Bioinspired Algorithms; E-Learning.

* Corresponding author.
E-mail address:vivekam101@gmail.com,

Corresponding author.
E-mail address:profgrmreddy@nitk.ac.in,

1877-0509© 2015 The Authors. Published by Elsevier B.V.


Peer-review under responsibility of the Conference Program Chairs.
2 Author name / Procedia Computer Science00 (2015) 000–000

1. Introduction

Emotions detected from facial and audio features are key to human computer interactions. As mentioned in [19],
emotion is of short term feeling, whereas moods are of long term, and personalities are of very long term i.e., if a
person performs a particular mood for a prolonged time period, then it is considered as his/her behavior, whereas a
mood can be defined by using a persistent emotional state. Emotion recognition from different media faces severe
performance degradation and instability due to its huge variation in training database and that in real time
application. On training stage, we have mainly two steps: feature extraction and classification. The features
extracted from multimedia are too big that it is adversely affecting the classification stage. Accuracy can be
improved by using better features for classification. We aimed at solving this NP-Hard feature subset problem in
emotion recognition using the effective use of bio-inspired heuristic algorithms. Further, with the help of better
classification, the accuracy can also be improved.
The main challenge here is to choose the best heuristic algorithm in conjunction with the classifier for emotion
recognition. Bio-inspired approaches are of either iterative improvement of a population of solutions (as in
Evolutionary algorithms, Swarm Intelligence based algorithms) or a single solution (e.g. Tabu Search) and mostly
employs randomization and local search to solve a given optimization problem. Size and relevance of features in the
feature vector play a vital role in emotion recognition system. Time complexity is the other vital factor for emotion
recognition. Emotions vary frequently and a perfect interaction is not possible without timely classification.
To address these problems, hence we propose a novel Hybrid Bio-inspired Multimodal Emotion Recognition
System (Hybrid Bio-MER) based on the combination of Bioinspired algorithms such as CSO [4], PSO [3] and
GA[5] with SVM [6] and ANN[18]. Another method of improving the accuracy is to reduce the error rate from
classifiers. Keeping that in mind, we apply a divide and conquer approach for the classification. In the first layer of
classification, we use a binary SVM [6] in order to classify high energy classes [Surprise, Happy and Anger] from
low energy classes [Sad, Disgust, and Neutral]. Then in the second layer, ANN [18] is used to classify the resulting
class obtained from the first layer to subclasses.
The core idea behind our proposed method is to make use of swarm intelligence algorithms in combination with
machine learning algorithms in order to reduce the complexity of emotion recognition system (Bio-MER). Towards
that, we propose a novel hybrid bioinspired algorithm using CSO with PSO and GA for optimal feature extraction.
Further, to reduce the classification error rate, we introduce a multiple classifier scheme. Binary SVM classifier in
the top will classify the six basic emotions into two sub emotion categories ie, anger, surprise and happiness, and the
other including sadness, neutral and disgust. In the next stage, ANN classifies the sub category of emotions into the
respective emotions.

The following are the key contributions of the proposed work:

 To the best of our knowledge, this is the first approach which uses hybrid bio-inspired algorithm to
optimize the feature extraction step in combination with SVM and ANN classifiers; thus reducing the
computational complexity in training feature vectors.
 To the best of our knowledge, this is the first paper on hybrid classifiers in the proposed Bio-MER
system using two layers of classifiers i.e. ANN and SVM in a tree form for emotion detection.
 To the best of our knowledge, this is the first approach where a new E-Learning system is devised with
hybrid bioinspired algorithm and tested the efficiency in real time for six basic emotions.

The rest of this paper is organized as follows: Section 2 deals with the Related Work; Section 3 focuses on the
Proposed Methodology; Section 4 discusses Experimental Results and Analysis. Finally the Concluding remarks
with future directions are given in Section 5.
Author name / Procedia Computer Science00 (2015) 000–000 3

2. Related Work

Several studies have been carried out in identifying emotions from various Media, Classifiers, and datasets.
Yan Zhang et al. [7] came up with a variation in CSO called vibration mutation cat swarm, or VMCSO which
targets on increasing the diversity in global search. They compared results using benchmark functions and showed a
good improvement in accuracy. Yuanmei Wen and Yanyu Chen [8] used support vector machine (SVM) model with
modified parallel cat swarm optimization (MPCSO) to forecast next-day cooling load in district cooling system
(DCS). Eigen values are extracted from the data and Principal Component Analysis (PCA) algorithm is used to
reduce the complexity in data sequence. Maysam Orouskhani et al. [9] proposed a new algorithm of CSO namely,
Average-Inertia Weighted CSO (AICSO). They introduced a new parameter to the position update equation as an
inertia weight and used a new form of velocity update equation in the tracing mode of algorithm. They concentrated
on the convergence rather than divergence part as done by Yan Zhang et al. [7]. In another approach, Pei-wei Tsai et
al. [10] investigated a parallel structure of cat swarm optimization. In the experiments, comparison is done with
Particle Swarm Optimization (PSO). Parallel approach CSO converges fast on small dataset and results were good.
Further, Pei-wei Tsai et al. [11] introduced an enhanced form of parallel cat swarm optimization (EPCSO) method
for solving numerical optimization problems. This method is devised to solve numerical optimization problems
under the conditions of a small population size and a few iteration numbers. The Taguchi method is widely used in
the industry for optimizing the product and the process conditions. By adopting the Taguchi method into the tracing
mode process of the PCSO method, they improved the accuracy and computation time. Further, Y. S. Ong et al. [12]
presented an evolutionary algorithm hybridized with a gradient-based optimization technique in the spirit of
Lamarckian learning for efficient design optimization and employ local surrogate models that approximate the
outputs of a computationally expensive Euler solver. Another important contribution in this area is given by Israa
Hadi et al. [13]. They introduced a new algorithm based on Hybrid Cat Swarm Optimization (HCSO) to reduce the
number of search locations in Block Matching (BM) process. The conducted simulations indicate the proposed
method gives better result than other BM algorithms in terms of accuracy and computation time.

3. Proposed Methodology

3.1. Facial Features

We used Constrained Local Model (CLM) tracker as facial recognizer. It is provided by Saragih et al. [14] where
features are extracted based on the locality and shape constraints. It is carried out in two steps i.e. Model building
and search process. Model building has two steps called shape and patch model building. Shape model is created
using Principal Component Analysis (PCA) and gives the mean shape and shape constraint. Mean shape is used for
initialization of new shape. CLM patch model creation is very similar to Active Appearance Model with difference
of using a set of patches of different features instead of using triangular patches. A linear SVM is used to train the
patch model. It is trained with MUCT database [15] which is having over 3700 faces. We used Viola-Jones face
detection algorithm [16] which is based on Haar like features, integral image, adaboost algorithm and cascade
classifier. In facial emotion detection module, we capture images from live video and then we identify 66 feature
points in x, y coordinates.

3.2. Hybrid Bio-Inspired Machine Learning Algorithm

As shown in Figure 1, visual features are extracted and processed with the help of our proposed hybrid
bioinspired system. In the training phase, we used 66 feature points from each face of CK+ dataset; our proposed
algorithm in conjunction with SVM Classifier as fitness function identified 15 relevant features giving the best
accuracy. The modified CSO algorithm with divergence by GA and convergence by PSO-GA combination is
described below. The relevant features are used to classify to six basic emotions using proposed hybrid algorithm
and the details are shown in Algorithm 1.
Proposed hybrid algorithm is a modification of most recent swarm intelligence algorithm: CSO algorithm which
is developed based on the common behavior of cats. In this proposed version, the location of each particle is
represented as vector xi= (xi1, xi2, xi3… xin) taking each bit xij (with j in {1,N} binary values 0 or1. For our
4 Author name / Procedia Computer Science00 (2015) 000–000

problem this xij represents the feature and whole vector is the feature set. Cats are very observant and it spends most
of its time observing its surroundings rather than running behind things which leads to wastage of energy. In
algorithm we represent this behavior by Seeking mode and Tracing mode.

Algorithm 1: Hybrid Bio-Inspired Algorithm using CSO-GA-PSO-SVM

Input: Training feature set [each feature vector contains 66 facial features]
Output: Feature vector indices [It will list out the indices of feature vector which provide best accuracy]

1: Randomly initialize cat‟s position and speed


2: For each cats until required accuracy is obtained or termination requirement satisfied
3: Check whether to choose cat‟s current characteristic set
4: Derive characteristic subsets for cats
5: Compute SVM fitness value for subsets found in Step 4
6: Execute Modified CSO algorithm
7: end
8: Optimal characteristic subset

Figure 1: Framework of Hybrid Bio-MER system using Facial features


Author name / Procedia Computer Science00 (2015) 000–000 5

3.2.1 Seek mode represents the cats behavior in resting time

This mode decides the best position to move in next iteration. We are using GA operation „Mutation‟ here for
good divergence. This mode has four main parameters as below: Seek Memory Pool (SMP), mutation probability,
counts of dimensions to change (CDC) and Self position consideration (SPC). The modified process of seeking
mode is described by the following steps:

Step1: Make j copies of the present position of cat 'k', where j=SMP. If the value of SPC is true, let j = (SMP-1),
then retain the present position as one of the candidates.
Step2: For each copy, according to CDC, randomly apply mutation on its feature set having 66 bits. Each bit
represents feature is present or not.
Step3: Calculate the fitness values (FS) of all candidate points.
Step4: If all FS are not exactly equal, calculate the selecting probability of each candidate point by (1); otherwise set
all the selecting probability of each candidate point is 1.
Step5: Randomly pick the point to move to from the candidate points, and replace the position of cat 'k'.
𝑆𝑆𝐸𝑖 −𝑆𝑆𝐸𝑚𝑎𝑥
𝑃𝑖 = 𝑆𝑆𝐸𝑚𝑎𝑥 −𝑆𝑆𝐸𝑚𝑖𝑛
(1)

If the goal of the fitness function is to find the minimum solution, FSb = FSmax, otherwise FSb = FSmin.

3.2.2 Tracing mode: Running after a target

Tracing mode is the other mode in CSO which models the moving behaviour of cats. We are including PSO+GA
combination for a better convergence. PSO operators namely the subtraction and addition operators are used along
with GA operators like mutation, crossover and selection. Particle movement is as per the following equation (2):

𝑋 𝑘 𝑡 + 1 = 𝑃𝑘 𝑡 − 𝑋 𝑘 𝑡 + [𝑋 𝑡 𝑡 ]′ (2)

The subtraction operator represents the crossover operation between two individuals. [𝑋 𝑡 𝑡 ]′ represent random
mutation operation on 𝑋 𝑡 𝑡 , and the addition represents selection operation from the individuals. Here 𝑃𝑘 𝑡
represent position of global best particle. We have tried with adaptive convergence rates based on SVM accuracy.
We applied 90% convergence rate till cross validation accuracy reach 70% and then reduced to 85% to increase the
divergence. The whole process is illustrated in the following Figure 2(a).

3.3 Evaluation Function

Here each particle position represents a feature vector and the evaluation of each particle is carried out by means
of the SVM classifier to assess the quality of the represented feature set. The fitness of a particle xi is calculated by
applying a 10-fold cross validation (10FCV) method to calculate the rate of correct classification accuracy of SVM
trained with this feature subset. In 10FCV, the data set is divided into 10 subsets. Each time one of the 10 subsets is
used as the test set and the other 9 subsets are put together forming training subset. Then the average error across all
10 trials is computed. The complete fitness function is described in the Equation 3.

100
𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑥 = 𝛽. 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦
+ 𝛾. #𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 (3)

Where 𝛽 and 𝛾 are weight values set to 0.75 and 0.25 respectively to keep 10FCV accuracy value takes priority
over the subset size. The objective here consists of maximizing the accuracy and minimizing the subset size.
6 Author name / Procedia Computer Science00 (2015) 000–000

3.4 Classification Model

As shown in Figure 2(b), we consider a binary SVM in combination with CSO for initial classification. Two
classes are identified namely (Anger, Joy and Surprise) in one class and (disgust, neutral and sadness) in another
class. Binary SVM will classify into two sub classes which is further classified by ANN in second stage.

3.5 Dataset

The Cohn-Kanade AU-Coded Expression Database Version 2, referred to as CK+, includes both posed and non-
posed (spontaneous) expressions and additional types of metadata. For posed expressions, the number of sequences
is increased from the initial release by 22% and the number of subjects by 27%. For action unit and expression
recognition, support vector machine (SVM) classifier with leave-one-out subject cross-validation was used.

(a) (b)

Figure 2: (a) Modified CSO Flow; (b) Classification Model

4. Experimental Results and Discussion

4.1. Experimental Setup

Proposed algorithm is coded in Python language using threads for parallel implementation. The multilabel SVM
classifier is implemented using the LibSVM toolkit [17]. RBF kernel function is used in SVM as it is giving better
result as compared to linear and polynomial kernels. All the experiments are carried our using a PC with Linux O.S
(Ubuntu 14.04 kernel 3.13.0-32.57) INTEL i5-3230M 2.6GHz with 8 GB RAM. For comparison with individual
bioinspired algorithms, we ran CSO, PSO and GA individually and in combination 10 times over each dataset to
reach a statistically meaningful conclusion.
Real time experiment is done with 20 postgraduate students from IT Department, NITK Surathkal, Mangalore,
India in an E-Learning environment. It acts like a feedback for E-learning providers to know how the student is
performing in a particular subject of teaching-learning process. Students are provided a web interface with webcam.
Author name / Procedia Computer Science00 (2015) 000–000 7

Facial signals are captured while student is in the class room. These signals are processed by our proposed system
and the corresponding emotion is displayed based on the duration. In the end, a questionnaire is provided to each of
20 students to assess the performance of the student. Experimental results are promising and motivated us to carry
out further research.

4.2. Parameter Settings

The parameters used in our proposed hybrid algorithm are given in Table I. These parameters were selected after
several test evaluations of each dataset several times until it reaches quality solutions and computational effort.

Table I. Parameters Used in Proposed Hybrid Algorithm


Parameter Value or Range
SMP 5
Mutation Probability 10%
CDC 80%
MR 2%
Cross Over Probability 90%

SMP: Seeking Memory Pool


CDC: Counts of dimensions to change
MR: Mixed Ratio to decide seeking mode and tracing mode

4.3. Results and Discussion

Figure 3(a) shows the comparison of our proposed method against other bioinspired algorithms and it shows
an improvement with clear margin. We ran each algorithm for 100 iterations for comparison purpose. Further to
find the best possible result from each algorithm we tried each algorithm 10 times using same dataset. Similarly,
we analyzed the computational cost of each algorithm with the same setup as mentioned above. Results are
shown in Figure 3(b). CSO and combination with CSO takes more computational cost since the seeking mode of
cats create duplicates to look for neighbourhood positions and evaluation of these duplicates is computationally
expensive. In case of PSO and GA, this mode is not available. Further our proposed hybrid method is little more
computationally expensive than other CSO combinations as divergence is created using mutation process and
convergence using combination of PSO-GA process involving crossover, mutation and select operations. Further
with the selected optimal features we classified the facial characteristics to six basic emotions like Anger,
Happy, Surprise, Sad, Disgust and Neutral.
(a) (b)

Figure 3(a): Cross validation accuracy obtained using various combinations of bioinspired algorithms;
(b): Time taken by each bioinspired algorithms for 100 iterations
8 Author name / Procedia Computer Science00 (2015) 000–000

5. Conclusion and Future Work

In the proposed work, we developed a novel Hybrid Bioinspired algorithm for Emotion recognition using CSO-
GA-PSO-SVM. Further, we tested this system with person dependent and independent scenarios and results are
encouraging. We used CK+ dataset considering 75% of data for training the system and remaining for testing. It
gives an average accuracy of 93.8% considering visual cues alone. This is an improvement of 10.5% accuracy when
compared to ER system with CSO-SVM alone. Further, we demonstrated the performance of our proposed
algorithm in a real time scenario. Emotions in collaborative work places like office, school, college, university are
generally expressed in the form of text, gesture and body movements. In future, we will extend the work on hybrid
bio-inspired system while considering these multimodal features and thereby widening the scope.

References

1. Lucey, Patrick, et al. "The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified
expression." Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on. IEEE, 2010.
2. Zhu, Aiqin, and Qi Luo. "Study on speech emotion recognition system in E-learning." Human-Computer Interaction. HCI Intelligent
Multimodal Interaction Environments, Springer Berlin Heidelberg, 2007. 544-552.
3. Binitha, S., and S. Siva Sathya. "A survey of bio inspired optimization algorithms." International Journal of Soft Computing and
Engineering 2.2 (2012): 137-151.
4. Chu, Shu-Chuan, Pei-Wei Tsai, and Jeng-Shyang Pan. "Cat swarm optimization." PRICAI 2006: Trends in Artificial Intelligence. Springer
Berlin Heidelberg, 2006. 854-858.
5. Back, Thomas. "Evolutionary algorithms in theory and practice, 1996."
6. Tong, Simon, and Daphne Koller. "Support vector machine active learning with applications to text classification." The Journal of Machine
Learning Research2 (2002): 45-66.
7. Y. Zhang and Y. Ma, “Cat Swarm Optimization with a Vibration Mutation Strategy”, International Journal of Machine Learning an d
Computing, vol. 4, no. 6, (2014) December, pp. 510-514.
8. Y. Zhang and Y. Ma, “Cat Swarm Optimization with a Vibration Mutation Strategy”, International Journal of Machine Learning an d
Computing, vol. 4, no. 6, (2014) December, pp. 510-514.
9. Y. Wen and Y. Chen, “Modified Parallel Cat Swarm Optimization in SVM Modeling for Short-term Cooling Load Forecasting”, Journal of
Software, vol. 9, no. 8, (2014) August, pp. 2093-2104.
10. P. W. Tsai, J. S. Pan, S. M. Chen, B. Y. Liao and S. P. Hao, “Parallel Cat Swarm Optimization”, In Proceedings of the 7th International
Conference on Machine Learning and Cybernetics, (2008), pp. 3328-3333.
11. P. w. tsai, J.-S. Pan, S.-M. Chen and B.-Y. Liao, “Enhanced parallel cat swarm optimization based on the Taguchi method”, vol. 39, no. 7,
(2012) June 1, pp. 6309–6319.
12. Y. S. Ong, K. Y. Lum and P. B. Nair, “Hybrid evolutionary algorithm with Hermite radial basis function interpolants for computationally
expensive adjoint solvers”, Computational Optimization and Applications, Springer US, vol. 39, no. 1, (2008) January, pp. 97-119.
13. Hadi, Israa, and Mustafa Sabah. "An Enhanced Video Tracking Technique Based on Nature Inspired Algorithm." International Journal of
Digital Content Technology and its Applications (JDCTA) 8.3 (2014): 32-42.
14. Saragih, Jason M., Simon Lucey, and Jeffrey F. Cohn. "Face alignment through subspace constrained mean-shifts." Computer Vision, 2009
IEEE 12th International Conference on. IEEE, 2009.
15. Milborrow, Stephen, John Morkel, and Fred Nicolls. "The MUCT landmarked face database." Pattern Recognition Association of South
Africa 201.0 (2010).
16. Viola, Paul, and Michael J. Jones. "Robust real-time face detection." International journal of computer vision 57.2 (2004): 137-154.
17. Chang, Chih-Chung, and Chih-Jen Lin. "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and
Technology (TIST)2.3 (2011): 27.
18. Dayhoff, Judith E., and James M. DeLeo. "Artificial neural networks." Cancer91.S8 (2001): 1615-1635.
19. Lucey, Patrick, et al. "The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified
expression." Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on. IEEE, 2010.

You might also like