You are on page 1of 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/318038195

Data Replication in Cloud Systems: A Survey

Article · July 2017


DOI: 10.4018/ijissc.2017070102

CITATION READS

1 285

4 authors, including:

Riad Mokadem Mohamed ridda Laouar


Paul Sabatier University - Toulouse III Université de Tébessa
29 PUBLICATIONS   81 CITATIONS    28 PUBLICATIONS   10 CITATIONS   

SEE PROFILE SEE PROFILE

Sean Eom
South East Missouri State University
84 PUBLICATIONS   1,450 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

e-learning success model View project

Software architecture modeling, transformation and evolution View project

All content following this page was uploaded by Sean Eom on 20 July 2017.

The user has requested enhancement of the downloaded file.


International Journal of Information Systems and Social Change
Volume 8 • Issue 3 • July-September 2017

Data Replication in Cloud Systems:


A Survey
Khaoula Tabet, University Larbi Tebessi, Department of Mathematics and Computer Science, Tebessa, Algeria
Riad Mokadem, Paul Sabatier University, Toulouse, France
Mohamed Ridda Laouar, University Larbi Tebessi, Department of Mathematics and Computer Science, Tebessa, Algeria
Sean Eom, Southeast Missouri State University, Management Information Systems, Cape Girardeau, MO, USA

ABSTRACT

This paper presents a survey of data replication strategies in cloud systems. Based on the survey and
reviews of existing classifications, we propose another classification of replication strategies based on
the following five dimensions: (i) static vs. dynamic, (ii) reactive vs. proactive workload balancing,
(iii) provider vs. customer centric, (iv) optimal number vs. dynamic adjustment of the replica factor
and (v) objective function. Ideally, a good replication strategy must simultaneously consider multiple
criteria: (i) the reduction of access time, (ii) the reduction of the bandwidth consumption, (iii) the
storage resource availability, (iv) a balanced workload between replicas and (v) a strategic placement
algorithm including an adjusted number of replicas. Therefore, selecting a data replication strategy
is a classic example of multiple criteria decision making problems. The taxonomy we present can
be a useful guideline for IT managers to select the data replication strategy for their organization.

Keywords
Classification, Cloud Computing, Data Replication Strategies, Multiple Criteria Decision Making

INTRODUCTION

With the increasing globalization of contemporary business organizations, distributed databases and
their management have become one of the key areas in database research. A distributed database is
a single logical database scattered across multiple computers. Basically, there are two options for
distributing a database: data partitionaing or data replication. Data replication is one of the important
decisions in organizations (Km & Eom, 2016). It refers to the creation of identical copies of data
(replicas). Data partitioning is another strategy for distributing a database that breaks a table into
multiple records (horizontal partitioning) or multiple columns (vertical partoitioning).
This paper presents a survey of data replication strategies in cloud systems. Data replication
improves data availability, response time, fault tolerance, and reduces network traffic. It is frequently
used in: (i) DBMS (Pérez, García-Carballeira, Carretero, Calderón, & Fernández, 2010), (ii) parallel
and distributed systems (Loukopoulos, Lampsas, & Ahmad, 2005; Benoit, Rehn-Sonigo, & Robert,
2008), (iii) mobile systems (Tos, Mokadem, Hameurlain, Ayav, & Bora, 2016) and (vi) large scale
systems, including P2P(Xhafa, Kolici, Potlog, Spaho, Barolli, & Takizawa, 2012)and data Grid

DOI: 10.4018/IJISSC.2017070102

Copyright © 2017, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.


17
International Journal of Information Systems and Social Change
Volume 8 • Issue 3 • July-September 2017

systems (Mansouri, Azad, & Chamkori, 2014). Many replication strategies proposed aim to answer
the following questions:

• What data should be replicated?


• When should the data be replicated
• Where the new replicas should be placed?

Data replication is a necessary tool for effectively managing a database in distributed database
environments. Most of the works in the literature have classified replication strategies based on the
following criteria: (i) static vs. dynamic classification (Chervenak, Deelman, Foster et al., 2002;
Čibej, Slivnik, & Robič, 2005), (ii) centralized vs. decentralized replication (Sashi & Thanamani,
2011; Amjad, Sher, & Daud, 2012; Grace & Manimegalai, 2014), (iii) server vs. client replication
(Doğan, 2009; Steen & Pierre, 2010), (iv) objective function based classification (Mokadem &
Hameurlain, 2015), and (v) system architecture based classification (Tos, Mokadem, Hameurlain
et al., 2015). However, the existing replication strategies are not adapted to the cloud system. They
aim to obtain the best performance without taking the profit of cloud providers or the satisfaction
of tenant requirements into account. Creating as many replicas in clouds may not be economically
feasible. Hence, replication strategies in such environments should also ensure both a tenant Quality
of Service (QoS) and the economic profitability of the provider.
This paper presents a survey of data replication strategies in cloud systems. We propose another
classification of replication strategies based on the following five dimensions:

• Static vs. dynamic (Ghemawat, Gobioff, & Leung, 2003; Bai, Jin, Liao et al., 2013);
• Reactive vs. proactive workload balancing (Silvestre, Monnet, Krishnaswamy et al., 2012;
Hussein & Mousa, 2014);
• Provider-centric vs. customer-centric (Sakr & Liu, 2012; Sousa & Machado, 2012);
• Minimal blocking probability (Xue, Shen, & Guo, 2015) and energy efficiency and bandwidth
consumption (Boru, Kliazovich, Granelli et al., 2015);
• Objective function (Bonvin, Papaioannou, & Aberer, 2011; Kirubakaran, Valarmathy, &
Kamalanathan, 2013; Tos et al., 2016).

The rest of this paper is organized as follows. Section 2 reviews existing classifications and how
dynamic replication can be classified with respect to multiple criteria. Section 3 proposes a new
taxonomy of data replication strategy in Cloud systems with respect to the several multiple dimensions.
Section 4 discusses some important factors that impacts performance of these strategies. The final
section outlines future research directions.

REVIEW OF EXISTING REPLICATION STRATEGY CLASSIFICATION

Most of the proposed replication strategy in cloud systems aim to increase data availability, decrease
data access time, reduce access latency, increase reliability of the cloud system, and minimize energy
consumption. Finding an optimal number of replicas in the cloud is an important part of the strategy
formulation in order to avoid unnecessary replications, which can generate an overhead in the system
(Boru et al., 2015). Although storage cost is becoming less important, some replication strategies
assume unlimited storage that is not realistic, while others assume a fixed and limited amount of
storage. Reducing access latency is also an important objective of replication strategies since it reduces
the job execution time (Xue et al., 2015). Most of the replication strategies require a tradeoff between
different criteria that impact the system performance. A significant number of replication strategies
that have been proposed in the literature are classified according to the following attributes.

18
15 more pages are available in the full version of this
document, which may be purchased using the "Add to Cart"
button on the product's webpage:
www.igi-global.com/article/data-replication-in-cloud-
systems/182329?camid=4v1

This title is available in InfoSci-Journals, InfoSci-Select,


InfoSci-Journal Disciplines Communications and Social
Science, InfoSci-Select, InfoSci-Technology Adoption,
Ethics, and Human Computer Interaction eJournal Collection.
Recommend this product to your librarian:
www.igi-global.com/e-resources/library-
recommendation/?id=2

Related Content

Using Digital Technologies to Aid E-Learning: A Pilot Study


Michele T. Cole, Daniel J. Shelley, Louis B. Swartz and Blessing F. Adeoye (2014).
Effects of Information Capitalism and Globalization on Teaching and Learning (pp.
22-35).
www.igi-global.com/chapter/using-digital-technologies-to-aid-e-
learning/113237?camid=4v1a

Adoption and Use of Information and Communication Technologies (ICTs) in


Library and Information Centres: Implications on Teaching and Learning
Process
Jerome Idiegbeyan-Ose, Mary Idahosa and Egbe Adewole-Odeshi (2014). Effects of
Information Capitalism and Globalization on Teaching and Learning (pp. 78-87).
www.igi-global.com/chapter/adoption-and-use-of-information-and-
communication-technologies-icts-in-library-and-information-
centres/113242?camid=4v1a
On-Line Course Registration Systems Usability: A Case Study of the e-Lion
Course Registration System at the Pennsylvania State University
Louis-Marie Ngamassi Tchouakeu, Michael K. Hills, Mohammad Hossein Jarrahi and
Honglu Du (2012). International Journal of Information Systems and Social Change
(pp. 38-52).
www.igi-global.com/article/line-course-registration-systems-
usability/72332?camid=4v1a

Green Computing: An Indian Perspective


Rabindra Ku Jena and D. G. Dey (2013). Governance, Communication, and
Innovation in a Knowledge Intensive Society (pp. 40-50).
www.igi-global.com/chapter/green-computing-indian-
perspective/76593?camid=4v1a

View publication stats

You might also like