You are on page 1of 7

6th IFIP/IEEE Interational Workshop on Broadband Convergence Networks

Telco Clouds and Virtual Telco:


Consolidation, Convergence, and Beyond
Peter Bosch Alessandro Duminuco Fabio Pianese Thomas L. Wood
Alcatel-Lucent Bell Labs
Service Infrastructure Research Domain
{peter bosch, alessandr.duminuco, jabio.pianese, tom. wood}@alcatel-Iucent.com
Abstrct-In this position paper we introduce Virtual Telco, a
comprehensive effort to simplify the management of deployed
telecommunication services by using a Cloud computing ap
proach. The main objective of Virtual Telco is to replace the costly
dedicated hardware implementing several centralized control
plane functions and other services with distributed solutions
that may be allocated on-demand over a pool of dependable,
dynamically contracted computing and networking resources
that are easy to manage. Virtual Telco relies on state-of-the
art techniques in the domains of virtualization and distributed
systems to meet the challenging criteria of reliability, scalability,
and timeliness required by present and future telecommunication
standards and services. As a representative example of a Virtual
Telco application, we propose the case of the distributed mobility
management entity (MME) for next-generation LTE cellular
networks.
Index Terms-Control Plane Design, Scalability, Localization
I. INTRODUCTION
The enormous technological success of packet-switching
networks and open protocols, embodied by the Internet and
its fundamental TCP/IP network stack, has been challenging
with increasing vigor both the historical assumptions and the
standard design practices of the telecommunication industry.
The long-lasting controversy between two opposite ideologies,
on one hand Internet's "end-to-end principle" [1], on the other
the "intelligent network" [2], has recently dwindled in inten
sity: the middle-ground of "network convergence" has fostered
a global agreement inside the telco industry. Convergence
roughly meant that telephony applications would start to be
built on top of unreliable, best-efort networks, but still main
taining intact the logical organization of the traditional control
plane. This principle drove the design and implementation of
extensible logical architectures such as the "IP Multimedia
Subsystem" (IMS) [3], in order to guarantee interoperability
between existing telephony standards and Internet-based VoIP.
Today, a strategic objective of the operators is leveraging
the competitive advantage provided by the ownership of their
networks against the emergence of Internet-based service
and content providers. The operators' response to this threat
has been a unanimous attempt to reposition themselves as
providers of value-added services. Exploiting the fexibility
of the new IMS architecture, they began to tightly integrate
their legacy infastructure with general-purpose server farms
that ran their own services and related back-end applications.
However, dedicated infrastructure is still required to run their
core telecOunication functions, which increases complexity
and imposes high maintenance costs. In an era of steadily
declining operational margins, this situation is clearly not
sustainable.
This paper presents our vision on the evolution of the role
of telecommunication operators as providers of computing and
networking infastructures. We suggest the Telco Cloud model
as a solution to hardware consolidation and management
issues. Telco Clouds ofer innovative options for implementing
telco core functions in a cost-efective and dependable way.
We introduce our Virtual Telco initiative as an example of
the proftable synergies and cost reductions operators could
obtain from running their core telephony services on Telco
Clouds. Finally, we describe the Cloud-based implementation
of a representative service, the mobility management entity
(MME) of a LTE cellular network.
A. Telco Clouds and hardware consolidation
Telecommunication networks are no longer built of special
ized boxes and interconnections. After a decade of experience
maintaining heterogeneous pools of general-purpose hardware,
middleware fameworks, and applications, operators are now
fully aware of the processes involved in running a large-scale
computing infastructure. The remaining source of network
and inventory management issues is a lack of consolidation
in both computing platforms and management processes. This
situation motivates our expectation that operators will be in
creasingly adopting a Cloud Computing approach to hardware
management in order to streamline their operations.
In this Telco Cloud scenario, operators will manage pools
of hardware resources installed in key locations at both the
core and edges of their networks. Telco Clouds can provide
a stable, standardized environment for deploying application
sofware, which will therefore help mitigate validation and
testing requirements upon hardware or networking upgrades.
Moreover, the emergence of Telco Clouds enables service
providers to offer a new class of dependable hosting services
that aggregate network, storage, and processing resources with
strong service-level guarantees. Telco Clouds have the po
tential to expand the telecommunication industry's traditional
service portfolio: Cloud-enabled operators will be able to
leverage their hardware and network infrastructures to host
their customers' applications.
978-1-4244-9221-31111$26.00 2011 IEEE 982
The key competitive advantage of a Telco Cloud operator
lies in its leverage upon its own network's management plane,
which exposes networking QoS features normally not available
in the case of Internet-based Cloud Computing providers.
Thanks to their unique position as the gatekeepers and man
agers of last-mile, aggregation, and collect networks, Telco
Cloud providers can ofer the lowest latency to computing
resources, while exposing an unmatched level of control to
third parties interested in high-performance hosting for their
virtualized service infastructure.
Established phone operators have a long experience and
well-known track record as providers of reliable services.
Their know-how will be instrumental in order to engage into
service-level agreements about the quality of service offered
to their customers, who will be able to obtain a complete
end-to-end compute and networking solution to host their
applications. When combined with the appropriate resource
management layer, Telco Cloud providers can become a new
kind of Infrastructure-as-a-Service (IaaS) providers, with a
diferent value proposition compared to the so-called over-the
top laaS providers: a service that combines reliable, guaranteed
network connectivity together with elastic compute and storage
resources.
B. A case for Virtual Telco
In what could be seen as a move in the opposite direction
compared to Amazon's move from being an Internet-based
retailer to an over-the-top provider of computing services,
Telco Cloud operators will be able to use the infastructure
they own and manage on behalf of their ordinary customers
to also support their own internal applications, core services,
and mission-critical applications. This opens interesting pos
sibilities, such as running a large part of the phone network
functions inside the Telco Cloud.
When analyzed fom a networking perspective, the tele
phone system can be seen as an application for building virtual
circuits that are optimized for voice communication between a
set of given end-points. The staple principle of its architecture
is the historical separation between control and data plane: the
former performs access control and determines the path data
needs to travel over the network, while the latter just delivers
the data as instructed.
We call Virtual Telco this further step on the road of
convergence between telephone services and network-based
computing infrastructure. There are several factors that make
the case for Virtual Telco appealing:
1) Localization. A Virtual Telco deployment can leverage
present operator facilities, such as real estate (central
ofces, regional ofces, data centers) and resources (net
works, dark fber), that traditional telephone operators
own and manage all over the territory.
2) Flexibility. Thanks to their fne-grained control over
networks, Virtual Telco providers can better optimize
the service for their customers with a wide range of
advanced requirements, such a: low access latency, high
availability and reliability, geographically diverse repli
cation, elasticity in computing resource allocation.
3) Consolidation. As stated above, the uniform platform
based on virtualization ofered by Virtual Telco promises
a consistent simplifcation in the way the operators'
hardware pools are currently managed.
Turning the control plane into a distributed application is feasi
ble, although the main challenge for the designer is upholding
the same reliability and availability fgures that are prescribed
by telecommunication standards. The architectures currently in
production are based on high-availability techniques and often
rely on two-way redundant hardware and built-in support for
transparent component fail-over.
Implementing control plane functions as distributed appli
cations over a large pool of commodity hardware requires
careful engineering choices in order to successfully exploit the
processing throughput, locality, and scalability that are made
available in the Virtual Telco scenario.
C Summar
The present work is structured as follows. Section 2 enu
merates the challenges of moving existing applications onto
the cloud in an attempt to make them more scalable and
resilient. Section 3 focuses on the feasibility of transforming
the infrastructure required by the control plane of a cellular
network into a set of distributed services running over a Cloud.
We then present the architecture of our Virtual Telco prototype,
implementing the Mobility Management Entity (MME) of a
LTE cellular backplane. Section 4 discusses related work, and
conclusions are drawn in Section 5.
II. MOVING SERVICES TO THE CLOUD
Telco providers run a large number of applications for their
business needs. The services they provide are diverse and
address a wide spectrum of technical issues under variable
constraints. We can roughly group applications into the fol
lowing four categories by their general requirements, trying to
abstract fom the specifc cases:
Database storage with ACID transaction processing:
stateful, requires reliability, availability, consistency. Ex
amples: database back-ends, accounting and billing
Web services, dynamic generation of user content: state
ful, limited by computing capacity and network or storage
IO, require availability and scalability
Distribution of stored data, live media streaming: mostly
stateless, limited by upload network capacity, these ser
vices require scalability and are often sensitive to latency
from transmission and buffering delays
Interactive data processing and high-speed messaging:
stateful, limited by network bandwidth, computing ca
pacity, and I/O speed, sensitive to latency, these services
need availability and sustained throughput
We can observe how even the general requirements of these
four application categories, while overlapping, are different
and often conficting. As these needs cannot be easily ad
dressed with a single, all-encompassing approach, system
983
designers are usually forced to deal with them in a service
dependent way. Moreover, as user demand increases, the
architectural blueprint of a service limits the performance gain
achievable with a move to a Cloud-based deployment:
Centralized applications cannot scale beyond the hard
ware resources of the machine that runs the service.
Stateless distributed applications can usually be scaled
linearly by increasing the allocation of hardware and
networking resources. Their scalability limits are usually
determined by the presence of non-parallelizable sections,
as expressed by Amdahl's argument.
Distributed applications that share state among instances
are subject to additional trade-ot between consistency,
availability, and partition tolerance [4].
In this section, we summarize the most common ways to
migrate existing telco applications to Cloud environments.
There are basically two cases:
1) An application is centralized and cannot be reimple
men ted or modifed. In this case, external mechanisms
need to be employed to partition the user activity among
diferent instances of the unmodifed software.
2) An application is distributed and deployed in a fxed
confguration. The group communication protocol must
insure reliability in case of failure of individual replicas.
A. Legacy architectures and virtualization
Oftentimes, external factors dictate that a service cannot
be modifed or re-implemented in order to be migrated to
a Telco Cloud. Without modifying the architecture of an
application, it is in general not possible to scale it beyond a
single machine, efectively limiting the maximum size of the
system to the capacity of the most powerful machine available.
If the application is stateless and thus parallelizable (e.g. media
content distribution, static web content delivery), standard load
balancing techniques such as DNS round-robin or application
layer redirect can be applied.
Adding reliability and dependability to centralized archi
tectures in a Cloud setting is a challenging problem. If
general-purpose high-availability mechanisms such as state
checkpointing and network activity logging can be combined
to ensure that the correct fow of execution of a replicated
application can be recovered afer the primary fails [5], the
performance that can be achieved is usually limited by the
connectivity between the replicas. When using checkpointing
in a non-customized hardware fail over confguration (such as
is the case in Cloud deployments), the additional network
latency negatively afects high-throughput applications, while
the steady data fows between replicas might contribute to
network congestion. Finally, additional layer-2 management
mechanisms may be required to transparently reconfgure the
network connectivity upon replica fail-over.
B. Distributed architectures
The main issue that needs to be overcome in order to be
able to effectively deploy distributed architectures on a Telco
Cloud is the management of shared and replicated state, a
Acronym
LTE
MME
UE
eNB
SGW
TA
HSS
DNS
DHT
Table I
TERMINOLOGY
Meaning
Long Term Evolution 3/4G cellular network
Mobility Management Entity
User Equipment (cellular terminal)
Extended Node B (base station wi controller)
Service Gateway (interface to IMS I phone system)
Tracking Area (scope for initial UE paging attempt)
Home Subscriber Server
Internet Domain Name System
Distributed Hash Table
well-known and fundamental theme in both theoretical and
practical computer science [6]. Generally speaking, stateless
services can be parallelized and scale linearly with the number
of instances devoted to the task, while stateful services require
coordination among replicas and thus overhead in latency
and message exchange. Stateful distributed services can be
made elastic and reliable by properly combining membership
management, group communication, replica fail-over, and dis
tributed consensus schemes [7][8][9][10].
Distributed applications can be further extended using feed
back (and other known automatic control techniques) in order
to dynamically reconfgure themselves, e.g. to replace or
reboot failed replicas, reduce the number of active instances,
or deploy more instances on new nodes to follow an external
increase in service demand.
III. DMME: A P2P MOBILITY MANAGER
To provide a concrete case of how demanding telecOu
nication applications can be distributed in Virtual Telco, we
will now describe the design of a typical control plane service
for the LTE cellular system I. The role of an LTE mobility
management entity (MME) is to keep track of the location
(trcking area, TA) and associated state of a cellular phone
(user equipment, UE) as it moves through the cellular network,
always guaranteeing its reachability in case of a network
initiated voice or data connection.
Because of power management concerns, LTE VEs spend
most of the time in low-power mode with their transceiver
turned of. VEs listen at regular intervals to the beacons sent
by the local base-station (eNB) and explicitly notify the MME
of changes in their TA. The MME is in charge of keeping up
to-date all the state relative to each VE while it is idle. When
a call is made to the UE, the MME performs paging, i.e.
contacts all the eNBs of the last known TA in which the VE
has been detected, before widening the scope of the search
and eventually giving up.
Our architecture, DMME, implements in a distributed, fault
tolerant, and elastic fashion the functionality of an MME and
is intended for deployment in Telco Cloud environments. In
the following, we illustrate the rationale behind the design of
DMME, present the architecture of our system, and describe
I For the reader's convenience, we summarized in Table I the abbreviations
used throughout the rest of this paper.
984
Other central offices
Third party
cloud provider
Other central offices
The internet
Figure 1. Placement of resources in a Telco Cloud
the techniques used to achieve reliability and fexibility under
strict latency constraints.
A. Assumptions
A Telco Cloud environment entails the presence of small
managed clusters of networked hardware at both the edge and
the core of the telephone system. A cloud-wide management
system exists that enables the allocation of compute resource
in the desired clusters. Cluster locality can be exposed to the
Telco Cloud applications via well-known abstractions, such
as network coordinates [11], or other specialized addressing
schemes (e.g. as in [12]).
Figure 1 presents a rough sketch of the placement of
Telco Cloud infastructure in the current operator's network
facilities. We also assume that the other elements of the
telephone system are confgured to make use of one of the
nearest instances of oMME in the Telco Cloud.
B. A localized state machine apprach
Mobility in a cellular network requires maintammg VE
state in a well-known location. In legacy architectures, this
is usually the regional ofce, a centralized hub covering
a large number of users (tens to hundreds of millions) in
a geographically contiguous environment (e.g. large capital
cities, multiple neighboring states, whole countries).
A number of previous studies show that the majority
of the VE population spends most of the time in a fxed
location. When mobility occurs, it is usually limited to a
small scope (nearby eNBs, same city) rather than occurring
over random locations and large distances [13][14]. Although
regional offces are suitable for hosting the operator's main
subscriber database (Home Subscriber Server, HSS), they may
be suboptimal for intensive message processing and dynamic
state management because of the communication latency in
troduced, such as in the case of the MME.
We argue that a locality-driven approach to user mobility
has several advantages. First, by pushing the management of
the control plane toward the edge of the access network, it
becomes possible to dimension the prcessing capacity based
on local demand. For instance, a shared pool of oMME
replicas can be installed in a central ofce to satisfy the
MP instance
dMME
MP
instance
X=2"
Figure 2. Ardlitecture of the DMME: MP and ROS
aggregate needs of a user community. In the extreme case,
by associating an instance of oMME to a single eNE, the
maximum load to be sustained by the control plane becomes
determined by the capacity of the eNE. Second, by bounding
the number of users served by each oMME instance, it is
possible to reach high levels of overll availability without
having to rely on expensive hardware failover mechanisms.
For instance, a centralized MME that has to be 99.999%
available can only aford to be ofine for less than 6 minutes
per year, as its failure afects the whole pool of users of
the cellular network. Moving to a distributed architecture
without single points of failure transforms the availability issue
into a partial availability problem. In a distributed MME
system, where each instance serves only a small subset of
the total user pool, the disruption introduced by independent
failures of individual instances is smaller and bound to the
local scope of the affected instances. Therefore, a suffcient
level of reliability can be reached using of-the-shelf hardware
components running standard software recovery and replica
management techniques.
C System architecture of dMME
The oMME is composed by two logically separate entities,
as schematically represented in Figure 2:
A reliable object store (ROS), which hosts a current
snapshot of all VE-related state. It behaves as an ordinary
distributed key-value storage, with the additional support
for a protocol to grant exclusive write access to UE data
985
records. The implementation of the ROS is outside the
scope of this document, as it is based on known data
replication and consistency techniques.
A set of stateless message prcessors (MP) that can
retrieve the appropriate UE state from the ROS and
execute the actions dictated by the MME state machine
and standard protocols (parsing messages, updating state
variables, sending responses to the UE, eNB, and service
gateway, etc.)
Message processors implement all the methods and data struc
tures required by the MME specifcation. In order to be able
to retrieve the appropriate UE state fom the ROS, the MPs
rely on a lookup function that returns a pointer to the ROS
instance where the UE state is kept. The lookup function plays
a multiplexing and load-balancing role.
In this architecture, fexibility and resilience to churn is
provided by the use of structured overlays as both inter
entity communication facilities and membership management
schemes. Due to its intended deployment scenario as part of
a managed system, the use of relatively expensive techniques
such as 0(1) DHTs [15] is acceptable because of their low
maintenance footprint. One-hop DHTs have also the advantage
of providing lower lookups latencies than ordinary DHTs with
O(log(n)) routing iterations, which is a desirable property as
a lookup is required before MME message processing can be
initiated at MP instances with stale local UE context.
D. The dMME state access prtocol
A simple transaction-based protocol is used by MP instances
to access the UE data stored in the ROS. The MPs perform
state retrieval upon receipt of a message fom a UE or fom
a network service gateway (SGW). As the MPs are stateless,
until a local user context is created the messages from users
without a local context result in a lookup query to the ROS.
After a successful lookup, a message exchange ensues with
the ROS to obtain a copy of the UE state. If the lookup fails,
a new user context is created locally, which will subsequently
be placed in the ROS.
Concurrent access to the same user data by different MPs is
always possible since the triggering messages (network events)
are originated by independent entities, such as eNB nodes
during hand over or SGWs paging the UE, and furthermore
can be arbitrarily delayed by the network. Leases are used to
guarantee mutual exclusion on the MP's ownership of each
UE state record: this is required for a correct execution of the
MME protocol, in order to prevent multiple state machines
from simultaneously serving a same UE, which would lead
to unpredictable results. A lease can either be explicitly
terminated by the active MP, or it can expire due to timeout.
The last MP instance to obtain a lease on a UE state is charged
with the execution of periodic maintenance tasks while the UE
remains idle, until another instance intervenes.
Figure 3 shows two simplifed message exchanges between
UEs, MME replicas, and the ROS. The users are moving
between two eNBs, each of which is associated to a diferent
oM ME instance. The picture on the top represents the state
migration process triggered by the reception of a network event
by a new oMME instance while the UE is not involved in a
call (MME state is ECM_IDLE). The new oMME instance
requests the appropriate UE state fom the ROS, along with
a lease to operate on it. As the UE state is ECM_IDLE
and no other oMME instance is operating on it, the lease
is granted and instance MME2 is enabled to run the MME
protocol on behalf of the user. The picture on the bottom
shows a case in which state migration is not allowed by the
protocol, as a network event is already in progress (MME state
is ECM_CONNECTED). When the new instance receives a
message, it recovers the UE state from the ROS and learns
of the ongoing activity. Therefore, the message is forwarded
to the oMME replica that is currently holding the lease on
the UE state (MMEl), which executes the proper actions and
directly responds to the appropriate UE.
The state migration protocol as described above helps
reduce the MME response latency, as it preserves the locality
of the processing for users remaining in a cell for the entire
duration of an event. As mobility is a geographically localized
phenomenon, the impact of forwarding messages to a same MP
instance for the entire duration of a call has only a marginal
efect on the processing latency. Moreover, by redirecting all
the UE state maintenance related to a network event to the
same oMME replica, our scheme preserves the consistency of
the transient local context required for features such as billing
and lawful interception.
E Deploying the dMME on the Telco Cloud
We are evaluating a number of scenarios for the deployment
of the oMME in a Telco Cloud environment. The two most
interesting scenarios we identifed are the following:
a) Deploy at the Edge: In this scenario, the oMME
message processing instances are executed on the eNB hard
ware. This confguration, coupled with a distributed and
locality-aware ROS data storage scheme, is interesting as it
allows making a cellular network deployment self-sustaining:
the capacity of the control plane can grow linearly with
the number of users supported by the data plane. Another
interesting aspect lies in the failure model of this scenario:
hardware failures would involve both data and control plane,
thus limiting the disruption fom a single failure to the local
scope of a cel\.
b) Deploy thrughout the Cloud: In this scenario, the
oMME instances can be dynamically instantiated as processes
on the Telco Cloud equipment. While the processing locality
aforded by this deployment scheme is lower, it forgoes the
technical constraints posed by the limits of existing eNB
capabilities and allows a fne-grained control by the operator
on the resources that need to be allocated in the network to
support the predictable daily variation in user activity [14].
We are presently building a testbed to emulate a large-scale
deployment of oMME nodes under realistic user activity. We
plan to perform experiments both with synthetic user traces,
generated by established mobility and behavioral models, and
with real data, obtained from measurements in a production
986
startsess|on
(e.g.acall)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
term|nate
sess|on
UE1@eNB1
oearer_setu
oearer_comlete
term|nate
term|nated
MML!QeNB! MML2QeNB2
get_state(UL!)
gran[ease(UL! )
!
________

==i
a

e
state(L! )
UE1: CONNECTED
UE1: ECM IDLE I
. - .
--------______
"
ack_state(UL! )
ack_state(UL! )
O3
I I II I.I.I 11 11 11
UE1(eNB2
startsess|on
(e.g.acall)
-
-
-
-
-
-
-
-
-
-
startsess|on
(e.g.acall)
-
-
-
-
-
-
UE1@eNB1
oearer_setu
MML2QeNB2
get_state(UL!)
gran[ease(UL! )
-
-
-_
__ .. t_
__
____

_
a

e
state(UL!)
UE1: CONNECTED
-
-
OBI|I1(
oearer_comlete
I " 11
-
UE1@eNB2
-
- term|nate
term|nate
sess|on
term|nated
FWD: term|nate(UL!)
release_state(UL! )
,- --------------.
I UE1: ECM_IDLE
. .
\_-------------
,
ack_state(UL! )
ack_state(UL! )
UE1: CONNECTED
state_owner(UL!).MML!
O3
Figure 3. Two sketches of the DMME state access protocol: (top) UE mobility during ECM_IDLE state: state migration to a new DMME instance is
transparently triggered by a subsequent network event; (bottom) UE mobility during ECM_CONNECTED state: the ROS notifes the new DMME instance
of the existing lease to prevent state migration, the event is forwarded to DMME instance that is currently holding the lease
987
CDMA2000 / EVDO cellular network. We look forward to
testing the behavior of our system in both scenarios, up to a
scale of 106 simultaneous users.
IV. RELATED WORK
In recent years, the subjects of Cloud computing and
virtualization have spurred a considerable interest in the
telecommunication industry. A number of architectures have
been proposed to integrate parts of the telephone control
plane as a virtualized function [16]. Sandstone [12] introduced
a distributed key-value database architecture with locality
features that make it suitable to support a large-scale IMS
deployment. Sandstone itself is inspired by previous work on
locality-aware distributed hash tables such as [17] and [18].
Our design for DMME extends this approach toward high
throughput interactive signaling applications.
The state machine approach provides a powerful metaphor
that streamlines the description and analysis of complex sys
tem behaviors [10]. For instance, state machines are well
suited for the implementation of highly reliable systems if used
in conjunction with algorithms to solve consensus. The prop
erties of state machines, together with the fact that they can
be automatically verifed, have proved invaluable in countless
applications, ranging fom fault-tolerant distributed storage
systems [7] to control plane functions for advanced telephone
services [19]. We described in this paper a generic architec
ture based on state machine checkpointing that implements
mobility-driven migration of processing.
Finally, the idea of pushing the management of cellular
mobility toward the edge of the network has been previously
proposed in small-scale scenarios, such as self-sustaining
cellular networks based on sparsely-deployed femtocells [20],
where the scale of the system is rather small and where the
capabilities of the hardware involved are poor and non-elastic.
Our DMME architecture scales this approach up to larger
systems thanks to its reliance on a structured overlay and the
fexibility of its delegation mechanism.
V. CONCLUSIONS AND FUTURE WORK
In this paper, we presented our vision for Virtual Telco, a
further step toward convergence between the telephone and
IP-based networks. The transition to Virtual Telco is enabled
by adoption of Telco Clouds, collections of managed hardware
clusters controlled by operators and installed in key locations
at the edge of their networks. Telco Clouds provide the
consolidation required for a manageable infastructure that
operators can leverage both for servicing their customers and
for supporting their internal applications.
We explored the requirements for generic applications to
receive benefts fom their deployment in a Telco Cloud en
vironment. While legacy applications generally require costly
mechanisms to introduce scalability and replication for achiev
ing fault tolerance, properly architected distributed applica
tions can be built that take advantage from the parallelism
and network locality ofered by the Telco Cloud. We presented
one such application, the distributed MME (DMME), a typical
example of control plane application with high availability and
scalability requirements.
We implemented a prototype version of DMME that is being
deployed on our internal testbed infrastructure. We plan to
evaluate and improve the design of DMME by observing its
behavior under a variety of user activity and mobility patterns.
Furthermore, we are considering optimization mechanisms to
improve the locality of state storage, a fundamental require
ment for interactive applications that need reliability under
extremely low response latencies.
REFERENCES
[1] J. H. Saltzer, D. P. Reed, and D. D. Clark, "End-to-end arguments in
system design," ACM Trns. Comput. Syst., vol. 2, no. 4, pp. 277-288,
1984.
[2] w D. Ambrosch, A. Maher, and B. Sasscer, eds., The Intelligent
Network: A Joint Study by Bell Atlantic, IBM and Siemens. Springer
Verlag, 1989.
[3] G. Camarillo and M.-A. Garcia-Martin, The 3G IP Multimedia Subsys
tem (IMS): Merging the Interet and the Cellular Worlds. John Wiley
& Sons, 2006.
[4] S. Gilbert and N. Lynch, "Brewer's conjecture and the feasibility of
consistent, available, partition-tolerant web services," SIGACT News,
vol. 33, no. 2, pp. 51-59, 2002.
[5] B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and
A. Warfeld, "Remus: High availability via asynchronous virtual machine
replication," in NSDl, 2008.
[6] S. Mullender, ed., Distributed systems (2nd Ed.). New York, NY, USA:
ACM Press / Addison-Wesley Publishing Co., 1993.
[7] T. D. Chandra, R. Griesemer, and 1. Redstone, "Paxos made live: an
engineering perspective," in PODC '07: Proc. of the 26th annual ACM
symposium on Principles of distributed computing, (New York, NY,
USA), pp. 398-407, ACM, 2007.
[8] C. Diot, W. Dabbous, and 1. Crowcroft, "Multipoint communication:
a survey of protocols, functions, and mechanisms," Selected Areas in
Communications, IEEE Joural on, vol. 15, pp. 277 -290, apr. 1997.
[9] B. W Lampson, "How to build a highly available system using con
sensus," in WDAG '96: Prc. of the 10th Interational Workshop on
Distributed Algorithms, (London, UK), pp. 1-17, Springer-Verlag, 1996.
[l0] F. B. Schneider, "Implementing fault-tolerant services using the state
machine approach: a tutorial," ACM Comput. Surv, vol. 22, no. 4,
pp. 299-319, 1990.
[l1] F. Dabek, R. Cox, F Kaashoek, and R. Morris, "Vivaldi: A decentralized
network coordinate system," in In Proc. of ACM SIGCOMM, 2004.
[12] G. Shi, J. Chen, H. Gong, L. Fan, H. Xue, Q. Lu, and L. Liang,
"Sandstone: A dht based carrier grade distributed storage system," in
In Proc. ICPP '09, pp. 420 -428, sep. 2009.
[13] E. Halepovic and C. Williamson, "Characterizing and modeling user
mobility in a cellular data network," in PE-WASUN '05, (New York,
NY, USA), pp. 71-78, ACM, 2005.
[l4] H. Zang and J. C. Bolot, "Mining call and mobility data to improve
paging efciency in cellular networks," in MobiCom '07, (New York,
NY, USA), pp. 123-134, ACM, 2007.
[l5] J. Risson, A. Harwood, and T Moors, "Stable high-capacity one-hop
distributed hash tables," in ISCC '06, (Washington, DC, USA), pp. 687-
694, IEEE Computer Society, 2006.
[l6] M. Matuszewski and M. A. Garcia-Martin, "A distributed IP multimedia
subsystem (IMS)," in WoWMoM 2007. IEEE Inti. Symposium on a World
of Wireless, Mobile and Multimedia Networks, pp. 1 -8, jun. 2007.
[l7] M. J. Freedman and D. Mazieres, "Sloppy hashing and self-organizing
clusters," in IPTPS, pp. 45-55, 2003.
[18] P Ganesan, K. Gummadi, and H. Garcia-Molina, "Canon in g major:
designing dhts with hierarchical structure," in Prc. of the 24th IntI.
Conference on Distributed Computing Systems, pp. 263 - 272, 2004.
[l9] R. Viswanathan and T. L. Wood, "Portable call agent: A model for
rapid development and emulation of network services," Bell Lab. Tech.
J., vol. 12, no. 4, pp. 159-l72, 2008.
[20] N. Thompson, P. Zerfos, R. Sombrutzki, J.-P. Redlich, and H. Luo,
"100% organic: design and implementation of self-sustaining cellular
networks," in HotMobile '08, (Napa Valley, California), pp. 27-32,2008.
988

You might also like