You are on page 1of 10

An Overview of the Amazon PaaS

Platform-as-a-Service
One of the key characteristics of cloud computing is
abstraction, which is the concept of hiding low-level
complexity through automation so developers can focus on
applications which, at the end of the day, is what really matters
to the business.
Initially, this abstraction was focused on compute, network and
storage infrastructureso called infrastructure as a service
(IaaS), which removed the time and complexity of configuring
and provisioning infrastructure as the basis for deploying
software.
But now these abstractions have moved up the stack to
encompass OS and middleware platforms (application servers,
portal, message queues, etc.), which developers have
traditionally set up by hand.

PaaS is quickly becoming


the preferred style for
software development and
delivery. Amazon has built
upon its IaaS foundation to
create a full-feature PaaS
offering for application
logic, integration, caching
and database
management.

Platform as a service (PaaS) enables the full realization of


application-centric computing by abstracting away all of the complexity below the
application tier. This enables improved business agility through faster deployment
of applications and application changes.
Today, developers are requesting that their middleware are delivered in this same
as a service model.

All rights reserved. Copyright 2012, Transcend Computing.

Although this concept means different things to different people, common


characteristics include:
-

Middleware software has been re-written to take advantage of the elastic and
resilient nature of modern computing (cheap servers with lots of memory,
commodity operating systems, massive storage, horizontal scaling, replicated
data, etc.)
Metered pricing (pay-per-use) is preferred over traditional CPU licensing.
The platforms auto-scale, auto-heal, auto-patch and auto-configure.
The functions of the platform can be called remotely over an IP based
network with an API (HTTP, JSON, REST, XML, etc.)
Application lifecycle management (ALM) tools such as version control, build
management and deployment management are available as a service and
integrated into the platform services.

In addition to changes in the characteristics of the platform itself, there may also be
changes in how the software is delivered. In many cases PaaS services are hosted by
a public cloud provider which is responsible for infrastructure including servers,
networking, power, data centers, etc. While these traditional PaaS providers were
typically third-party managed hosting or cloud service providers, today large IT
shops also deliver common shared platform-as-a-service offerings. In this context,
IT is a managed service provider in its own right, beholden to similar (or often more
demanding) service level guarantees as a public PaaS provider.
The most highly contested attribute of PaaS is multi-tenancy, which describes the
level and degree of computational sharing. Typically, low-sharing environments
(data centers, servers, platforms, etc.) see lower efficiencies. Due to reduced
purchasing power, they have a weaker bargaining position and often must pay
higher prices. Also, the lack of scale economies mean that resources (like people and
machines) yield lower utilization rates than you may find in a large shared
environment. A downside of multi-tenant environment is when a neighbor uses
up large amounts of resources , negatively impacting your performance. Think of it
like living in a condo: lawn service may be one of many valued convenience, but its
the opposite of convenient when your neighbors use up the hot water! There are
tradeoffs to shared environments.
The Amazon offering has multiple levels of tenancy, implemented at various layers
of their stack:
- At the infrastructure layer, a user can reserve a complete server and place
their preferred platforms on the server. This is called, dedicated instances.
- Also at the infrastructure layer, a user can put their platforms on regular EC2
instances where the sharing is at the hypervisor layer.

All rights reserved. Copyright 2012, Transcend Computing.

A variation of the previous model is where AWS uses EC2 instances for
sharing but locks down the hypervisor and maintains control over it. This is
used in several of their PaaS services (RDS, ElastiCache, etc.)
A final type of tenancy is when the computing model is completely hidden
from the user. In this paper, we refer to this approach as encapsulated. In
these cases, Amazon is responsible for the availability, scalability, security
and other non-functional concerns of the platform.

Some purists may argue that the only kind of PaaS is one that is fully encapsulated.
However, we have found that it is beneficial to have choices. For example, by using a
service that provisions servers and platforms and exposes some of their details is
great when you need to directly interact with the component. It allows developers
to use existing engines like MySQL and Memcached. That said, it puts a larger
burden on the developer to maintain the scaling, availability, data backups and so
on.

All rights reserved. Copyright 2012, Transcend Computing.

Support Services
There are a number of services that dont technically fall into the PaaS category, nor
are they naturally part of IaaS. Typically, these crosscutting services intersect with
other services and apply some added functional behavior or management value.
Amazon examples include:
-

CloudWatch This is the Amazon monitoring service, which is used to


collect data on the health of the other services, record the data and if
necessary, trigger events and alarms so that new actions can be taken.
Amazon has also built agents for their existing services (RDS, SNS, etc.) to
capture their health and report the findings to CloudWatch. This data is
available to any user who provisions a platform service.
CloudFormation This is the Amazon orchestrated provisioning service,
which is used to launch entire environments in a predictable and repeatable
manner. For example, one might use CloudFormation to provision a multitiered application by giving the service a template that describes all of the
components and their interdependencies. A single template might launch a
load balancer, four compute instances, two databases, set up the host names,
define auto-scaling properties, and so on. CloudFormation isnt a traditional
piece of developer middleware, but it is commonly used to provision PaaS
services as part of multi-tier application architectures.
Autoscaling As the name implies, autoscaling is a service that is used to
increase or decrease the amount of computing resources applied to a task.
The service uses data held in CloudWatch (the monitor) to determine if a
server is overloaded. When this is the case, the autoscaling service can
launch new servers and attach load balancers to those servers to redirect
incoming traffic. Conversely, when load decreases the servers are spun down.
Identity & Access Management Security is another crosscutting concern
that affects IaaS and PaaS elements. All of the AWS services are integrated
into the IAM service and make extensive use of their policy system.

Infrastructure Services
Amazon Web Services is perhaps best known for their IaaS offerings, including
compute, network and storage (EC2, Route 53, ELB, Security Groups, Virtual Private
Cloud and Elastic Load Balancer). Although these services are not in the scope of
this paper, it is worth noting that most large systems that are developed today use a
combination of IaaS and PaaS elements together to solve the problem.
All rights reserved. Copyright 2012, Transcend Computing.

The Amazon PaaS Services


Amazons platform services can be categorized according to their contribution
relative to the application architecture:
1. Application Logic-as-a-Service
2. Database-as-a-Service
3. Caching-as-a-Service
4. Integration-as-a-Service

Application Logic-as-a-Service
Today, application logic is typically written by hand in modern programming
languages like Ruby, Java, PHP or C#. Each language also has frameworks or
libraries that are used to accelerate development. For example, the Rails framework
remains popular for Ruby developers while Java developers commonly use servlet
engines or Spring containers. It is common for a PaaS solution to embrace the use of
multiple programming languages and multiple frameworks; Amazon is no different.
The primary service used to host and execute application logic is Elastic Beanstalk
This service originally focused exclusively on running applications that were written
for the Java Virtual Machine and could be executed inside of an Apache Tomcat
servlet engine. The service allows a user to upload a .war file (a pre-packaged
servlet) and the Beanstalk service takes care of things like managing the JVM,
patching Tomcat, adjusting configuration files, auto-scaling the service according to
an SLA, managing the dev/test/stage/prod environments (roll forward and roll
back) and controlling multiple versions of the users software. Beanstalk applications
will often use the other platform services for integration, persistence, security, etc.
More recently, the Beanstalk service was extended to support PHP. In this scenario,
the unit of deployment is the source code not a compiled unit (like the Java .war
file). To make source code transfer simple, Beanstalk also added support for the Git
version control system. Development teams that are already using Git can continue
to do so and copy their source branches to the Beanstalk service. From here, the
source files are picked up and can be executed. Developers that are using an
alternative version control system like SVN or CVS will need to take an extra step of
bridging their current system with Git.
Current criticisms of Elastic Beanstalk include the lack of additional language
support (Ruby, Node.JS, C#, etc.), the lack of a continuous build environment like
Hudson/Jenkins and the lack of integrated testing frameworks for functional testing,
All rights reserved. Copyright 2012, Transcend Computing.

regression testing, stress testing, etc. Despite the limitations, a growing number of
third parties are filling the gaps and Amazon is continuing release updates at a
frantic pace.

Database-as-a-Service
Amazon offers three native choices for databases each with their own advantages
and disadvantages. The earliest offering was SimpleDB. This solution was
introduced as a simple way to store information persistently by using key/value
pairs. SimpleDBs claim-to-fame is that it really is easy to use, mostly because it
doesnt have many of the more complicated features developers have come to
expect in database management systems. It does excel from an administrative
perspective. For example, data is automatically replicated and backed up for the
user. The design of SimpleDB embraces encapsulated horizontal scalability enabling
applications to generate massive loads against the database without ever worrying
about the number of CPUs, memory or other physical resources that are provisioned
behind the scenes.
Although SimpleDB satisfied many needs, most business applications used a
relational database. Amazon responded with Relational Database Service (RDS).
Unlike SimpleDB, RDS is not an encapsulated horizontally scaling system as this
would require significant changes to the underlying database engines. Instead, RDS
gives the users the ability to self-service provision a database and configure it to
their needs. The service currently supports most of the popular editions and
versions of MySQL and Oracle. Users can specify specific configuration settings for
their database including the size of the machine (CPUs and Memory), backup &
restore options, the ability to auto-patch the database engine, the publishing of
monitoring data and high availability features like the auto-recovery of a database
system in a remote data center if the original went down.
The third database service offered by Amazon is DynamoDB. This offering is
considered a NoSQL database, which means that it doesnt rely on SQL for data
definition (create table, etc.) or for data manipulation (select * where). Instead,
DynamoDB offers a schema-less database management system. Many view this
offering as a replacement for SimpleDB because it has a superset of the functionality
while being delivered in the same encapsulated, horizontally scalable manner.

Caching-as-a-Service
High-speed caching has become a mainstay in modern computing architectures. A
properly implemented caching layer will significantly reduce both latency and
increase data throughput.
All rights reserved. Copyright 2012, Transcend Computing.

Amazon offers an implementation of a clustered cache by wrapping one of the most


popular open source solutions, Memcached. Users are able to launch a cache via selfservice provisioning (API or portal). The memcached software is exposed to the
developer and commands can be issued directly against it.
The Elasticache service offers the ability to associate the caching software with
various types of EC2 compute services (# of CPUs, amount of memory to dedicate,
etc.) Arrays of instances are combined to create a horizontal scaling effect. A set of
nodes that are associated together are known as a cache cluster, which can be
managed as a single unit from a scaling and availability and perspective.
For instance, if a caching node locks up or goes down, the Elasticache service will
automatically replace those instances with new nodes. If the cache is overloaded,
alerts can be defined to grow the size of the cluster. Finally, the Elasticache service
manages the patching and maintenance of the memcached software. Software
updates are applied according to user specified parameters, typically associated
with off-peak or after-hours maintenance windows.

Integration-as-a-Service
Amazon Web Services currently offers two types of integration services for systemto-system decoupling and messaging. At this time, there is no mechanism to do
payload transformations or protocol mediation. The current services are Simple
Notification Service (pub/sub communication) and Simple Queue Service (message
queue).
A key principle to system design is decoupling of modules via messaging. AWS
provides an event-based mechanism to allow a publisher to create a topic of
interest and then publishes messages related to the topic. Multiple users (or
systems) can subscribe to the topic and receive a copy of any published messages.
Simple Notification Service (SNS) provides pub/sub (publication/subscription)
capabilities inside the AWS cloud. The service is an encapsulated, horizontally
scalable offering. Amazon does not indicate which message libraries they use behind
the service interface to provide the functionality. Developers can call the service via
SOAP- or a REST-based commands and they specify their delivery protocol of choice
(HTTP, HTTPS, SMTP, SQS or SMS). After a message has been placed on a topic, the
SNS service sends the message to all subscribers.
In its current state, SNS does not offer guaranteed delivery notification by
confirming receipt of individual messages, nor does it provide guarantees on the
timeliness of delivery. SNS should be viewed as an Internet scale pub/sub delivery
system that provides best-effort service levels. It should not be used in instances
where guaranteed delivery (at least once, exactly once, not more than once) is
All rights reserved. Copyright 2012, Transcend Computing.

required such as in financial transactions unless additional guarantees are built


around the core service. The service is considered massively scalable and does
provide high availability by offering intra-region redundancy and redundant
replication of temporarily persisted objects. The service leverages other AWS
services such as CloudWatch for monitoring, CloudFormation for orchestration and
Identity & Access Management for fine-grained access control.
A second integration service offered by AWS is Simple Queue Service (SQS), which
allows developers to separate two modules from a load-over-time perspective. For
example, if module A were to receive significant load in a short period of time, work
requests can be placed in a queue. Module B could then pull items off the queue and
begin processing them in order. A common scenario is when modules have different
owners (other companies, siloed applications, etc.) and the modules need to
communicate. Normally, the modules would be forced to communicate and process
loads at the same pace. The message queue enables the two modules to work at
different speeds, where the module working at the slower speed will queue the
work requests.
Using the SQS service, developers can use either a SOAP or RESTful interface to
create, delete and inspect queues as well as to add or remove items from a queue.
Messages can also be batched allowing a group of messages to be processed
together. Each message is locked while its being processed. This prevents multiple
consumers from accidentally processing the same item on a queue. The SQS service
is an encapsulated, horizontally scalable offering; while this design enables massive
scaling, a downside is that the distributed design makes it more difficult to manage
the state of messages across nodes in the cloud.
AWS has chosen not to implement advanced queuing capabilities like FIFO (first-infirst-out) or priority queues. The assumption is that if users want these features,
they will extend the core service to include finer grained management of message
arrival and departures. The decision to make SQS a massively scalable, highly
available system may have contributed to the decision to not support existing
protocols like STOMP or AMQP. Although neither SQS nor SNS try to meet JMS (Java
Messaging Service) requirements, they do satisfy several of the API mandates and
libraries are available.
Simple Workflow Services (SWS) is a recent addition to the AWS PaaS suite.
Architects often break large complex applications into multiple smaller modules.
These modules are then called one at a time based on the results (or state) of the
prior call. SWS manages the distributed state and facilitates the execution of multistep applications. With workflow as part of the service name, many think this
offering is human workflow or BPM; this isnt the case. It could serve as the engine
for a traditional BPM solution but in its current low-level form, it would be
cumbersome to build end-user application with it. Instead, it should be used as the
All rights reserved. Copyright 2012, Transcend Computing.

coordinator of distributed execution of system tasks with dependencies,


concurrency, pre-defined ordering and ordering based on state.

Summary findings
While the Amazon cloud is best known for the original EC2 infrastructure services,
the majority of the recent releases have been in the platform services space. This is
consistent with the growing belief that IaaS is necessary, but not sufficient; the real
value in enabling application-centric computing models come from innovations in
the PaaS space.
Although Amazon doesnt publish revenue figures on their cloud offering many
have developed models that project impressive usage and growth rates. Advanced
users are increasingly expanding the breadth of the platform services they rely upon
because of their convenience, accessibility and low price.
Although we cant substantiate it with data, Transcend believes that Amazon
currently has the largest PaaS offering when measured by annual revenue, total
number of users or total compute hours.
By virtually any measure, the AWS PaaS offering is a market leader. Perhaps more
importantly, Amazon has demonstrated a strong commitment to this space and a
desire to innovate and lead at progressively higher layers of the stack. Based on its
impressive vision and unrivaled ability to execute, we believe Amazon will parlay its
IaaS dominance into a similar position of strength in PaaS.

All rights reserved. Copyright 2012, Transcend Computing.

About Transcend
Transcend Computing is an innovator in Amazon Compatible Environments (ACE)
for public, private and hybrid cloud computing. Transcend was formed to help
developers, enterprises and managed service providers to capitalize on the
momentum of Amazon Web Services.
StackStudio is a visual, drag-and-drop online development environment for
assembling multi-tier application topologies using the Amazon CloudFormation
format. Application stacks assembled with StackStudio are ready to run on Amazon
Web Services (AWS) and on other public and private ACE platforms.
These stacks can then be shared with other developers in StackPlace, an open
social architecture community sponsored by Transcend Computing. StackPlace
allows developers to create, contribute, consume and collaborate on ACEcompatible application topologies.

All rights reserved. Copyright 2012, Transcend Computing.

10

You might also like