You are on page 1of 147

CLOUD COMPUTING

AND
DATA CENTERS
Cloud Computing - Some terms
• Term cloud is used as a metaphor for internet
• Concept generally incorporates combinations of the following
– Infrastructure as a service (IaaS)
– Platform as a service (PaaS)
– Software as a service(SaaS)
• Not to be confused with
– Grid Computing – a form of distributed computing
• Cluster of loosely coupled, networked computers acting in concert to perform
very large tasks
– Utility Computing – packaging of computing resources such as computing
power, storage, also a metered services
– Autonomic computing – self managed
Grid Computing
• Share Computers and data
• Evolved to harness inexpensive computers in Data center to solve variety of problems
• Harness power of loosely coupled computers to solve a technical or mathematical problem
• Used in commercial applications for drug discovery, economic forecasting, sesimic
analysis and back-office
• Small to big
– Can be confined to a corporation
– Large public collaboration across many companies and networks
• Most grid solutions are built on
– Computer Agents
– Resource Manager
– Scheduler
• Compute grids
– Batch up jobs
– Submit the job to the scheduler, specifying requirements and SLA(specs) required for
running the job
– Scheduler matches specs with available resources and schedules the job to be run
– Farms could be as large as 10K cpus
• Most financial firms has grids like this
• Grids lack automation, agility, simplicity and SLA guarantees
Utility Computing
• More related to cloud computing
– Applications, storage, computing power and network
• Requires cloud like infrastructure
• Pay by the use model
– Similar to electric service at home
• Pay for extra resources when needed
– To handle expected surge in demand
– Unanticipated surges in demand
• Better economics
Cloud computing – History
• Evolved over a period of time
• Roots traced back to Application Service Providers in
the 1990’s
• Parallels to SaaS
• Evolved from Utility computing and is a broader concept
Cloud computing
• Much more broader concept
• Encompasses
– IAAS, PAAS, SAAS
• Dynamic provision of services/resource pools in a co-ordinated fashion
– On demand computing – No waiting period
– Location of resource is irrelevant
• May be relevant from performance perspective, data locality
• Applications run somewhere on the cloud
– Web applications fulfill these for end user
– However, for application developers and IT
• Allows develop, deploy and run applications that can easily grow
capacity (scalability), work fast (performance), and offer good reliability
• Without concern for the nature and location of underlying infrastructure
– Activate, retire resources
– Dynamically update infrastructure elements without affecting the business
Clouds Versus Grids

• Clouds and Grids are distinct


• Cloud
– Full private cluster is provisioned
– Individual user can only get a tiny fraction of the total resource pool
– No support for cloud federation except through the client interface
– Opaque with respect to resources
• Grid
– Built so that individual users can get most, if not all of the resources in a
single request
– Middleware approach takes federation as a first principle
– Resources are exposed, often as bare metal
• These differences mandate different architectures for each
Cloud Mythologies
• Cloud computing infrastructure is just a web service interface to
operating system virtualization.
• Cloud computing imposes a significant performance penalty over
“bare metal” provisioning.
– “I won’t be able to run a private cloud because my users will
not tolerate the performance hit.”
• Clouds and Grids are equivalent
– “In the mid 1990s, the term grid was coined to describe
technologies that would allow consumers to obtain computing
power on demand.”
Commercial clouds
Cloud Anatomy

• Application Services(services on demand)


– Gmail, GoogleCalender
– Payroll, HR, CRM etc
– IBM Lotus Live
• Platform Services (resources on demand)
– Middleware, Intergation, Messaging, Information, connectivity etc
– AWS, IBM Virtual images, Boomi, CastIron, Google Appengine
• Infrastructure as services(physical assets as services)
– IBM Blue house, VMWare, Amazon EC2, Microsoft Azure Platform, Sun Parascale and more
2010 Gartner “IT Hype Cycle”
for Emerging Technologies

2009 2010

2008

2007
Cloud Computing - layers

Layers Architecture
What is a Cloud?

Individuals Corporations Non-Commercial

Cloud Middle Ware


Storage OS Network Service (apps) SLA (monitor),
Provisioning Provisioning Provisioning Provisioning Security, Billing,
Payment

Resources
Services Storage Network OS
Why cloud computing
• Data centers are notoriously underutilized, often idle 85% of the
time
– Over provisioning
– Insufficient capacity planning and sizing
– Improper understanding of scalability requirements etc
• including thought leaders from Gartner, Forrester, and IDC—
agree that this new model offers significant advantages for fast-
paced startups, SMBs and enterprises alike.
• Cost effective solutions to key business demands
• Move workloads to improve efficiency
• Server Message Block (SMB, also known as Common Internet File System,
CIFS) operates as an application-layer network protocol mainly used to provide
shared access to files, printers, serial ports, and miscellaneous communications
between nodes on a network. It also provides an authenticated inter-process
communication mechanism.
Most usage of SMB involves computers running Microsoft Windows, where it is
often known as "Microsoft Windows Network".
How do they work?
• Public clouds are opaque
– What applications will work well in a cloud?
• Many of the advantages offered by Public Clouds appear useful for “on
premise” IT
– Self-service provisioning
– Flexible resource allocation
• What extensions or modifications are required to support a wider variety of
services and applications?
– Data assimilation
– Multiplayer gaming
– Mobile devices
Cloud computing -
Characteristics
• Agility – On demand computing infrastructure
– Linearly scalable – challenge
• Reliability and fault tolerance
– Self healing – Hot backups, etc
– SLA driven – Policies on how quickly requests are processed
• Multi-tenancy – Several customers share infrastructure, without compromising
privacy and security of each of the customer’s data
• Service-oriented – compose applications out of loosely coupled services. One
service failure will not disrupt other services. Expose these services as API’s
• Virtualized – decoupled from underlying hardware. Multiple applications can run in
one computer
• Data, Data, Data
– Distributing, partitioning, security, and synchronization
Public, Private and Hybrid
clouds
Public clouds
• Open for use by general public
– Exist beyond firewall, fully hosted and managed by the vendor
– Individuals, corporations and others
– Amazon's Web Services and Google appEngine are examples
• Offers startups and SMB’s quick setup, scalability, flexibility
and automated management. Pay as you go model helps
startups to start small and go big
• Security and compliance?
• Reliability concerns hinder the adoption of cloud
– Amazon S3 services were down for 6 hours
Public Clouds (Now)

• Large scale infrastructure available on a rental basis


– Operating System virtualization (e.g. Xen, kvm) provides CPU isolation
– “Roll-your-own” network provisioning provides network isolation
– Locally specific storage abstractions
• Fully customer self-service
– Service Level Agreements (SLAs) are advertized
– Requests are accepted and resources granted via web services
– Customers access resources remotely via the Internet
• Accountability is e-commerce based
– Web-based transaction
– “Pay-as-you-go” and flat-rate subscription
– Customer service, refunds, etc.
Private Clouds
• Within the boundaries (firewall) of the organization
• All advantages of public cloud with one major difference
– Reduce operation costs
– Has to be managed by the enterprise
• Fine grained control over resources
• More secure as they are internal to org
• Schedule and reshuffle resources based on business demands
• Ideal for apps related to tight security and regulatory concerns
• Development requires hardware investments and in-house expertise
• Cost could be prohibitive and cost might exceed public clouds
Clouds and SOA

• SOA Enabled cloud computing to what is today


• Physical infrastructure like SOA must be discoverable, manageable and governable
• REST Protocol widely used (Representational State Transfer)
Clouds for Developers
• Ability to acquire, deploy, configure and host
environments
• Perform development unit testing, prototyping and full
product testing
Open Source Cloud Infrastructure

• Simple
– Transparent => need to “see” into the cloud
– Scalable => complexity often limits scalability
– Secure => limits adoptability
• Extensible
– New application classes and service classes may require new features
– Clouds are new => need to extend while retaining useful features
• Commodity-based
– Must leverage extensive catalog of open source software offerings
– New, unstable, and unsupported infrastructure design is a barrier to uptake,
experimentation, and adoption
• Easy
– To install => system administration time is expensive
– To maintain => system administration time is really expensive
Microsoft and Amazon face
challenges
• Globus/Nimbus
– Client-side cloud-computing interface to Globus-enabled TeraPort cluster
– Based on the Globus Virtual Workspace Service
– Shares upsides and downsides of Globus-based grid technologies
• Enomalism (now called ECP)
– Start-up company distributing open source
– REST APIs(Representational State Transfer (REST) is a
style of software architecture for distributed hypermedia
systems such as the World Wide Web.
• Reservoir
– European open cloud project
– Many layers of cloud services and tools
– Ambitious and wide-reaching but not yet accessible as an implementation
• Eucalyptus
– Cloud Computing on Clusters
– Amazon Web Services compatible
– Supports kvm and Xen
• Open Nebulous

• Joyent
– Based on Java Script
Open Source Cloud Ecosystem -
Tools
• RightScale
– Startup focused on providing client tools as SaaS hosted in AWS
– Uses the REST interface
• Canonical
– Ubuntu 9.10 (Karmic Koala)
– Includes KVM and Xen Hypervisors
Open Source Cloud Anatomy

• Extensibility
– Simple architecture and open internal APIs
• Client-side interface
– Amazon’s AWS interface and functionality (familiar and testable)
• Networking
– Virtual private network per cloud
– Must function as an overlay => cannot supplant local networking
• Security
– Must be compatible with local security policies
• Packaging, installation, maintenance
– system administration staff is an important constituency for uptake
Open Source Cloud Anatomy ..
cntd

• Private clouds are really hybrid clouds


– Users want private clouds to export the same APIs as the public clouds
• In the Enterprise, the storage model is key
– Scalable “blob” storage doesn’t quite fit the notion of “data file.”
• Cloud Federation is a policy mediation problem
– No good way to translate SLAs in a cloud allocation chain
– “Cloud Bursting” will only work if SLAs are congruent
• Customer SLAs allow applications to consider cost as first-class principle
– Buy the computational, network, and storage capabilities that are required
Open Source Clouds contd.
Eucalyptus (Elastic Utility Computing Architecture Linking Your
Programs To Useful Systems)
Clouds and Virtualization
• Operating System virtualization (Xen, KVM, VMWare, HyperV)
is only apparent for IaaS
– AppEngine = BigTable
• Hypervisors virtualize CPU, Memory, and local device access as
a single virtual machine (VM)
• IaaS Cloud allocation is
– Set of VMs
– Set of storage resources
– Private network
• Allocation is atomic
• SLA- A service level agreement (SLA) is a negotiated agreement between
two parties where one is the customer and the other is the service provider.
This can be a legally binding formal or informal "contract"
- The SLA records a common understanding about services, priorities,
responsibilities, guarantees, and warranties
• Monitoring
KVM- Kernel based Virtual Machine
Cloud Infrastructure
• Network operations center

 Physical Infrastructure
Cloud Infrastructure ..contd
• Physical Security

 Cooling
Cloud Infrastructure ..contd
• Power infrastructure, Network Cabling, Fire
safety
Cloud computing open issues
• Governance
– Security, Privacy and control
– SLA guarantees
– Ownership and control
– Compliance and auditing
• Sarbanes and Oxley Act
• Reliability
– Good servive provider with 99.999% availability
• Cloud independence – Vendor lockin?
– Cloud provider goes out of business
• Data Security
• Cloud lockin and Loss of control
– Plan for moving data along with Cloud provider
• Cost?
• Simplicity?
• Tools
• Controls on sensitive data?
– Out of business
• Big and small
– Scalability and cost outweigh reliability for small
businesses
– Big businesses may have a problem
Battle in the cloud
• Amazon Web Services
• Google App Engine
– Free upto 500 MB,
• Free for small scale applications?
• Universities?
– Pay when you scale
• GoGrid
• .. Some more Hosting companies
- HP, IBM, Oracle (+sun) and Dell
• Cloud computing entails building massive distributed systems
– They use replicated data, shared relational databases, parallelism
– Brewer’s “CAP theorem:” Must sacrifice Consistency for Availability &
Performance

• Cloud providers believe this theorem

• Key to scalability is decoupling (Modules should communicate as less as


possible with one another ),
- loosest possible synchronization (Any synchronized mechanism is a risk)

• We gave up on consistency too easily

Long ago, we knew how to build reliable, consistent


distributed systems.
• A consistent distributed system will often have many
components, but users observe behavior
indistinguishable from that of a single-component
reference system
• They reason this way:
– Systems that make guarantees put those guarantees first and
struggle to achieve them
– For example, any reliability property forces a system to
retransmit lost messages, use acks, etc
– But modern computers often become unreliable as a symptom
of overload… so these consistency mechanisms will make
things worse, by increasing the load just when we want to ease
off!
• So consistency (of any kind) is a “root cause” for
meltdowns, oscillations (up and down), thrashing
• Thus application developers are urged to not assume consistency
and to avoid anything that will break if inconsistency occurs
• To reintroduce consistency we need
– A scalable model
• Should this be the Isis model?
– A high-performance implementation
• Can handle massive replication for individual objects
• Massive numbers of objects
• Won’t melt down under stress
• Not prone to instabilities or resource exhaustion problems

• Intermediate System To Intermediate System (IS-IS), is a protocol


used by network devices (routers) to determine the best route for
datagrams through a packet-switched network.
The protocol was defined in ISO 2002 as an international standard
within the Open Systems Interconnection (OSI) reference design.
IS-IS is not an Internet Standard, however IETF republished the
protocol in RFC 1142 for the Internet community.
Cloud Programming
Message Passing
Not only fundamental to distributed systems, but also a
better parallel programming model
– Performance / correctness isolation
– Well-defined points of interaction
– Scalable model
More difficult to use
– Little language support
• Erlang integrates message with pattern matching
Message passing libraries
• Fundamental mismatch: asynchronous strange in a synchronous world
Open problems
– Control structures for asynchronous messages
– Communications contracts
– Integration of messages in type system and memory model
Distribution

Distributed systems are rich source of difficult problems


– Replication
– Consistency
– Quorum
Well studied field produced good solutions
– Outsider’s perspective: research is focused on fundamental problems and
used in real systems
How can these techniques be incorporated into programming
model?
– Libraries
– Language integration
– New models
Availability
Services must be highly available
– Blackberry outage gets national media attention
– Affect millions of people simultaneously
– Service can become part of national infrastructure
High availability is challenge
– Starts with design and engineering
– Hard to eliminate all “single points of failure”
– Murphy’s law rules (Anything that can go wrong, will go wrong )
Programming models provide little support of systematic error handling
– Disproportionate fraction of software defects in error handling code
• Run in inconsistent state
• Difficult to test
– Erlang has systematic philosophy of fail and notify
– Could lightweight transactions simplify rollback ?
Performance
Performance is system-level concern
– Goes far beyond the code running on a machine
– Most performance tools focus on low-level details
– Automatic optimization (compilers) is very granular
Current approach is wasteful and uncertain
– Build, observe, tweak, overprovision, pray
Performance should be specified as part of behavior
– SLAs as well as pre-/post-conditions
Need scalability
– Grow by adding machines, not rewriting software
Architecture should be the starting point
– Model and simulate before building a system
Adaptivity
– Systems need to be introspective and capable of adapting behavior to load
– e.g., simplify home page when load spikes, defer low-priority tasks, provision more
machines, …
Application Partitioning

Static partition of functionality between client and server


– Difficult to support range of clients with different architectures and
capabilities
– Difficult to adapt to changing constraints (e.g., battery)
– Move computation to data, particularly when communications
constrained
– Code mobility
• Exists in data center (VMs)
Currently, client and server are two fundamentally different
applications
– Evolution around interfaces
• Single program model, compiled for server and client
High-Level Abstractions

Map-reduce and dataflow abstractions simplify large-scale


data analysis in data centers
– Convenient way to express problems
– Hide complex details (distribution, failure, restart)
– Allow optimization (speculation)
Need more abstractions for wider range of problems
– Not appropriate for services
Conclusive outcomes

Cloud computing is more than VMs, data centers, web


services, …
– New way of performing computation
New opportunity to correct problems with existing
computing models
– Cost, complexity, reliability, …
Challenges and Barriers
Current
• Balancing Security and Usability
– User Validation
– Virtualization; servers, firewalls, networks
– Access
• Business processes
– Flexible funding; credit cards, speeding MIPR process
• Cultural inertia
– Sharing the vision
• Controlling expectations
– “Why can’t it…..”
Future
• Security optimization
– “Shared” accreditation
– Validation of customer applications
– Integrating Software as a Service
– Accessing federated and shared services
• Business streamlining
– Each Service and Agency has unique processes
– Funding hurdles; Procurement $ verses Operating $
A Working Definition of Cloud Computing
• Cloud computing is a model for enabling convenient, on-
demand network access to a shared pool of configurable
computing resources (e.g., networks, servers, storage,
applications, and services) that can be rapidly provisioned and
released with minimal management effort or service provider
interaction. (NIST defn)
• This cloud model promotes availability and is composed of five essential
characteristics, three service models, and four deployment models.

• The National Institute of Standards and Technology (NIST),


known between 1901 and 1988 as the National Bureau of Standards (NBS), is a measurement
standards laboratory which is an agency of the United States Department of Commerce.
5 Essential Cloud Characteristics

• On-demand self-service
• Broad network access
• Resource pooling
– Location independence
• Rapid elasticity
• Measured service
3 Cloud Service Models

• Cloud Software as a Service (SaaS)


– Use provider’s applications over a network
• Cloud Platform as a Service (PaaS)
– Deploy customer-created applications to a cloud
• Cloud Infrastructure as a Service (IaaS)
– Rent processing, storage, network capacity, and other fundamental
computing resources

• To be considered “cloud” they must be deployed on top of cloud


infrastructure that has the key characteristics
Service Model Architectures
Cloud Infrastructure Cloud Infrastructure Cloud Infrastructure
IaaS Software as a Service
PaaS PaaS (SaaS)
SaaS SaaS SaaS Architectures

Cloud Infrastructure Cloud Infrastructure


IaaS Platform as a Service (PaaS)
PaaS PaaS Architectures

Cloud Infrastructure
IaaS Infrastructure as a Service (IaaS)
Architectures
4 Cloud Deployment Models

• Private cloud
– enterprise owned or leased
• Community cloud
– shared infrastructure for specific community
• Public cloud
– Sold to the public, mega-scale infrastructure
• Hybrid cloud
– composition of two or more clouds
Common Cloud Characteristics

• Cloud computing often leverages:


– Massive scale
– Homogeneity
– Virtualization
– Low cost software
– Geographic distribution
– Service orientation
– Advanced security technologies
The NIST Cloud Definition Framework

Hybrid Clouds
Deployment
Models Private Community
Public Cloud
Cloud Cloud

Service Software as a Platform as a Infrastructure as a


Models Service (SaaS) Service (PaaS) Service (IaaS)

On Demand Self-Service
Essential
Broad Network Access Rapid Elasticity
Characteristics
Resource Pooling Measured Service

Massive Scale Resilient Computing

Common Homogeneity Geographic Distribution


Characteristics Virtualization Service Orientation
Low Cost Software Advanced Security
Cloud Computing Security
Security is the Major Issue
Analyzing Cloud Security

• Some key issues:


– trust, multi-tenancy, encryption, compliance
• Clouds are massively complex systems can be reduced
to simple primitives that are replicated thousands of
times and common functional units
• Cloud security is a tractable problem
– There are both advantages and challenges
General Security Advantages

• Shifting public data to a external cloud reduces the


exposure of the internal sensitive data
• Cloud homogeneity makes security auditing/testing
simpler
• Clouds enable automated security management
• Redundancy / Disaster Recovery
General Security Challenges

• Trusting vendor’s security model


• Customer inability to respond to audit findings
• Obtaining support for investigations
• Indirect administrator accountability
• Proprietary implementations can’t be examined
• Loss of physical control
Security Relevant Cloud
Components
• Cloud Provisioning Services
• Cloud Data Storage Services
• Cloud Processing Infrastructure
• Cloud Support Services
• Cloud Network and Perimeter Security

• Elastic Elements: Storage, Processing, and Virtual


Networks
Provisioning Service

• Advantages
– Rapid reconstitution of services
– Enables availability
• Provision in multiple data centers / multiple instances

• Challenges
– Impact of compromising the provisioning service
Data Storage Services
• Advantages
– Data fragmentation and dispersal
– Automated replication
– Provision of data zones (e.g., by country)
– Encryption at rest and in transit
– Automated data retention
• Challenges
– Isolation management / data multi-tenancy
– Storage controller
• Single point of failure / compromise?
– Exposure of data to foreign governments
Cloud Processing Infrastructure

• Advantages
– Ability to secure masters and push out secure images
• Challenges
– Application multi-tenancy
– Reliance on hypervisors
– Process isolation / Application sandboxes
Cloud Support Services

• Advantages
– On demand security controls (e.g., authentication, logging,
firewalls…)
• Challenges
– Additional risk when integrated with customer applications
– Needs certification and accreditation as a separate application
– Code updates
Cloud Network and Perimeter Security

• Advantages
– Distributed denial of service protection
– VLAN capabilities
– Perimeter security (IDS, firewall, authentication)
• Challenges
– Virtual zoning with application mobility
Cloud Security Advantages
Part 1
• Data Fragmentation and Dispersal
• Dedicated Security Team
• Greater Investment in Security Infrastructure
• Fault Tolerance and Reliability
• Hypervisor Protection Against Network Attacks
• Possible Reduction of C&A Activities (Access to Pre-
Accredited Clouds)
Cloud Security Advantages
Part 2
• Simplification of Compliance Analysis
• Data Held by Unbiased Party (cloud vendor assertion)
• Low-Cost Disaster Recovery and Data Storage
Solutions
• On-Demand Security Controls
• Real-Time Detection of System Tampering
• Rapid Re-Constitution of Services
• Advanced Honeynet Capabilities
Cloud Security Challenges Part
1
• Data dispersal and international privacy laws
– EU Data Protection Directive and U.S. Safe Harbor program
– Exposure of data to foreign government and data subpoenas
– Data retention issues
• Need for isolation management
• Multi-tenancy
• Logging challenges
• Data ownership issues
• Quality of service guarantees
Cloud Security Challenges
Part 2
• Dependence on secure hypervisors
• Attraction to hackers (high value target)
• Security of virtual OSs in the cloud
• Possibility for massive outages
• Encryption needs for cloud computing
– Encrypting access to the cloud resource control interface
– Encrypting administrative access to OS instances
– Encrypting access to applications
– Encrypting application data at rest
• Public cloud vs internal cloud security
• Lack of public SaaS version control
Additional Issues
• Issues with moving sensitive data to the cloud
– Privacy impact assessments
• Using SLAs to obtain cloud security
– Suggested requirements for cloud SLAs
– Issues with cloud forensics
• Contingency planning and disaster recovery for cloud
implementations
Secure Migration Paths
for Cloud Computing
The ‘Why’ and ‘How’ of Cloud Migration

• There are many benefits that explain why to


migrate to clouds
– Cost savings, power savings, green savings,
increased agility in software deployment
• Cloud security issues may drive and define
how we adopt and deploy cloud computing
solutions
Balancing Threat Exposure and
Cost Effectiveness
• Private clouds may have less threat exposure than
community clouds which have less threat exposure than
public clouds.
• Massive public clouds may be more cost effective than
large community clouds which may be more cost
effective than small private clouds.
• Doesn’t strong security controls mean that I can adopt
the most cost effective approach?
Cloud Migration and Cloud Security
Architectures
• Clouds typically have a single security architecture but have many
customers with different demands
– Clouds should attempt to provide configurable security mechanisms
• Organizations have more control over the security architecture of
private clouds followed by community and then public
– This doesn’t say anything about actual security
• Higher sensitivity data is likely to be processed on clouds where
organizations have control over the security model
Putting it Together

• Most clouds will require very strong security controls


• All models of cloud may be used for differing tradeoffs
between threat exposure and efficiency
• There is no one “cloud”. There are many models and
architectures.
• How does one choose?
Migration Paths for
Cloud Adoption
• Use public clouds
• Develop private clouds
– Build a private cloud
– Procure an outsourced private cloud
– Migrate data centers to be private clouds (fully virtualized)
• Build or procure community clouds
– Organization wide SaaS
– PaaS and IaaS
– Disaster recovery for private clouds
• Use hybrid-cloud technology
– Workload portability between clouds
Possible Effects of
Cloud Computing
• Small enterprises use public SaaS and public clouds and minimize
growth of data centers
• Large enterprise data centers may evolve to act as private clouds
• Large enterprises may use hybrid cloud infrastructure software to
leverage both internal and public clouds
• Public clouds may adopt standards in order to run workloads from
competing hybrid cloud infrastructures
Cloud Standards Mission

• Provide guidance to industry and government for the


creation and management of relevant cloud computing
standards allowing all parties to gain the maximum value
from cloud computing
NIST and Standards

• NIST wants to promote cloud standards:


– We want to propose roadmaps for needed standards
– We want to act as catalysts to help industry formulate their
own standards
• Opportunities for service, software, and hardware providers
– We want to promote government and industry adoption of
cloud standards
Goal of NIST Cloud Standards Effort

• Fungible clouds
– (mutual substitution of services)
– Data and customer application portability
– Common interfaces, semantics, programming models
– Federated security services
– Vendors compete on effective implementations
• Enable and foster value add on services
– Advanced technology
– Vendors compete on innovative capabilities
A Model for Standardization
and Proprietary Implementation

• Advanced features Proprietary Value


Add Functionality

• Core features

Standardized Core
Cloud Capabilities
Proposed Result

• Cloud customers knowingly choose the correct mix for


their organization of
– standard portable features
– proprietary advanced capabilities
A proposal: A NIST Cloud
Standards Roadmap
• We need to define minimal standards
– Enable secure cloud integration, application portability, and data
portability
– Avoid over specification that will inhibit innovation
– Separately addresses different cloud models

84
Towards the Creation of
a Roadmap (I)
• Thoughts on standards:
– Usually more service lock-in as you move up the SPI stack (IaaS-
>PaaS->SaaS)
– IaaS is a natural transition point from traditional enterprise
datacenters
• Base service is typically computation, storage, and networking
– The virtual machine is the best focal point for fungibility
– Security and data privacy concerns are the two critical barriers to
adopting cloud computing
Towards the Creation of
a Roadmap (II)
• Result:
– Focus on an overall IaaS standards roadmap as a first major
deliverable
– Research PaaS and SaaS roadmaps as we move forward
– Provide visibility, encourage collaboration in addressing these
standards as soon as possible
– Identify common needs for security and data privacy standards
across IaaS, PaaS, SaaS
A Roadmap for IaaS

• Needed standards
– VM image distribution (e.g., DMTF OVF)
– VM provisioning and control (e.g., EC2 API)
– Inter-cloud VM exchange (e.g., ??)
– Persistent storage (e.g., Azure Storage, S3, EBS, GFS, Atmos)
– VM SLAs (e.g., ??) – machine readable
• uptime, resource guarantees, storage redundancy
– Secure VM configuration (e.g., SCAP)
A Roadmap for PaaS and SaaS
• More difficult due to proprietary nature
• A future focus for NIST

• Standards for PaaS could specify


– Supported programming languages
– APIs for cloud services
• Standards for SaaS could specify
– SaaS-specific authentication / authorization
– Formats for data import and export (e.g., XML schemas)
– Separate standards may be needed for each application space
Security and Data Privacy Across IaaS,
PaaS, SaaS

• Many existing standards


• Identity and Access Management (IAM)
– IdM federation (SAML, WS-Federation, Liberty ID-FF)
– Strong authentication standards (HOTP, OCRA, TOTP)
– Entitlement management (XACML)
• Data Encryption (at-rest, in-flight), Key Management
– PKI, PKCS, KEYPROV (CT-KIP, DSKPP), EKMI
• Records and Information Management (ISO 15489)
• E-discovery (EDRM)
Cloud Computing Publications
Planned NIST
Cloud Computing Publication

• NIST is planning a series of publications on cloud computing

• NIST Special Publication to be created in FY09


– What problems does cloud computing solve?
– What are the technical characteristics of cloud computing?
– How can we best leverage cloud computing and obtain security?
Cloud Resources, Case Studies, and
Security Models
Thoughts on Cloud Computing

• Galen Gruman, InfoWorld Executive Editor, and Eric


Knorr, InfoWorld Editor in Chief
– “A way to increase capacity or add capabilities on the fly
without investing in new infrastructure, training new
personnel, or licensing new software.”
– “The idea of loosely coupled services running on an agile,
scalable infrastructure should eventually make every enterprise
a node in the cloud.”
Thoughts on Cloud Computing
• Tim O’Reilly, CEO O’Reilly Media
• “I think it is one of the foundations of the next generation of
computing”
• “The network of networks is the platform for all computing”

• “Everything we think of
as a computer today is
really just a device that
connects to the big
computer that we are
all collectively
Thoughts on Cloud Computing

• Dan Farber, Editor in Chief CNET News


• “We are at the beginning of the age of planetary computing.
Billions of people will be wirelessly interconnected, and the only
way to achieve that kind of massive scale usage is by massive
scale, brutally efficient cloud-based infrastructure.”
Core objectives of Cloud Computing

• Amazon CTO Werner Vogels


• Core objectives and principles that cloud computing
must meet to be successful:
– Security
– Scalability
– Availability
– Performance
– Cost-effective
– Acquire resources on demand
– Release resources when no longer needed
– Pay for what you use
– Leverage others’ core competencies
– Turn fixed cost into variable cost
A “sunny” vision
of the future
• Sun Microsystems CTO Greg Papadopoulos
– Users will “trust” service providers with their data like they trust
banks with their money
– “Hosting providers [will] bring ‘brutal efficiency’ for utilization,
power, security, service levels, and idea-to-deploy time” –CNET
article
– Becoming cost ineffective to build data centers
– Organizations will rent computing resources
– Envisions grid of 6 cloud infrastructure providers linked to 100
regional providers
Foundational Elements of Cloud
Computing

98
Foundational Elements
of Cloud Computing

Primary Technologies Other Technologies


• Virtualization • Autonomic Systems
• Grid technology • Web 2.0
• Service Oriented Architectures
• Web application
• Distributed Computing
frameworks
• Broadband Networks
• Browser as a platform
• Service Level
• Free and Open Source Software Agreements

99
Consumer Software Revolution
Web 2.0

• Is not a standard but an evolution in using the WWW


• “Don’t fight the Internet” – CEO Google, Eric Schmidt
• Web 2.0 is the trend of using the full potential of the web
– Viewing the Internet as a computing platform
– Running interactive applications through a web browser
– Leveraging interconnectivity and mobility of devices
– The “long tail” (profits in selling specialized small market goods)
– Enhanced effectiveness with greater human participation
• Tim O'Reilly: “Web 2.0 is the business revolution in the computer
industry caused by the move to the Internet as a platform, and an
attempt to understand the rules for success on that new platform.”
Enterprise Software Revolution
Software as a Service (SaaS)

• SaaS is hosting applications on the Internet as a service


(both consumer and enterprise)
• Jon Williams, CTO of Kaplan Test Prep on SaaS
– “I love the fact that I don't need to deal with servers, staging, version
maintenance, security, performance”
• Eric Knorr with Computerworld says that “[there is an]
increasing desperation on the part of IT to minimize
application deployment and maintenance hassles”
Three Features of
Mature SaaS Applications
• Scalable
– Handle growing amounts of work in a graceful manner
• Multi-tenancy
– One application instance may be serving hundreds of companies
– Opposite of multi-instance where each customer is provisioned their own server
running one instance
• Metadata driven configurability
– Instead of customizing the application for a customer (requiring code changes),
one allows the user to configure the application through metadata
SaaS Maturity Levels
• Level 1: Ad-Hoc/Custom
• Level 2: Configurable
• Level 3: Configurable,
Multi-Tenant-Efficient
• Level 4: Scalable,
Configurable, Multi-
Tenant-Efficient

Source: Microsoft MSDN Architecture Center


Utility Computing

• “Computing may someday be organized as a public


utility” - John McCarthy, MIT Centennial in 1961
• Huge computational and storage capabilities available
from utilities
• Metered billing (pay for what you use)
• Simple to use interface to access the capability (e.g.,
plugging into an outlet)

104
Service Level Agreements (SLAs)

• Contract between customers and service providers of the


level of service to be provided
• Contains performance metrics (e.g., uptime, throughput,
response time)
• Problem management details
• Documented security capabilities
• Contains penalties for non-performance

105
Autonomic System Computing

• Complex computing systems that manage themselves


• Decreased need for human administrators to perform lower level tasks
• Autonomic properties: Purposeful, Automatic, Adaptive, Aware
• IBM’s 4 properties: self-healing, self-configuration, self-optimization,
and self-protection

IT labor costs are 18 times that of equipment costs.


The number of computers is growing at 38% each year.

106
Grid Computing
• Distributed parallel processing across a network
• Key concept: “the ability to negotiate resource-sharing
arrangements”
• Characteristics of grid computing
– Coordinates independent resources
– Uses open standards and interfaces
– Quality of service
– Allows for heterogeneity of computers
– Distribution across large geographical boundaries
– Loose coupling of computers

107
Platform Virtualization
• “[Cloud computing] relies on separating your applications from
the underlying infrastructure” - Steve Herrod, CTO at VMware
• Host operating system provides an abstraction layer for running
virtual guest OSs
• Key is the “hypervisor” or “virtual machine monitor”
– Enables guest OSs to run in isolation of other OSs
– Run multiple types of OSs
• Increases utilization of physical servers
• Enables portability of virtual servers between physical servers
• Increases security of physical host server

108
Web Services

• Web Services
– Self-describing and stateless modules that perform discrete
units of work and are available over the network
– “Web service providers offer APIs that enable developers to
exploit functionality over the Internet, rather than delivering
full-blown applications.” - Infoworld
– Standards based interfaces (WS-I Basic Profile)
• e.g., SOAP, WSDL, WS-Security
• Enabling state: WS-Transaction, Choreography
– Many loosely coupled interacting modules form a single
logical system (e.g., legos)
Service Oriented Architectures

• Service Oriented Architectures


– Model for using web services
• service requestors, service registry, service providers
– Use of web services to compose complex, customizable,
distributed applications
– Encapsulate legacy applications
– Organize stovepiped applications into collective integrated
services
– Interoperability and extensibility

110
Web application frameworks
• Coding frameworks for enabling dynamic web sites
– Streamline web and DB related programming operations (e.g., web services
support)
– Creation of Web 2.0 applications
• Supported by most major software languages
• Example capabilities
– Separation of business logic from the user interface (e.g., Model-view-controller
architecture)
– Authentication, Authorization, and Role Based Access Control (RBAC)
– Unified APIs for SQL DB interactions
– Session management
– URL mapping
• Wikipedia maintains a list of web application frameworks

111
Free and Open Source Software

• External ‘mega-clouds’ must focus on using their massive


scale to reduce costs
• Usually use free software
– Proven adequate for cloud deployments
– Open source
– Owned by provider
• Need to keep per server cost low
– Simple commodity hardware
• Handle failures in software

112
Public Statistics on Cloud
Economics

113
Cost of Traditional Data Centers

• 11.8 million servers in data centers


• Servers are used at only 15% of their capacity
• 800 billion dollars spent yearly on purchasing and maintaining
enterprise software
• 80% of enterprise software expenditure is on installation and
maintenance of software
• Data centers typically consume up to 100 times more per square foot
than a typical office building
• Average power consumption per server quadrupled from 2001 to
2006.
• Number of servers doubled from 2001 to 2006

114
Energy Conservation and Data Centers

• Standard 9000 square foot costs $21.3 million to build with


$1 million in electricity costs/year
• Data centers consume 1.5% of our Nation’s electricity
(EPA)
– .6% worldwide in 2000 and 1% in 2005
• Green technologies can reduce energy costs by 50%
• IT produces 2% of global carbon dioxide emissions

115
Cloud Economics

• Estimates vary widely on possible cost savings


• “If you move your data centre to a cloud provider, it will cost a tenth
of the cost.” – Brian Gammage, Gartner Fellow
• Use of cloud applications can reduce costs from 50% to 90% - CTO
of Washington D.C.
• IT resource subscription pilot saw 28% cost savings - Alchemy Plus
cloud (backing from Microsoft)
• Preferred Hotel
– Traditional: $210k server refresh and $10k/month
– Cloud: $10k implementation and $16k/month

116
Cloud Economics

• George Reese, founder Valtira and enStratus


– Using cloud infrastructures saves 18% to 29% before
considering that you no longer need to buy for peak capacity

117
Cloud Computing Case Studies
and Security Models

118
Google Cloud User:
City of Washington D.C.
• Vivek Kundra, CTO for the District (now OMB e-gov administrator)
• Migrating 38,000 employees to Google Apps
• Replace office software
– Gmail
– Google Docs (word processing and spreadsheets)
– Google video for business
– Google sites (intranet sites and wikis)
• “It's a fundamental change to the way our government operates by moving to the
cloud. Rather than owning the infrastructure, we can save millions.”, Mr. Kundra

• 500,000+ organizations use Google Apps


• GE moved 400,000 desktops from Microsoft Office to Google Apps and then
migrated them to Zoho for privacy concerns

119
Are Hybrid Clouds in our Future?

• OpenNebula
• Zimory
• IBM-Juniper Partnership
– "demonstrate how a hybrid cloud could allow enterprises to
seamlessly extend their private clouds to remote servers in a
secure public cloud...“
• VMWare VCloud
– “Federate resources between internal IT and external clouds”

120
vCloud Initiative

• Goal:
– “Federate resources between internal IT and external
clouds”
– Application portability
– Elasticity and scalability, disaster recovery, service level
management
• vServices provide APIs and technologies

121
Microsoft Azure Services

Source: Microsoft Presentation, A Lap Around Windows Azure, Manuvir Das

122
Windows Azure Applications,
Storage, and Roles

n m
Worker
Web Role
LB

Role

Cloud Storage (blob, table, queue)

Source: Microsoft Presentation, A Lap Around Windows Azure, Manuvir Das

123
Case Study: Facebook’s Use of Open Source
and Commodity Hardware (8/08)
• Jonathan Heiliger, Facebook's vice president of technical operations
• 80 million users + 250,000 new users per day
• 50,000 transactions per second, 10,000+ servers
• Built on open source software
– Web and App tier: Apache, PHP, AJAX
– Middleware tier: Memcached (Open source caching)
– Data tier: MySQL (Open source DB)
• Thousands of DB instances store data in distributed fashion (avoids
collisions of many users accessing the same DB)
• “We don't need fancy graphics chips and PCI cards," he said. “We need
one USB port and optimized power and airflow. Give me one CPU, a little
memory and one power supply. If it fails, I don't care. We are solving the
redundancy problem in software.”

124
Case Study: IBM-Google Cloud

• “Google and IBM plan to roll out a worldwide network of


servers for a cloud computing infrastructure” – Infoworld
• Initiatives for universities
• Architecture
– Open source
• Linux hosts
• Xen virtualization (virtual machine monitor)
• Apache Hadoop (file system)
– “open-source software for reliable, scalable, distributed computing”
– IBM Tivoli Provisioning Manager

125
Case Study: Amazon Cloud
• Amazon cloud components
– Elastic Compute Cloud (EC2)
– Simple Storage Service (S3)
– SimpleDB
• New Features
– Availability zones
• Place applications in multiple locations for failovers
– Elastic IP addresses
• Static IP addresses that can be dynamically remapped to point
to different instances (not a DNS change)

126
Amazon Cloud Users:
New York Times and Nasdaq (4/08)
• Both companies used Amazon’s cloud offering
• New York Times
– Didn’t coordinate with Amazon, used a credit card!
– Used EC2 and S3 to convert 15 million scanned news articles to PDF (4TB data)
– Took 100 Linux computers 24 hours (would have taken months on NYT computers
– “It was cheap experimentation, and the learning curve isn't steep.” – Derrick Gottfrid,
Nasdaq
• Nasdaq
– Uses S3 to deliver historic stock and fund information
– Millions of files showing price changes of entities over 10 minute segments
– “The expenses of keeping all that data online [in Nasdaq servers] was too high.” –
Claude Courbois, Nasdaq VP
– Created lightweight Adobe AIR application to let users view data

127
Case Study:
Salesforce.com in Government
• 5,000+ Public Sector and Nonprofit Customers use Salesforce Cloud
Computing Solutions

• President Obama’s Citizen’s Briefing Book Based on Salesforce.com Ideas


application
– Concept to Live in Three Weeks
– 134,077 Registered Users
– 1.4 M Votes
– 52,015 Ideas
– Peak traffic of 149 hits per second

• US Census Bureau Uses Salesforce.com Cloud Application


– Project implemented in under 12 weeks
– 2,500+ partnership agents use Salesforce.com for 2010 decennial census
– Allows projects to scale from 200 to 2,000 users overnight to meet peak periods with no capital
expenditure

128
Case Study:
Salesforce.com in Government
• New Jersey Transit Wins InfoWorld 100 Award for its Cloud
Computing Project
– Use Salesforce.com to run their call center, incident management, complaint tracking,
and service portal
– 600% More Inquiries Handled
– 0 New Agents Required
– 36% Improved Response Time

• U.S. Army uses Salesforce CRM for Cloud-based Recruiting


– U.S. Army needed a new tool to track potential recruits who visited its Army
Experience Center.
– Use Salesforce.com to track all core recruitment functions and allows the Army to
save time and resources.

129
Questions?
• Peter Mell
• NIST, Information Technology Laboratory
• Computer Security Division

• Tim Grance
• NIST, Information Technology Laboratory
• Computer Security Division

Contact information is available from:


http://www.nist.gov/public_affairs/contact.htm

130
What is this buzzword?

Hype?
The hype

Cluster Computing
Cloud Computing
Grid Computing 
Data Centers

• Large server and storage farms


– Used by enterprises to run server applications
– Used by Internet companies
• Google, Facebook, Youtube, Amazon…
– Sizes can vary depending on needs
Data Center Architecture

• Traditional: applications run on physical servers


– Manual mapping of apps to servers
• Apps can be distributed
• Storage may be on a SAN or NAS
– IT admins deal with “change”
• Modern: virtualized data centers
– App run inside virtual servers; VM mapped onto physical
servers
– Provides flexiblility in mapping from virtual to physical
resources
Virtualized Data Centers

• Resource management is simplified


– Application can be started from preconfigured VM images /
appliances
– Virtualization layer / hypervisor permits resource allocations to
be varied dynamically
– VMs can be migrated without application down-time
Workload Management

• Internet applications => dynamic workloads


• How much capacity to allocate to an application?
– Incorrect workload estimate: over- or under-provision capacity
– Major issue for internet facing applications
• Workload surges / flash crowds cause overloads
• Long-term incremental growth (workload doubles every few
months for many newly popular apps)
– Traditional approach: IT admins estimate peak workloads and
provision sufficient servers
• Flash-crowd => react manually by adding capacity
– Time scale of hours: lost revenue, bad publicity for application
Dynamic Provisioning

• Track workload and dynamically provision capacity


• Monitor -> Predict -> Provision
• Predictive versus reactive provisioning
– Predictive: predict future workload and provision
– Reactive: react whenever capacity falls short of demand
• Traditional data centers: bring up a new server
• Borrow from Free pool or reclaim under-used server
• Virtualized data center: exploit virtualization to speed up
application startup time
– How is this done?
Energy Management in Data
Centers
• Energy: major component of operational cost of data
centers
– Large data centers have energy bills of several million $.
– Where does it come from?
• Power for servers and cooling
• Data centers also have a large carbon footprint
• How to reduce energy usage?
• Need energy-proportional systems
– Energy proportionality: energy use proportional to load
– But: current hardware not energy proportional
Energy Management
• Many approaches possible
• Within a server:
– Shut-down certain components (cores, disks) when idling or at low loads
– Use DVFS for CPU
• Most effective: shutdown servers you don’t need
– Consolidate workload onto a smaller # of servers
– Turn others off
• Thermal management: move workload to cooling or move
cooling to where workloads are
– Requires sensors and intelligent cooling systems
Container-based Data Centers

• Modular design
• No expensive buildings needed
• Plug and play: plug power, network, cooling vent
Example: Container DC

• Courtesy: Dan Reed, Microsoft


– Talk at NSF workshop
• Benefits of MS Gen 4 data ctr
– Scalable
– Plug and play
– Pre-assembled
– Rapid deployment
– Reduced construction
Cloud Computing

• Data centers that rent servers/ storage


• Cloud: virtualized data center with self-service web
portal
• Any one with a “credit card” can rent servers
• Automated allocation of servers

• Use virtualized architecture


• Examples: Amazon EC2, Azure, New servers
Cloud Models

• Private clouds versus Public Clouds


– Who owns and runs the infrastructure?

• What is being rented?


– Infrastructure as a service (rent barebone servers)
– Platform as a service (google app engine)
– Software as a service (gmail, online backup, Salesforce.com)
Pricing and Usage Model

• Fine-grain pricing model


– Rent resources by the hour or by I/O
– Pay as you go (pay for only what you use)
• Can vary capacity as needed
– No need to build your own IT infrastructure for peaks needs
Amazon EC2 Case Study

• Virtualized servers
– Different sizes / instances
• Storage: Simple storage service (S3)
– Elastic block service (EBS)
• Many other services
– Simple DB
– Database service
– Virtual private cloud

You might also like