You are on page 1of 40

Storage with Amazon S3 and

Amazon Glacier

Darryl S. Osborne AWS Storage Specialist Solutions Architect

26 October 2016

2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda

AWS Storage is a Platform


Amazon S3 - Object storage
Amazon Glacier - Archive storage
Data transfer options
Content distribution
Case studies
Q&A
Storage is a platform: AWS Storage Maturity

Amazon EC2
Amazon EFS Amazon EBS Instance Store Amazon S3 Amazon Glacier

File Block Object

Data Transfer

S3 Transfer Storage
AWS Direct AWS ISV Connectors Amazon Kinesis
Acceleration Gateway
Connect Snowball Firehose
Object storage is foundational
Compute Analytics
EC2 Lambda EMR Data Pipeline Kinesis

Amazon S3 Amazon Glacier

Object

Content Delivery Database


Elastic RDS DynamoDB RedShift
CloudFront
Transcoder
Value in Every GB

Scale

Durability

Cloud Data Migration

Lifecycle Management

Broad Integration with other AWS services


Amazon S3 Object storage
What is Amazon S3
Highly durable object storage for all types of data

Internet-scale storage Built-in redundancy Low price per GB Benefit from AWSs
Grow without limits Designed for per month massive security
99.999999999% No commitment investments
durability No up-front cost
Key Features of Amazon S3

Data Management Data protection


Cost monitoring and controls Versioning
Lifecycle management Cross-region replication
Ease of use Security
Programmatic access using AWS SDKs Multi-factor authentication delete
REST APIs Flexible access control mechanisms
Management Console, AWS CLI Time-limited access to object
Event Notifications Access logs
Delivered using SQS, SNS, or Lambda Multiple client and server-side
Enable you to trigger workflows, alerts or Encryption options
other processing
Innovation for Amazon S3

Event Cross-region VPC endpoint


notifications replication for Amazon S3

Amazon CloudWatch Amazon S3 bucket Read-after-write


& AWS CloudTrail support limit increase consistency in all regions
Innovation for Amazon S3, continued

Lifecycle policy

Amazon S3 Transfer
Standard-IA Acceleration

Expired object Incomplete


delete marker multipart
upload expiration
Choice of storage class on Amazon S3

S3 - Standard S3 Standard Glacier


Infrequent Access

Active data Infrequently accessed data Archive data


Storage tiered to your requirements
Durable
99.999999999%
99.999999999%

Hot Data
Active
Active and/or
and/or
Temporary Data
Temporary Data Available
S3:
S3: 99.99%
S3 Common Secure Event Versioning Cross Region L
99.99%
S3-IA: 99.9%
S3-IA: 99.9%
Namespace
Define
SSE,
SSE, client
client
encryption,
encryption, IAM
Notifications Keep
Keep multiple
multiple
copies
Replication i
Define storage
storage IAM SQS,
SQS, SNS,
SNS, and
and copies
class
class per
per object
object integration
integration Lambda
Lambda automatically
automatically f Performant
Low
Low Latency
Latency
Warm Data e High Throughput
High Throughput
Infrequently
Infrequently
Accessed
Accessed Data
Data c
S3-IA y Scalable
Elastic
Elastic capacity
capacity
c No
No preset
preset limits
limits

l
e
Cold Data
Archive
Archive and
and
Compliance
Compliance Data
Data

Glacier
Storage tiered to your requirements
Durable
99.999999999%
99.999999999%

Hot Data $0.03/GB per month > 0K 0 Days


Active
Active and/or
and/or
Temporary
Temporary Data
Data Available
S3:
S3: 99.99%
S3 L
99.99%
S3-IA: 99.9%
S3-IA: 99.9%

i
f Performant
Low
Low Latency
Latency
Warm Data $0.0125/GB per month 128K 30 Days e High Throughput
High Throughput
Infrequently
Infrequently
Accessed
Accessed Data
Data c
$0.01/GB retrieval
S3-IA y Scalable
Elastic
Elastic capacity
capacity
c No
No preset
preset limits
limits

l
e
Cold Data $0.007/GB per month > 0K 90 Days
Archive
Archive and
and
Compliance
Compliance Data
Data
$0.01/GB retrieval > 5% 3 5 Hrs
Glacier
User Generated Content Example

10-% Reads 90+% Writes Use case


S3 User files become dormant days after upload. The
access pattern is usually 90+% writes and 10-%
reads.

Benefits
Lower costs with minimal integration.
S3-IA
Assuming 90/10 access ratio:

$0.0125/GB + $0.001 (retrievals) = $0.0135/GB

Glacier
Active Archive Example

Use case
Data reads from archive are infrequent but require
Active Data
immediate response. Data is archived for future
S3 L reference or compliance and often resides on tape.
i
f The optimal tier for deep archives is Glacier. S3-IA
e can be an intermediate phase into Glacier.
On-Demand Reads c
S3-IA y Customer value
Improve access to valuable content, reduce costs and
c improve durability.
l
e Example applications
Digital media archives
Deep Archive
Intermediate log archives for Big Data Analytics
Glacier
Enterprise Backup Example

Use case
SGW
Backup and archive on-premises
data or EC2 data volumes to
AWS directly from backup
L applications or through a
Active Backup i
f gateway.
S3-IA e
c
y Customer value
c
l
Reduce costs, simplify
e management, infinite scale
Long-term Backup compared to on-prem tape/disk
Glacier
Amazon S3 Versioning

Preserve, retrieve, and restore every version


PUT
of every object stored in your bucket Key = photo.gif

S3 automatically adds new versions and


preserves deleted objects with delete markers
Easily control the number of versions kept by
Key = photo.gif
using lifecycle expiration policies ID = 121212
Key = photo.gif

ID = 111111
Easy to turn on in the AWS Management
Versioning
Console Enabled
Amazon S3 Event Notifications

Delivers notifications to Amazon SNS, Amazon SQS, or AWS


Lambda when events occur in S3

SNS topic
S3 No t if ic ation
s

Notifications SQS queue


Events Notif
icat o
i ns
Lambda function
Foo()
Foo() {{


}}
Amazon S3 Cross-region Replication
Automated, fast, and reliable asynchronous replication of data across AWS regions

Use cases:
Compliancestore data hundreds of miles apart
Lower latencydistribute data to regional customers)
Securitycreate remote replicas managed by separate AWS accounts

Only replicates new PUTs. Once


S3 is configured, all new uploads
into a source bucket will be
Destination
(Oregon) replicated
Entire bucket or prefix based
Source
(Virginia) 1:1 replication between any 2
regions
Versioning required
Amazon S3 Virtual Private Endpoint (VPCE)

Prior to S3 VPCE Using S3 VPCE


Amazon S3
Amazon S3

Public IP on EC2 Instances and IGW Access S3 using S3 Private Endpoint (VPE)
Private IP on EC2 Instances and NAT without using NAT instances or Gateways
Increased security
Amazon S3 Data Encryption Options

Client-side encryption use AWS SDKs


You manage the encryption keys and never send them to AWS

Server-side encryption (SSE) with Amazon S3 managed keys


Check-the-box to encrypt your data at rest. Keys managed by S3

SSE with customer provided keys


You manage your encryption keys and provide them for PUTs and GETS

SSE with AWS Key Management Service managed keys


Keys managed centrally in AWS KMS with permissions and auditing of usage

For more details watch Encryption and Key Management in AWS:


https://www.youtube.com/watch?v=uhXalpNzPU4
Amazon S3 Availability & Usage
Amazon S3 holds trillions of objects and regularly peaks at millions of requests per
second.

Available in 14
regions today and
4 new regions
coming soon.
Amazon S3 Capacity Pricing
Traditional storage Amazon S3
pay only for what you use!

1 PB raw storage

800 TB usable storage

600 TB allocated storage

400 TB application data


Amazon S3 Price

Pay only for what you use.

There is no minimum fee.

We charge less where our


costs are less, and prices
are based on the location
of your Amazon S3 bucket.

Estimate your monthly bill using the AWS Simple Monthly Calculator.
Amazon Glacier Archive storage
What is Amazon Glacier
Archival storage for infrequently accessed data

Amazon Glacier Even lower cost than Stop managing


is optimized for Amazon S3; physical media
infrequent retrieval Same high durability

3-5 hour retrieval latency $0.007 per GB/month Replace tape libraries, VTLs
%5 free tier on retrievals $86 per TB/year
Key Features of Amazon Glacier

Vault Inventory Access Controls


Inventory all archives Integrated with AWS IAM
Available as JSON or CSV Supports MFA device access
Ease of use Integrated Lifecycle Management
Programmatic access using AWS SDKs Integrated with Amazon S3 Lifecycle
REST APIs policies
Management Console, AWS CLI Establish auto-archive rules for
Amazon S3 objects
Data Retrieval Policies
Define data retrieval limits and cost Tagging Support
ceiling Tag vaults for cost management
Example: Free Tier Only, Max Filter cost reports based on tags
Retrieval Rate,
Innovation for Amazon Glacier

Audit Logs

Vault Lock Vault Access Policies


Three Ways to Ingest Data with Amazon Glacier

Direct Glacier API/SDK


Direct access to Glacier for deep archives
S3 lifecycle integration
Move older data to less expensive archive
tier
Third party tools and gateways
Integrate existing backup and archive
applications using an IT-friendly interface
Data Transfer Options
AWS Data Ingest Options

AWS Direct AWS ISV Connectors


Connect Snowball

Amazon Kinesis S3 Transfer Storage


Firehose Acceleration Gateway
Content Distribution
Amazon CloudFront Edge Locations

AWS provides full-site,


or media asset, delivery
via a worldwide content
delivery network (CDN)
called Amazon CloudFront.
Single origin storage for content distribution

- Amazon S3 can be used as durable 3

origin for global content distribution Edge


Location
Edge
- Provides single origin for multiple Location
Edge 2
CDNs, such as Amazon CloudFront Location
3
- Data transfer out of Amazon S3 into 2
Edge
Amazon CloudFront is free! Edge
Location

- Optimal for serving static web Location Amazon S3


Bucket
assets such as images, videos and Edge
Edge
Location

HTML Location
Case Studies
SoundCloud: Audio Transcoding

- Worlds leading social sound platform


- Audio files must be transcoded and stored in
multiple formats
- Stores petabytes of data
- Transcoded files served from Amazon S3 via
Amazon CloudFront
- Originals moved to Amazon Glacier for cost
savings
Druva InSync SaaS: Endpoint Data Protection

Druva inSync Cloud relies on:


- Amazon EC2
- Amazon S3
- Amazon DynamoDB
Amazon Storage Partner Ecosystem
Data Content and
Backup/DR Gateway/NAS Management Acceleration Sync & Share

Archive File System


Whats next?
Getting started with S3 and Glacier:
http://aws.amazon.com/s3/getting-started/
http://aws.amazon.com/glacier/getting-started/

Pricing:
http://aws.amazon.com/s3/pricing/
http://aws.amazon.com/glacier/pricing/
http://calculator.s3.amazonaws.com/index.html

AWS Youtube channel:


https://www.youtube.com/user/AmazonWebServices/playlists
Q&A
darrylo@amazon.com

Learn more at: http://aws.amazon.com/s3/


http://aws.amazon.com/glacier/

You might also like