You are on page 1of 43

Advanced data management

Jiaheng Lu Department of Computer Science Renmin University of China


www.jiahenglu.net

Course purpose
Teach in English The objective is to expose

graduate students to exciting data management topics

Course contents
Cloud computing and cloud data
management
XML data management Column-store database Data processing in bioinformatics

Lecturer Academic experience

2006.9 ~2008.6 University of California, Irvine, Postdoc researcher 2002.8 ~2006.8 National University of Singapore, PhD candidate
1998.9 ~ 2001.1 Shanghai Jiao Tong University Master candidate

University of California, Irvine

Research in Postdoc

Data integration in medical system [US patent]

Approximate string search [ICDE08]

6
6

National University of Singapore


7

Course grading
Report
30%

Google App Engine 30% In-class presence and quiz 40%

Any question and any comments ?

2012/3/20

Cloud computing

Why we use cloud computing?

Why we use cloud computing?


Case 1: Write a file Save Computer down, file is lost Files are always stored in cloud, never lost

Why we use cloud computing?


Case 2: Use IE --- download, install, use Use QQ --- download, install, use Use C++ --- download, install, use Get the serve from the cloud

What is cloud and cloud computing?


Cloud Demand resources or services over Internet scale and reliability of a data center.

What is cloud and cloud computing?


Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a serve over the Internet.

Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them.

The architecture of cloud computing system

Characteristics of cloud computing

Virtual. software, databases, Web servers, operating systems, storage and networking as virtual servers.

On demand. add and subtract processors, memory, network bandwidth, storage.

Types of cloud service


SaaS Software as a Service PaaS Platform as a Service IaaS Infrastructure as a Service

SaaS

Software delivery model

No hardware or software to manage Service delivered through a browser Customers use the service on demand Instant Scalability

SaaS

Examples

Your current CRM package is not managing the load or you simply dont want to host it in-house. Use a SaaS provider such as Salesforce.com

Your email is hosted on an exchange server in your office and it is very slow. Outsource this using Hosted Exchange.

PaaS

Platform delivery model

Platforms are built upon Infrastructure, which is expensive Estimating demand is not a science! Platform management is not fun!

PaaS

Examples

You need to host a large file (5Mb) on your website and make it available for 35,000 users for only two months duration. Use Cloud Front from Amazon.

You want to start storage services on your network for a large number of files and you do not have the storage capacityuse Amazon S3.

IaaS

Computer infrastructure delivery model

A platform virtualization environment Computing resources, such as storing


and processing capacity. Virtualization taken a step further

IaaS

Examples

You want to run a batch job but you dont have the infrastructure necessary to run it in a timely manner. Use Amazon EC2. You want to host a website, but only for a few days. Use Flexiscale.

Cloud computing and other computing techniques

An Industry Transformed
Delgo www.delgo.com

http://www.boxofficemojo.com/

Shrek, Delgo, and Others

Why did Dreamworks use this? Upsides? Downsides?

Grid Computing & Cloud Computing

share a lot commonality intention, architecture and technology Difference programming model, business model, compute model, applications, and Virtualization.

Grid Computing & Cloud Computing

the problems are mostly the same

manage large facilities;


define methods by which consumers discover, request and use resources provided by the central facilities;

implement the often highly parallel computations that execute on those resources.

Grid Computing & Cloud Computing

Virtualization Grid

do not rely on virtualization as much as Clouds do, each individual organization maintain full control of their resources an indispensable ingredient for almost every Cloud

Cloud

Any question and any comments ?

2012/3/20

35

Google App Engine

Google App Engine


Does

one thing well: running web apps app configuration

Simple

Scalable

Secure
37

App Engine Does One Thing Well


App

Engine handles HTTP(S) requests, nothing else


Think RPC: request in, processing, response out Works well for the web and AJAX; also for other services

App

configuration is dead simple

No performance tuning needed

38

App Engine Architecture


req/resp stateless APIs urlfech mail images R/O FS stdlib app

Python VM process

stateful APIs

memcache

datastore

39

How to use Google App engine

Download Java 6 Download Eclipse and Google plug in

Register a user account in Google


Create an application (python, Java) and upload the code

In class quiz

Please answer all questions

You may be requested to answer a question later. Your performance will affect your final score.

Study Google App Engine


http://code.google.com/intl/en/appengine/docs/j ava/gettingstarted/

You might also like