You are on page 1of 131

CLOUD COMPUTING AND SERVICE

ORIENTED ARCHITECTURE

Anup Kumar, Professor


Alok Srivastava, Senior Consultant
Mobile Information Network and Distributed Systems (MINDS)
University of Louisville
and
Microsoft Corporation
THE CLOUD
The Cloud Phenomenon

 “The Cloud“ is emerging at the convergence


of three major trends —

 Service Orientation (SOA)


 Virtualization
 Standardization of computing through the
Internet (Web 2.0)
Distributed Application
Architecture
Business tier 1 Business tier 2 Data tier

WAP
Server

BO BO

Web Server EJB Container

Client tier

4
Features of Distributed
Application
 Reusability and simple integration
 Different clients can reuse the business object and interacts with well-defined
interfaces
 Allows for quick and easy development of new objects from existing library

 Performance
 Provides better performance in most cases and better use of resources
 Allows for workload distribution on multiple computers

 Reliability and scalability


 Reliability is achieved by replication and distribution of objects
 Scalability is achieved by object replication on different servers

 Maintainability
 Modification and redeployment of business component is carried out via
centralized management tool
Era of Cooperative Computing
 Technological evolution that accelerates the ongoing Internet revolution
 Platform-independent, Internet-based RPC
 Evolving to new distributed computing model for the Internet
 Key value: simplicity
 Allows semantic integration, access to existing business processes and
transactions

 Business revolution
 New product/service distribution models
 New Internet revenue models
 Tighter linkage to customers and partners
 Webs of value, changes to competitive structure

6
Characteristics of Web
Services
 Web services are self-describing
 Use XML-based description language
 Clients can discover Web services and invoke their functionality
 Allows creation of new Web services using the existing Web
services
 Web services provide
 Interoperability among distributed applications that spans
over different hardware and software
 Uses HTTP and XML-based communication
 Allows flexible integration of Web and component-based
development
7
Web Service Ingredients
Web Service

Pu nd
Developer

bli its
ice

sh de
v
er

es sc
s
or
m lts

th rip
f su

e tio
er e

se n
P R
in

rv
ta

ice
Ob

Performs lookup

Service Client Web Service


Get the location of the service Registry

8
Web Services Architecture
 Web Service Developer
 Builds services and makes them available over the Internet through
global registry
 The services can be written in any language, on any platform
 Provides service description for clients

 Service Client
 Can locate a service and, based on its service description, can invoke the
required Web service
 The service user could be a regular application, Web client
 Could be developed in any language

 Web Service Registry


 Allows for registration of services and lookup
 Brings Web service provider and the Web service client together
9
Evolution of Computing

 Network Computing
 Network is computer (client - server)
 Separation of Functionalities

 Cluster Computing
 Tightly coupled computing resources:
CPU, storage, data, etc. Usually connected within a LAN
 Managed as a single resource
 Commodity, Open source
Evolution of Computing

 Grid Computing
 Resource sharing across several domains
 Decentralized, open standards
 Global resource sharing

 Utility Computing
 Don’t buy computers, lease computing power
 Upload, run, download
 Ownership model
Applications on the Web

The Cloud
Cloud Computing

Gartner defines cloud computing as a style of


computing where massively scalable IT
enabled capabilities are delivered as a service
to external customers using Internet
technologies
Platform Continuum

On-Premises Hosted Cloud


Servers Servers Platform

• Bring your own • Renting machines, • Shared, multi-tenant


machines, connectivity, connectivity, software infrastructure
software, etc. • Less control • Virtualized & dynamic
• Complete control • Fewer responsibilities • Scalable & available
• Complete responsibility • Lower capital costs • Abstracted from the
• Static capabilities • More flexible infrastructure
• Upfront capital costs for • Pay for fixed capacity, • Higher-level services
the infrastructure even if idle • Pay as you go
Types of Cloud Services

Iaas: Infrastructure as a Service

Paas: Platform as a Service

SaaS: Software as a Service


Cloud Computing Myths
Myth Description
1 Cloud computing is an architecture or an infrastructure.
2 Every vendor will have a different cloud.
3 Software as a service (SaaS) is a cloud.
4 Cloud computing is a brand-new revolution.
5 All remote computing is cloud computing.
6 The Internet and the Web are the cloud.
7 Everything will be the cloud.
8 The cloud eliminates private networks.
Observations
 Cloud is an abstraction for a relationship between the consumers and providers of
services

 Vendors will all feed services into the one public cloud – those who can subscribe
can access cloud services

 SaaS and infrastructure/platform as a service (IaaS/PaaS) vendors will become


cloud services

 Availability of technology to be used by masses of people who care about what


they can do with the technology, rather than how the technology is implemented
(additional layer of abstraction)

 Not everything will become cloud computing, because many projects will require a
level of privacy, performance or uniqueness that cannot be supported through the
public cloud (S+S Strategy is a key)
Cloud v/s Non-Cloud
Cloud Computing Providers

 Amazon
 Elastic Computing (EC2)
 Simple Storage Service(S3)
 Google
 Google App Engine
 Microsoft
 Azure platform
 Salesforce
Major Types of Cloud

 Compute and Data Cloud


 Amazon Elastic Computing Cloud (EC2), Google
MapReduce, Science clouds
 Provide platform for running science code

 Host Cloud
 Google AppEngine
 Highly-available, fault tolerance, robustness for
web capability
Cloud Computing Example –
Google AppEngine
 Google AppEngine API
 Python runtime environment
 Datastore API
 Images API
 Mail API
 Memcache API
 URL Fetch API
 Users API
 A free account can use up to 500 MB storage,
enough CPU and bandwidth for about 5 million page
views a month
 http://code.google.com/appengine/
THE CLOUD IMPACT
Interaction Models for Computing

 Personal Computer
 One to One

 Client/Server
 One to Many

 Cloud Computing
 Many to Many
Cloud Computing Advantages

 Separation of infrastructure maintenance


duties from application development
 Separation of application code from physical
resources
 Ability to use external assets to handle peak
loads
 Ability to scale to meet user demands quickly
 Sharing capability among a large pool of
users, improving overall utilization
What Power Cloud Computing
Provides?

 Commodity Hardware
 Performance: single machine not interesting
 Reliability
 Most reliable hardware will still fail: fault-
tolerant software needed
 Fault-tolerant software enables use of
commodity components
 Standardization: use standardized machines to
run all kinds of applications
What Power Cloud Computing
Provides?
 Infrastructure Software
 Distributed storage:
 Distributed File System (GFS)
 Distributed semi-structured data
system
 BigTable
 Distributed data processing system
 MapReduce
What is the common issues of all these software?
Why Does Cloud Matter?
Business Value

Business logic

Expand to new locale
Perform live upgrade for new feature
Apply OS patches Service “glue”
Diagnose service failures
and operations
Add storage capacity
Handle increase in traffic
Respond to hardware failures
Datacenter
Components of Cloud
Implementation
What are the Ingredients?
 An operating system for the cloud

 Framework for reducing the complexity of


internet scale applications

 Designed to provide services for scalable &


reliable operations

 A set of services running the datacenters


Features of Operating System for
the cloud

 Hardware Abstraction across multiple servers


 Distributed Scalable, Available Storage
 Deployment, Monitoring and Maintenance
 Automated Service Management, Load
Balancers, DNS
 Programming Environments
 Interoperability
 Designed for Utility Computing
Benefits of Cloud OS?

 OS Takes care of your service in the cloud


 Deployment
 Availability
 Patching
 Hardware Configuration

 You worry about writing the service


Features of Cloud OS

 Automated Service Management

 Computation Resource Allocation

 Storage Resource Allocation

 User friendly Development Environment


Cloud Components?

Compute Storage

Developer
SDK
What Could be Compute Resources?

Compute
• .NET 3.5 SP1, Unix
• Server 2008 – 64bit
Storage
• Web Role
• IIS7 Web Sites
(ASP.NET, FastCGI)
• Web Services (WCF)
• Worker Role
• Stateless Servers
Developer
• Http(s)
Tools
What Could be Storage Resources?
Storage
• Durable, scalable, available
• Blobs, Clobs
Compute • Tables
• Queues
• REST interfaces
– Can be used without compute

Developer
Tools
Check List for Cloud Computing
Framework?
Compute Storage

 All of the hardware


 Hardware Load Balancers
 Servers
 Networks
 DNS
 Monitoring
 Automated service management

Developer
Tools
Service Models
 Describes your Service
<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="CloudService1" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
<WebRole name="WebRole">
<ConfigurationSettings>
<Setting name="AccountName"/>
</ConfigurationSettings>
<LocalStorage name="scratch" sizeInMB="50"/>
<InputEndpoints>
<!-- Must use port 80 for http and port 443 for https when running in the cloud -->
<InputEndpoint name="HttpIn" protocol="http" port="80" />
</InputEndpoints>
</WebRole>
<WorkerRole name="WorkerRole">
<ConfigurationSettings>
<Setting name="AccountName"/>
<Setting name="TableStorageEndpoint"/>
</ConfigurationSettings>
</WorkerRole>
</ServiceDefinition>
Service Architecture

WebSite
Web Site Worker
Worker
L Web Site
(ASPX, ASMX, WCF)
(ASPX, ASMX, WCF) Service
B (ASPX, WCF) Service
Internet

Your Service

Queue
L
B
Blobs
Tables
Storage

Cloud Datacenter
Cloud Service Lifecycle

 Create service package


Binaries + Content + Service Metadata
 Deploy via web portal
 Add & remove capacity via web portal
 Deployed across domains
 Upgrade with zero downtime
Automated Service Management
 You tell us what, we take care of how
 What
 Service metadata
 How
 Metadata describes service
 No OS footprint
 Service is copied to instances
 Instances were copied to physical hardware
 Physical hardware booted from VHD
 All patching is performed offline
Cloud Service Monitoring

 Cannot Attached Debugger to Cloud

 Event logs
 Retrieve logs via web portal

 Detailed consumption reporting


Design Considerations
 Scalability and availability are the design points
 Storage may not be a relational database
 Stateless
 Stateless front ends, store state in storage
 Data structures to decouple components
 Instrumentation capability for your application
 Once you are on - stay on
 Easy patching & updates
AZURE SERVICES PLATFORM
The Roadmap

Pricing & SLA Commercial


First CTP Updated CTPs Confirmation Availability

Fall 2008 Spring 2009 Summer 2009 Fall 2009


Azure Services Platform

A Look Inside Azure Services
Platform
Your Applications

Service …
Bus Workflow Database Analytics Identity Contacts

Access
… Reporting … Devices …
Control

Compute Storage Manage …


Interoperability

Azure Services Platform



WINDOWS AZURE
Windows Azure Design Goals
Windows Azure Should Provide-
 The same facilities that a desktop OS
provides, but on a set of connected servers:
 Abstract execution environment
 Shared file system
 Resource allocation
 Programming environments

 And more: Utility computing


 24/7 operation
 Pay for what you use
 Simpler, transparent administration
Additional Design Goals
 Automated service management
 You define the rules and provide your code
 The platform follows the rules: deploys, monitors, and
manages your service

 A powerful service hosting environment


 All of the hardware: servers; load balancers; …
 Virtualized and direct execution

 Scalable, available cloud storage


 Blobs, tables, queues, …

 A rich, familiar developer experience


 Visual Studio, Eclipse support, …
Windows Azure -Realized

 The Azure™ Services Platform (Azure) is an internet-


scale cloud services platform (an application platform
as a service)
 Hosted in Microsoft data centers (global)
 Provides an operating system and a set of
developer/deployment services
 Open architecture provides choices to build:
 web applications
 applications running on connected devices, PCs, servers
 Hybrid solutions offering the best of online and on-premises
Windows Azure
A Closer Look
Running Applications
Web Role

 Web farm that handles


request from the
Public Internet internet
 IIS7 hosted web core
 Hosts ASP.NET
Web Role  XML based configuration
Load of IIS7
Balancer  Integrated managed
pipeline
 Supports SSL
Storage Services  Windows Azure code
access security policy for
managed code
Worker Role

 No inbound
network connections Public Internet
 Can read requests
from queue
Worker
in storage Role
 Windows Azure
specific CAS policy for
managed code
Storage Service
Azure Service Architecture

Public Internet

Worker
Web Role
Role
Load
Balancer

Storage Service
Serving Dynamic Content

Public Internet

Worker
Web Role
Role
Load
Balancer

Storage Service
Background Tasks

Public Internet

Worker
Web Role
Role
Load
Balancer

Storage Service
Azure Storage
Windows Azure Storage
Abstractions
 Blobs – provide a simple interface
for storing named files along with
metadata for the file

 Tables – provide structured storage.


A table is a set of entities, which
contain a set of properties

 Queues – provide reliable storage and


delivery of messages for an application
Blob Storage Concepts
Key concepts account, container, blob
and blocks
Account Container Blob Block

IMG001.JPG
Pictures
IMG002.JPG

Account
Block AAAA

Movies MOV1.AVI Block AAAB

Block AAAC
Blob As A List Of Blocks

 Blob
 Consists of a List of Blocks

 Properties of Blocks
 Each Block defined by a Block ID
 Up to 64 Bytes, scoped by Blob Name
 Blocks are immutable
 A block is up to 4MB
 Do not have to be same size
Queues

 Simple asynchronous dispatch queue


 Create and delete queues
 Message:
 Retrieved at least once
 Max size 8kb
 Operations:
 put
 get
 delete
Queue Storage Concepts
Account, queue and message

Account Queue Message

128x128, http://

Thumbnail Jobs
256x256, http://

Account

http://…

Indexing Jobs

http://…
Tables

 Entities and properties (rows & columns)


 Tables scoped by account
 Designed for billions+
 Scale-out using partitions
 Partition key & row key
 Operations performed on partitions
 Efficient queries
 No limit on number of partitions
 Use ADO.NET Data Services
Table Storage Concepts
Account, table and entity

Account Table Entity

Name=…
hash=…
Users
Name=…
hash=…
Account

Tag=…id=…

PhotoIndex

Tag=…,id=…
Concurrent Updates

Client Client
A B
Version Rating
1: Ch9, Jan-2, 5
2: 1: Ch9, Jan-2, 4
Error: 412

5 : Ch9, Jan-1, 3
If-Match: 1 Ch9, Jan-2, 4
If-Match: 1 Ch9, Jan-2, 5 1 : Ch9,
2: Ch9, Jan-2,
Jan-2, 52

9 : Ch9, Jan-3, 6

 Use standard HTTP mechanisms – Etag and If-Match


 Get entity – get system maintained version as ETag
 Update Entities Locally – change rating
 Send Update with version check - IF-Match with Etag
 Success if version matches, and update version on Client-A
 Precondition failed (412) if version does not match
Desktop And Related Azure
Concepts
Desktop Windows Azure
 EXE  Service package
 Application  Service
Configuration configuration
 Manifest  Service definition
 DLL
 Service role
 Windows forms
library  Web role
 Windows service  Worker role
 Local data stores
 Internet data stores
Windows Azure
Your
Service
DN
S

L
B

Web Portal
(API)
L
Fabric B

Controller
Service Deployment
Your
Service Model
Service
DN
S

L
B

Web Portal
(API) confi
DNS L
g Fabric B

Controller
Service Scaling
Your
Service Service

DN Service
S Service
Service Service

L Service
B
Service

Service
Web Portal
(API)
L
Fabric B

Controller
Model
Service Monitoring &
Recovery
Your
Service
DN Service
S

Service

L
B Service
Service

Web Portal
(API)

!
L
Fabric B

Controller
Model
Service Models & Roles

Web A Main Web Worker X Image Resize


(port 80) 100 instances 2 instances

Worker Y Auction Processing


25 instances
Web B Admin
(port 8081) 2 instances
Worker Z Notifications
10 instances

Bid Now Service


SQL Azure
Deployment
Web Portal
(API)

DB
Script
SQL Azure
TDS
SQL Azure
Accessing databases
Web Portal
(API)

Your SQL Azure


TDS
App

Change Connection String


Database Replicas
Single Database Multiple Replicas

Single Primary
Replica 1

Replica 2
DB

Replica 3
SQL Azure
Database Monitoring & Recovery
Web Portal
(API)

Your
App
SQL Azure
TDS !
DEPLOYMENT AUTOMATION IN
WINDOWS AZURE
Deploying A Service Manually

 Resource allocation
 Machines must be chosen to host roles of the service
 Fault domains, update domains, resource utilization, hosting environment, etc.
 Procure additional hardware if necessary
 IP addresses must be acquired
 Provisioning
 Machines must be setup
 Virtual machines created
 Applications configured
 DNS setup
 Load balancers must be programmed
 Upgrades
 Locate appropriate machines
 Update the software/settings as necessary
 Only bring down a subset of the service at a time
 Maintaining service health
 Software faults must be handled
 Hardware failures will occur
 Logging infrastructure is provided to diagnose issues

This is ongoing work…you’re never done


Windows Azure Automation

 Fabric Controller (FC) “What” is Fabric


 Maps declarative service needed Controller
specifications to
available resources Make it
 Manages service life cycle happen
starting from bare metal
 Maintains system health
and satisfies SLA
 What’s special about it
 Model-driven Fabric
service management
 Enables utility-model
shared fabric
 Automates hardware
management

Load-balancers Switches
Fabric Controller
 Owns all the data center hardware
 Uses the inventory to host services
 Similar to what a per machine operating system
does with applications
 The FC provisions the hardware
as necessary
 Maintains the health of the hardware
 Deploys applications to free resources
 Maintains the health of those applications
What You Describe In Your Service Model…

 The topology of your service


 The roles and how they are connected
 Attributes of the various components
 Operating system features required
 Configuration settings
 Describe exposed interfaces
 Required characteristics
 How many fault/update domains you need
 How many instances of each role
Fault/Update Domains
 Allows you to specify what portion of
your service can be offline at a time
 Fault domains are based on the
topology of the data center Fault domains
 Switch failure
 Statistical in nature
 Update domains are determined by
what percentage of your service you
will take out at a time for an upgrade
 You may experience outages for both at
the same time
 System considers fault domains when
allocating service roles Allocation is across
 Example: Don’t put all roles in same
rack fault domains
 System considers update domains
when upgrading a service
Windows Azure Service Lifecycle
Goal is to automate life cycle as much as possible

Coding & Provisioning Deployment Maintain goal


Modeling state

Mapping and deploying to


New services and Desired Monitor

● ● ●
actual hardware
updates configuration ●
Network configuration

React to events

Developer Developer/ Automated Automated


Deployer
DEVELOPING WITH AZURE
Consistent, Familiar
Development Experience
 Visual Studio
 Templates
 Debugging

 .NET platform
 .NET, IIS7, WCF

 “The cloud on your desktop”


 Complete offline cloud simulation
 Like Cassini (web development server)
Not Just Websites

 Cloud services aren’t just websites


 Many other types of work for the cloud
 Bulk file conversion
 Heavy analytics
 Even websites can offload asynchronous
work
 We need a more complex architecture
Publishing Your Service To The Cloud

1. Write code on your laptop


2. Upload your package to the web portal
3. Push “deploy”

4. Monitor, upgrade, scale…


Debugging in the Cloud

 Debugging the cloud really means logging

 Simple logging API today

 More functionality over time


Scalable Web Application
Parallel Processing
Application
Web Application with
Background Processing
Using Cloud Storage From
Locally Hosted Application
AZURE SERVICES
Azure Services
Approach to Azure Services
Provide a Flexible Services Platform with Internet Scale
Simple scenarios are simple – complex scenarios are possible
Services hosted in Microsoft’s data centers
Designed for high availability & scalability

Base it on Internet Standards


Multiple protocol support including HTTP, REST, SOAP, AtomPub
Broad investment in open, community-based access to Azure services

Extend Your Existing Investments


Familiar tools, languages, and frameworks with .NET and Visual Studio
Provides the choice to build on-premises, cloud, or hybrid solutions
Integrate with existing assets such as AD and premises applications
.NET Services address common infrastructure challenges in creating
distributed applications

.NET SERVICES
.NET Service
Access Control
Access Control
Challenges
 Lots of identity providers, many vendors,
protocol variability – tricky to get it all right
 Access checks strewn throughout applications
 Hard to be agile, compliant, and flexible
Approach
 Federate a wide-range of identity providers and
technologies – pluggable too
 Factor out access control logic into manageable
collection of rules
The Access Control Pattern
3. Map input claims
to output claims based on
1. Define access access control rules
control rules
Your Access

(o
4. t c
Control Project

ut
Re lai
pu

tu ms
0. Trust

rn
2. itia en
exchanged;

to from
Se l c tit
(In . id

ke
secrets,

nd laim y)

n 3)
e.g
6. Check

to s;
for claims certs

ke
n
Your App 5. Send token User
(Relying Party) w/ request (Application)
Access Control Capabilties
 A hosted security token service
 The output security token contains claims computed from
claims in incoming tokens
 Define and manage rules to map claims to claims
 Create and manage scopes; e.g. URLs
 Create and manage claim types
 Create and manage signing and encryption keys
 Create and manage rules within an application scope
 Rules can be chained; e.g. Bob  Manager, Manager 
Edit – enables RBAC or more
 Manage permissions on scopes; e.g. delegation
 Standards based – works with Java, Ruby, PHP, …
Service Bus
Service Bus
Challenges
 You want to make it easy and secure for partners to
integrate with your application
 But you don’t always know ahead of time the
characteristics or scale of the integration
 Plus partners and customers have devices and
services running behind firewalls
Approach
 Provide a highly-available “Service Bus” based on
standard Internet protocols
The Service Bus Pattern
Applications, Workflows, …

Service Registry
Federated
Identity and Application Messaging Patterns
Access Control
Connectivity Fabric

Your Clients On-Premises Cloud Services


Services
ESB Storage Billing
Desktop,
Web,
Desktop,
RIA, Web
Desktop,
RIA, Web
Compute …
RIAs, … Corp Service
Service Bus Capabilties
 Connectivity Fabric  Service Registry
 NAT / firewall traversal  Stable URIs for services
 Mobile & intermittently  Discovery – supports
connected receivers
Atom pub, …
 Application Messaging
 Service Bus Workflows
 Bi-directional / peer-to-peer
communication  Simple hosted message
 Publish and subscribe – processing activities
multicast to receivers  Conditional behavior, fire
through a stable URI events, transform
 Cloud buffering – web messages, send mail, …
integration, “queues”, …
Service Bus Scenarios

1. Create a custom, peer-to-peer Instant


Messenger application in ~20 lines
2. Pop a “toast” when you have a new
customer order
3. “Slingbox” your videos from home
4. Easy, secure, web-based sharing from
mobile devices
5. Integrate and orchestrate corporate
billing and fulfillment systems
Workflow
Workflow Service
Overview
 Framework Support
 Supports WF 3.5
 Future releases to support WF 4.0 and WF 4.0+
 Activities
 New Activities for Azure Services Platform
 Supports a subset of WF out-of-box activities
 Workflow Designer
 Use existing tools
 Deploy, Run, Manage
 Management Portal for easy access
 Management APIs for rich automation
The Live Framework lets applications access Live Services data,
optionally synchronizing that data across desktops and devices

LIVE FRAMEWORK AND SERVICES


Live Services

… are a set of building blocks for handling user


data and application resources which can
connect your application to hundreds of
millions of users.
Live Framework
Live Framework
Inside Live Operating Environment
Live Framework Programming Model

Script Engine App Engine


Sync Engines
Sync Engines Formatters API Throttler
Sync Engines
Resource
Scheduler Analytics
Manager
FS Manager Auth/AuthZ
Cache
P2P Comms Notifications

Sync Engines HTTP Comms Device Mgmt


Sync Engines
Service Proxies

Live Operating Environment Engine

(Full /Min) CLR

Host (Web/desktop)
Accessing Data
Using A Mesh
Windows Live Services

WL Hotmail WL ID WL Messenger Live Search WL Spaces WL Alerts

Live Search WL Sky Drive WL Events


Live.com WL Photo Gallery
Maps
WL Mail

WL Calendar
Live Gadgets WL Expo
WL for Mobile WL Writer WL Gallery
WL Agents

WL Favorites WL Family Safety


WL Contacts WL Toolbar
WL OneCare WL QnA

Mesh Services
Distributed Data Mining
Data Mining Process Model

• Understanding the
problem definition
• Understanding the data
• Preparation of the data
• Data mining
• Evaluation of the
discovered knowledge
• Using the discovered
knowledge

120
Distributed Data Mining
(DDM)
 Definition

 Distributed Data Mining (DDM) deals with the problem


of mining distributed, possibly multi-party data using
distributed algorithms.
Data
Web

Set 1
Service

Data Data
Web

Set 4
Service Internet Web

Set 2
Service

Data
Web

Set 3
Service

121
Detailed Architecture
Data Mining Data Mining Data Mining
Client Client Client

C1
Execution framework for Hierarchical Web-Component based services
C2
C3
Registry

DM Registration DM Discovery
Using DMRL Service C4
R2
R1

Algorithm Provider Data Algorithm Provider Data


(Aggregate Component DMCL) Provider (Core Web Service) Provider

Client: The user of the distributed data mining framework.


Registry: This location has database for all the algorithm service providers and the data
providers to keep their details.
122
Registration
DataSites

Site1 Site2 Siten


......

er
Re

Register

st
g

gi
ist

Re
er
Data Data Data
Service 1 Service 2 Service n
RegistryServer
Algorithm Algorithm Algorithm
Service 1 Service 2 Service n

Register
er

R
st

eg
gi

is
Re

te
Algorithms

r
Site1 Site2 Siten
......

123
Selecting Dataset and Algorithm Service
Information
Client Sho
wa
vai
lab
n le A
atio lgo
1 set
Inf orm
Sel
rith
ms 2
le Data ect
D ata
for
the
availab ion set s ele
o v ie
w
fo r mat (s)
and
cte
d Da
tt et In
ues Data
s req tas
Req ble
ues
t ava
et s
aila
w av ilab
Sho le
alg
orit
hm
ShowDataSets JSP / Servlets s ShowAlgorithms

Se Se fo
Se rv rv In

fo
rv ic ce

fo
ic

In
ic e e vi

In
In In r

e
e
Se

ic
In fo f

ic
o

rv
fo

rv

Se
R

Se
e
g
Data Data Algorithm Algorithm
i Data
Service 2 Service n
AlgorithmS
Service 2 Service n
Service 1 ervice 1
s
t
r
y

124
Mining Process (Centralized)
3 4 5 6
Collect the Datasets from the Combine the Fetch the selected Process the
selected Data Site(s) selected Algorithm Combined
Datasets From the Remote Site Dataset
with the selected
Selected Algorithm WebMineServlet Algorithm
and data set
Information from client Data 1 Combined
fetchData
Data 2 data set Algorithm 1
combineData fetchAlgo executeAlgo

Results after mining


Request for
Data Service URL
a Request for Algo
R
a
Using the
Algo Service URL
5
e R

Algorithm 1
g b e Results
a2

i Data Data g b
Data 1

Dat

Service 1 Service 2 Algorithm


s i Service 1 displayResults
t s
r t
y r
y

Data 1 Data 2 Algorithm1


Client

125
Mining Process (Cloud-Distributed)
3 4 5
Fetch the selected Algorithm Send the Algorithm Process the data
from the Remote Site to the selected Data Site(s) Distributed
at the selected
Data Site(s)
Selected Algorithm WebMineServlet
and data set executeAlgo
Information from client
Algorithm 1
fetchAlgo sendAlgo
Request for Algo
Using the Request for Data a
Algo Service URL b Using the
b
a Data Service URL
b
R
Algorithm 1

e a

Algo

Al
g R

go
Algorithm

ri
i Service 1

rit
e

thm

hm
s g

1
t i
Data Data
Service 1 Service 2 Data 1 Data 2
r s
y t
r
y

Algorithm1
Data 1 Data 2 Client

126
Mining Process (Cloud-Distributed)
3 4 5
Fetch the selected Algorithm Send the Algorithm Process the data
from the Remote Site to the selected Data Site(s) Distributed
at the selected
Data Site(s)
Selected Algorithm WebMineServlet
and data set executeAlgo
Information from client
Algorithm 1
fetchAlgo sendAlgo
Request for Algo
Using the Request for Data a
Algo Service URL b Using the
b
a Data Service URL
b
R
Algorithm 1

e a

Algo

Al
g R

go
Algorithm

ri
i

rit
Service 1
e

thm

hm
s g

1
t i
Data Data
Service 1 Service 2 Data 1 Data 2
r s
y t
r
y

Algorithm1
Data 1 Data 2 Client

127
Mining Process (Cloud-Distributed)
3 4 5
Fetch the selected Algorithm Send the Algorithm Process the data
from the Remote Site to the selected Data Site(s) Distributed
at the selected
Data Site(s)
Selected Algorithm WebMineServlet
and data set executeAlgo
Information from client
Algorithm 1
fetchAlgo sendAlgo
Request for Algo
Using the Request for Data a
Algo Service URL b Using the
b
a Data Service URL
b
R
Algorithm 1

e a

Algo

Al
g R

go
Algorithm

ri
i

rit
Service 1
e

thm

hm
s g

1
t i
Data Data
Service 1 Service 2 Data 1 Data 2
r s
y t
r
y

Algorithm1
Data 1 Data 2 Client

128
On Demand Cloud Computing
Collaborative Computing
THANK YOU!

Anup Kumar and Alok Srivastava

You might also like