You are on page 1of 23

“Advanced Database Systems”

Introduction
Course 

Objectives
A DataBase Management System DBMS
outlines 

Structure of a DBMS
File and data models – an overview
 Evolution of Database Technology

 Advanced database contents/topics


Centralized database system related issues
Distributed database system related issues
Introduction to Database system adapted architectures
Modern database systems
Database Security and Authorization
Database development applications
Recapitulative mini project
“Advanced Database Systems”- Introduction

Objectives
 To understand the fundamental concepts and advanced technology underlying
database systems:
 database management systems roles and
 Advanced database fundamentals
 Modern database systems:
 Object-relational databases
 Spatial databases
 Active databases
 Etc.
 Physical data organizations
 Query processing and optimization
 Transaction management
 concurrence control
 crash recovery
 Security and authorization
 Distributed database system issues
 Parallel and system architectures
 OLAP, data mining and data warehouse

 To gain hand-on experience with database application systems


 developing a small application system using MS SQLSERVER
 Java/Vis…ualBasic and database-backed web-sites

2
“Advanced Database Systems”- Introduction

A DataBase Management System DBMS


 What is DBMS?
 Need for information management
 A very large, integrated collection of data.
 Models real-world enterprise.
 DMBS contains information about a particular enterprise
 A Database Management System (DBMS) is a software package designed to store and
manage databases.
 DBMS provides an environment that it both convenient and efficient to use

 Why Use a DBMS? Purpose of Database Systems


Database management systems were developed to handle the following
issues/difficulties of typical file-processing systems supported by conventional
operating systems:
 Data independence and efficient access.
 Difficulty in accessing data
 Data integrity and security.
 Data redundancy and inconsistency
 Uniform data administration.
 Concurrent access, recovery from crashes.
 Data isolation – multiple files and formats
 Replication control
 Reduced application development time.

3
“Advanced Database Systems”- Introduction

Structure of a DBMS (1/2)


 A typical DBMS has a layered
architecture. These layers must consider
concurrency control and
 The figure does not show the recovery
concurrency control and Query Optimization
and Execution
recovery components.
Relational Operators
 This is one of several possible
Files and Access Methods
architectures; each system has
its own variations. Buffer Management

Disk Space Management

DB
4
“Advanced Database Systems”- Introduction
The enclosed figure shows the
Structure of a DBMS (2/2) structure (with some
simplification) of a typical DBMS
based on the relational data model
Web Application Front SQL
Forms Ends Interface
SQL Statements

Plan Executor Parser Query


Evaluation
Operator Optimizer Engine
Evaluator

Transactio Files and Access Methods


n Manager
Buffer Manager Recover
Lock
Manager y
Manage
Concurrenc Disk Space Manager r
y control
DBM
S
Index System
Files Data Database
catalog
Files
5
“Advanced Database Systems”- Introduction

File and data models – an overview


 Data Models
 A data model is a collection of concepts for describing data.
 A schema is a description of a particular collection of data, using the a given data
model.
 History of File Organizations:
 Sequential search
 index sequential
 B-tree
 Hashing

 Classification of Database Models:


 Entity-relationship
 Network Various database models
 Hierarchical provide logical and
 Relation
 Object-oriented physical data
 Deductive independence to separate
 Object-relational
 Semi-structured data XML simple logical database
structures and
complicated physical file
6
“Advanced Database Systems”- Introduction

Evolution of Database Technology


1960s: Hierarchical (IMS) & network (CODASYL) DBMS.

1970s: Relational data model, relational DBMS implementation.

1980: RDBMS rules the earth

1985: Advanced data models (extended-relational, OO, deductive, etc.)


Application-oriented DBMS (spatial, scientific, engineering, etc.).

1990s: ORDB, OLAP, Data mining, data warehousing, multimedia


databases, and network databases.

2000s: Databases for XML, bioinformatics, stream data and sensor


network data

7
“Advanced Database Systems”- Introduction

Advanced database contents/topics


 Centralized database system related issues
 Storage and File Structure
 Indexing and Hashing
 Query Processing
 Transaction Fundamentals
 Concurrency Control Techniques
 Database Recovery
 Distributed database system related issues
 Distributed Databases “Fundamentals”
 Distributed Transactions-Commit Protocols
 Distributed Databases Concurrency Control
 Heterogeneous Distributed Databases
 Advanced and modem database systems
 Introduction to Database system adapted architectures
 Modern database systems
 Object-Oriented Databases
 Spatial and Geographic Databases
 Data Mining and Data Warehousing: Concepts and Techniques
 Database Security and Authorization
 Database development applications
 Recapitulative mini project
8
ntralized database system related issues

File Organizations and indexing


 Storage and File Structure
 Physical Storage Media - we introduce the different types of storage media and technologies,
conversed topics are:
 Volatile and nonvolatile storage
 Storing devices and Magnetic Disks
 Performance Measures
 Introduction to the RAID technology
 Storage and File Organization - in this section we speak about the following topics:
 Storage Access : Buffer Manager, Buffer-Replacement Policies
 File Organization: Organization of Records in Files, Un ordered, Ordered files, Hashed files
 Data-Dictionary Storage

 Indexing and Hashing


 Introduction
 Basic Concepts
 Ordered Indexes-Clustered / Unclustered
 Multi-level Indexes
 Index Update: Deletion / Insertion
 B+-Tree Index Files, B-Tree Index Files
 Example of a B+-tree - Insert / Delete
 Static Hashing / Dynamic Hashing
 Example of Hash Index
 Hashing vs. Other Schemes
 Grid Files
 Bitmap Indices
 Index Definition in SQL

9
ntralized database system related issues

Query Processing
Basic Steps in Query Processing – an overview
Measures of Query Cost
Query Processing- Several algorithms
Selection Operation
Join Operation: different algorithms to implement joins
Nested-loop join - Block nested-loop join
Indexed nested-loop join
Other Operations – an overview
Query Optimization using Heuristics
Query tree, Graph tree
Transformation of Relational Expressions
Equivalence Rules
Pushing Selections, Join Ordering, etc.
Choice of Evaluation Plans
Structure of Query Optimizers
Evaluation of Expressions – Materialization, Pipelining
Statistics for Cost Estimation

10
ntralized database system related issues

Transaction and concurrency control


 Transaction Fundamentals
 Transaction Concept
 Transaction - ACID properties
 Transaction States
 Concurrent Executions – Schedules
 Scheduling Transactions – Serializability
 Serializability - Conflict Serializability
 Testing for conflict Serializability - Precedence graph
 Serializability - View Serializability
 Recoverability: Why recovery is needed? Cascading rollback
 Concurrency Control – an overview
 Levels of Consistency - Levels of Consistency in SQL-92
 Transaction Definition in SQL
 Database Concurrency Control Techniques
 Purpose of Concurrency Control
 Lock-Based Protocols - Pitfalls and serializability issues
 The Two-Phase Locking (2PL) Protocol
 Timestamp-Based Protocols - Recoverability and Cascade Freedom
 Deadlock Handling
 Deadlock Prevention Strategies
 Deadlock Detection – graph based strategy, Deadlock Recovery
 Locking and Insert, Delete Operations
 Other protocols and schemes - an overview
 Graph-Based Protocols
 Validation-Based Protocol
 Granularity of data items , Intention Lock Modes
 Multi-version Schemes, Index Locking Protocol

11
ntralized database system related issues

Database Recovery
 Database Recovery – an overview
 Failure Classification
 Algorithms/techniques and Storage Structures

 Data Access

 Recovery and Atomicity

 Log-Based Recovery
 Deferred Database Modification
 Immediate Database Modification

 Checkpoints – an overview
 Checkpoints recovery steps - example
 Recovery With Concurrent Transactions

 Buffer Management - Log Record Buffering

 Failure with Loss of Nonvolatile Storage

 Shadow Paging

12
stributed database system related issues

Distributed Databases “Fundamentals”


 Distributed Database concepts
 Distributed Database System – an overview
 Data Distribution – Advantages and benefits
 Types of Distributed Databases
 Heterogeneous and Homogeneous Databases
 Distributed DBMS Architectures
 Distributed Data Storage
 Data Replication
 Data Fragmentation
 Distributed Catalog Management
 Data transparency - Naming of Data Items
 Transparency and updates

13
stributed database system related issues

Distributed Transactions and Concurrency control


 Distributed Transactions- Commit Protocols
 Distributed Transactions - Overview
 System Failure Modes
 Commit Protocols - Two Phase Commit Protocol (2PC)
 Phase 1: Obtaining a Decision
 Phase 2: Recording the Decision
 Handling of Failures
 Recovery and Concurrency Control
 Alternative Models - Persistent messaging systems
 Error Conditions with Persistent Messaging
 Persistent Messaging and Workflows
 Implementation of Persistent Messaging
 Annex - Three Phase Commit (3PC), Transactional Workflows
 Distributed Databases Concurrency Control
 Concurrency Control – an overview
 Centralized: Single-Lock-Manager Approach
 Distributed Lock Manager
 - Primary copy- Majority protocol
 - Biased protocol- Quorum consensus
 Time-stamping
 Replication with Weak Consistency
 Multi-master Lazy Replication
 Deadlock Handling
 Prevention strategies
 Centralized Approach
 Distributed Query Processing
 Simple Join Processing / Possible strategies, Semijoin Strategy
 Join Strategies that Exploit Parallelism
14
stributed database system related issues

Heterogeneous Distributed Databases


 Motivation, Database Integration
 DataBase Management Systems “DBMS”– An overview
 Old-School Approaches
 Database integration - Problem Dimensions
 Important Types of Integrated Database Systems
 Multidatabases
 Federated Databases Schema Architecture
 Handling Integration Conflicts
 A Generic Federated DBMS Architecture
 Wrapper- Architecture & Tasks, Example
 Schema Integration
 Query Processing
 Web as a Loosely Coupled Federated Database
 The Mediator - Mediator Systems
 Data Warehousing

15
Database System Architectures
 Centralized Systems

 Client-Server Systems
 Transaction Servers
 Data Servers

 Parallel DBMS
 Interconnection Network Architectures
 Architecture Issue: Shared What?
 Parallel DBMS Techniques and different Types
 Data Partitioning
 Parallel processing

16
Object-Oriented Databases (1/2)
 Motivation
 Introduction – Motivating Example
 Why Object Databases “ODBs”?
 Need for Complex Data Types
 Engineering Database Design-overview
 Database Design Process
 Logical/Physical database design
 Object-oriented concepts
 ODBs are more Natural & Direct
 Object-oriented terminologies – an overview
 Investigation and analysis - RDBs vs. ORDBs , RDBs vs. ODBs., etc.
 The Object-Oriented Data Model - An example of a class in UML
 Object-Oriented Data Modelling - rapid overview
 OO Data Modelling: Example

17
Object-Oriented Databases (1/2)
 Object Persistence - Introduction
 Persistent Programming Languages
 Specifying Object Persistence via Naming and Reachability
 Persistent Objects – Storage and Access issues
 Persistent C++ Systems
 Object Persistence – Using database
 The ODMG Standard for Object Databases
 OODBMS Features, OODB Evolution, Strategies for building OODBMS
 ODMG Object Database Standard, and Components
 An Architecture for OODBMS
 Object Model, Object Definition Language (ODL)
 Mapping Class Diagrams into ODL
 ODMG-OQL for querying the database
 Simple OQL Queries, Retrieving Objects – an example
 Database Entry Points
 Retrieving data from multiple objects
 Unnesting and Nesting Collections
 Objects network sample - OQL vs. SQL
 ODMG- Binding languages
 ODMG Types
 C++ ODL, OMT, OQL
 Persistent Java Systems
 Exercise Reconsidering the University schema…
 Object vs. Relational
 Seminar - ODB Design (Self-Study)
18
Spatial and Geographic Databases
 Fundamentals of GIS - Overview
 Spatial and Geographic Data(bases)
 Why Study GIS? What is GIS? What’s in a GIS?

 GIS vs. Other Systems - How GIS differs from Related Systems

 GIS System-Architecture and Components


 GIS Spatial Data Model
 GIS Spatial and Attribute Data
 How a GIS Organizes Spatial Data?
 Raster and Vector data Model - Spaghetti & Topologic Vector Data Model
Representing Surfaces – DEM, TIN, Contour (isolines) Lines
File Formats for Raster and Vector data models

 Spatial Database Management?


 Querying Data & Indexing of Spatial Data

 Sources of Geographic Data

19
Data Mining and Data Warehousing
 Motivation
 Evolution of Database Technology – overview
 Why Data Mining? — Potential Applications
 What Is Data Mining? Data Mining: A KDD Process
 Data Mining: On What Kind of Data?
 What is a Data Warehouse?
 Data Warehouse vs. other systems, OLTP vs. OLAP
 Conceptual Modeling of Data Warehouses
 Defining a Snowflake Schema in Data Mining Query Language DMQL
 Multi-Tiered Architecture - Approaches to Building OLAP Server
 Indexing OLAP Data: Bitmap Index
 Data Warehouse Back-End Tools and Utilities
 From OLAP to On Line Analytical Mining OLAM, An OLAM Architecture
 Data Mining Functionalities
 Are All the “Discovered” Patterns Interesting?
 Market-Basket Data; typical case,
 Frequent Pairs in SQL, A-Priori Algorithm
20
Database Security and Authorization
 Introduction to DB Security
 Access Controls
 Database Security and the DBA
 Discretionary Access Control
 The privileges at the account/relation levels
 Granting and revoking of relation privileges
 Views and Security
 Propagation of Privileges
 Role-Based Authorization
 Mandatory Access Control
 Access Control for Multilevel Security
 Multilevel Relations
 Discretionary Access Control vs. Mandatory Access Control
 Introduction to Statistical Database Security
21
Database Application Development
 Database Programming
 Embedded SQL
 Dynamic SQL
 Embedded SQL in Java

 Database APIs: Alternative to embedding


 Embedded SQL in Java using SQLJ - an example
 Database Stored Procedures
 SQL Persistent Stored Modules (SQL/PSM)

 Client-Server a Modern Database Architectures


 Client-Server Computing
 Two-Tier Architecture
 Multiple-Tier Architecture

 Active Database Concepts and Triggers – an introduction


 Generalized (ECA) Model for Active DB
 Database Triggers context
22
Mini-Projects, Recapitulation
 University Education Information System
 Object Oriented database design

 Relational / Object mapping

 Relational / Object Replication issues

 Private University distributed Information System


 Many relational sites, repository central site

 Database design issues

 Issues related to data sharing

 SQL queries for internet access

 Website implementing and using the SQL queries


23

You might also like