Professional Documents
Culture Documents
Savings account and customer records are kept in permanent system les.
Application programs are written to manipulate les to perform the following tasks:
1
• Find an account balance.
• Generate monthly statements.
2. Development of the system proceeds as follows:
Multiple users
2
• Every user of the system should be able to access only the data they are permitted to
see.
• E.g. payroll people only handle employee records, and cannot see customer
accounts; tellers only
access account data and cannot see payroll data.
• Difficult to enforce this with application programs.
Integrity problems
These problems and others led to the development of database management systems
3
Figure 1.1: The three levels of data abstraction
• Highest level.
• Describes part of the database for a particular group of users.
• Can be many different views of a database.
• E.g. tellers in a bank get a view of customer accounts, but not of pay roll data.
1. Data models are a collection of conceptual tools for describing data, data relationships,
data semantics and data constraints. There are three different groups:
4
Describe data at the conceptual and view levels.
• Entity-relationship model.
• Object-oriented model.
• Binary model.
• Semantic data model.
• Infological model.
• Functional data model.
2. At this point, we'll take a closer look at the entity-relationship (E-R) and object-
oriented models.
5
• Another essential element of the E-R diagram is the mapping cardinalities, which
express the number of entities to which another entity can be associated via a
relationship set.
We'll see later how well this model works to describe real world situations.
1. The object-oriented model is based on a collection of objects, like the E-R model.
6
• Under most data models, changing the interest rate entails changing code in
application programs.
• In the object-oriented model, this only entails a change within the pay-interest
method.
2. Unlike entities in the E-R model, each object has its own unique identity, independent
of the values it
contains:
7
name street city number
name balance
900 55
556 100000
647 105366
801 10533
8
• Data are represented by collections of records.
• Relationships among data are represented by links.
• Organization is that of an arbitrary graph.
• Figure 1.4 shows a sample network database that is the equivalent of the relational
database of Figure 1.3.
The Hierarchical Model
The relational model does not use pointers or links, but relates records by the values they
contain. This allows a formal mathematical foundation to be defined.
• Unifying model.
• Frame memory.
3. We will not cover physical models.
• Physical scheme
• Conceptual scheme
• Subscheme (can be many)
1.5 Data Independence
1. The ability to modify a scheme de nition in one level without a ecting a scheme de
nition in a higher level is called data independence.
• The ability to modify the physical scheme without causing application programs to
be rewritten
• Modi cations at this level are usually to improve performance
Logical data independence
• The ability to modify the conceptual scheme without causing application programs
to be rewritten
• Usually done when logical structure of database is altered
3. Logical data independence is harder to achieve as the application programs are usually
heavily dependent on the logical structure of the data. An analogy is made to abstract data
types in programming languages.
10
• deletion of information in the database
• modi cation of information in the database
2. A DML is a language which enables users to access and manipulate data.
procedural: the user speci es what data is needed and how to get it
2. Databases typically require lots of storage space (gigabytes). This must be stored on
disks. Data is moved between disk and main memory (MM) as needed.
3. The goal of the database system is to simplify and facilitate access to data.
Performance is important. Views provide simpli cation.
• Interaction with the file manager: Storing raw data on disk using the le system
usually provided by a conventional operating system. The database manager must
translate DML statements into low-level le system commands (for storing,
retrieving and updating data in the database).
• Integrity enforcement: Checking that updates in the database do not violate
consistency constraints (e.g. no bank account balance below $25) .
• Security enforcement: Ensuring that users only have access to information they are
permitted to see.
11
• Backup and recovery: Detecting failures due to power failure, disk crash, software
errors, etc., and restoring the database to its state before the failure.
• Concurrency control: Preserving data consistency when there are concurrent users.
5. Some small database systems may miss some of these features, resulting in simpler
database managers. (For example, no concurrency is required on a PC running MS-DOS.)
These features are necessary on larger systems.
• Scheme definition: the creation of the original database scheme. This involves
writing a set of definitions in a DDL (data storage and de nition language),
compiled by the DDL compiler into a set of tables stored in the data dictionary.
• Storage structure and access method definition: writing a set of definitions
translated by the data storage and de nition language compiler.
• Scheme and physical organization modification: writing a set of definitions used
by the DDL compiler to generate modi cations to appropriate internal system
tables (e.g. data dictionary). This is done rarely, but sometimes the database
scheme or physical organization must be modified.
• Granting of authorization for data access: granting di erent types of authorization
for data access to various users.
• Integrity constraint specification: generating integrity constraints. These are
consulted by the database manager module whenever updates occur.
1.10 Database Users
1. The database users fall into several categories:
Application programmers are computer professionals interacting with the system through
DML calls embedded in a program written in a host language (e.g. C, PL/1, Pascal).
12
• These are sometimes called fourth-generation languages.
• They often include features to help generate forms and display data.
Sophisticated users interact with the system without writing programs.
Naive users are unsophisticated users who interact with the system by using permanent
application programs (e.g. automated teller machine).
2. Components include:
File manager manages allocation of disk space and data structures used to represent
information on disk.
13
Figure 1.6: Database system structure.
Database manager: The interface between low-level data and application programs and
queries.
Query processor translates statements in a query language into low-level instructions the
database manager understands. (May also attempt to nd an equivalent but more ecient
form.)
DDL compiler converts DDL statements to a set of tables containing metadata stored in a
data dictionary.
14
In addition, several data structures are required for physical system implementation:
The E-R (entity-relationship) data model views the real world as a set of basic objects
(entities) and relationships among these objects.
It is intended primarily for the DB design process by allowing the speci cation of an
enterprise scheme.
• branch, the set of all branches of a particular bank. Each branch is described by the
attributes branch-name, branch-city and assets.
• customer, the set of all people having an account at the bank. Attributes are
customer-name, S.I.N., street and customer-city.
• employee, with attributes employee-name and phone-number.
• account, the set of all accounts created and maintained in the bank. Attributes are
account number and balance.
• transaction, the set of all account transactions executed in the bank. Attributes are
transaction number, date and amount
2.2 Relationships & Relationship Sets
• A relationship is an association between several entities.
A relationship set is a set of relationships of the same type.
Formally it is a mathematical relation on n 2 (possibly non-distinct) sets.
If E1;E2....En are entity sets, then a relationship set R is a subset of
if{(e1,e2. . . .. en) |e1∈ E1; e2 ∈ E2; . . . ∈ En}
where (e1; e2; . . .; en) is a relationship.
For example, consider the two entity sets customer and account. (Fig. 2.1 in the
text). We de ne the relationship CustAcct to denote the association between
customers and their accounts. This is a binary relationship set (see Figure 2.2 in
the text).
Going back to our formal de nition, the relationship set CustAcct is a subset of all
the possible customer and account pairings.
This is a binary relationship. Occasionally there are relationships involving more
than two entity sets.
The role of an entity is the function it plays in a relationship. For example, the
relationship works-for could be ordered pairs of employee entities. The rst
16
employee takes the role of manager, and the second one will take the role of
worker.
A relationship may also have descriptive attributes. For example, date (last date of
account access) could be an attribute of the CustAcct relationship set.
2.3 Attributes
It is possible to de ne a set of entities and the relationships among them in a number of
different ways. The main diffrerence is in how we deal with attributes.
• Consider the entity set employee with attributes employee-name and phone-
number.
• We could argue that the phone be treated as an entity itself, with attributes phone-
number and location.
• Then we have two entity sets, and the relationship set EmpPhn de ning the
association between employees and their phones.
• This new de nition allows employees to have several (or zero) phones.
• New de nition may more accurately reflect the real world.
• We cannot extend this argument easily to making employee-name an entity.
The question of what constitutes an entity and what constitutes an attribute depends
mainly on the structure of the real world situation being modeled, and the semantics
associated with the attribute in question.
• The entity set transaction has attributes transaction-number, date and amount.
• Different transactions on di erent accounts could share the same number.
• These are not suficient to form a primary key (uniquely identify a transaction).
• Thus transaction is a weak entity set.
For a weak entity set to be meaningful, it must be part of a one-to-many
relationship set. This relationship set should have no descriptive attributes. (Why?)
18
The idea of strong and weak entity sets is related to the existence dependencies
seen earlier.
• Member of a strong entity set is a dominant entity.
• Member of a weak entity set is a subordinate entity.
A weak entity set does not have a primary key, but we need a means of
distinguishing among the entities. The discriminator of a weak entity set is a set of
attributes that allows this distinction to be made.
The primary key of a weak entity set is formed by taking the primary key of the
strong entity set on which its existence depends (see Mapping Constraints) plus its
discriminator.To illustrate:
• transaction is a weak entity. It is existence-dependent on account.
• The primary key of account is account-number.
• transaction-number distinguishes transaction entities within the same account (and
is thus the discriminator).
• So the primary key for transaction would be (account-number, transaction-
number).
Just Remember: The primary key of a weak entity is found by taking the primary
key of the strong entity on which it is existence-dependent, plus the discriminator
of the weak entity set.
2.6 Primary Keys for Relationship Sets
The attributes of a relationship set are the attributes that comprise the primary keys of the
entity sets involved in the relationship set.
For example:
19
Figure 2.7:An E-Rdiagram
20
sets. Figures 2.9 to 2.10 show some examples.
Go back and review mapping cardinalities. They express the number of entities to which
an entity can be
associated via a relationship.
The arrow positioning is simple once you get it straight in your mind, so do some
examples. Think of the arrow
head as pointing to the entity that \one" refers to.
21
Numbers instead of arrowheads indicating cardinality.
- Symbols, 1, n and m used.
- E.g. 1 to 1, 1 to n, n to m.
- Easier to understand than arrowheads.
A range of numbers indicating optionality of relationship. (See Elmasri &
Navathe, p 58.)
- E.g (0,1) indicates minimum zero (optional), maximum 1.
- Can also use (0,n), (1,1) or (1,n).
- Typically used on near end of link - confusing at rst, but gives more information.
- E.g. entity 1 (0,1) || (1,n) entity 2 indicates that entity 1 is related to between 0
and 1 occurrences of entity 2 (optional).
- Entity 2 is related to at least 1 and possibly many occurrences of entity 1
(mandatory).
22
& Navathe, chapter 21.)
- Composite attributes.
- Derived attributes.
- Subclasses and superclasses.
- Generalization and specialization.
23
Figure 2.13:E-R diagram with aternary relationship
This E-R diagram says that a customer may have several accounts, each located in a speci
c bank branch, and that an account may belong to several di erent customers.
For each entity set and relationship set, there is a unique table which is assigned the
name of the corresponding set. Each table has a number of columns with unique names.
(E.g. Figs. 2.14 - 2.18 in the text).
24
2.9.1 Representation of Strong Entity Sets
We use a table with one column for each attribute of the set. Each row in the table
corresponds to one entity of the entity set. For the entity set account, see the table of gure
2.14.We can add, delete and modify rows (to reflect changes in the real world).
D1 × D2 or ×2 i+1 Di
for the account table, where D1 and D2 denote the set of all account numbers and all
account balances, respectively. In general, for a table of n columns, we may denote the
cartesian product of D1,D2, . . .,Dn by
×n i+1 Di
For example, the weak entity set transaction has three attributes: transaction-number, date
and amount. The primary key of account (on which transaction depends) is account-
number. This gives us the table of gure 2.16.
25
m
U primary-key(Ei)U{a1,a2,...ak}
i=1
An example:
The relationship set CustAcct involves the entity sets customer and account.
Their respective primary keys are S.I.N. and account-number.
CustAcct also has a descriptive attribute, date.
This gives us the table of gure 2.17.
Non-binary Relationship Sets
The ternary relationship of Figure 2.13 gives us the table of figure 2.18. As required, we
take the primary keys of each entity set. There are no descriptive attributes in this
example.
These relationship sets are many-to-one, and have no descriptive attributes. The primary
key of the weak entity set is the primary key of the strong entity set it is existence-
dependent on, plus its discriminator. The table for the relationship set would have the
same attributes, and is thus redundant.
2.10 Generalization
Consider extending the entity set account by classifying accounts as being either savings-
account or chequing-account.
26
Each of these is described by the attributes of account plus additional attributes. (savings
has interest-rate and chequing has overdraft-amount.)
We can express the similarities between the entity sets by generalization. This is the
process of forming containment relationships between a higher-level entity set and one
or more lower-level entity sets. In E-R diagrams, generalization is shown by a triangle, as
shown in Figure 2.19.
Consider a DB with information about employees who work on a particular project and
use a number of machines doing that work. We get the E-R diagram shown in Figure
2.20.Relationship sets work and uses could be combined into a single set. However, they
shouldn't be, as this would obscure the logical structure of this scheme.
The solution is to use aggregation.
27
An abstraction through which relationships are treated as higher-level entities.
For our example, we treat the relationship set work and the entity sets employee
and project as a higher-level entity set called work.
Figure 2.20 shows the E-R diagram with aggregation.
Figure 2.20:E-R diagram with redundant relationships
28
Transforming an E-R diagram with aggregation into tabular form is easy.We createa table
for each entity and relationship set as before.
The table for relationship set uses contains a column for each attribute in the primary key
of machinery and work.
29
In Figure ??, relationship between a customer and account can be made only if
there is a corresponding branch.
In Figure ??, an account can be related to either a customer or a branch alone.
The design of figure ?? is more appropriate, as in the banking world we expect to
have an account relate to both a customer and a branch.
The design of figure is more appropriate, as in the banking world we expect to have an
account relate to both a customer and a branch.
30
Strong entity sets and their dependent weak entity sets may be regarded as a single
\object" in the database,as weak entities are existence-dependent on a strong
entity.
It is possible to treat an aggregated entity set as a single unit without concern for
its inner structure details.
Generalization contributes to modularity by allowing common attributes of similar
entity sets to be repre-sented in one place in an E-R diagram. Excessive use of the
features can introduce unnecessary complexity into the design.
*********
31