You are on page 1of 13

4.

Database Management Technology Page 1



4. Database Management Technology
Data:
Data are values of qualitative or quantitative variables, belonging to a set of items. Data
in computing (or data processing) are represented in a structure, often tabular (represented by
rows and columns), a tree (a set of nodes with parent-children relationship) or a graph structure
(a set of interconnected nodes). Data are typically the results of measurements and can
be visualized using graphs or images. Data as an abstract concept can be viewed as the lowest
level of abstraction from which information and then knowledge are derived. Raw data, i.e.,
unprocessed data, refers to a collection of numbers, characters and is a relative term; data
processing commonly occurs by stages, and the "processed data" from one stage may be
considered the "raw data" of the next. Field data refers to raw data collected in an uncontrolled in
situ environment. Experimental data refers to data generated within the context of a scientific
investigation by observation and recording.

Information:
Information, in its most restricted technical sense, is a sequence of symbols that can be
interpreted as a message. Information can be recorded as signs, or transmitted as signals.
Information is any kind of event that affects the state of a dynamic system that can interpret the
information.
Conceptually, information is the message being conveyed. Therefore, in a general sense,
information is "Knowledge communicated or received concerning a particular fact or
circumstance", or rather; information is an answer to a question. Information cannot be
predicted and resolves uncertainty. The uncertainty of an event is measured by its probability of
occurrence and is inversely proportional to that. The more uncertain an event, the more
information is required to resolve uncertainty of that event. The amount of information is
measured in bits.






4. Database Management Technology Page 2


Data vs Information
Data Information
Meaning: Data is raw, unorganized facts
that need to be processed. Data
can be something simple and
seemingly random and useless
until it isorganized.
When data is
processed, organized,
structured or presented in a
given context so as to make it
useful, it is called Information.
Example: Each student's test score is
onepiece of data
The class' average score or the
school's average score is
theinformation that can be
concluded from the given data.
Definition: Latin 'datum' meaning "that
which is given". Data was the
plural form of datum singular
(M150 adopts the general use
of data as singular. Not
everyone agrees.)
Information is interpreted data.

Data Hierarchy
Data Hierarchy refers to the systematic organization of data, often in a hierarchical form. Data
organization involves fields, records, files and so on.
A data field holds a single fact or attribute of an entity. Consider a date field, e.g. "September 19,
2004". This can be treated as a single date field (e.g. birthdate), or 3 fields, namely, month, day
of month and year.
A record is a collection of related fields. An Employee record may contain a name field(s),
address fields, birthdate field and so on.

4. Database Management Technology Page 3

A file is a collection of related records. If there are 100 employees, then each employee would
have a record (e.g. called Employee Personal Details record) and the collection of 100 such
records would constitute a file (in this case, called Employee Personal Details file).
Files are integrated into a database. This is done using a Database Management System. If there
are other facets of employee data that we wish to capture, then other files such as Employee
Training History file and Employee Work History file could be created as well.
An illustration of the above description is shown in this diagram below.

The following terms are for better clarity.
With reference to the example in the above diagram.
Data field label = Employee Name or EMP_NAME
Data field value = Jeffrey Tan
The above description is a view of data as understood by a user eg a person working in Human
Resource Department.

4. Database Management Technology Page 4

The above structure can be seen in the hierarchical model, which is one way to organize data in a
database.
[3]

In terms of data storage, data fields are made of bytes and these in turn are made up of bits.

Database
A database is an organized collection of data. The data are typically organized to model relevant
aspects of reality in a way that supports processes requiring this information. For example,
modeling the availability of rooms in hotels in a way that supports finding a hotel with
vacancies.
Database management systems (DBMSs) are specially designed applications that interact with
the user, other applications, and the database itself to capture and analyze data. A general-
purpose database management system (DBMS) is a software system designed to allow the
definition, creation, querying, update, and administration of databases.
Well-known DBMSs include MySQL, PostgreSQL, SQLite,Microsoft SQL Server, Microsoft
Access, Oracle. A database is not generally portable across different DBMS, but different
DBMSs can inter-operate by using standards such as SQL and ODBC or JDBC to allow a single
application to work with more than one database.

File Organization & Access Method
File Access Method
The way by which information/data can be retrieved. There are two method of file
access:
1. Direct Access
2. Sequential Access
Direct Access
This access method the information/data stored on a device can be accessed randomly
and immediately irrespective to the order it was stored. The data with this access method
is quicker than sequential access. This is also known as random access method. For
example Hard disk, Flash Memory



4. Database Management Technology Page 5

Sequential Access
This access method the information/data stored on a device is accessed in the exact order
in which it was stored. Sequential access methods are seen in older storage devices such
as magnetic tape.
File Organization Method
The process that involves how data/information is stored so file access could be as easy
and quickly as possible. Three main ways of file organization:
1. Sequential
2. Index-Sequential
3. Random
Sequential file organization
All records are stored in some sort of order (ascending, descending, alphabetical). The
order is based on a field in the record. For example a file holding the records of
employeeID, date of birth and address. The employee ID is used and records stored is
group accordingly (ascending/descending). Can be used with both direct and sequential
access.

Index-Sequential organization
The records is stores in some order but there is a second file called the index-file that
indicates where exactly certain key points. It cannot be used with sequential access
method.

Random file organization
The records are stored randomly but each record has its own specific position on the disk
(address). With this method no time could be wasted searching for a file. Instead it jumps
to the exact position and access the data/information. Can only be used with direct access
access method.





4. Database Management Technology Page 6

Comparison of Traditional File-Based Approach and Database Approach

At the beginning, you should understand the rationale of replacing the traditional file-based
system with the database system.
File-based System
File-based systems were an early attempt to computerize the manual filing system. File-based
system is a collection of application programs that perform services for the end-users. Each
program defines and manages its data.
However, five types of problem are occurred in using the file-based approach:

Separation and isolation of data
When data is isolated in separate files, it is more difficult for us to access data that should be
available. The application programmer is required to synchronize the processing of two or more
files to ensure the correct data is extracted.

Duplication of data
When employing the decentralized file-based approach, the uncontrolled duplication of data is
occurred. Uncontrolled duplication of data is undesirable because:

i. Duplication is wasteful
ii. Duplication can lead to loss of data integrity

Data dependence
Using file-based system, the physical structure and storage of the data files and records are
defined in the application program code. This characteristic is known as program-data
dependence. Making changes to an existing structure are rather difficult and will lead to a
modification of program. Such maintenance activities are time-consuming and subject to error.

Incompatible file formats
The structures of the file are dependent on the application programming language. However file
structure provided in one programming language such as direct file, indexed-sequential file

4. Database Management Technology Page 7

which is available in COBOL programming, may be different from the structure generated by
other programming language such as C. The direct incompatibility makes them difficult to
process jointly.

Fixed queries / proliferation of application programs
File-based systems are very dependent upon the application programmer. Any required queries
or reports have to be written by the application programmer. Normally, a fixed format query or
report can only be entertained and no facility for ad-hoc queries if offered.
File-based systems also give tremendous pressure on data processing staff, with users'
complaints on programs that are inadequate or inefficient in meeting their demands.
Documentation may be limited and maintenance of the system is difficult. Provision for
security, integrity and recovery capability is very limited.

Database Approach
In order to overcome the limitations of the file-based approach, the concept of database and the
Database Management System (DMS) was emerged in 60s.

Advantages
A number of advantages of applying database approach in application system are obtained
including:

Control of data redundancy
The database approach attempts to eliminate the redundancy by integrating the file. Although
the database approach does not eliminate redundancy entirely, it controls the amount of
redundancy inherent in the database.

Data consistency
By eliminating or controlling redundancy, the database approach reduces the risk of
inconsistencies occurring. It ensures all copies of the idea are kept consistent.



4. Database Management Technology Page 8

More information from the same amount of data
With the integration of the operated data in the database approach, it may be possible to derive
additional information for the same data.

Sharing of data
Database belongs to the entire organization and can be shared by all authorized users.

Improved data integrity
Database integrity provides the validity and consistency of stored data. Integrity is usually
expressed in terms of constraints, which are consistency rules that the database is not permitted
to violate.

Improved security
Database approach provides a protection of the data from the unauthorized users. It may take the
term of user names and passwords to identify user type and their access right in the operation
including retrieval, insertion, updating and deletion.

Enforcement of standards
The integration of the database enforces the necessary standards including data formats, naming
conventions, documentation standards, update procedures and access rules.

Economy of scale
Cost savings can be obtained by combining all organization's operational data into one database
with applications to work on one source of data.

Balance of conflicting requirements
By having a structural design in the database, the conflicts between users or departments can be
resolved. Decisions will be based on the base use of resources for the organization as a whole
rather that for an individual entity.



4. Database Management Technology Page 9

Improved data accessibility and responsiveness
By having an integration in the database approach, data accessing can be crossed departmental
boundaries. This feature provides more functionality and better services to the users.

Increased productivity
The database approach provides all the low-level file-handling routines. The provision of these
functions allows the programmer to concentrate more on the specific functionality required by
the users. The fourth-generation environment provided by the database can simplify the database
application development.

Improved maintenance
Database approach provides a data independence. As a change of data structure in the database
will be affect the application program, it simplifies database application maintenance.

Increased concurrency
Database can manage concurrent data access effectively. It ensures no interference between
users that would not result any loss of information nor loss of integrity.

Improved backing and recovery services
Modern database management system provides facilities to minimize the amount of processing
that can be lost following a failure by using the transaction approach.

Disadvantages
In split of a large number of advantages can be found in the database approach, it is not without
any challenge. The following disadvantages can be found including:

Complexity
Database management system is an extremely complex piece of software. All parties must be
familiar with its functionality and take full advantage of it. Therefore, training for the
administrators, designers and users is required.


4. Database Management Technology Page 10

Size
The database management system consumes a substantial amount of main memory as well as a
large number amount of disk space in order to make it run efficiently.

Cost of DBMS
A multi-user database management system may be very expensive. Even after the installation,
there is a high recurrent annual maintenance cost on the software.

Cost of conversion
When moving from a file-base system to a database system, the company is required to have
additional expenses on hardware acquisition and training cost.

Performance
As the database approach is to cater for many applications rather than exclusively for a particular
one, some applications may not run as fast as before.

Higher impact of a failure
The database approach increases the vulnerability of the system due to the centralization. As all
users and applications reply on the database availability, the failure of any component can bring
operations to a halt and affect the services to the customer seriously.

Entity Relationship
In software engineering, an entityrelationship model (ER model) is a data model for
describing a database in an abstract way.
An ER model is an abstract way of describing a database. In the case of a relational database,
which stores data in tables, some of the data in these tables point to data in other tables - for
instance, your entry in the database could point to several entries for each of the phone numbers
that are yours. The ER model would say that you are an entity, and each phone number is an
entity, and the relationship between you and the phone numbers is 'has a phone number'.
Diagrams created to design these entities and relationships are called entityrelationship
diagrams or ER diagrams.

4. Database Management Technology Page 11

Using the three schema approach to software engineering, there are three levels of ER models
that may be developed.
Conceptual data model
This is the highest level ER model in that it contains the least granular detail but
establishes the overall scope of what is to be included within the model set. The
conceptual ER model normally defines master reference data entities that are commonly
used by the organization. Developing an enterprise-wide conceptual ER model is useful
to support documenting the data architecture for an organization.
A conceptual ER model may be used as the foundation for one or more logical data
models (see below). The purpose of the conceptual ER model is then to establish
structural metadata commonality for the master data entities between the set of logical
ER models. The conceptual data model may be used to form commonality relationships
between ER models as a basis for data model integration.
Logical data model
A logical ER model does not require a conceptual ER model, especially if the scope of
the logical ER model includes only the development of a distinct information system. The
logical ER model contains more detail than the conceptual ER model. In addition to
master data entities, operational and transactional data entities are now defined. The
details of each data entity are developed and the entity relationships between these data
entities are established. The logical ER model is however developed independent of
technology into which it will be implemented.
Physical model
One or more physical ER models may be developed from each logical ER model. The
physical ER model is normally developed to be instantiated as a database. Therefore,
each physical ER model must contain enough detail to produce a database and each
physical ER model is technology dependent since each database management system is
somewhat different.
The physical model is normally forward engineered to instantiate the structural metadata
into a database management system as relational database objects such as database
tables, database indexessuch as unique key indexes, and database constraints such as
a foreign key constraint or a commonality constraint. The ER model is also normally

4. Database Management Technology Page 12

used to design modifications to the relational database objects and to maintain the
structural metadata of the database.
The first stage of information system design uses these models during the requirements
analysis to describe information needs or the type of information that is to be stored in
a database. The data modelingtechnique can be used to describe any ontology (i.e. an overview
and classifications of used terms and their relationships) for a certain area of interest. In the case
of the design of an information system that is based on a database, the conceptual data model is,
at a later stage (usually called logical design), mapped to a logical data model, such as
the relational model; this in turn is mapped to a physical model during physical design. Note
that sometimes, both of these phases are referred to as "physical design". It is also used in
database management system.

Entityrelationship modeling
The building blocks: entities, relationships, and attributes
An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified. An entity is an abstraction from the complexities
of a domain. When we speak of an entity, we normally speak of some aspect of the real world
which can be distinguished from other aspects of the real world.
[4]

An entity may be a physical object such as a house or a car, an event such as a house sale or a car
service, or a concept such as a customer transaction or order. Although the term entity is the one
most commonly used, following Chen we should really distinguish between an entity and an
entity-type. An entity-type is a category. An entity, strictly speaking, is an instance of a given
entity-type. There are usually many instances of an entity-type. Because the term entity-type is
somewhat cumbersome, most people tend to use the term entity as a synonym for this term.
Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical
theorem.
A relationship captures how entities are related to one another. Relationships can be thought of
as verbs, linking two or more nouns. Examples: an owns relationship between a company and a
computer, a supervises relationship between an employee and a department,
a performs relationship between an artist and a song, a provedrelationship between a
mathematician and a theorem.

4. Database Management Technology Page 13

The model's linguistic aspect described above is utilized in the declarative database query
language ERROL, which mimics natural language constructs. ERROL'ssemantics and
implementation are based on reshaped relational algebra (RRA), a relational algebra which is
adapted to the entityrelationship model and captures its linguistic aspect.
Entities and relationships can both have attributes. Examples: an employee entity might have
a Social Security Number (SSN) attribute; the proved relationship may have a date attribute.
Every entity (unless it is a weak entity) must have a minimal set of uniquely identifying
attributes, which is called the entity's primary key.
Entityrelationship diagrams don't show single entities or single instances of relations. Rather,
they show entity sets and relationship sets. Example: a particular song is an entity. The collection
of all songs in a database is an entity set. The eaten relationship between a child and her lunch is
a single relationship. The set of all such child-lunch relationships in a database is a relationship
set. In other words, a relationship set corresponds to a relation in mathematics, while a
relationship corresponds to a member of the relation.
Certain cardinality constraints on relationship sets may be indicated as well.

You might also like