Professional Documents
Culture Documents
Definition
Traditional file based system is one in which we manually or through computer handle the database such as updating, insertion, deletion adding new files to database etc. File based systems were developed as better alternatives to paper based filing systems. By having files stored on computers, the data could be accessed more efficiently. It was common practice for larger companies to have each of its departments looking after its own data.
Cont
A group of related fields, such as the students name, the course taken, the date, and the grade, comprises a record. A group of records of the same type is called a file. A group of related files makes up a database
Record
A record describes an entity. An entity is a person, place, thing, or event on which we maintain information. An order is a typical entity in a sales order file, which maintains information on a firms sales orders. Each characteristic or quality describing a particular entity is called an attribute. For example, order number, order date, order amount, item number, and item quantity would each be an attribute of the entity order. Every record in a file should contain at least one field that uniquely identifies instances of that record so that the record can be retrieved, updated, or sorted. This identifier field is called a key field
Data Hierarchy
Data Hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves fields, records, files and so on. Database- Student database (course file, financial file) File(Course File-Name of all students, courses, date, grade) Record- (Name, course, date , grade) FieldName or age of a person Byte10101010 Bit0
The resulting confusion would make it difficult for companies to create customer relationship management, supply chain management, or enterprise systems that integrate data from different sources.
Such programming changes may cost millions of dollars to implement in programs that require the revised data.
Poor security
Because there is little control or management of data, access to and dissemination of information may be out of control. Management may have no way of knowing who is accessing or even making changes to the organizations data.
Examples of Databases
The following are examples of database applications: Computerized library systems Automated teller machines Flight reservation systems Computerized parts inventory systems
DBMS Features
It allows organizations to conveniently develop databases for various applications by database administrators (DBAs) and other specialists. A DBMS allows different user application programs to concurrently access the same database. It typically supports query languages, which are in fact high-level programming languages, dedicated database languages that considerably simplify writing database application programs. Database languages also simplify the database organization as well as retrieving and presenting information from it.
Cont
A DBMS provides facilities for controlling data access, enforcing data integrity, managing concurrency control, recovering the database after failures and restoring it from backup files, as well as maintaining database security. It act as an interface between the application programs and the data. It is a collection of interrelated files and a set of programs through which the users can access and modify these files.
For instance: When the application program calls for a data item, such as gross pay, the DBMS finds this item in the database and presents it to the application program.
Cont
DBMSs are categorized according to their data structures or types. The DBMS accepts requests for data from an application program and instructs the operating system to transfer the appropriate dat. When a DBMS is used, information systems can be changed more easily as the organization's information requirements change. New categories of data can be added to the database without disruption to the existing system.
Cont
Database servers are dedicated computers that hold the actual databases and run only the DBMS and related software. Database servers are usually multiprocessor computers, with generous memory and RAID (redundant array of independent disks; it is a storage technology that combines multiple disk drive components into a logical unit) disk arrays used for stable storage. DBMSs are found at the heart of most database applications, modern DBMSs typically rely on a standard operating system to provide these functions.
DBMS Components
A database management system has following components: DBMS engine: It accepts logical requests from various other DBMS subsystems, converts them into physical equivalents, and actually accesses the database and data dictionary as they exist on a storage device. Data definition language: It is the formal language programmers use to specify the content and structure of the database. It defines each data element as it appears in the database before that data element is translated into the forms required by application programs.
Contd....
The most prominent data manipulation language SQLStructured Query Language. SQL is an interactive query language to access data from databases. Sophisticated languages for managing database systems are called fourth-generation languages, or 4GLs for short. The information from a database can be presented in a variety of formats. Most DBMSs include a report writer program that enables you to output data in the form of a report. Many DBMSs also include a graphics component that enables you to output information in the form of graphs and charts.
Data Dictionary
Stores definition of data elements and data characteristics, such as usage, physical representation, ownership (who in the organization is responsible for maintaining the data), authorization, and security. Through these components DBMS manipulates the data and provides an environment which is appropriate to use in retrieving and storing the database information.
Example
single HR database serves multiple applications and helps the organizations to draw together all information for various applications. Details of employees (Name, Address, etc) Payroll (Net pay, Hours worked, gross pay) Benefits (LIC, Pension Plan) A
Cont
Data administration subsystem: It helps users manage the overall database environment by providing facilities for backup and recovery, security management, query optimization, concurrency control, and change management.
Views of data
Three different views of data: External, User of data: The users view of a database program represents data in a format that is meaningful to a user and to the software programs that process those data, there can be an endless number of different external views. This feature allows users to see database information in a more business-related way rather than from a technical, processing viewpoint. Physical view or Internal View: The physical view refers to the way the data are physically stored and processed. logical view or conceptual view: This refers to how or Database administrator views the database.
Database Models
A database model or database schema is the structure or format of a database, described in a formal language supported by the database. In other words, a it is the application of a data model when used in conjunction with a DBMS. It is the theoretical foundation of a database and fundamentally determines in which manner data can be stored, organized, and manipulated in a database system. It thereby defines the infrastructure offered by a particular database system.
Database Model
There are three types of database models common to the industry: Hierarchical database model Network database model Relational database model
Contd.....
Children- segments (pieces of records) Top level- root segment Below-child segment Starts from top and move downwards
Cont
Employee description (Root)
Compensation
Cont
The structure allows representing information using parent/child relationships: each parent can have many children, but each child has only one parent (also known as a 1-to-many relationship). All attributes of a specific record are listed under an entity type. In a database an entity type is the equivalent of a table. Each individual record is represented as a row, and each attribute as a column.
Cont.
A user accesses data within this model by starting at the root table and working down through the tree to the target data. This access method requires the user to be very familiar with the structure of the database. User can retrieve data very quickly because there are explicit links between the table structures. Problem occurs when a user needs to store a record in a child table that is currently unrelated to any record in a parent table- cannot support complex relationships- often a problem with redundant data.
as the by C W
in any
Contd.....
One student may be enrolled in many courses Vice versa a course have many students
Cont.
Cont.
The structure of a network database is represented in terms of nodes and set structures. A node represents a collection of records, and a set structure establishes and represents a relationship in a network database. It is a transparent construction that relates a pair of nodes together by using one node as an owner and the other node as a member. One or more sets (connections) can be defined between a specific pair of nodes, and a single node can also be involved in other sets with other nodes in the database.
Cont
User can access data from within the network database, starting from any node and working backward or forward through related sets It supports fast and complex data access, than those provided by the hierarchical database. A user has to be very familiar with the structure of the database in order to work through the set structures. It is not easy to change the database structure without affecting the application programs that interact with it.
Contd...
Combines relational tables to provide the user with more information than is available in individual tables It is flexible and can answer adhoc query In a relational database, three basic operations, are used to develop useful sets of data: select, project, and join.
Contd...
The select operation creates a subset consisting of all records in the file that meet stated criteria. The join operation combines relational tables to provide the user with more information than is available in individual tables. The project operation creates a subset consisting of columns in a table, permitting the user to create new tables (also called views) that contain only the information required.
The select, project, and join operations enable data from two different tables to be combined and only selected attributes to be displayed.
Contd...
Leading mainframe relational database management systems include IBMs DB2; Microsofts SQL and Oracle from the Oracle Corporation. DB2, Oracle, and Microsoft SQL Server are used as DBMS for midrange computers.
Each table is a relation, each row is a tuple representing a record, and each column is an attribute representing a field. These relations can easily be combined and extracted to access data and produce reports provided that any two share a common element
Database
Hierarchical Network Relational
Processing Efficiency
High Medium-High Lower but improving
Flexibility
Low Low-Medium High
NDBM
Low
HDBM
Low
RDBM
High
Not so simple
Have to know Very simple tree structure of database No possible and Have to dependency procedural in between line with the tree relations. So can structure be nonprocedural.
Complex procedural
Advantages of DBMS
Organizations information system complexity is reduced by central management of data, access, utilization and security. Data redundancy and inconsistency can be reduced by eliminating all of the isolated files in which the same data elements are repeated Data confusion can be eliminated by providing central control of data creation and definition.
Contd...
Program-data dependence can be reduced by separating the logical view of the data from its physical arrangement Program development and maintenance costs can be reduced substantially Access and availability of information can be increased
Data Warehouse
In computing, a data warehouse (DW) is a storage facility used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. A data warehouse maintains its functions in three layers: 1. Staging is used to store raw data for use by developers. 2. The integration layer is used to integrate data and to have a level of abstraction from users. 3. The access layer is for getting data out for users.
Data Warehouses
Decision makers need concise, reliable information about current operations, trends, and changes. Data often is fragmented in separate operational systems, such as sales or payroll, so that different managers make decisions from incomplete knowledge bases. Users and information systems specialists may have to spend inordinate amounts of time locating and gathering data. Data warehousing addresses this problem by integrating key operational data from around the company from various sources and presents in a form in a form that is consistent, reliable, and easily available.
Contd...
A data warehouse is a facility that stores current and historical data of potential interest to managers throughout the company. The data originate in many core operational systems and external sources, including Web site transactions, each with different data models. They may include legacy systems, relational or object-oriented DBMS applications, and systems based on HTML or XML documents.
Cont
The data from these diverse applications are copied into the data warehouse database as often as needed hourly, daily, weekly, monthly. The data are standardized into a common data model and consolidated so that they can be used across the enterprise for management analysis and decision making. The data are available for anyone to access as needed but cannot be altered.
Cont..
The data warehouse must be carefully designed by both business and technical specialists to make sure it can provide the right information for critical business decisions. The firm may need to change its business processes to benefit from the information in the warehouse
Reconciled data
External Sources
Analysis
Serve
Query/Reportin g
Operational Dbs
Data Mining
DATA SOURCES
DATA MARTS
TOOLS
Data Marts
A data mart is the access layer of the data warehouse environment that is used to get data out to the users. The data mart is a subset of the data warehouse which is usually oriented to a specific business line or team. A data mart is a repository of data gathered from operational data and other sources that is designed to serve a particular community of knowledge workers. In scope, the data may derive from an enterprise-wide database or data warehouse or be more specialized.
Datamarts
Companies can build enterprise-wide data warehouses where a central data warehouse serves the entire organization, or they can create smaller, decentralized warehouses called data marts. A data mart is a subset of a data warehouse in which a summarized or highly focused portion of the organizations data is placed in a separate database for a specific population of users.
Example
A company might develop marketing and sales data marts to deal with customer information. A data mart typically focuses on a single subject area or line of business, so it usually can be constructed more rapidly and at lower cost than an enterprisewide data warehouse. However, complexity, costs, and management problems will rise if an organization creates too many data marts
DATA MINING
A data warehouse system provides a range of adhoc
and standardized query tools, analytical tools, and graphical reporting facilities, including tools for data mining. Inferring new information from already collected data. Traditionally job of Data Analysts Computers have changed this. Far more efficient to combine through data using a machine than eyeballing statistical data.
Contd...
Wikipedia definition: Data mining is the entire process of applying computer-based methodology, including new techniques for knowledge discovery, from data.
Knowledge Discovery Concrete information gleaned from known data. Data you may not have known, but which is supported by recorded facts.
Contd.....
Knowledge Prediction Uses known data to forecast future trends, events, etc. (ie: Stock market predictions) Data mining uses a variety of techniques to find hidden patterns and relationships in large pools of data and infer rules from them that can be used to predict future behavior and guide decision making.
Contd.....
Data mining provide information for targeted marketing in which personalized or individualized messages can be created based on individual preferences. There are many data-mining applications in both business and scientific work. These systems can perform high level analyses of patterns or trends. Datamining applications can perform high-level analyses of patterns or trends, but they can also drill into more detail where needed.
Cont
Datamining is both a powerful and profitable tool, but it poses challenges to the protection of individual privacy. Datamining technology can combine information from many diverse sources to create a detailed data image about each of us our income, our driving habits, our hobbies, our families, and our political interests.
Datamining Functions
Extract, transform, and load transaction data onto the data warehouse system. Store and manage the data in a multidimensional database system. Provide data access to business analysts and information technology professionals. Analyze the data by application software. Present the data in a useful format, such as a graph or table.
Cont
Clustering is the task of discovering groups and structures in the data that are in some way or another "similar. Classification is the task of generalizing known structure to apply to new data. For example, an email program might attempt to classify an email as legitimate or spam. Regression Attempts to find a function which models the data with the least error. Summarization providing a more compact representation of the data set, including visualization and report generation.