You are on page 1of 7

A Paper Presentation on

Presented By
TENALI ENGINEERING
COLLEGE
GUN
TUR

By
Avinash Reddy. M
III CSE
e-mail id:
avinashreddy.muvva@gmail.com

V.venkata Rami Reddy


e-mail id:
venkataram.varikuti@gmail.com

1
warehouse architectures can evolve to deliver
the value of data mining to end users.

Abstract:
DATA is a collection of
facts from which conclusions may be drawn…

Organisations are today suffering from a


malaise of data overflow.Data mining. The
extraction of hidden predictive information ntroduction
from large databases is a powerful new
technology with great potential to help
companies focus on the most important This paper presents an overview of how Data
information in their data warehouses. Ware-houses serve as a data source for data
Data mining tools predict future trends and mining. Data warehousing is one of the most
behaviors, allowing businesses to make important strategic initiatives in the information
proactive, knowledge-driven decisions. Data systems field. Since the early 1990s, data
mining tools predict future trends and behaviors, warehouses have been at the forefront of
allowing businesses to make proactive, information technology applications as a way
knowledge-driven decisions. The automated, for organizations to effectively use digital
prospective analyses offered by data mining information for business planning and decision
move beyond the analyses of past events making. Hence, an understanding of data
provided by retrospective tools typical of warehouse system architecture is or will be
decision support systems. important in our roles and responsibilities in
information management.
A Data Warehouse is a database designed to
support decision making in an organization. Most fundamentally, a data warehouse is
Data from The production databases are copied created to provide a dedicated source of data to
to the data warehouse so that queries can be support decision-making applications. Rather
performed without disturbing the performance than having data scattered across a variety of
or the stability of the production systems. Other systems, a data warehouse integrates the data
steps include identifying the problem to be into a single repository. It is for this reason that
solved, collecting and preparing the right data, a data warehouse provides “a single version of
interpreting and deploying models, and the truth.”
monitoring the results. The real key to success,
however, is to have a thorough understanding of All users and applications access the same data.
your data and of your business. Because users access better data, their ability to
analyze data and make decisions improves. Data
This paper provides an introduction to the basic Warehousing has emerged as an increasingly
technologies of data simulated mining. popular and powerful concept of applying
Examples of profitable applications illustrate its information technology to turn this huge island
relevance to today’s business environment as of data into meaningful information for better
well as a basic description of how data business decisions.

2
Whereas a data warehouse is a repository of data,
data warehousing is the entire process. As shown in
Figure 1, data warehousing encompasses a broad
range of activities: all the way from extracting data
from source systems to the use of the data for
decision-making purposes. Specifically, it includes
data extraction, transformation, and
loading, the access of the data by end users, and
applications. A data warehouse environment
includes an extraction, transportation,
DATAWAREHOUSING: transformation, and loading (ETL) solution, an
online analytical processing (OLAP) engine, client

Most simply, a data warehouse is a collection of


data created to support decision making. Users and
applications access the warehouse for the data that
they need. A warehouse provides a data
infrastructure. It eliminates a reason for the failure of
many decision support applications – the lack of
quality data.
A data warehouse has the following four
characteristics analysis tools, and other applications that manages
the process of gathering data and delivering it to
• Subject-oriented means that all relevant business users.
data about a subject is gathered and stored as
a single set in a useful format.
Why Data Warehouse?
• Integrated refers to the Data collected from Data Warehouse is a data repository to support an
multiple systems and are integrated around organization’s decision making. BI (business
subjects. Data being stored in a globally intelligence) relies on data warehousing, making
accepted fashion with consistent naming cost-effective storing and managing of warehouse
conventions, measurements, encoding data critical to any BIDW. Without an effective data
structures, and physical attributes, even warehouse, organizations cannot extract the data
when the underlying operational systems required for information analysis in time to facilitate
store the data differently. expedient decision making. The ability to obtain
• Non-volatile: A warehouse is nonvolatile – information in real-time has become increasingly
users cannot change or update the data. The critical in recent years because decision making
data warehouse is read-only. Non-volatile cycle times have been drastically reduced.
makes sure that all users are working with Competitive pressures require business to make
the same data. The warehouse is updated, intelligent decisions to based on their incoming
but through IT controlled load processes business data and do it quickly…
rather than by users.
• Time variant. A warehouse maintains DATA MINING :
historical data (i.e., it includes time as a
Data mining is the process of extracting
variable). Unlike transactional systems,
information from the company’s various databases
where only recent data, such as for the last
and re-organizing it for purposes other than what the
day, week, or month, are maintained, a
database where originally intended for.
warehouse may store years of data.
Data mining, the extraction of hidden
Historical data is needed to detect
predictive information from large databases, is a
deviations, trends, and long-term
powerful new technology with great potential to help
relationships.

3
companies focus on the most important information Data warehouse is to get the enterprise-wide data in
in their data warehouses. Data mining tools predict a format that is most useful to end-users, regardless
future trends and behaviors, allowing businesses to of their locations. Data warehousing is used for:
make proactive, knowledge-driven decisions. Data
mining tools can answer business questions that  Increasing the speed and flexibility of
traditionally were too time consuming to resolve. analysis.
They scour databases for hidden patterns, finding  Providing a foundation for enterprise-wide
predictive information that experts may miss integration and access.
because it lies outside their expectations.  Improving or re-inventing business
Most companies already collect and refine processes.
massive quantities of data. Data mining techniques  Gaining a clear understanding of customer
can be implemented rapidly on existing software and behavior.
hardware platforms to enhance the value of existing
information resources, and can be integrated with
new products and systems as they are brought on-
line.
Benefits of data mining
Scope of data mining The primary benefit of data mining is the
Data mining techniques can yield the ability to turn feelings into facts. It also protects you
benefits of automation on existing software & from your gut feelings, because we all realize that
hardware platforms, and can be implemented on new many times they are not right. The fundamental
systems, as existing platforms are upgraded and new benefit of data mining is then two folds.
products developed. Faster processing means that Let’s look at a number of tangible benefits
users can automatically experiment with more the data mining process can bring to companies.
models to understand complex data. High speed 1. Fraud detection
makes practical for users to analyze huge quantities All too often businesses are so caught up in
of data. Larger databases, in tern, yield improved their daily operations that they don’t have time to
predictions. dedicate to uncovering to those out of ordinary
business. These events include fraud, employee
Techniques in data mining: theft, and illegal redirection of company goods.
Data mining uses a number of techniques to Fraud detection is seen primarily as out-of-the-blue
discover patters and uncover trends in data data mining.
warehouse data. The most commonly used 2. Return on investments
techniques are: A significant segment of the companies
• Artificial neural networks looking at, or already adopting, data warehouse
• Decision tees technology spend millions of dollars on new
business initiatives. The research & development
• Genetic algorithms
costs are astronomical. Everyone has struggled with
• Nearest neighbor method
time. These returns on investment give a finite
amount of money and people available. This is a
The basic reasons organizations form of targeted data mining.
implement Data Warehouses are: 3. Scalability of electronic solution
The major player in the data-mining arena
To perform server/disk bound tasks associated with provides solutions that are robust and scalable. A
querying and reporting on robust data mining solution is one that performs well
Servers/disks not used by transaction processing and can display results in an acceptable time. The
systems. ability to work with a wide range fo input datasets is
part of this phenomenon called scalability.
BENEFITS OF DATAWARE
HOUSING

4
THE DATA MINING
PROCESS

Data Mining
differs from
traditional data
analysis in that it
discovers patterns
that were
previously overlooked, as opposed to queries or
statistical methods which require the analyst to make
an assumption.

Data Mining builds models, which are abstractions


of reality as shown in the data. Building and
validating the models is a process. As illustrated in
Nautilus Systems' diagram of the Data Mining
Process below, the Data Mining Process involves a
significant amount of time spent in data preparation,
as well as model building and validation.
Information learned during discovery frequently
sends the analyst back to data preparation, or even to
clarification of the problem statement.

As illustrated in Nautilus Systems' diagram of the


Data Mining Process below, the Data Mining
Process involves a significant amount of time spent
in data preparation, as well as model building and
validation. Information learned during discovery
frequently sends the analyst back to data
preparation, or even to clarification of the problem
statement.

5
Applications of data mining Data Warehousing provides the means to change
Wide ranges of companies have deployed successful raw data into information for making effective
applications of data mining. business decisions--the emphasis on information, not
Successful application areas include: data. The data warehouse is the hub for decision
support data. A good data warehouse will... provide
 A pharmaceutical company can analyze its
the RIGHT data... to the RIGHT people... at the
recent sales force activity and their results
RIGHT time: RIGHT NOW! While data warehouse
to improve targeting of high-value
organizes data for business analysis, Internet has
physicians and determine which marketing
emerged as the standard for information sharing
activities will have the greatest impact in
Data warehouse and data mining plays an important
the next few months. The data needs to
role in storing data and sorting out the particular
include competitor market activity as well
data. It has become very easy for a user to get the
as information about the local health care
information that he wants through this mining.
systems. The results can be distributed to
Quantifiable business benefits have been prove
the sales force via a wide-area network that
through the integration of data mining with current
enables the representatives to review the
information systems, and new products are on the
recommendations from the perspective of
horizon that will bring this integration to an even
the key attributes in the decision process.
wider audience of users.
 A credit card company can leverage its vast BIBLIOGRAPHY & REFERENCES
warehouse of customer transaction data to
identify customers most likely to be
interested in a new credit product. Using a
small test mailing, the attributes of
customers with an affinity for the product
can be identified.
 A large consumer package goods company
can apply data mining to improve its sales
process to retailers. 1. Eckerson, W.W. (1988)
THE WORKING OF DATA "Post-Chasm
WAREHOUSE Warehousing," Journal
of Data Warehousing,

The working of data warehouses are 2. Recent Developments in


explained in following diagram. Data Warehousing by
H.J. Watson.

CONCLUSION
3. Data Mining Concepts
and Techniques by
Jiawei Han, Micheline
Kamber

Websites

6
1. www.datawarehousingonli

ne.com
2. www.pcc.ac.uk.com
3. www.dsstechniques.com

You might also like