Professional Documents
Culture Documents
Data mining is computer software used for the extraction of useful knowledge from the heaps
of data available. Data mining is typically used in retail industry, Banking sector,
mathematics, fraud detection activity.
According to data mining for scientific and engineering applications (Robert L. Grossman,
Chandrika kamath,Vipin Kumar and Raju 2001).Computer simulation has been improved, the
computer can be able to produce terra bytes of data in few hours .It takes a long time may be
few weeks or month for a human to extract the useful information in the data.
According to data mining in computer security (Daniel Barbara and Sushi Jajodia 2002).The
data dealing with both network and host is very large. There can be many number users that
login into same host and the network. Data mining techniques provide unique candidate to
use it.
Data mining is used in business intelligence
According to data mining for business intelligence (Galit Shmueli, Nitin R.Patel, Peter C.
Bruce2007). There are heaps of data is available in the market for using in a particular
business. Choosing a particular information that is useful for the business from the data.
QUESTION 2:
The whole data mining process is based on the cycle plan, do, check, and act.
Data Processing: The collected data from different ways is executed by the computer
according to their procedures. This is the data that has to be processed by the organisation.
Model construction and Evaluation: After the collected data is executed, it is the time
evaluates the results of the data. What are the goals of the organisation and At what level that
the project meet the requirements of the organisation. The organisation should have a good
vision about it.
Taking Action:
Last but not the least this is the major step in any project. If the project did not meet the
requirements of the organisation, this step deals with what went wrong and what are steps to
be taken to meet our the goals.
QUESTION 3:
(a)
In supervised learning, the input attributes of the forecast are the humidity, the barometric
pressure, the temperature.
The output of this is the forecast is windy or possible hail or storms in the night.
(b)
In supervised learning, if a medical practitioner wants to predict the survival rate of the breast
cancer the input attributes for this are
What is the blood pressure of the patient, heart beat of the patient, percentage of
haemoglobin in the blood, strength of the bone.
The outcome of this is the survival of the breast cancer patient is low or high.
(c)
In a supervised learning, The possible inputs if a company wants to identify the fraud cases
to minimise the risks in loan system are. The age of the person, visa status of the person,
monthly or annual income.
The possible outcomes the credit card is rejected due to poor credit history. Or the credit card
is approved.
(d)
In an unsupervised learning, if a super market manager wants to improve the success rate of
the direct mail targeting the possible input attributes are
The name of the person, address o f the person, contact number of the person
The outcome is the super market is responding well for the customers.
(e)
It is a data query.
QUESTION 4:
(A)
(B)
(C)
Create the note pad.
Identify the attributes and define the relation between them.
Changed the extension to .arff
Opened the WEKA and selected open file and selected .arff file
References:
2. Robert L. Grossman, Chandrika kamath, Vipin Kumar and Raju, 2001‘kluwer academic
publishers’, The Netherlands, 29 march p.128
http://books.google.com.au/books?
id=K9bRLRpGM2cC&dq=applications+of+data+mining&printsec=frontcover&source=in&h
l=en&ei=f4C1S7GFF9CHkQWSj4yXDQ&sa=X&oi=book_result&ct=result&resnum=11&v
ed=0CDoQ6AEwCg#v=onepage&q=&f=false
3. Daniel Barbara and Sushil Jajodia, 2002 ‘kluwer academic publishers’, north and Central
America, 29march p.25
http://books.google.com.au/books?
id=QXNj15Lp1OsC&dq=applications+of+data+mining&printsec=frontcover&source=in&hl
=en&ei=f4C1S7GFF9CHkQWSj4yXDQ&sa=X&oi=book_result&ct=result&resnum=12&ve
d=0CDwQ6AEwCw#v=onepage&q=&f=false
4. Galit Shmueli, Nitin R.Patel, Peter C. Bruce2007 ‘John Wiley and sons’, Canada, 29
march p.13
http://books.google.com.au/books?
id=cM3hN0mvzLsC&dq=applications+of+data+mining&printsec=frontcover&source=in&hl
=en&ei=hYO1S-
HKApiekQWsp8WPDQ&sa=X&oi=book_result&ct=result&resnum=13&ved=0CEIQ6AEw
DA#v=onepage&q=&f=false