You are on page 1of 2

PUNE INSTITUTE OF COMPUTER TECHNOLOGY

DHANKAWADI, PUNE – 43.


DEPARTMENT: COMPUTER ENGINEERING ACADEMIC YEAR: 2008-09

Project Group Members: Tapomay Dey,


Tapomay.dey@gmail.com,
9960156254,
Final year, B.E. computers.

Project Title: DTminer-a webapp.

Synopsis:

Problem Statement:

Decision tree induction is used in expert systems for knowledge discovery.


The main task performed in these systems is making use of inductive methods, given
the values of attributes of an unknown object, to determine appropriate classification
according to decision tree rules.
Input is supervised historical data. The system must learn from this data in
order to generate a decision tree.
The decision tree shall classify instances by traversing from root node to leaf
node. We start from root node of decision tree, testing the attribute specified by this
node, and then moving down the tree branch according to the attribute value in the
given set. This process is the repeated at the sub-tree level until a leaf node (class
prediction) is reached.
The purpose of this project is to develop a web application that can be
deployed on Glassfish and that implements a modified variation of decision tree
learning algorithm ID3 (subject to constraints).

Application /Context:

 Financial institutions
 Genomics supercomputer
 Disease recognition
 Telecommunications
 Super markets
 Etc.

Concept:

Data mining is the analysis of (often large) observational data sets to find
unsuspected relationships and to summarize the data in novel ways that are both
understandable and useful to the data owner.
'Decision tree learning is a method for approximating discrete-valued target
functions, in which the learned function is represented by a decision tree. Decision
tree learning is one of the most widely used and practical methods for inductive
inference'. (Tom M. Mitchell,1997,p52)
A decision tree is a tree in which each branch node represents a choice
between a number of alternatives (attributes), and each leaf node represents a decision
(class).
Supervised data is represented by an (x, y) pair where x is data about entity
and y is its class.

Area of Project: Data Mining

References:

1. Tom M. Mitchell, (1997). Machine Learning, Singapore, McGraw-Hill.


2. Paul E. Utgoff and Carla E. Brodley, (1990). 'An Incremental Method for
Finding Multivariate Splits for Decision Trees', Machine Learning: Proceedings
of the Seventh International Conference, (pp.58). Palo Alto, CA: Morgan
Kaufmann.
3. MIT OCW- 15.062 Data Mining, Spring 2003.
4. An Implementation of ID3 --- Decision Tree Learning Algorithm Wei Peng,
Juhua Chen and Haiping Zhou: Machine Learning, University of New South
Wales, School of Computer Science & Engineering, Sydney, NSW 2032,
Australia

You might also like