Welcome to Scribd!

Skip carousel

CSCE Homework : Weka Installation

Uploaded by

Jason Dew

0% found this document useful (0 votes)

25 views6 pages

CSCE822 HW1

Original Title

Assignment 1

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

CSCE822 HW1

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

25 views6 pages

CSCE Homework : Weka Installation

Uploaded by

Jason Dew

CSCE822 HW1

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 6

Search inside document

CSCE  Homework 

Jason Dew
September , 

Abstract
In this homework I will explore the eﬃcacy of diﬀerent parameters in
the k nearest neighbor algorithm, including the value of k, weighting tac-
tics, and the distance measure used to calculate similarity. e breast cancer
Wisconsin (diagnostic) data¹ from the UCI Machine Learning repository² is
used.

 Weka installation
is was very straightforward on my platform of choice, Mac OS .. I also put
the weka.jar ﬁle in a standard location so that it can be used programmatically
via JRuby.

 Acquisition and preliminary analysis

. D  
e data set was very easy to ﬁnd and I was impressed with the organization and
depth of the UCI repository.

. A 

All of the aributes in the given data set, except for the ID, are ordinal and range
from  to . e stacked boxplots in Figure  show this as well as the relationship
between the aributes. e means and standard deviations are also given in Fig-
ure .

As the ID has no predictive value, it was removed from consideration.

¹ hp://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
² hp://archive.ics.uci.edu/ml/index.html


Boxplots of each attribute

mitosis ● ● ● ● ● ● ● ●

normal_nucleoli ● ●

bland_chromatin ●

bare_nuclei

epithelial_cell_size ● ● ●

marginal_adhesion ● ●

cell_shape_uniformity

cell_size_uniformity

clump_thickness

2 4 6 8 10

Scale

Figure : Boxplots of the aributes in the data set.


standard
attribute mean
deviation
clump_thickness . .
cell_size_uniformity . .
cell_shape_uniformity . .
marginal_adhesion . .
epithelial_cell_size . .
bare_nuclei . .
bland_chromatin . .
normal_nucleoli . .
mitosis . .

Figure : Means and standard deviation for the aributes.

 Analysis of k-NN classiﬁers

. M
In order to train and test given a single data set, -fold cross-validation was used
and this seems to give good results. However, it is of note that the accuracy num-
bers are lower than when using the training set as the test set.

. R
In order to learn more about how the k-NN classifier works, I varied several op-
tions in addition to k, including weighting the similarities and varying the distance
measures. e accuracy of a classifier is defined as
# correct
accuracy =
# of instances
Figure  shows how the accuracy varies in k using the Euclidean distance measure
and no weighting. ere does not seem to be a clear paern here. Figure  shows
how the distance metric used affects the accuracy achieved. e differences are
between these are slight and the Euclidean distance does the best overall. Figure 
shows prey clearly that weighting the results either by using the inverse distance
or the similarity is a good idea. In this case, using the inverse distance does a beer
job.


k−NN accuracy for k ranging from 5 to 10

●
96.6

●
accuracy

96.4

● ●
96.2

5 6 7 8 9 10

Figure : Graph of the eﬀect of k on the k-NN algorithm.


Comparison of distance metrics
96.8

● Euclidean
Manhattan
●
Chebyshev
96.6

●
96.4

●
accuracy

● ●
96.2

●
96.0

5 6 7 8 9 10

Figure : Graph of the eﬀect of the distance metric used on the k-NN algorithm.


Comparison of weighting options
97.2

● None
Inverse distance
Similarity
97.0
96.8
accuracy

●
96.6

●
96.4

● ●
96.2

5 6 7 8 9 10

Figure : Graph of the eﬀect of the use of weighting on the k-NN algorithm.

Demo Report
Document7 pages
Demo Report
Anuj Vijay Patil (B22EE010)
No ratings yet
Modal Synthesis Analysis Using Craig-Bampton Methodin An Object Oriented Approach
Document3 pages
Modal Synthesis Analysis Using Craig-Bampton Methodin An Object Oriented Approach
Mr. S. Thiyagu Asst Prof MECH
No ratings yet
Predict Customer Churn with Machine Learning Models
Document10 pages
Predict Customer Churn with Machine Learning Models
A d
100% (2)
Assignment 3
Document8 pages
Assignment 3
mohamedmariam490
No ratings yet
Chapter 2 - Texture Analysis
Document18 pages
Chapter 2 - Texture Analysis
Siti Fairuz
No ratings yet
Improvisation in image quality metrics for image recognition using data wavelet transformation compared with discrete cosine transformation
Document13 pages
Improvisation in image quality metrics for image recognition using data wavelet transformation compared with discrete cosine transformation
vamsi krishna bendalam
No ratings yet
Image Proc - Term Paper Report
Document13 pages
Image Proc - Term Paper Report
Arun Dixit
No ratings yet
Combining Support Vector Machines: 6.1. Introduction and Motivations
Document20 pages
Combining Support Vector Machines: 6.1. Introduction and Motivations
Summrina Kanwal
No ratings yet
LJ 9
Document7 pages
LJ 9
Edward
No ratings yet
Data Mining - Weka 3.6.0
Document5 pages
Data Mining - Weka 3.6.0
Navee Jayakody
No ratings yet
694 Xxix-Part5
Document7 pages
694 Xxix-Part5
Meenakshi Meenu
No ratings yet
IIS
Document10 pages
IIS
Petko Petkovski
No ratings yet
w6 Clustering
Document29 pages
w6 Clustering
Srisha Prasad Rath
No ratings yet
Assignment 5 1
Document13 pages
Assignment 5 1
Unnati Thakkar
No ratings yet
Sarkar ACM MMSec 09 PDF
Document10 pages
Sarkar ACM MMSec 09 PDF
zgxfsbjbn
No ratings yet
Minitab Handbook
Document121 pages
Minitab Handbook
Salvador Peña Ugalde
100% (1)
N Gain
Document5 pages
N Gain
Munirah
No ratings yet
Pattern Recognition and Machine Learning - 2022 Winter Semester
Document19 pages
Pattern Recognition and Machine Learning - 2022 Winter Semester
Ishaan Shrivastava (B20AI013)
No ratings yet
InvariantFeaturesFromInterestPointGroups Brown2002 PDF
Document10 pages
InvariantFeaturesFromInterestPointGroups Brown2002 PDF
pjenazabrijanje
No ratings yet
PREDICTIVE CLUSTERING
Document5 pages
PREDICTIVE CLUSTERING
jbsimha3629
100% (1)
5 Curve Fitting and Interpolation
Document20 pages
5 Curve Fitting and Interpolation
super junlin
No ratings yet
Classifying Hand-Written Digits Using Neural Network
Document21 pages
Classifying Hand-Written Digits Using Neural Network
Nihir Yadav
No ratings yet
Numerical-Experimental Assessment of Stress Intensity Factors in Ultrasonic Very-High-Cycle Fatigue (VHCF)
Document26 pages
Numerical-Experimental Assessment of Stress Intensity Factors in Ultrasonic Very-High-Cycle Fatigue (VHCF)
mohamad
No ratings yet
Kernel K-Means, Spectral Clustering and Normalized Cuts: Inderjit S. Dhillon Yuqiang Guan Brian Kulis
Document6 pages
Kernel K-Means, Spectral Clustering and Normalized Cuts: Inderjit S. Dhillon Yuqiang Guan Brian Kulis
No12n533
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
Document16 pages
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
「瞳」你分享
No ratings yet
Special Course Implementation of NURBS-Based Iso-Geometric Analysis (IGA)
Document16 pages
Special Course Implementation of NURBS-Based Iso-Geometric Analysis (IGA)
Antonio Benigni
No ratings yet
Multiple Transformer Mining For Vizwiz Image Caption
Document2 pages
Multiple Transformer Mining For Vizwiz Image Caption
sonali kamble
No ratings yet
Data Mininig Project
Document28 pages
Data Mininig Project
Karthikeyan Manimaran
67% (3)
C C C C C C C C C
Document10 pages
C C C C C C C C C
Shailesh Dewan
No ratings yet
Model Fit Assessment Via Marginal Model Plots
Document11 pages
Model Fit Assessment Via Marginal Model Plots
Ariake Swyce
No ratings yet
Direction-Adaptive Karhunen-Lo'eve Transform For Image Compression
Document8 pages
Direction-Adaptive Karhunen-Lo'eve Transform For Image Compression
Pradeep Kumar
No ratings yet
Lab4 TransformCoding
Document3 pages
Lab4 TransformCoding
swarami
No ratings yet
Support vector machines and Bayesian networks classify mammographic masses
Document11 pages
Support vector machines and Bayesian networks classify mammographic masses
Perry Groot
No ratings yet
4. Clustering FinancialData
Document38 pages
4. Clustering FinancialData
Zeeshan Ali
No ratings yet
Tolerance Charting PDF
Document7 pages
Tolerance Charting PDF
Sudeep Kumar Singh
No ratings yet
W6 Clustering
Document29 pages
W6 Clustering
5599RAJNISH SINGH
No ratings yet
Achour Idoughi - Project03
Document7 pages
Achour Idoughi - Project03
Achour Idoughi
No ratings yet
Introduction of Finite Element Analysis
Document12 pages
Introduction of Finite Element Analysis
santosh
No ratings yet
ZHAO Xiaozong
Document9 pages
ZHAO Xiaozong
ssyxz10
No ratings yet
0 Anisham PDF
Document23 pages
0 Anisham PDF
anisha M
No ratings yet
Automatic Panoramic Image Stitching using Invariant Features
Document16 pages
Automatic Panoramic Image Stitching using Invariant Features
Apoorva Joshi
No ratings yet
Project Report - Credit Card Fraud Detection
Document11 pages
Project Report - Credit Card Fraud Detection
Snehal Jain
No ratings yet
The CMA Evolution Strategy: A Tutorial: Nikolaus Hansen June 28, 2011
Document34 pages
The CMA Evolution Strategy: A Tutorial: Nikolaus Hansen June 28, 2011
Ardi Susanto
No ratings yet
Lecture Slides Week11
Document33 pages
Lecture Slides Week11
moazzam kiani
No ratings yet
Report
Document7 pages
Report
api-365966798
No ratings yet
Weighted Graph Cuts without Eigenvectors
Document14 pages
Weighted Graph Cuts without Eigenvectors
Shwetha B
No ratings yet
Package Nleqslv': R Topics Documented
Document19 pages
Package Nleqslv': R Topics Documented
rezza ruzuqi
No ratings yet
0 mbm08cvpr
Document8 pages
0 mbm08cvpr
Danny Von Castillo Salas
No ratings yet
Fauqueur Icip06 DTCWT
Document4 pages
Fauqueur Icip06 DTCWT
Shilpa Mitra
No ratings yet
Lecture Slides-Week11
Document32 pages
Lecture Slides-Week11
moazzam kiani
No ratings yet
Summary - MachineLearning (Part 2)
Document19 pages
Summary - MachineLearning (Part 2)
aril dan
No ratings yet
Final Project Report
Document18 pages
Final Project Report
jstpallav
No ratings yet
Module 3
Document20 pages
Module 3
Manju p s
No ratings yet
Perceptual Adaptive Insensitivity SVM Image Coding
Document21 pages
Perceptual Adaptive Insensitivity SVM Image Coding
analyst111
No ratings yet
Statistics and Excel Worksheet - JGG - Rev - STUDENT
Document6 pages
Statistics and Excel Worksheet - JGG - Rev - STUDENT
lilyespiritu.22
No ratings yet
Multi-Camera Calibration Method Based On Minimizing The Difference of Reprojection Error Vectors
Document10 pages
Multi-Camera Calibration Method Based On Minimizing The Difference of Reprojection Error Vectors
Riyaan Bakhda
No ratings yet
K-Nearest Neighbours (KNN)
Document10 pages
K-Nearest Neighbours (KNN)
Syed Hamza Ibrar Gillani
No ratings yet
Curs 2 Compressive Sensing
Document17 pages
Curs 2 Compressive Sensing
Madalina-Valentina Tafta
No ratings yet
Optimal Feature Selection for SVMs
Document25 pages
Optimal Feature Selection for SVMs
Gabor Szirtes
No ratings yet
Statistics in Theory and Practice
From Everand
Statistics in Theory and Practice
Robert Lupton
Rating: 3 out of 5 stars
3/5 (1)
Standards List July2019
Document8 pages
Standards List July2019
Richard P
No ratings yet
Writing Theory Draft
Document18 pages
Writing Theory Draft
api-488391657
No ratings yet
Midterm Exam: Cecor2 - Hydraulics and Geotechnical Engineering
Document2 pages
Midterm Exam: Cecor2 - Hydraulics and Geotechnical Engineering
Ejay Empleo
No ratings yet
Niche Partitioning
Document3 pages
Niche Partitioning
Khang Lq
No ratings yet
Lecture 3
Document16 pages
Lecture 3
Awil Mohamed
No ratings yet
4 Compass Surveying
Document10 pages
4 Compass Surveying
Suson Dhital
No ratings yet
Implementing Product Management
Document156 pages
Implementing Product Management
Jyoti Mohanty
No ratings yet
أسئلة شاملة 1
Document25 pages
أسئلة شاملة 1
ibraheemalabsi99
No ratings yet
Disaster Readiness Exam Specifications
Document2 pages
Disaster Readiness Exam Specifications
RICHARD CORTEZ
No ratings yet
Core-Core Repulsion Integrals: E (A Z ZJJ
Document1 page
Core-Core Repulsion Integrals: E (A Z ZJJ
Raditya D Hm
No ratings yet
7 Barriers To Implementing and Maintaining An Effective HRM Function
Document13 pages
7 Barriers To Implementing and Maintaining An Effective HRM Function
Paing Hein Kyaw
No ratings yet
CSAT 2019 Spe3D Duguid - Andrew PDF
Document111 pages
CSAT 2019 Spe3D Duguid - Andrew PDF
docturbo
No ratings yet
Philadelphia University Faculty of Engineering and Technology Department of Mechanical Engineering
Document8 pages
Philadelphia University Faculty of Engineering and Technology Department of Mechanical Engineering
Tamer Jafar
No ratings yet
Soil Compaction: A. General Principles
Document6 pages
Soil Compaction: A. General Principles
Icha Estrada
No ratings yet
PNR Status. - Train Details: Number Name Class Quota
Document1 page
PNR Status. - Train Details: Number Name Class Quota
Pranshu
No ratings yet
Frankfurt School taxes and ideology critique
Document5 pages
Frankfurt School taxes and ideology critique
Ernesto Bulnes
No ratings yet
Speech Language Impairment - Eduu 511
Document15 pages
Speech Language Impairment - Eduu 511
api-549169454
No ratings yet
Alenar R.J (Stem 11 - Heliotrope)
Document3 pages
Alenar R.J (Stem 11 - Heliotrope)
REN ALEÑAR
No ratings yet
SCHEMACSC520
Document3 pages
SCHEMACSC520
fazaseiko
No ratings yet
ED 107 162 Author Morphology. Pub Date Aug 69 Note Austin Edrs Price MF-$O.76 Descriptors
Document75 pages
ED 107 162 Author Morphology. Pub Date Aug 69 Note Austin Edrs Price MF-$O.76 Descriptors
Talha Khan
No ratings yet
ISSAQ: An Integrated Sensing Systems For Real-Time Indoor Air Quality Monitoring
Document15 pages
ISSAQ: An Integrated Sensing Systems For Real-Time Indoor Air Quality Monitoring
KemHuyền
No ratings yet
Final Answers
Document4 pages
Final Answers
Anshul Singh
No ratings yet
Unit 4 Language Summary: Vocabulary
Document1 page
Unit 4 Language Summary: Vocabulary
Stephania Galindez
No ratings yet
MTBF and MTTR For Metal-Enclosed Capacitor Banks and Harmonic Filter Systems
Document4 pages
MTBF and MTTR For Metal-Enclosed Capacitor Banks and Harmonic Filter Systems
bansalr
No ratings yet
F3 Maths 2012 1stexam Paper1
Document3 pages
F3 Maths 2012 1stexam Paper1
YiuhangLeung
100% (2)
ARHITECTURA Si FOCUL
Document282 pages
ARHITECTURA Si FOCUL
Theodor Dinu
No ratings yet
Ethics Presentation
Document2 pages
Ethics Presentation
AbhinavGupta
No ratings yet
Spiros Styliaras: Devops Engineer
Document2 pages
Spiros Styliaras: Devops Engineer
Σπύρος Στυλιαράς
No ratings yet
Piper Lance II - Turbo Lance II-Maintenance - smv1986
Document568 pages
Piper Lance II - Turbo Lance II-Maintenance - smv1986
willkobi
No ratings yet
Tugas (UTS) ASPK - Andro Tri Julianda (95017019)
Document4 pages
Tugas (UTS) ASPK - Andro Tri Julianda (95017019)
محمد عزير
No ratings yet