You are on page 1of 8

AbstractThe application oI data mining to environmental

monitoring has become crucial Ior a number oI tasks related to


emergency management. Over recent years, many tools have been
developed Ior decision support system (DSS) Ior emergency
management. In this article a graphical user interIace (GUI) Ior
environmental monitoring system is presented. This interIace allows
accomplishing (i) data collection and observation and (ii) extraction
Ior data mining. This tool may be the basis Ior Iuture development
along the line oI the open source soItware paradigm.

KeywordsData Mining, Environmental data, Mathematical
Models, Matlab Graphical User InterIace.
I. INTRODUCTION
HE management oI environmental emergency is one oI
the most interested Iield that scientists are working to
develop, where rapid environmental changes call Ior
continuous surveillance and on-line decision making. The
complexity oI environment problems make necessary the
development and applications oI new tools capable oI
processing not only numerical aspects, but also the experience
Irom experts and wide public participation, which are all
needed in decision system.
As a part oI decision support system Ior environmental
emergency management, the data mining plays a main role in
extracting data, analysis, and prediction. For Environment
Monitoring System (ENMS), data come Irom measuring
stations (i.e. meteorological ones), and the measurements Ilow
Irom several sensors to support decision makers.
In this paper, we show a Matlab Graphical User InterIace
(GUI) as a tool Ior environmental applications. The aspect oI
environment emergency management we consider here
regards data collecting and prediction, as its role in supporting
decision making. We mention the data here as sensors
measurements over real time observing. In the last years, a
huge number oI potentially useIul methods and soItware tools
have been proposed including methods Ior environment
surveillance. Our tool`s additive is to connect the monitored
data aIter processing and extracting with powerIul tools oI
Matlab Ior data mining, using Prediction, ClassiIication, and
Neural Network tools.

M. Awawdeh is with the Department oI Mechanical, Energetic,
Management, and Transport Engineering DIME- University oI Genova, P.le
Kennedy Pad. D, 16129 Genova, Italy (e-mail: awawdehdime.unige.it).
A. Fedi is with Acrotec Srl, Via A. Magliotto 2, 17100 Savona, Italy (e-
mail: adrianoacrotec.it).
We present the data mining algorithms and methods
application that meeting our project phases in a sequence
regarding to |2|.
The Tool is a contribution work in a project, named
Integrated Network Ior Emergency (NIE). The interIace is
connected to the other parts oI the project to complete a
comprehensive system Ior environment management. Our role
in the project is to support the decision making by scientiIic
prediction tools.
II. BASICS
A. Profect Overview
The interIace is a part oI a project Ior environment
surveillance, named Integrated Network Ior Emergency
'N.I.E. This project has been perIormed by cooperated work
oI the University oI Genova, 'FadeOut Company; it works
on soItware programming, and 'ACROTEC Company,
which is the main executor oI this project. The main goals oI
the project are to monitor environmental changes, take
measurements using multi sensors in diIIerent measuring
stations as real time observing, display data on website, in
addition to interactive modeling, broadcasting and early
warning system, and environment measurements analysis
depending on mathematical models using the deIinition oI
scientiIic prediction.
We will not discuss in this paper the whole project, whereas
mentioned in the introduction, our role is to design the
mathematical models Ior supporting the decision-making in
environment emergency state; as a part oI our work, the
proposed interIace has been designed.
B. Environmental Monitoring and Data Mining
Many environmental systems involve processes which are
not yet well known, and Ior which no Iormal models are
established at present. Because the consequences oI an
environmental system changing behavior or operating under
abnormal conditions may be severe, there is a great need Ior
Knowledge Discovery (KD) in the area.
Great quantities oI data are available, but as the eIIort
required to analyze the large masses oI data generated by
environmental systems is large, much oI them are not
examined deeply and the associated inIormation remains
unexploited. The special Ieatures oI environmental processes
demand a new paradigm to improve analysis and consequently
management. Approaches beyond straightIorward application
oI conventional classical techniques are needed to meet the
MATLAB-Based Graphical User InterIace (GUI) Ior
Data Mining as a Tool Ior Environment
Management
M. Awawdeh, A. Fedi
T
World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
129
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4
challenge oI environmental system investigation. Data mining
techniques provide eIIicient tools to extract useIul inIormation
Irom large databases, and are equipped to identiIy and capture
the key parameters controlling these complex systems |1|.
The interactive Graphical User InterIace proposed in this
work helps in extracting, analyzing, and exporting data using
the power oI Matlab programming regards data mining
techniques.
C. Working Prerequisites and Required Connections
From section (II.A), we showed that our interIace is a part
oI comprehensive system, so we have initial tasks to run our
interIace, where this tool is not an open source yet and the data
come to our interIace by importing it Irom the server oI the
executive company (ACROTEC). The soItware and
connections we use Ior this interIace are: (1) Licensed Matlab
Program version (R2011b) Irom MATHWORK, and a
package oI JAVA library. (2) CISCO Systems VPN Version
5.0.07.0240, connection is required to access the data in the
company`s core server.
III. OUR INTERFACE AND IT`S APPLICATIONS
A. Database and Data form
The database includes measurements Irom sensors over real
time observing. These data Ilow to the core server in the
surveillance management room oI the executive company. The
Iorm oI data we need to import in our interIace is not
compatible with the Iorm oI data in the server Ior two main
reasons: (1) The Iorm oI data is not a numerical Iorm so we
couldn`t manage by Matlab Iunctions. (2) The programming
language, which has been used to construct the data in the core
server, is X-DROPS language; this language is a customized
programming language, developed by ACROTEC Company
Ior this project. This code is able to connect with MATLAB
but the code itselI is not readable by MATLAB. So the
interIace has to make the connection between MATLAB and
X-DROPS to create the readable data Ior our interIace beIore
progressing its phases. The data Ilow to our interIace as shown
in Fig. 1.
The sensors are distributed in many locations and they are
connected with the stations over transmission lines. The
blocks 'ACROTEC Databases and 'Data reprocessing are
our data source; because our system is not connected directly
to the sensors. The connection with ACROTEC server
provides our interIace with all required data over on-line
connections and it supports the Ieedback process.


Fig. 1 Data Ilow block diagram Irom the source (Sensors) to our
system (InterIace)
Data preparing and integration is occurring in both stages,
Iirstly in the core server and then in our interIace processes
corresponding to the target oI data extracting. The Iirst
Iiltration oI data occurs in the core server, and it is not shown
in our system, but another phases oI data Iiltering and errors
Iormulation have been compiled with data selecting and
extracting, to provide our interIace by Iull access to process
and extract data Irom the database tank. All data processing
phases in our interIace is dependent processes, changeable,
and editable upon user targets.
B. Data Initiali:ing and Collecting
The initialization oI data occurs over the connection
between Matlab and X-DROPS; the VPN connections build
the bridge access to the database in the core server. This phase
creates the connection with X-DROPS and aIter this
initialization the interIace is ready to start collecting data and
processing them.
AIter this step, we implement the Iiltration as an in-Iiltering
process, which Iilters the extracted data corresponding to the
output coming Irom the phase oI 'Data reprocessing (see Fig.
1). Depending on the data collecting target as we will show in
the Iollowing section, the data Ilow inside our interIace as
shown in Fig. 2.
C. Interface Structure and Its Feature
The interIace includes three phases oI data processing, (1)
Extracting data Irom the core database and build the bridge oI
data exchange between X-DROPS and Matlab. (2) Collecting
data Irom sensors depending on speciIic search criteria
determined by user. (3) Building databases inside Matlab and
exporting these data to other connected interIaces as shown in
(Fig. 1). The Data collecting Ieatures depend on user
commands where there are two modes Ior collecting data: (1)
Collect measurements depending on the type oI sensors
determined by user. (2) Collect measurement under the
determinant oI the geographical points. The modes oI these
phases are shown in the interIace as multi-input choices Ior
user.


Fig. 2 Data Ilow and its destination inside our interIace structure
World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
130
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4

Fig. 3 Get Observations interIace screenshot at the starting mode

The external database Irom which our interIace extract data,
is a collection oI sensor measurements over time. It is
composed by a matrix oI observations, constructed as
|stationstime| dimension. We collect the measurement as a
Iunction oI ( , x v ), where x is the time in minutes, and v is the
measurements. Fig. 3 shows the interIace in the starting mode.
From now on, we will call the interIace as 'Get Observations
interIace.
At the starting mode, aIter the auto-initializing (where the
initialization is occurring automatically when user open the
interIace) the system is ready to process the input and extract
data up to user choices. Here, all input data are required to
start processing regardless to the desired output. Required
input and available output are shown in Table I.

TABLE I
INTERFACE INPUT AND OUTPUT DATA
Required input All available output
Sensor type Collecting all available data
Time-Duration Plotting 2D ( , x v )
Desired date (period) Exporting to Data Mining
Station`s name Creating Datasets
Constructing Databases

In Table II, we show the interIace article rules
corresponding to the three modes oI data collection.

TABLE II
INTERFACE ARTICLES AND PROCESSES
InterIace Article Processing rule
Process Observations oI particular type oI sensors in all stations in
Genova
All Data Observations oI particular type oI sensors in all stations in
Italy
Station Observation Observations oI all sensor in one station in Genova

From Fig. 3, one can see that user has lists oI input choices.
The choices are corresponding to these questions Irom the
user: What is the type oI sensor? What is the observation
period? For which date users want to get the measurements?
and in which station?
We will not study the interIace Irom the point view oI
system analysis, because we do not show each article oI the
interIace and its rules. The goal oI this paper is to present an
easy-developed tool Ior environment emergency management
based on Matlab programming and data mining methods. The
InterIace is divided into three phases: (1) The Input phase
where user shall choose Irom the lists. (2) The visualized
output, which is in two Iaces: a graph visualizing, and a table
oI collected measurements. (3) Data mining phase; contains
the plot prediction tools, data clustering, data training tools,
and open source soItware.
The data processing generates two types oI extracted data:
an auto-exported data shown in the interIace-visualized
output. This type Ieeds two levels oI data mining: the 'Least
Squares and 'ANOCOVA prediction tools. The other type
oI the generated data is the Databases; in this type, user
constructs and exports data Ior the data mining phase Irom the
tool 'Construct Database as shown in Fig. 3. The user can
generate six diIIerent databases with six diIIerent input
choices. Fig. 4 shows an example oI a generated database.


Fig. 4 An example oI Databases have saved in the Workspace oI
Matlab

World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
131
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4
Here, we show an example oI data obse
speciIic inputs.
Example 1: (1) Sensor type: Thermomete
Fig. 5 Get Obser

D. Data Mining Methods and Application
The two high-level primary goals oI data m
tend to be prediction and description. Pr
using some variables or Iields in the dat
unknown or Iuture variables oI interest
Iocuses on Iinding human-interpretable patte
data. The goal oI prediction and description
using variety oI particular data-mining meth
show some oI these methods corresponding
system.
BeIore proceeding with our interIace ph
mention one oI the important steps
applications. That is 'Outlier Detection`
primary one in many data mining application
an outlier as 'observation in a data set whi
inconsistence with the reminder oI that set oI
interIace, we use 'Tukey`s method
customized-weighted least squares meth
regression to detect outliers. Tukey`s method
boxplot, is a graphical tool to display in
continuous univariate data, such as the medi
(Q1), upper quartile (Q3), lower extreme, an
oI a data set |4|.In this mode to minimize
outliers, we Iit our data using robust-least s
using 'Bisquare Weight; method -Matla
method minimizes the weight sum oI sq
programmed all these mathematical proced
that Ior users, since not all users in the data
able to deal with the pure mathematical mo
Iuture work we hope to progress-in is to deve
outlier detection Iilter) to be an industrial Iilte
We don`t apply this process Ior all dat
request in the data analysis phases (just to
including its extreme value to be notiIied I
export it Ior detecting outlier beIore the analy
oI data observation with the
Thermometer. (2) Observing
each 60 minutes. (3) In 11
station is 'Pegli2 station (see Fi

Get Observations interIace screenshot, example 1 at the Running mod
plications Inside
oals oI data mining in practice
scription. Prediction involves
s in the database to predict
oI interest and description
retable patterns describing the
d description can be achieved
mining methods |2|. Here, we
rresponding to its rules in our
interIace phases, we should
tant steps in data mining
tection`. This step is a
g application. Johnson deIines
data set which appears to be
oI that set oI data |3|. In our
Boxplot and a
uares method with robust
method oI constructing a
o display inIormation about
as the median, lower quartile
r extreme, and upper extreme
to minimize the inIluence oI
least squares regression
Matlab support- this
sum oI squares. We have
atical procedures to simpliIy
s in the data analysis Iield are
hematical models. One oI the
o develop this Iilter (i.e.
ndustrial Iilter.
s Ior all data, but upon user
ases (just to display all data
be notiIied Irom user then to
Iore the analysis). Matlab code
has been programmed to proc
method. The code consists
detection using the deIinitions o
removal, with the ability to exch
some proper values that reduce
over regression analysis.
Fig. 6 shows a snapshot Ior
days) including outliers, which
method.

Fig. 6 (a) Outlier detection remova
been removed. (b) Outlier detection
red crosses are
(3) In 11-December-2012. (3) Desired
tation (see Fig. 5).

unning mode
med to process the data using Tukey`s
oI data processing Ior outlier
deIinitions oI quartile, boxplot, and outlier
bility to exchange the value oI outlier with
reduce the eIIect oI extreme values
snapshot Ior data (measurements Ior six
liers, which have been detected by Tukey`s

(a)


(b)
ction removal display snapshot. Outliers have
tlier detection graphical display snapshot. The
ed crosses are the outliers
World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
132
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4
Classification and Clustering, ClassiIication is learning a
Iunction that maps (classiIies) a data item into one oI several
predeIined classes |5|. Clustering is a common descriptive
task where one seeks to identiIy a Iinite set oI categories or
cluster to describe the data |6|. In our system, we use Fuzzy
C-Means Clustering (FCM); FCM is a data clustering
technique in which a dataset is grouped into n clusters with
every data point in the dataset belonging to every cluster to a
certain degree |7|. The Iollowing example (example 2), shows
a 2D classiIication using Fuzzy C-Means techniques Ior
measurements taking Irom ('Thermometer sensor) in
particular station ('Pegli2) where these variables are
observed each 30 minutes Ior 24 hours in some day.
Example 2: A plot oI 2D classiIication Ior measurements
Irom 'Thermometer sensor. We use (3) clusters Ior this
dataset, where the number oI desired clusters is upon user
choice, taking into account the dataset size. (See Fig. 7)


Fig. 7 An example oI 2D classiIication 3 Cluster, Ior measurement
oI 'Thermometer sensor; observations through 24hours each 30
minutes

Regression is learning a Iunction that maps a data item to
real-valued prediction variables |2|. In the Data mining panel
as shown in Fig. 5, we implemented two panels; one as 'Plot
Prediction tools and the other Ior 'Classification and Neural
Network`. In the 'Plot Prediction tools, Iirstly we have Least
Squares Prediction Plot, which is nonlinear regression method
Ior prediction. We use the Matlab Iunction 'Polvtool
(Interactive polynomial Iitting) in a least squares sense. The
data Ior this prediction plot have already implemented Irom
the processing phase under the process oI 'Plot & Export
Data where the values oI ( , x v ) are exported as we
mentioned beIore. User can use the interIace to explore the
eIIects oI changing the Iit parameters and to export Iit results
to the workspace. From example 2, one can see the plotting oI
the measurements; these variables are exported to the least
square plotting and its prediction curve is shown in Fig. 7 as
quartic mode.


Fig. 8 A prediction plot oI quartic model, Ior the measurements oI
example 2, in the interIace oI Least Squares, one can change the
degree to see diIIerent models (Linear, quadratic, cubic, etc.). In
addition, to export variables to the workspace

The second tool oI plot prediction tools is 'one-wav
analvsis of covariance (ANOCOVA) models.We implement
the measurements ( , x v ) as auto-exported variables Irom
graph. ANOCOVA models process our measurements, and
plot the prediction curves corresponding to Iive diIIerent
models (same mean, separate means, same line, parallel lines,
and separate lines), and respect to our modes oI 'grouping,
we grouped the measurements as periods oI measurements
observations where we designed the modes as 24 hours oI
observations over measurements each 30 minutes, as these
period oI time: (00:00-06.00, 06.00-12.00, 12.00-18.00, 18.00-
00.00). This powerIul tool provides us by three types oI
outputs: (1) An interactive graph oI the data and prediction
curves. (2) An ANOVA table. (3) A table oI parameter
estimates. Example 3 (Fig. 9) illustrate ANOCOVA prediction
plot.
Example 3: ANOCOVA prediction plot Ior the variables
Irom example 2, with these Ieatures: Model: Separate Lines,
Group: All Group.
For these two types oI prediction plots (Least squares and
ANOCOVA), we use a design process oI data mining
algorithms (see Figs. 9-11). The idea oI this design was
developed by Mikut et al. |8|.

World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
133
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4
Fig. 9 An example oI ANOCOVA prediction
measurement in the example 2, Irom this Iigure
output we mentioned above in that example. W
using the model 'separate lines and Ior all grou
sensor measurements are 49; they are the readin
through 1 day (24 hours). The user can choose
diIIerent Iits, in addition, to choose any oI the Iou
each demonstrates the measurements Irom 00.
diIIerent numbers oI variables Ior eac

R. Mikut et al. |9|implemented the s
mining process proposed Irom |8| in the
This tool box is an open source soItware
it operates by a graphical user interIace.
their design process blocks to meet our need o
The 'Error formulation has been inte
source code oI our interIace where the det
measurements collections is running over pre
and structures oI the sensors measurements a
to pre GIS deIinition, where we constructe
Iace the on-line data detection.
Nonlinear Regression and Classificatio
methods consist oI a Iamily oI techniques I
Iit linear and nonlinear combinations oI
(Sigmoids, splines, polynomials) to combina
variables |2|. Neural networks method is
common methods oI data mining, it used
clustering, Ieature mining, prediction, and pa
One oI neural network types is 'Feed-forw
regards the perception back propagation
Iunction network as representatives, and
the areas such as prediction and patter recogn

A prediction, Ior the same
Iigure one can see the three
We apply ANOCOVA
d Ior all groups (0,6,12,18). The
reading each 30 minute
choose any model to see
ny oI the Iour groups where one
ents Irom 00.00 o`clock with
iables Ior each group
ented the standardized data
in the toolbox Gait-CAD.
soItware bases on Matlab and
We edited some oI
eet our need oI processing.
as been integrated with the
detection process oI
ning over predeIined variables
asurements and corresponding
constructed the source code to
sification methods, these
techniques Ior prediction that
binations oI basis Iunctions
) to combinations oI the input
method is one oI the most
it used Ior classiIication,
ction, and pattern recognition.
forward networks`, that
propagation model and the
ives, and it is mainly used in
patter recognition.
We have chosen this type oI
data (measurements in the dat
section (III), through the data co
databases respect to our desti
databases Ilow in two way (i)
workspace and (ii) as a save
extension files). These data are
variables, which have been proce

Fig. 10 Design process oI data min
The design oI this algorithm Iit wit
the on-line models. Moreover, the
Observation interIace have been
develop

Fig. 11 Design process oI data mi
predic

These data will be trained
backpropagation algorithm`
algorithm, which was independe
Levenberg and Donald Mar
solution to the problem oI minim
is Iast and has stable converg
exchange between the speed oI
stability oI the steepest descent m

this type oI neural networks to train our
s in the database). As we mentioned in
h the data collection process we construct
to our destination oI monitoring. These
way (i) as available variables in the
) as a saved Iile in the directory (.dat
data are constructed up to the input
ve been processed.

s oI data mining algorithms Ior Least Squares.
orithm Iit with our need oI output and support
oreover, the processing operations in the 'Get
e have been constructed to Iit with the Iuture
development

I data mining algorithms Ior ANOCOVA
predictions
be trained with 'Levenberg-Marquardt
rithm`, The Levenberg-Marquardt
as independently developed by Kenneth
Donald Marquardt, provides a numerical
lem oI minimizing a nonlinear Iunction. It
able convergence |10|. It gives a good
the speed oI the Newton algorithm and the
est descent method.
World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
134
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4
Fig. 12 Design process oI data mining algorithm
neural network training

The update rule oI Leven-Marquardt al
written as:
1
1
( )
T
k k k k k k
w w J J I J e

+
= + , where
w: The weight, J . Jacobian, . Comb
which is alwavs positive, I . The identitv ma
As the combination oI steepest descent a
Gauss-Newton algorithm, The Leve
algorithm switches between the two algor
training process. (See |10|).
The design process Ior data mining alg
neural network has been published to Iit
opportunity oI developing the interIace
soItware.
ThereIore, we use again the design oI |8
oI data mining method (see Fig. 12).
Example 4, this example shows the regre
dataset (see Figs. 13 and 14), which has be
measurements Irom thermometer sensor d
over observing each 60 minutes. The inp
imported as DBASE1`, which is a |3x25|
static data: 25 samples oI 3 elements, with
selections: 70 oI samples Ior training, 15
and 15 Ior testing.


ing algorithms Ior FeedIorward

arquardt algorithms can be
, where
. Combination coefficient
ntitv matrix.
est descent algorithm and the
The Levenberg-Marquardt
e two algorithms during the
a mining algorithms Ior our
to Iit with the Iuture
e interIace as open source
8| to show the design
ws the regression Ior a small
hich has been constructed as
ter sensor during three days
tes. The input Iile has been
| matrix, representing
ments, with these percentage
training, 15 Ior validation,
Fig. 13 Regression, d

Fig. 14 Neural network training pe

Fig. 15 InterIaces output and progra
mentions our interIace and which it
pane


egression, dataset oI example 4

rk training perIormance, dataset oI example 4

ut and programming based, 'Get Observation
and which it includes the other interIaces in its
panels
World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
135
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4
IV. DATA MINING ALGORITHMS AND REVIEW
There are three primary components in any data-mining
algorithm: (i) model representation (ii) model evaluation, and
(iii) search. |2|,|12|-|16|.
Model representation our interIace is operated by a
graphical user interIace based on Matlab. This representation
can be divided into two Iaces: (i) Hidden programming Ior
data collecting which based on X-DROPS code, at the end oI
this stage the system has been initialized and the discoverable
patterns are described into graphical user interIace based on
Matlab. (ii) Matlab programming, which the interIace has
been built and executed over Matlab code using (GUI) tools.
Interactive graphical user interIace describe the discoverable
patterns as graphs, tables, analysis, numerical results, etc. As
mentioned beIore we can summarize the representation Ior our
interIace and interIaces inside as shown in Fig. 15.
Another component oI data mining algorithms is Model-
evaluation criteria, as mention in |2|, this component are
quantitative statements oI how well a particular pattern (a
model and its parameters) meets the goal oI knowledge
discovery in database (KDD) process. KDD is the 'nontrivial
process of identifving valid, novel, potentiallv useful, and
ultimatelv understandable patterns in data |2|. Our interIace
meets this deIinition by its application. The understandable-
constrains plot oI measurements, which have been extracted
corresponding to speciIic user criteria and have been exported
to the data models (predictive models, data training pattern,
clustering, and data Iitting) give a clear idea Ior the observer
about the desired data with their analysis. All oI models have
been tested under a randomly user input choices; test set has
been used to examine the predictive model accuracy with
other multi-input levels.
Search method, the third components oI data mining
algorithm, which it consists oI two components: (1) parameter
search and (2) model search. In this stage data mining task is
reduced to purely an optimization task: Iind the parameters
and models Irom selected Iamily that optimized the evaluation
criteria |2|, |11|. (See Fig. 16).


Fig. 16 InterIace data Ilow and process Irom the search model up to
extracted data. More details about the search model have been
discussed in the previous sections oI this paper
V. CONCLUSION
The aim oI these tools is to provide an interIace Ior
applying data mining methods in the environmental
applications. This interIace plays an intermediate-cooperated
role between two Iields: environmental monitoring and data
mining. We use this interIace to collect data Irom sensor
(extract Iorm database) and applying the data mining methods
on these sets oI data, Iinally Ior extracting theses data in
understandable and Ilexible model to be used in the decision
support system. The system has been designed to Iit with real
on-line observing and the algorithms have been constructed
Ilexibly to meet with Iuture needs. This interIace will be
developed in an upgrading-phase to be as an open source
software.
ACKNOWLEDGMENT
Authors would like to extend grateIul thanks to proIessor
Angelo Alessandri and proIessor Patrizia Bagnerini Irom the
department oI mathematical engineering and simulation in the
university oI Genova, Cosimo Versaci Irom ACROTEC
Company, Ior there guidance and their appreciated eIIorts. In
special way, the author wishes to acknowledge all members oI
the project (N.I.E), in providing the data on which this toolwas
based.
REFERENCES
|1| Jessica Spate, Karina Gibert, Miquel S`anchez-Marr`e, Eibe Frank,
Joaquim Comas, Ioannis Athanasiadis, Rebecca Letcher, 'Data Mining
as a Tool Ior Environmental Scientists. Al Maga:ine Vol.17.1996.
|2| Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth, From
Data Mining to Knowledge Discovery in Databases, Article.
|3| Richard Johnson and Dean Wichern, "Applied Multivariate Statistical
Analysis", 1992.
|4| John W. Tukey, 'Exploratory data analysis, 1977.
|5| Weiss, Sholom M. and Kulikowski, Casimir A. "ClassiIication and
prediction methods Irom statistics, neural nets, machine learning, and
expert systems", 1991.
|6| Anil K. Jain and Richard C. Dubes. 'Algorithms Ior data clustering",
1988.
|7| Math Works, http://www.mathworks.it/index.html
|8| Mikut, R.; Reischl, M.; Burmeister, O.; Loose, T.: Data Mining in
Medical Time Series. Biomedizinische Technik, vol. 51, pp. 288293,
2006.J.
|9| R. Mikut, O. Burmeister, S. Braun, M. Reischl, 'the open source matlab
toolbox gait-cad and its application to bioelectric signal processing,
Paper, Institute Ior Applied Computer Science, Forschungszentrum
Karlsruhe GmbH, Germany.
|10| Hao Yu and B. M. Wilamowski, 'LevenbergMarquardt Training
Industrial Electronics Handbook, vol. 5 Intelligent Systems, 2nd
Edition, chapter 12, pp. 12-1 to 12-15, CRC Press 2011.
|11| NIE project documents, ACROTEC Company. http://www.acrotec.it
|12| Muhammad Aqil, Ichiro Kita, Akira Yano, and NishiyamaSoichi.
'Decision Support System Ior Flood Crisis Management using ArtiIicial
Neural Network, International Journal oI Electrical and Computer
Engineering 1:5 2006
|13| David Hand, Principle oI Data Mining. Massachusetts Institute oI
Technology, 2001.
|14| Feng Jiansheng. KDD and its applications, BaoGangtechniqyes. 1993.
|15| Xianjun NI, Research oI Data Mining Based on Neural Networks. World
Academy oI Science, Engineering and Technology, 39 2008.
|16| Ben-Gal, Data Mining and Knowledge Discovery Handbook, chapter
one.'Kluwer Academic Publisher, 2005.





World Academy of Science, Engineering and Technology
International Journal of Computer, Information Science and Engineering Vol:8 No:1, 2014
136
I
n
t
e
r
n
a
t
i
o
n
a
l

S
c
i
e
n
c
e

I
n
d
e
x

8
5
,

2
0
1
4

w
a
s
e
t
.
o
r
g
/
p
u
b
l
i
c
a
t
i
o
n
s
/
9
9
9
7
4
9
4

You might also like