You are on page 1of 8

IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO.

1, MARCH 2003

75

Computer Intrusion Detection Through EWMA for Autocorrelated and Uncorrelated Data
Nong Ye, Senior Member, IEEE, Sean Vilbert, and Qiang Chen
AbstractReliability and quality of service from information systems has been threatened by cyber intrusions. To protect information systems from intrusions and thus assure reliability and quality of service, it is highly desirable to develop techniques that detect intrusions. Many intrusions manifest in anomalous changes in intensity of events occurring in information systems. In this study, we apply, test, and compare two EWMA techniques to detect anomalous changes in event intensity for intrusion detection: EWMA for autocorrelated data and EWMA for uncorrelated data. Different parameter settings and their effects on performance of these EWMA techniques are also investigated to provide guidelines for practical use of these techniques. Index TermsAnomaly detection, computer audit data, exponentially weighted moving average (EWMA), information assurance, intrusion detection.

NOTATION time or identification of an observation event intensity by counting the number of events per second parameter in an -period moving average system time of event observation of the smoothed event intensity for event EWMA statistic for observation parameter to determine the EWMA statistic mean of standard deviation of mean of standard deviation of LCL of UCL of 1-step-ahead prediction error standard deviation of estimated standard deviation of LCL of UCL of parameter to determine LCL and UCL LCL of UCL of parameter to determine the smoothed-variance of smoothing constant to determine the I. INTRODUCTION NTRUSIONS into information systems have presented important threats to reliability and QoS of systems, causing faults and failures in systems, and interrupting services to users [1][4]. Intrusions can take many forms: denying services by flooding system resources, e.g., communication channels, servers, memory and CPU, rapidly propagating a virus or worm, gaining privileges of root users to perform malicious actions, etc. To protect information systems from intrusions and thus assure reliability and QoS of systems, it is highly desirable to develop techniques that detect intrusions into information systems in real time while intrusive activities are occurring. This paper focuses on intrusion-detection techniques for assuring reliability and QoS of information systems. There are extensive studies on faults and failures in information systems in the reliability community, covering software reliability [5], classification of faults [5], [6], fault detection .

ACRONYMS1 AFRL-Rome U.S. Air Force Research Lab - Rome, NY ASU-ISAL Arizona State University - Information and Systems Assurance Lab. BSM basic security module CPU central processing unit CUSUM cumulative sum DARPA U.S. Defense Advanced Research Projects Agency EWMA exponentially weighted moving average EWMV exponentially weighted moving variance ID identification MIT-LL Massachusetts Inst. of Technology - Lincoln Lab. LCL lower control limit NIDES next-generation intrusion detection expert system OS operating system QoS quality of service SPARC a specific model of Sun work-stations SPC statistical process control UCL upper control limit UNIX UNIX OS for computers, e.g., Sun work-stations WinNT Windows-NT OS for personal computers
Manuscript received July 3, 2000; revised May 15, 2001. This work was supported in part by the U.S. Defense Advanced Research Project Agency (DARPA)/Air Force Research Laboratory - Rome (AFRL-Rome) under Grant F30602-99-1-0506. Responsible Editor: J. H. Lambert. N. Ye is with the Information and Systems Assurance Laboratory, Arizona State University, Tempe, AZ 85287 USA (e-mail: NongYe@asu.edu). S. Vilbert and Q. Chen are with the Department of Industrial Engineering, Arizona State University, Tempe, AZ 85287 USA. Digital Object Identifier 10.1109/TR.2002.805796
1The

LCL UCL

LCL UCL LCL UCL

singular and plural of an acronym are always spelled the same.

0018-9529/03$17.00 2003 IEEE

76

IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003

[7], [8], and fault-tolerant system design [9][12]. However, intrusions into information systems, which account for many real-world reliability incidents and QoS problems in information systems, have not been widely studied using theories and techniques of quality and reliability engineering. Existing intrusion detection techniques [13][35] are mostly based on artificial intelligence techniques such as: expert systems, formal logic, artificial neural networks, data mining, pattern matching, etc.

A detailed review of existing intrusion detection techniques is in [13]. In general, existing intrusion detection techniques fall into 2 categories: signature-recognition and anomaly-detection [13], [14]. Signature-recognition techniques [15][20], also referred to as misuse detection in [13], [14], match current activities in an information system with signatures of known intrusions, and signal intrusions when there is a match. Strings, production rules, colored Petri nets, state transition diagrams, decision trees, and cluster structures have been used to represent and store intrusion signatures. Anomaly detection techniques build a profile of activities in an information system in a usual operation conditiona norm profile, and detect differences from the norm profile as anomalies or possible intrusions [21][35]. Strings, formal logic, artificial neural networks, and frequency histograms have been used to represent the norm profile. ASU-ISAL has been investigating intrusions and their impacts on quality/reliability engineering of information systems. Several intrusion detection techniques, have been developed, based on SPC theories and techniques [19][28]. This paper presents that work on applying one kind of SPC techniques, EWMA, to intrusion detection for monitoring and detecting intrusions that manifest through anomalous changes in intensity of events in an information system. Many intrusions into information systems manifest through the -significantly increased or decreased intensity of events occurring in information systems. For example, in typical denial-of-service attacks, an overwhelming number of service requests can be sent to a server, e.g., a web server, of an information system over a short period of time to deplete the computational resource in the server and thus deny the servers ability to respond to users service requests. Such denial-of-service attacks increase the intensity of events on the server. In many virus or worm attacks through e-mail servers, the number of e-mails received over a short period of time also increases abruptly during the attacks. Intruders who have gained super-user privileges can disable many resources in the information system, resulting in the abruptly decreased intensity of events. Therefore, the early detection of -significant changes in the event intensity can help stop many intrusions early to protect information systems and assure reliability and QoS of information systems. The event intensity is the number of events per time unit. It can be considered as a continuous-value variable measuring activities in an information system. Existing work on monitoring

a single continuous-value variable of activities (e.g., the intensity of events) for intrusion detection is mainly based on an anomaly detection technique developed for NIDES [33], [34]. This technique divides values of a continuous-value variable into bins, computes the frequency of activities falling into each bin, and uses the frequency histogram of various bins from long-term normal activities as the norm profile to detect large deviations of current activities from the norm profile based on some intuitive statistics (called statistic and statistic). The technique is not robust to the -nonnormality of data. This paper applies EWMA techniques, which are robust to the -nonnormality of data, to intrusion detection for monitoring and detecting -significant changes in the event intensity. Section II reviews a variety of EWMA techniques, and describes 2 EWMA techniques that are investigated in this study (EWMA for autocorrelated data and EWMA for uncorrelated data). Section III gives detailed information about the data of events used in this study, and the application of the EWMA techniques to this set of data for intrusion detection. Section IV presents the testing results, and discusses the effects of various parameter settings for the EWMA techniques on performance of these techniques. II. EWMA TECHNIQUES SPC techniques have typically been used for monitoring and controlling quality of anufacturing processes. SPC techniques can be univariate or multivariate, and detect changes in process mean (mean shifts), process variance (variance changes), relationship among multiple variables (counter-relationship), etc. [36], [37]. This paper focuses on detecting -significant changes of event intensity for intrusion detection. The event intensity is a single variable measuring 1 characteristic of events in an information system. Hence, this paper considers only univariate SPC techniques to detect mean shifts in the event intensity as anomalies or possible intrusions. Shewhart control charts, CUSUM control charts, and EWMA control charts are univariate SPC techniques that are typically used to detect mean shifts [36][42]. EWMV control charts [37], [43] are designed to detect variance changes, but can also be sensitive to mean shifts. EWMA control charts are robust to the -nonnormality of data [37], [39]. Since the -normality of the intensity of events occurring in an information system can not be guaranteed, only EWMA control charts are considered in this study. EWMA was first suggested in [39]. More recent descriptions of EWMA are in [37], [38]. A description of multivariate EWMA is in [44]. If data contain a sequence of uncorrelated , then the EWMA control chart plots process observations, , computed as [37]: (1) The and of are [37]: (2) (3)

YE et al.: COMPUTER INTRUSION DETECTION THROUGH EWMA FOR AUTOCORRELATED AND UNCORRELATED DATA

77

and can be estimated from training data before testing. The LCL and UCL for the EWMA control chart are [37]: UCL (4) . If falls outside For the 5% -significance level, the LCL and UCL, an anomaly is detected and an alarm signal is generated. If data contain a sequence of autocorrelated process obser, then the EWMA statistic in (1) can be used to vations, provide a 1-step-ahead prediction model of autocorrelated data when the process mean does not drift too quickly [40]. The is . The is [40]: 1-step-ahead prediction for (5) The in (1) can be set by minimizing the sum of squared 1-stepahead prediction errors on the training data [40], or in a different are -independently distributed with the mean way. The , and standard deviation = . The EWMA control-chart . The LCL and UCL are [40]: plots LCL The EWMA control chart on with [40]: control chart on LCL UCL for the 5% -significance level; by calculating a smoothed variance [40]: (7) can be estimated (8) Reference [40] suggests that smaller values of are preferred. Other variations of EWMA for autocorrelated data are in [40]. Events in an information system are usually autocorrelated, because users usually carry out a series of related commands in order to complete a given task. Hence, data of event intensity in an information system can be autocorrelated. In this study, both EWMA for autocorrelated data and EWMA for uncorrelated data are applied to data of event intensity in an information system. The 2 EWMA techniques are compared with regard to their performance for intrusion detection. III. APPLICATION This section describes the intrusion-detection problem, including the: data source, observation of event intensity, training data, testing data, application of the EWMA techniques to the intrusion-detection problem. A. Data Source An information system typically consists of host machines (e.g., machines running a UNIX operation system and machines running a Windows NT OS) and communication links connecting those host machines to form a computer network. Two sources of data are widely used to capture events in an information system for intrusion detection: network traffic data, audit trail data (audit data). UCL (6) is equivalent to the EWMA

Network-traffic-data contain data packets traveling over communication links between host machines to capture events over communication networks. Audit-data capture events occurring on a host machine. This study uses audit-data from a UNIX-based host machine (a Sun SPARC 10 work-station with a Solaris OS), and focus on intrusions into a host machine that leave trails in computer audit data. The Solaris OS from Sun Microsystems Inc. has the security facility, BSM which monitors audit events on a host machine. There are over 250 types of BSM auditable events, depending on the version of the Solaris OS. Because there are about 284 types of BSM audit-events on the host machine, 284 event-types are considered in this study. A BSM audit record for each event contains a variety of information, including the time of the event, event type, user ID, group ID, process ID, session ID, system object accessed. This study is concerned with the intensity of events. Hence, only the time-of-audit events are extracted from the audit data and used for intrusion detection. The time unit is seconds. B. Training and Testing Data This study obtains a collection of audit data for usual events from MIT-LL. Those events are generated by simulating activities observed in a real computer network system in a usual operation condition. The audit data contain a sequence of 3019 audit events lasting 580 seconds in time. The intensity of the 3019 audit events ranges from 0 to 135 events/second. The first half of the audit data, consisting of 1613 audit events lasting 381 seconds, is used as the training data. The second half of the audit data, consisting of 1406 audit events lasting 199 seconds, is used for testing. Hence, the number of the usual events in the training data set is similar to the number of the usual events in the testing data set. The data set of the 3019 audit events is divided into 2 halves (for training and for testing) based on the number of audit events, not the time, because other intrusion detection techniques [19][28] were developed which obtain their observations from the audit data based on the type of individual events, rather than on the time of these events. These different intrusion detection techniques are tested on the same training data and the same testing data for comparison. Added to the 1406 usual events in the testing data set are 10 107 intrusive events that are generated by simulating a denial-of-service attack. These 10 107 intrusive events, lasting 10 seconds, are generated by simulating events at a much higher level of event-intensity than that of the usual events. The simulation of the 10 107 audit events is based on a -normal distribution of the event intensity with mean of 1000 events/second and the standard deviation of 50 events/second. These 10 107 intrusive events are inserted in the middle of the 1406 usual events. Hence, the testing data contain a total of 11 513 audit events with 3 segments of data in the sequence: 703 usual events (the first half of the 1406 usual events),

78

IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003

at time receives a weight of . Hence, if the interest is in a long-term trend of past observations, then use a . If the interest is in the short-term small value of , e.g., trend of the observations in the recent past, then use a large value of , e.g., 0.2. In general, if the exponential smoothing system is to be equivalent to an -period moving average system, then [40], [45]: (10) This study uses the exponential smoothing method to obtain the short-term trend of the observations in the recent past. Hence, 2 values of , 0.2 and 0.3, are used and compared. Events in an information system occur in the interval shorter than 1 second as shown in Figs. 1 and 2. Hence, the measure of the event intensity in terms of the number of events per second is based on the sampling of events every second. However, the data sampling at every second can leave a time gap for intrusive events such as those from a denial-of-service attack to damage the information system before the next data-sample is taken, because a denial-of-service program can generate hundreds or even thousands of events automatically within a second. To prevent this, (9) is transformed into an equivalent form that is used to update the smoothed event intensity at each event : (11) Table I shows the computation of the smoothed-event intensity according to (11) in comparison with the computation of the smoothed-event intensity according to (9), for a given sequence of events. Table I shows that (9) updates the smoothed event intensity for every second, whereas (11) updates the smoothed event intensity for every event but produces the same result of the smoothed event intensity at the end of every second. D. Application of EWMA Techniques to Intrusion Detection Both EWMA for autocorrelated data and EWMA for uncorrelated data are tested to compare their performance for intrusion detection. Application of the EWMA technique for autocorrelated data to intrusion detection takes the 2 steps: 1) Training. Because the training data-set consists of 1613 usual events for 381 seconds, first obtain the sequence of 381 observations of the event intensity by computing the number of events for each second; the average event-intensity of these 381 observations is calculated to become the initial , in (11), and the value of the smoothed event intensity, , in (1). Then for initial value of the EWMA statistic, each audit event in the training data-set, use (11) to obtain , and the observation of the smoothed-event intensity for . Finally, the use (1) to compute the EWMA statistic for average of the 1613 values of is computed and used as for the testing data. The smoothed event intensity for the last event in the training data set, , is used as the for the testing data. For the 1613 events initial value of in the training data-set, the sum of the squared 1-step-ahead prediction errors divided by 1613 is used as the initial value of , for the testing data. 2) Testing. For each audit event in the testing data set, first , then (1) to obtain , (8) to comuse (11) to obtain , and (7) to compute UCL and pute the estimated . Then find-out if LCL UCL . If not, LCL

Fig. 1. Observations of k (i), from the training data, using the event intensity needed.

Fig. 2. Observations of k (i), from the testing data, using the event intensity needed.

10 107 intrusive events, 703 usual events (the second half of the 1406 usual events). The time gap from the last event in one segment of audit data to the first event in the next segment is 1 second. C. Observation of the Event-Intensity Event-intensity can be measured by counting the number of . For example, the training data conevents per second, sist of 1613 audit events for 381 seconds, and produce a sequence of 381 observations of the event intensity (number of events per second) as shown in Fig. 1. The testing data consist of 11 513 audit events for 209 seconds, and produce a sequence of 209 observations of the event intensity as shown in Fig. 2. Figs. 1 and 2 also show the auto-correlation of the observations on the event intensity from the audit data. This study uses the smoothed-event-intensity by smoothing the observations of the event intensity in the recent past as follows: (9) A detailed description of the exponential smoothing method is in [45]. The smoothing reduces the effect of outliers or wild values in the observations of the event-intensity [45]. The smoothing-constant determines the decay rate or the aging weight of the past observations in computing the smoothed-event intensity at the present time. The event-intensity at the present time, , receives a weight of ; the observation receives a weight of , and the observation at time

YE et al.: COMPUTER INTRUSION DETECTION THROUGH EWMA FOR AUTOCORRELATED AND UNCORRELATED DATA

79

TABLE I COMPUTATION OF THE SMOOTHED-EVENT INTENSITY FOR  = 0:2

TABLE II VARIOUS VALUES FOR PARAMETERS USED IN THE EWMA TECHNIQUE FOR AUTO-CORRELATED DATA

an alarm signal is produced on this event; otherwise, no signal is produced. Table II shows various values for the parameters in the above formulas that are investigated for EWMA for autocorin (11) is given in related data. The reason for using Section III-C. Reference [40] suggests that the value of for the EWMA statistic in (1) is chosen to minimize the sum of squared 1-step-ahead prediction errors on the training data. in Table I. For this value, This method yields (0.05, 0.001, 0.0001) for the smoothed several values of variance in (8) are investigated for comparison. In addition to , the values (0.05, 0.001, 0.0001, 0.000 01) are the also investigated for comparison. For each of these additional values, is set to the same level as the value (0.05, 0.001, 0.0001, or 0.000 01). , the along with similar In addition to the is also investigated values of and to those for the for comparison. The for all the combinations of pais also investigated for rameter values. In addition, the the , the , and the , which yield poor performance as shown in Section IV. Hence, the value is increased from 1.96 to 3 to see if this change of makes any performance difference. The application of the EWMA technique for uncorrelated data to intrusion detection takes the following 2 steps. 1) Training: Because the training data-set consists of 1613 usual events for 381 seconds, first obtain the sequence of 381 observations of the event intensity by computing the number of events for each second. The average event-intensity in (11), and of these 381 observations is calculated for in (1). Then for each audit event in the training for data-set, use (11) to obtain the observation of , and use (1)

to compute the EWMA statistic for . Finally, the average is computed and used as for the of the 1613 values of in testing data. This average is also used as the estimated for the last event in the training (2) during testing. The , for the testing data-set, is used as the initial value of the for the training data is used data. The standard deviation of in (3) during testing. Equations (2) and (3) as the estimated and . Equation (4) is then used to are used to compute compute UCL and LCL , which are used during testing. 2) Testing: For each audit event in the testing data-set, first , and use (1) to obtain . Then evaluate use (11) to obtain [LCL , UCL ]. If not, an alarm signal is produced on if this event; otherwise, no signal is produced. As discussed in Section II and shown in Figs. 1 and 2, the data of the event intensity in the information system are autocorrelated. The data of the smoothed event intensity are also autocorrelated because of the smoothing method. Hence, the EWMA technique for uncorrelated data is tested to compare with the and EWMA technique for autocorrelated data only for that yield the best performance for the EWMA technique for autocorrelated data among all the parameter combinations, as shown in Section IV. The value of in (4) is also set to 1.96 at the -significance level of 0.05. IV. RESULTS AND DISCUSSIONS The test data consist of the sequence: 703 usual events, 10 107 intrusive events, 703 usual events. For each EWMA technique with a given parameter combination, compute the number of false alarms and the number of hits. A false alarm is a signal on a usual event. A hit is a signal on an intrusive event. Also, check how soon the intrusion is detected by noting the first intrusive event with a signal. Table III shows the: number of hits, number of false alarms, first signaled intrusion event for the 2 EWMA techniques with different combinations of parameter values. The examination on the performance of the EWMA technique for autocorrelated data leads to 5 findings: 1) Table III shows, among all the parameter settings for , , and (highlighted by an underline in Table I) produce the best performance with: 5392 hits out of the 10 107 intrusive events, no false alarm out of the 1406 usual events,

80

IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003

TABLE III PERFORMANCE RESULTS OF THE 2 EWMA TECHNIQUES WITH DIFFERENT COMBINATIONS OF PARAMETER VALUES

(a)

(b) Performance of EWMA for autocorrelated data with  = 0:2,  = = 10 , L = 1:96. (a) Plot of x(i), LCL (i), UCL (i). (b) Plot of signal (i). Fig. 3.

an early detection at intrusion event #958. Fig. 3 shows more performance details for this parameter setting. and ), 2) For this parameter setting ( signals are not produced for all the 10 107 intrusive events. Signals are produced only for those early intrusive events. As the center line and the LCL and UCL of the control chart gradually

adjust to the smoothed intensity level of the intrusive events, no signals are produced on the later intrusive events. This is acceptable because the early detection of the intrusive events should already trigger actions to stop the later intrusive events. The early detection of the intrusion events is more important than the 100% hit rate for all the intrusive events. 3) For this parameter setting ( and ), changing from 0.2 to 0.3, does not make much performance difference. 4) Larger values of , such as 0.05 and 0.85 (the optimal value for the time-series data of the smoothed event intensity), produce smaller prediction errors, thereby resulting in smaller esand smaller in-control ranges as shown in Fig. 4. timated This produces false alarms on both the first 703 usual events and the last 703 usual events. Even increasing from 1.96 to 3 does not help much in overcoming this difficulty, as shown in Table III. When represents the short-term trend of the event intensity in the recent past, and as the parameters in determining the LCL and UCL should reflect the long-term trend of , the event intensity. This is why the parameter setting of , and produces the good performance on the smoothed event intensity as stated in Finding 1. are too small, such as 0.000 01, the center 5) If and line and the LCL and UCL of the EWMA control chart become too sluggish to update the long-term trend of the smoothed event intensity, thereby resulting in the false alarms in the last 703 usual events when the smoothed event intensity drops after the intrusive events as shown in Fig. 5 and Table III. The performance of the EWMA technique for uncorrelated data is shown in Fig. 6 and Table III. Because the LCL and UCL do not change, there are false alarms on all the last 703 usual events whose smoothed event intensity still carries over the high level of the smoothed event intensity of the 10 107 intrusive events before these 703 usual events. In summary, it appears from the testing results that both EWMA for autocorrelated data and EWMA for uncorrelated data can work well for detecting intrusions that manifest themselves through -significant changes in the intensity of events occurring in an information system. The advantage

YE et al.: COMPUTER INTRUSION DETECTION THROUGH EWMA FOR AUTOCORRELATED AND UNCORRELATED DATA

81

(a)

(a)

(b) Fig. 4. Performance of EWMA for autocorrelated data with  = 0:2,  = = 0:5, L = 1:96. (a) Plot of x(i), LCL (i), UCL (i). (b) Plot of signal (i).
10

(b) Fig. 6. Performance of EWMA for uncorrelated data with  = 0:2,  = , L = 1:96. (a) Plot of z (i), LCL (i), UCL (i). (b) Plot of signal (i).

(a)

of the EWMA technique for uncorrelated data is that it can detect not only abrupt changes in the event intensity but also small mean shifts through the gradually increased or decreased event intensity [37]. However, if the EWMA technique for uncorrelated data is used, the initial value of the smoothed event intensity needs to be reset after intrusions are detected for preventing the carry-over effect. If EWMA for autocorrelated data is used, the reset of the initial value of the smoothed event intensity is not necessary, because EWMA for autocorrelated data automatically adjusts the LCL and UCL to account for the carry-over effect. Overall, the smoothing constant for computing the smoothed event intensity should not be too small, in order to capture the short-term trend of the event intensity in the recent past. The and for setting the LCL and UCL to reflect the long-term trend of the smoothed-event intensity should be much smaller than the smoothing constant. The for -significance level of 0.05 appears to work well. EWMA for autocorrelated data might not be able to detect small mean shifts through the gradually increased or decreased event intensity. ACKNOWLEDGMENT The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either express or implied, of, DARPA/AFRL-Rome, or the U.S. Government.

(b)

= 10 (i).

Fig. 5.

Performance of EWMA for autocorrelated data with  = 0:2,  = , L = 1:96. (a) Plot of x(i), LCL (i), UCL (i). (b) Plot of signal

82

IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003

REFERENCES
[1] W. Stallings, Network and Inter-Network Security Principles and Practice: Prentice-Hall, 1995. [2] C. Kaufman, R. Perlman, and M. Speciner, Network Security: Private Communication in a Public World: Prentice-Hall, 1995. [3] N. Ye, J. Giordano, and J. Feldman, Detecting information warfare attacks: Current state of the art from a process control viewpoint, Communications ACM, vol. 44, Aug. 2001. [4] T. Escamilla, Intrusion Detection: Network Security Beyond the Firewall: John Wiley and Sons, 1998. [5] W. Everett, S. Keene, and A. Nikora, Applying software reliability engineering in the 1990s, IEEE Trans. Rel., vol. 47, pp. 372378, Sept. 1998. [6] A. Thakur and R. K. Iyer, Analyze-NOWAn environment for collection and analysis of failures in a network of workstations, IEEE Trans. Rel., vol. 45, no. 4, pp. 561570, Dec. 1996. [7] C. S. Hood and C. Ji, Proactive network-fault detection, IEEE Trans. Rel., vol. 46, no. 3, pp. 333341, Sept. 1997. [8] S. Morasca, Assessment of fault-detection processes: An approach based on reliability techniques, IEEE Trans. Rel., vol. 45, no. 4, pp. 632637, Dec. 1996. [9] C. M. Krishna, Optimal configuration of redundant real-time systems in the face of correlated failure, IEEE Trans. Rel., vol. 44, no. 4, pp. 587594, Dec. 1995. [10] S.-T. Cheng, Topological optimization of a reliable communication work, IEEE Trans. Rel., vol. 47, no. 3, pp. 225233, Sept. 1998. [11] S. N. Chau, L. Alkalai, A. T. Tai, and J. B. Burt, Design of a faulttolerant COTS-based bus architecture, IEEE Trans. Rel., vol. 48, no. 4, pp. 351359, Dec. 1999. [12] H.-M. Sun and S.-P. Shieh, Optimal information-dispersal for increasing the reliability of a distributed service, IEEE Trans. Rel., vol. 46, no. 4, pp. 462472, Dec. 1997. [13] H. Debar, M. Dacier, and A. Wespi, Toward a taxonomy of intrusiondetection systems, Computer Networks, vol. 31, pp. 805822, 1999. [14] R. Lippmann et al., Evaluating intrusion detection systems: The 1998 DARPA off-line intrusion detection evaluation, in Proc. DARPA Inform. Survivability Conf. Exposition: IEEE Computer Society, 2000, pp. 1225. [15] D. Anderson, T. Frivold, and A. Valdes, Next-generation Intrusion Detection Expert System (NIDES): A Summary, SRI International, Tech. Rep. SRI-CSL-97-07, 1995. [16] G. Vigna, S. Eckmann, and R. Kemmerer, The STAT tool suite, in Proc. DARPA Inform. Survivability Conf. and Exposition: IEEE Computer Society, 2000, pp. 4655. [17] S. Kumar, Classification and Detection of Computer Intrusions, Ph.D. dissertation, Department of Computer Science, Purdue University, Indiana, USA, 1995. [18] W. Lee, S. J. Stolfo, and K. Mok. Mining in a data-flow environment: Experience in network intrusion detection. presented at Proc. 5th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining. [Online]. Available: http://www.cs.columbia.edu/ sal/JAM/PROJECT/. [19] N. Ye and X. Li, A scalable clustering technique for intrusion signature recognition, in Proc. 2nd IEEE SMC Inform. Assurance Workshop, 2001. [20] N. Ye, X. Li, and S. M. Emran, Decision trees for signature recognition and state classification, in Proc. 1st IEEE SMC Inform. Assurance and Security Workshop, 2000. [21] N. Ye et al., Probabilistic techniques for intrusion detection based on computer audit data, IEEE Trans. Syst., Man, and Cybern., vol. 31, no. 4, 2001. [22] N. Ye, S. M. Emran, X. Li, and Q. Chen, Statistical process control techniques for an intrusion detection system, in Proc. 2nd DARPA Inform. Survivability Conf. and Exposition, 2001. [23] S. M. Emran and N. Ye, Robustness of Canberra metric in computer intrusion detection, in Proc. 2nd IEEE SMC Inform. Assurance Workshop, 2001. [24] N. Ye, Q. Chen, and S. M. Emran, Computer intrusion detection based on statistical distributions of distance metrics, in Proc. Southern Conf. Computing, 2000. [25] N. Ye, A Markov chain model of temporal behavior for anomaly detection, in Proc. 1st IEEE SMC Inform. Assurance and Security Workshop, 2000. [26] N. Ye, Q. Zhong, and M. Xu, Probabilistic networks with undirected links for anomaly detection, in Proc. 1st IEEE SMC Inform. Assurance and Security Workshop, 2000.

[27] N. Ye, Q. Chen, and S. M. Emran, Hotellings T2 multivariate profiling for anomaly detection, in Proc. 1st IEEE SMC Inform. Assurance and Security Workshop, 2000. [28] N. Ye and Q. Chen, An anomaly detection technique based on a chisquare statistic for detecting intrusions into information systems, Qual. Rel. Eng. Int., vol. 17, no. 2, pp. 105112, Mar./Apr. 2001. [29] D. E. Denning, An intrusion-detection model, IEEE Trans. Software Eng., vol. 13, no. 2, pp. 222232, 1987. [30] A. K. Ghosh, A. Schwatzbard, and M. Shatz. Learning program behavior profiles for intrusion detection. presented at Proc. 1st USENIX Workshop on Intrusion Detection and Network Monitoring. [Online]. Available: http://www.rstcorp.com/ anup/. [31] S. Forrest, S. A. Hofmeyr, and A. Somayaji, Computer immunology, Communications ACM, vol. 40, no. 10, pp. 8896, Oct. 1997. [32] C. Ko, G. Fink, and K. Levitt, Execution monitoring of security-critical programs in distributed systems: A specification-based approach, in Proc. 1997 IEEE Symp. Security and Privacy: IEEE Computer Society, 1997, pp. 134144. [33] H. S. Javitz and A. Valdes, The SRI statistical anomaly detector, in Proc. 1991 IEEE Symp. Res. Security and Privacy: IEEE Computer Society, 1991. [34] H. S. Javitz and A. Valdes, The NIDES statistical component description of justification, SRI International, Tech. Rep. A010, 1994. [35] Y. Jou et al., Design and implementation of a scalable intrusion detection system for the protection of network infrastructure, in Proc. DARPA Inform. Survivability Conf. Exposition: IEEE Computer Society, 2000, pp. 6983. [36] T. P. Ryan, Statistical Methods for Quality Improvement: John Wiley and Sons, 1989. [37] D. C. Montgomery, Introduction to Statistical Quality Control: John Wiley and Sons, 1997. [38] J. S. Hunter, The exponentially weighted moving average, J. Qual. Technol., vol. 18, pp. 203209, 1986. [39] S. W. Roberts, Control chart tests based on geometric moving averages, Technometrics, vol. 1, pp. 239251, 1959. [40] D. C. Montgomery and C. M. Mastrangelo, Some statistical process control methods for autocorrelated data, J. Qual. Technol., vol. 23, no. 3, pp. 179193, July 1991. [41] C. M. Borror, D. C. Montgomery, and C. G. Runger, Robustness of the EWMA control charts to nonnormality, J. Qual. Technol., vol. 31, no. 3, pp. 309316, 1999. [42] S. H. Steiner, EWMA control charts with time-varying control limits and fast initial response, J. Qual. Technol., vol. 31, no. 1, pp. 7586, 1999. [43] J. F. MacGregor and T. J. Harris, The exponentially weighted moving variance, J. Qual. Technol., vol. 25, no. 1, pp. 106118, 1993. [44] S. S. Prabhu and G. C. Runger, Designing a multivariate EWMA control chart, J. Qual. Technol., vol. 29, no. 1, pp. 815, Jan. 1997. [45] D. C. Montgomery, L. A. Johnson, and J. S. Gardiner, Forecasting and Time Series Analysis: McGraw-Hill, 1990.

Nong Ye received the B.S. in Computer Science from Peking University, Beijing, the M.S. in Computer Science from the Chinese Academy of Sciences, Beijing, and the Ph.D. in Industrial Engineering from Purdue University, West Lafayette, IN. Since 2002, Dr. Ye has been a Professor with the Department of Industrial Engineering, Arizona State University, Tempe. Her research interests are in assuring quality/reliability and preventing faults and errors in information systems, humanmachine systems, and manufacturing systems. Dr. Ye is a Senior Member of the IEEE and of the Institute of Industrial Engineers.

Sean Vilbert received the M.S. in 1999 in Industrial Engineering from Arizona State University.

Qiang Chen received the B.S. in 1993 and the M.S. in 1999 in Manufacturing Engineering from Beijing University of Aeronautics and Astronautics (BUAA). Since 1999, he has studied in the Department of Industrial Engineering, Arizona State University; and received the Ph.D. in 2001 in Industrial Engineering from Arizona State University. From 19931996, he worked as an Information Management Engineer in Beijing Aircraft Maintenance and Engineering Company. His research interests include intrusion detection, data noise cancellation, and knowledge discovery.

You might also like