Online Temporal Signal Comparison

Ind. Eng. Chem. Res.
2007, 46, 4531-4548
4531
Online Temporal Signal Comparison Using Singular Points Augmented Time

Warping
Rajagopalan Srinivasan*,, and Mingsheng Qian
Laboratory for Intelligent Applications in Chemical Engineering, Department of Chemical and
Biomolecular Engineering, National UniVersity of Singapore, 10 Kent Ridge Crescent, Singapore 119260,
and Process Sciences and Modeling, Institute of Chemical and Engineering Sciences, 1 Pesek Road,
Jurong Island, Singapore 627833
Advances in instrumentation and data storage technologies have allowed the process industries to collect
extensive operating data which can be used to extract information about the underlying process and provide
online decision support. One of the fundamental problems in data-based decision support is comparison of
time-series data. Many signal comparison methods require signals that are of the same length and synchronized.
Synchronization of varying length signals is usually achieved using dynamic time warping (DTW). Major
limitations of DTW include computational cost and the tendency to link operationally different points.
Previously, we proposed singular points augmented time warping to overcome these shortcomings during
offline signal comparison (Srinivasan and Qian Ind. Eng. Chem. Res. 2005, 44, 4697). Landmarks in process
data such as extreme values and sharp changes, called singular points, are used to segment a signal into
regions with homogeneous properties, called episodes. Singular points of two signals are linked by dynamic
programming; time warping is used to synchronize the episodes. A locally optimal equivalent of DTW called
extrapolative time warping (XTW) with better computational performance was also proposed. In this paper,
we present the extension of this approach to online signal comparison. The online signal comparison problem
is a generalization of the offline problem and has two additional challenges: (1) the reference signal for
comparison is not known a priori and has to be selected from a library, and (2) the starting point of the
reference and real-time signal would not coincide in general, and the corresponding points have to be identified.
The approach proposed here addresses these by extending dynamic locus analysis (Srinivasan and Qian Chem.
Eng. Sci. 2006, 61, 6109) and singular point augmented XTW using anchoring and flanking strategies. The
application of the proposed approach is illustrated using two different case studies: online operating mode
identification in the Tennessee Eastman process and online fault identification in a lab-scale distillation column.
The results show that the proposed approach is robust, efficient, and suitable for online signal comparison.
1. Introduction
As a result of significant advances in data collection and
storage, vast amounts of historical operating data are becoming
commonly available in the chemical process industry. This
data is a rich source of information about the process that can
be used to improve the plant operation. Given the parallel
developments in pattern classification3 and statistical, information, and systems theories,4 data-based approaches have been
gaining in popularity. Potential areas of application of datadriven methods include regulatory and sequence control,
visualization, operation improvement, state identification, and
fault diagnosis. Despite these developments in extracting
information and knowledge from data, many important and
challenging problems persist in data analysis and knowledge
extraction. In this paper, we address one such problem: online
temporal signal comparison.
Temporal signals with time-varying properties commonly
arise in chemical plants during transition states such as startup,
grade change, shutdown, maintenance, and other abnormal
operations. The precept of signal comparison-based approaches
is that similar states result in similar temporal signals. So, if a
historical database of representative signals is available, the root
cause of a process change occurring in real-time can be
* Corresponding author. E-mail: chergs@nus.edu.sg. Tel.: +65
65168041. Fax: +65 67791936.
National University of Singapore.
Institute of Chemical and Engineering Sciences.
determined. The challenge, however, is that, as a result of the

nature of industrial operations, signals from different instances
of the same state are not exact replicates; there would be
significant differences in magnitude and duration of the variable
profiles as a result of run-to-run variations arising from differing
initial conditions, impurities, seasonal affects, and operator
actions. A direct comparison of a real-time signal with a
reference signal in the historical database would therefore be
incorrect, and signal comparison methods try to account for these
normal operational variations.
1.1. Previous Work in Online Signal Comparison. Three
classes of signal comparison methods can be differentiated. The
first class of methods is based on multivariate statistical
aggregates of the signal such as principal component analysis
(PCA). PCA-based approaches have been popular in process
monitoring. For comparing two multivariate signals, Krzanowski5 proposed a PCA similarity factor that compares the
reduced PCA subspaces of the original signals. Raich and Cinar6
used the PCA similarity factor for diagnosing process disturbances. Singhal and Seborg7 modified the PCA similarity factor
by weighing the principal components with the square root of
their corresponding eigenvalue, . The PCA similarity factor is
only applicable for statistically stationary signals whose properties do not change with time. To extend it to non-stationary
signals, Srinivasan et al.8 proposed a dynamic PCA-based
similarity factor SDPCA that accounts for the temporal evolution
of the signal. The main advantage of the PCA-based methods
is their inherent ability to deal with multivariate signals and
10.1021/ie060111s CCC: $37.00 2007 American Chemical Society

Published on Web 05/10/2007
4532
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Figure 1. Typical signal and its singular points.
their low computational requirements. Their main shortcomings

are that (1) they do not explicitly consider the synchronization
between the model and the real-time signal because time is
accounted for implicitly in these models; (2) they are nonintuitive, especially for plant operators, because the comparison
is based on a derived quantity with no physical significance;
and (3) they consider the data as monolithic and arising from a
single process state with specified statistical properties. This
last requirement makes them unsuitable for online applications
in multi-mode processes that can operate in multiple states. Also,
for operator decision support, it is important to not only calculate
the extent of similarity but also identify the point of divergence,
that is, the point in time from when the two signals start to
deviate from one another. Because the PCA-based methods
consider the whole of the data as a single block they cannot
directly detect the point of divergence.
The second class of signal comparison methods uses a
qualitative abstraction of the actual signal.9,10 Rengaswamy and
Venkatasubramanian11 used a set of qualitative primitives, such
as increase, decrease, and so forth, to represent the evolution
of a signal. Libraries of qualitative trends corresponding to
various process states are calculated offline. The patterns of the
qualitative trends in the real-time signal are compared with
those in the library and used for fault diagnosis. The interested
reader is referred to the reviews by Venkatasubramanian et al.12
and Maurya et al.13 for further details. A simple trend
comparison does not suffice for non-stationary processes because
the mapping between the trend and the process state is one to
many; that is, the same trend may correspond to normal
operation in one state but a fault in another. Sundarraman and
Srinivasan14 proposed an enhanced trend analysis approach to
overcome this by considering the duration and magnitude along
with the qualitative shape of the trend. The triplet of shape,
duration, and magnitude of the trend enables state-specific
comparisons because the enhanced trend-to-state mapping is
one-to-one. The main shortcomings of the trend-based approaches are their univariate nature and larger time lags in state
identification arising from the qualitative abstraction of the signal
and the consequent loss of information.
The third class of methods uses a direct comparison between
an offline annotated library of signals and the real-time evolution
of the process to overcome the latter shortcoming. However, it

is normal for signals from different runs of the same state to be
slightly different and not match each other perfectly. Therefore,
methods to compare signals by adjusting their time scales, called
time warping, have been developed. One such is dynamic time
warping (DTW), which originated as a robust method for
calculating the difference between unsynchronized speech
signals.15
Let T and R denote two time-sampled signals of lengths t
and r to be synchronized and let j and i denote the time index
of their trajectories, respectively. Let the superscript * denote
the optimal value of the variable. DTW finds a sequence F* of
P points on an r t grid such that a total distance measure
between the two trajectories is minimized.
F* ) {c(1), c(2), ..., c(p), ..., c(P)}

c(p) ) [i(p) j(p)]
D(r, t) )
(1)
(2)
d[i(p), j(p)] w(p)

N(w)p)1
D*(r, t) ) min[D(r, t)]

F
(3)
(4)
where (i(p), j(p)) is the warping assignment at step p, d[i(p)

j(p)] is the local distance between the point j(p) of T and point
i(p) of R, D(r, t) is the normalized total distance between the
two signals, and D*(r, t) is the minimum normalized distance
between them. Constraints are often used to define and restrict
the search space and find an alignment that optimizes some
criterion. They are motivated by physical considerations, to
avoid excessive compression or expansion, speed up the
calculation, or enforce other problem-specific limits on the
alignment. A common global constraint is to set the end point
of T and R to coincide, that is,
c(1) ) (1, 1)
(5)
c(p) ) (r, t)
(6)
Local constraints determine local features around this point. For

example, the Itakura local constraint defines (i - 1, j), (i - 1,
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4533
Figure 2. Algorithm for real-time signal tracking.
j - 1), and (i - 1, j - 2) as the set of predecessors to (i, j) and

results in a local slope in [1/2 2]. Let DA(i, j) be the minimum
accumulated total distance between the two signals from (1, 1)
to (i, j). The optimization problem in eq 4 reduces to
DA(i, j) )
DA(i - 1, j) + d(i, j) or [ if condition (A*)]
min DA(i - 1, j - 1) + d(i, j)
DA(i - 1, j - 2) + d(i, j)
(7)
where DA(1, 1) ) d(1, 1) and condition (A*) indicates that the

predecessor of point (i - 1, j) is the point (i - 2, j). More
details of DTW can be found in Sankoff and Kruskal.15 Kassidas
et al.16 used DTW for synchronizing batch trajectories.
A major shortcoming of DTW is its computational complexity
(O(tr)) which precludes its use with long signals common in
the process industry. Also, the entire assignment history has to
be stored in memory with the concomitant memory requirements. Further, in many cases, the minimum distance criterion
used in DTW does not guarantee comparison and synchronization of operationally equivalent points of the signal. To
overcome the above limitations, Srinivasan and Qian1 augmented
time warping by using landmarks in the signal, called singular
points. We summarize this approach and related developments
in section 2. Previous applications of DTW have focused on
offline signal comparison, where the two signals to be compared
are entirely available beforehand. For online state identification
or fault diagnosis, a signal that is evolving in real-time has to
be compared. Also, the reference signal is not known a priori.
This results in two challenges: (1) locating the optimum
reference signal from a library of signals and the point from
which comparison between the two signals should start and (2)
performing robust signal comparisons with a computational
requirement suitable for online use. We address these two
challenges in this paper. In section 3, we propose a flanking
strategy for efficiently identifying the reference signal as well
an anchoring strategy for updating the signal comparison when

a new sample becomes available. In section 4, the application
of the method to state identification in the Tennessee Eastman
simulation and fault diagnosis in a lab-scale distillation column
is reported. Summary and conclusions from this work are
presented in section 5.
2. Singular Points Augmented Time Warping
Information content is not homogenously distributed throughout a signal; rather, the majority of the features of the signal
are concentrated in a small number of points. Such points, which
are landmarks in the signal evolution, are called singular points.1
The singular points of a sample signal are shown in Figure 1.
Mathematically, each singular point is a triplet TSP ) {, ,
} where is the time of occurrence of the singular point,
is the magnitude of the variable at the singular point, and is
its type. Three types of singular points can be differentiated
corresponding to sharp changes, extrema, and trend changes;
thus, singular points are broadly points of local extrema in the
variable and its first and second derivatives. The different types
of singular points are not mutually exclusive, and the same point
can correspond to more than one type. This is illustrated in
Figure 1: P1 is both an extreme point and a sharp change point;
similarly, P2 and P4 are trend change points as well as sharp
change points. In such cases, the list of all matching types is
noted. Methods for singular points identification are reported
in Srinivasan and Qian1 and Qian.17
The segment of a signal between adjoining singular points is
called a singular episode. An episode thus consists of regions
of nearly constant slope, small oscillations, and so forth. Clearly,
a signal can be deconstructed into adjoining singular episodes;
equivalently, a signal can be described through an ordered set
of singular points. The representation of a signal by its singular
points enables its efficient synchronization and comparison with
another signal. This is achieved by linking the singular points
or episodes of the two signals using dynamic programming. One
4534
Figure 3. Algorithm for optimal reference signal R* identification.
Figure 4. Flanking segments used for reference signal identification.
important criterion for linking two sets of singular points is that

their sequences should match. A sequence Violation occurs
SP
SP
SP
between a pair of singular points (TSP
m , Rm ) and (Tk , Rl ) in
SP
signals T and R if m < k (that is, Tm temporally precedes TSP
k )
SP
but m > l (that is, RSP
m does not temporally precede Rl ). As
described in detail in Srinivasan and Qian,1 the singular points
of two signals are said to correspond and can be linked if they
are of the same type and if the linkage does not result in any
sequence violations. For a given linkage of singular points, the
distance between the two signals is calculated as the sum of
episode-wise distances. Time warping is used to calculate the
distance between the corresponding episodes so as to account
for magnitude and duration differences between the episodes.
Among the various linkages possible between two singular point
sequences, the linkage that results in minimum signal distance
is considered optimal.
Srinivasan and Qian1 also proposed a new time warping
strategy called extrapolative time warping (XTW), which is a
greedy search modification of classical DTW with Itakura local
constraint. The XTW method obviates dynamic programming
for each local point by optimizing each point locally. In contrast
to DTW, search in XTW proceeds in the forward direction
starting from the first point of the signal to the last. Given the
warping assignment (i, j), the optimal warping for the subsequent
step, that is, the location of j* that corresponds to (i + 1), is
based only on the previous decision and the current distance.
Figure 5. Temporal development and translation of flanking segments during optimal reference signal identification.
Three possible successors, (i + 1, j), (i + 1, j + 1), and (i + 1,

j + 2), are considered. In lieu of eq 7, the optimal search path
in XTW is therefore defined by
DA(i + 1, j*) ) min

DA(i, j) + d(i + 1, j) or [ if condition (B*)] j* ) j
DA(i, j) + d(i + 1, j + 1)
j* ) j + 1
DA(i, j) + d(i + 1, j + 2)
j* ) j + 2
(8)
with initial condition DA(1, 1) ) d(1, 1). Condition (B*)

indicates that the predecessor of point j* ) j. Thus, for each
step, the decision for the corresponding point for i is based only
on three comparisons: to increase j by 0, 1, or 2. Following
the Itakura local constraint, if the previous decision was to
increase j by 2, then the successor would not have this option
(so as to maintain the local slope in the [1/2 2] range), and j can
increase only by 0 or 1. Similarly, in the preceding step, if j
did not increase, the successor would not have the option to
remain at the same j, and j* ) j + 1 or j* ) j + 2. Because
any decision is based only on the previous decision and the
current difference, dynamic programming is obviated. The
search space of the XTW is the same as DTW with the Itakura
local constraint. However, unlike DTW, in XTW once a match
has been assigned, future assignments will not affect it. The
assignment history tree, which is the origin for the large
computational storage requirements of DTW, is not necessary
in XTW; instead only the list of assignments needs to be
maintained. The greedy extrapolative search for any point
decreases the computational time and provides a major advantage when XTW is used in online signal tracking as explained
in section 3. However, because global information is not used
at each step, an optimal solution is not guaranteed by XTW.
This issue comes to the foreground when long signals have to
be compared. In our approach, this is addressed by combining
XTW with singular point linkage.
The singular points based time warping approaches enforce
the optimal linkage of the major landmarks of the two signals
Table 1. TE Process Measurements and Their Base Values

variable name
A feed (stream 1)
D feed (stream 2)
E feed (stream 3)
A and C feed (stream 4)
recycle flow (stream 8)
reactor feed rate (stream 6)
reactor pressure
reactor level
reactor temperature
purge rate (stream 9)
product separator
temperature
product separator level
product separator pressure
product separator
underflow (stream 10)
stripper level
stripper pressure
stripper underflow
(stream 11)
stripper temperature
stripper steam flow
compressor work
reactor cooling water
outlet temperature
condenser cooling water
outlet temperature
variable
number
base case
value
units
XMEAS (1)
XMEAS (2)
XMEAS (3)
XMEAS (4)
XMEAS (5)
XMEAS (6)
XMEAS (7)
XMEAS (8)
XMEAS (9)
XMEAS (10)
XMEAS (11)
0.25052
3664.0
4509.3
9.3477
26.902
42.339
2705.0
75.0
120.40
0.33712
80.109
kscmh
kg h-1
kg h-1
kscmh
kscmh
kscmh
kPa gauge
%
C
kscmh
C
XMEAS (12)
XMEAS (13)
XMEAS (14)
50.000
2633.7
25.160
%
kPa gauge
m3 h-1
XMEAS (15)
XMEAS (16)
XMEAS (17)
50.000
3102.2
22.949
%
kPa gauge
m3 h-1
XMEAS (18)
XMEAS (19)
XMEAS (20)
XMEAS (21)
65.731
230.31
341.43
94.599
XMEAS (22)
77.297
C
kg h-1
kW
C
C
using dynamic programming. The global optimality of the

episode-level comparison is therefore not a critical requirement
because the optimal assignment of each time point within an
episode has no physical significance and is rarely necessary in
practical applications. The two-step comparison also leads to
significant improvements in speed, memory requirement, and
efficiency of signal comparison. Another important advantage
is that because the singular points have physical meaning such
as the beginning or ending of a process event, they can be
directly used for state identification, monitoring, and supervision.
Singular point augmented time warping is suited for signals
whose endpoints are known to correspond as these are used in
the initial warping assignment (1, 1). This is not the case in
4536
Figure 6. Test signal T and reference signals (R1 and R2) for the illustrative example.
real-time signal comparison where the online signal is available

starting from an unknown state. We use dynamic locus analysis
(DLA) to identify the endpoints in the library signal that
correspond to those of the real-time signal.
2.1. Dynamic Locus Analysis for Finding the Best Matching Segment. DLA is an efficient method to compare a short
signal with a long reference signal and identify the best matching
segment from the reference.2 Consider the short signal X ) {x1,
x2, x3, ..., xm} which is the last m samples from an online sensor.
Here m is the size of the evaluation window. Let Y ) {y1, y2,
y3, ..., yn} be a long reference signal. For every segment of Y,
say Z ) {yl, yl+1, ..., yj}, a segment Z* ) {yl*, yl*+1, ..., yj*}, is
called the locus of X if
D*(X, Z*) ) min{D*(X, Z)}

l,j
(9)
Also, yl* is called the corresponding point of x1, and yj* is the
corresponding point of xm. The brute force approach of
considering each possible l in Y and performing an independent
comparison will result in an unacceptable computational load.
DLA overcomes this by extending Smith and Watermans18
dynamic programming approach for comparing protein sequences. In DLA, the locus of X is identified by using a
dissimilarity matrix, DS. Let i and j be the time indices of X
and Y, respectively. The (i, j) element of DS measures the
minimal difference between the sub-segment {x1, x2, x3, ..., xi}
in X and the sub-segment {yl, yl+1, yl+2, ..., yj} in Y. In the general
case, l is unknown and is determined using dynamic programming.
i
DS(i, j) ) min{
F
(xd, yj(d))}
d)1
(10)
where yj(d) is the time warped point that matches with xd and
(xd, yj(d)) ) |yj(d) - xd| is the difference between xd and yj(d).
Because the optimal search should allow for compression and
elongation in Y relative to X, time warping is used to synchronize
X and Y. Following DTW with Itakura local constraint, eq 10
reduces to
DS(i, j) ) min{DS(i - 1, j - 1) +
(xi, yj), Fi,j, Gi,j}i [2 m] j [2 n] (11)
Fi,j ) DS(i - 1, j - 2) + (xi, yj)
Gi,j ) DS(i - 1, j) + (xi, yj) or if G*
where G* indicates that the predecessor of point (i - 1, j) is
the point (i - 2, j). Note that DS(i, j) is not the total minimum
distance between {x1, x2, x3, ..., xi} and{y1, y2, ..., yj}, rather it
is the total minimum distance between X and its locus in Y.
Segments of Y that are similar to X would lead to small values
of DS. The optimal match between X and the locus in Y is given
by DS(m, j*) where j* ) argminj{DS(m, j)} j [1 n]. More
details of DLA can be found in Srinivasan and Qian.2 In this
paper, we extend DLA and singular point augmented time
warping for online signal comparison.
3. Online Signal Comparison Using Singular Points
Augmented Time Warping
The online signal comparison problem can be stated as
follows: Given a set of reference signals K and a real-time signal
T emanating from the process operating at an unknown state,
(1) identify the reference signal that best matches the current
state of the process and (2) identify the progress of the process
Figure 7. Comparison of real-time signal T at T ) 8 with R1 (shown in part b) reveals a minimum at R1 ) 199 (shown in part c). Similar comparison with
R2 depicted in part d shows a minimum at R2 ) 1083 as shown in part e.
with respect to the reference signal. The former is called the

optimal reference signal identification problem while the latter
is referred to as real-time signal (or state) tracking. The first
step involves comparison of the real-time signal with many
reference signals and is computationally more intensive than
the second step; hence, although the first step could be repeated
at every instant, it is not tenable for online application, where
the requirement is that the calculation time at every sample must
be less than the sampling interval . New signal comparison
strategies that extend XTWSP and DLA are needed for these
purposes.
3.1. Flanking Strategy for DLA. The DLA provides an
efficient way to identify the locus: a sub-segment of a long
signal that best matches another (short) signal. Although DLA
can be directly used for optimal reference signal identification,
because its computational cost is proportional to the length of
the two signals (real-time and reference), its efficiency would
continually decrease as the real-time signal evolves and becomes
longer. The flanking strategy bounds the length of the signal
used for locus identification by decomposing the real-time
signal. Flanking segments, signal segments of fixed length from
the beginning and end of the signal, are used for this purpose.
Consider a signal X ) {x1, x2, ..., xm}, where m g 2. In the
flanking strategy, X is decomposed into three segments: the
anterior flanking segment, the core segment, and the posterior
flanking segment. The anterior flanking segment of X is defined
as its first points, that is, XA ) {x1, x2, ..., x} where is the
flank length. Similarly, the posterior flanking segment of X is
defined as the last points, that is, XP ) {xm-+1, xm-+2, ...,
xm}. The inner m - 2 points of X comprise the core segment
as illustrated in Figure 4 for a sample signal.
The flanking segments of the signal can be used to identify
the locus of X efficiently based on the recognition that any locii
of X should also have segments that have high similarity with
the flanking segments XA and XP. The flanking strategy exploits
this property. Given a long reference signal Y, all the segments
in Y, say YA, that closely match XA can be identified. Similarly,
all adequate matches for XP, say YP, can also be identified. Note
that the lengths of YA and YP may not be equal to the flank
length due to run-to-run variations between the real-time and
the reference signals. Each pair of YA and YP where YA precedes
YP in Y can be used to construct a unique composite segment Z
sandwiched by YA and YP, Z Y. Each composite segment is
a possible locus of X. By eq 9, the locus of X in Y is the
composite segment Z* which has the least difference with T.
The flanking strategy is thus a generalization of the DLA for
identifying the locus of a longer signal. It recognizes that the
computational complexity of any signal comparison method
depends directly on the length of the two signals to be compared.
The flanking strategy is computational efficient because the two
flanking segments are short; therefore, the cost of the first phase
of comparisons with all the reference signals is small and
4538
Figure 8. Snapshot at T ) 226 of signal comparison between T and R.
Figure 9. Snapshot at T ) 682 of signal comparison between T and R.
bounded by . In general, any method can be used to identify

the locii YA and YP; we use DLA as described in section 3.4.
Also, any method can be used to compare the composite
segments Z with X; we use XTWSP for this purpose.
3.2. Anchoring Strategy for Time Warping. The anchoring
strategy also seeks to improve the computational efficiency of
comparing a signal that is evolving in time. Consider X ) {x1,

x2, x3, ..., xm}, the last m samples from an online sensor, and Y
) {y1, y2, y3, ..., yn}, a long reference signal. Let a segment of
Y, say {y1, ..., yj} be known to match{x1, x2, x3, ..., xm}. The
base point at time m is defined as the point in the Y that
corresponds to xm. In this case, the base point at time m is yj.
Figure 10. Comparison of real-time signal T at T ) 685 with R1 (shown in part b) reveals a minima at R1 ) 185 (shown in part c). Similar comparison
with R2 depicted in part d shows a minimum at R2 ) 1087 as shown in part e.
The portion of X used for comparison is called the eValuation

window and notated by the subscript w. The matching segment
in Y is also notated by the subscript w. Thus, if the entire X is
used for comparison, Xw ) {x1, ..., xm} and Yw* ) {yl, ..., yj}.
When a new sample xm+1 becomes available from the online
sensor, Xw is extended by appending xm+1, and the base point
has to be recalculated. Any time warping method can be used
for this purpose. As mentioned above, the computational expense
of signal comparison increases with signal length; therefore, as
new samples are obtained, it takes ever-increasing time to
calculate the base point because the evaluation window increases
monotonically.
The anchoring strategy provides a systematic way of trimming
the evaluation window without sacrificing accuracy. It relies
on the insight that the entire X does not need to be compared
with Y at every instant. If at time m, {x1, ..., xm} has matched
{y1, ..., yj}, then any future divergence of X from Y can be
detected using only recent observations of X in the evaluation
window.
Anchor points are 2-tuples from X and Y that are known to
correspond. In the above, (x1, y1) can be considered an anchor
point. In general, any pair of singular points in X and Y, say
XSP and YSP, that have been linked by dynamic programming
can be used as anchor points. Also, if tXSP and tYSP are the times
of occurrence of XSP and YSP, respectively, then two nonoverlapping segments can be differentiated in X comprising the
observations {x1, ..., xtXSP-1} before the anchor point and the
ones after, {xtXSP, ..., xm}. Similarly, Y is split by the anchor
point into two segments: {y1, ..., ytYSP-1} and {ytXSP, ..., yn}.
Denoting the difference between X and Y by (X, Y),
(X, Y) ) ({x1, ..., xm}, {y1, ..., yj})
(12)
Using the linear additivity property,
(X, Y) ) ({x1, ..., xtXSP-1}, {y1, ..., ytYSP-1}) +

({xtXSP, ..., xm}, {ytXSP, ..., yn}) (13)
Because XSP and YSP are linked singular points, ({x1, ...,
xtXSP-1}, {y1, ..., ytYSP-1}) 0 and (X, Y) ({xtXSP, ..., xm},
{ytXSP, ..., yn}). So, for updating the base point, the evaluation
window can be shortened to begin from the anchor point, Xw )
{xtXSP, ..., xm} and Yw* ) {ytXSP, ..., yj}. The proposed online signal
comparison approach uses the anchoring and flanking strategies
for real-time signal tracking and optimal reference signal
identification sub-problems as described next.
3.3. Real-Time Signal Tracking Using Anchoring Strategy.
This step uses the optimal reference signal R* calculated a priori
(see section 3.4) and seeks to confirm that the process continues
to operate in the same state (i.e., same reference signal). It thus
calls only for resynchronization of the real-time signal T based
on the additional sample with R*. In the following, for ease of
4540
Figure 11. Schematic of the Tennessee Eastman19 process with control system.
Table 2. Disturbance Profiles for TE Process (a) XD1 (b) XD2 (c) XD3 (d) XD4 (e) XD5.
target
time
(min)
target
XD1-A
XD1-B
XD1-C
1.20base value
1.15base value
1.10base value
180
240
300
1.40base value
1.35base value
1.30base value
XD2-A
XD2-B
XD2-C
1.03base value
1.025base value
1.02base value
180
240
300
XD3-A
XD3-B
XD3-C
1.05base value
1.045base value
1.04base value
XD3-A
XD3-B
XD3-C
XD5-A
XD5-B
XD5-C
time
(min)
target
time
(min)
target
time
(min)
(a) XD1
190
254
318
1.60base value
1.55base value
1.50base value
200
268
336
1.0base value
1.0base value
1.0base value
780
900
1020
1.05base value
1.045base value
1.04base value
(b) XD2
190
254
318
1.07base value
1.065base value
1.06base value
200
268
336
1.0base value
1.0base value
1.0base value
1020
1080
1200
180
240
300
1.10base value
1.09base value
1.08base value
(c) XD3
190
254
318
1.15base value
1.135base value
1.12base value
200
268
336
1.0base value
1.0base value
1.0base value
780
900
1020
1.05base value
1.045base value
1.04base value
180
240
300
1.10base value
1.09base value
1.08base value
(d) XD4
190
254
318
1.15base value
1.135base value
1.12base value
200
268
336
1.0base value
1.0base value
1.0base value
780
900
1020
0.95base value
0.955base value
0.96base value
180
240
300
0.90base value
0.91base value
0.92base value
(e) XD5
190
254
318
0.85base value
0.865base value
0.88base value
200
268
336
1.0base value
1.0base value
1.0base value
780
900
1020
explanation, the time variable for the real-time signal is denoted

as T and that of reference signal R as R. The two signals can
be compared starting from the beginning (i.e., T ) 1);
alternatively a smaller evaluation window can be used on the
basis of the anchoring strategy described above. We pursue the
latter option for computational efficiency purposes.
Consider the real-time signal T ) {t1, ..., ti, ..., tm-1}. The
evaluation window is defined as per the anchoring strategy based
on singular points. Let the last singular point in T be at time T

SP
) SP
T and let the value of T at that time be ttT . The
corresponding singular point in R* can be obtained using
XTWSP. These singular points in T and R* for the anchor point
of the evaluation window. So, the evaluation window Tw at T
) m - 1 is Tw ) {ttTSP, ttTSP+1, ..., tm-1}. Let the corresponding
segment of R* that matches Tw be R/w. When a new sample tm
becomes available at T ) m, Tw is updated, and the task in
Figure 12. Three runs of XD1 in the TE case study with different magnitudes and duration.
real-time signal tracking is to update the base point and R/w

through resynchronization as well as confirm that R/w continues
to match Tw. In our approach, this is achieved in two steps, as
shown in Figure 2:
Step A: Efficient Calculation of the Difference between
the Real-Time Signal and the Reference Signal. Any signal
comparison method can be used for calculating the difference
between Tw and R/w. We use XTW for this purpose because of
its advantageous time and space requirements. The normalized
time-warped distance between Tw and R/w is calculated as
follows:
m
(Tw, R/w) )
|rj(i) - ti|
m-
SP
T
i)TSP
of XTW comes at a price: a large (Tw, R/w) does not

guarantee that Tw is dissimilar from R/w because the local
greedy search in XTW can lead to an overestimate of the
difference when signals become long.1 Such artifacts can be
eliminated by a more accurate calculation using XTWSP, when
necessary.
Step B: Accurate Calculation of the Difference between
the Real-Time Signal and the Reference Signal. XTWSP links
the singular points in the real-time and reference signals and
thus calculates an accurate difference between Tw and R/w. So
if eq 15 is not satisfied, a more accurate difference is calculated
using XTWSP, and a condition analogous to eq 15 is evaluated
to confirm divergence of T from R*.
(14)
((Tw, R/w) < max) and ((tm, rR*) < 2 max) (16)
The numerator is the XTW difference between Tw and R/w

while the denominator is the length of the evaluation window.
The difference between the latest real-time sample tm and its
warping assignment in R* as per XTW, notated as
(tm, rR*),
is also calculated. If
Here, (Tw, R/w) is the difference between Tw and R/w while (tm, rR*) is the distance between tm and the base point as
calculated using XTWSP. Note that (Tw, R/w) e (Tw, R/w)
because XTWSP relies on singular point linkage. If eq 16 is
satisfied, the process is considered to continue in the same state
(i.e., no change to the reference signal), and the base point is
updated. One byproduct of performing XTWSP is that new
singular points could have been identified in Tw and linked with
R*. The last singular point in T and its corresponding linked
singular point in R* are subsequently used as the new anchor
point which results in the shortening of the evaluation window.
So, future Step A and Step B calculations become more efficient
and accurate. If condition 16 is not satisfied, the process is
considered to have moved to a new state, and another optimal
+1
( (Tw, R/w) < max) and (

(tm, rR*) < 2 max) (15)
is satisfied, the process is considered to continue in the same
stage of operation, and only the base point R* is updated. The
first condition ensures that the broad overall trend of the
reference and real-time signals is the same while the second
condition allows for larger local variations in the signal due to
noise or run-to-run differences. The computational efficiency
4542
Figure 13. Schematic of the lab-scale distillation column.
reference signal has to be identified, starting at T ) m as the

point of divergence POD
T .
3.4. Optimal Reference Signal Identification. The task in
this stage is to identify the reference signal R* that best matches
the state of the process at time tm. This would be required at T
) 1 when the reference signal is not known and when the realtime signal has diverged from the previous reference (i.e., eq
16 is not satisfied). Consider a diVergent segment of T ranging
from the point of divergence to the latest sample:
TD )
{tTPOD, ..., tm}
if m g POD
+
T
{tm-+1, ..., tm} if e m < POD

+
T
(17)
Note that POD

) 1 at T ) 1. The process state is identified by
T
comparing the divergent segment with all the reference signals
in the library (K). Consider a reference signal R ) {r1, r2, r3,
..., rn} from K with time index j. Let (TD, R) be the normalized
difference between TD and R. The optimal reference signal R*
is defined as
R* ) arg min ((TD, R))

RK
(18)
A tradeoff exists between the speed and accuracy of reference

signal identification. If the divergent segment used for identify-
ing R* is small, several good matches may exist in K. We

quantify the accuracy of the located optimal reference signal in
terms of the inseparability ratio R, defined as the ratio of the
normalized difference of the best matching reference signal to
that of the second-best one:2
R)
(TD, R*)
min
RK,R*R*
((TD, R))
(19)
The inseparability ratio thus reflects the uniqueness of R*.

Following Srinivasan and Qian,2 the criterion for the optimal
reference signal at T ) m with base point R* is
((TD, R*) < max) and (R < Rmax)
(20)
where the first condition ensures low difference between TD

and R* while the latter ensures its uniqueness. If eq 20 is not
satisfied at T ) m, an unknown state is flagged, and the
calculation of R* and R* is repeated when the next real-time
sample becomes available at T ) m + 1.
Next, we describe how the difference (TD, R) is calculated.
Because the suitable start and endpoints of TD in R are not
known a priori, the locus of TD in R has to be first calculated.
The DLA can be used for this purpose but as described in section
Figure 14. Process signals from Run-03 of the lab-scale distillation column.
3.1 it is efficient as long as the length of the divergent segment

is small. The flanking strategy proposed earlier becomes
necessary if the divergent segment becomes long. We therefore
consider two different phases as shown in Figure 3.
En Bloc Comparison Phase. When the divergent segment
< 2, and can be used in its entirety,
is small, that is, m - POD
T
DLA is directly used to identify the locus of the divergent
segment. The normalized difference is calculated as
m
(TD, R) )
|rj(i) - ti|
i)m-+1
(21)
where (i, j) is the warping assignment between TD and its locus

in R. As mentioned above, if eq 20 is not satisfied and the
optimal reference signal cannot be clearly determined, comparison is repeated when the next real-time sample becomes
available. Thus the length of the divergent segment would
increase with time. Once the length of TD becomes large, DLA
becomes computationally expensive. We minimize the computational requirement for such cases using the flanking-based
comparison strategy.
Flanking Phase. When the divergent segment is large, m POD
g 2 and the flanking strategy becomes necessary. Short
T
flanking segments from the start and end of TD provide the basis
for identifying the locus of TD in R. The anterior flanking
segment of the divergent segment is XA ) {tTPOD, tTPOD+1, ...,
tTPOD+-1} and the posterior flanking segment is XP ) {tm-V+1,
tm-+2, ..., tm}; thus, TD is sandwiched by the flanking segments.
DLA is used to identify all the matching segments YA in R

such that (XA, YA) < max. [One significant point needs to be
noted here. In contrast to the original DLA where only the best
matching segment is identified, here several candidate segments
may be identified. To optimize the calculation, candidates that
are not at least one singular point away from a better candidate
(in terms of smaller ) are rejected, so as to yield a wellseparated set of candidate segments (see Qian17).] The same
criterion is applied to obtain all reference signal segments YP
that match XP. Each pair of YA and YP defines a different
composite segment in R. Because in general these composite
segments can be long, XTWSP is used to synchronize them with
TD. The difference between TD and R is then calculated as
follows:
m
(TD, R) )
|rj(i) - ti|
i)TPOD
m - POD
+1
T
(22)
Figure 5 shows how the en bloc comparison and the flanking

phase are integrated in the proposed method for reference signal
identification. When the process moves away from the previous
state as indicated by eq 16, the immediately preceding points
{tT-+1, ..., tT} are used as the divergent segment and en bloc
comparison performed to identify the reference signal. If the
optimal reference signal, as per eq 20, cannot be identified by
T ) POD
+ 2 the divergent segment is split into anterior and
T
posterior flanking segments, each of length , and the flanking
4544
Table 3. Offline Signal Difference between Different Disturbance Instances in the TE Process Using Direct Comparison (10-1)
XD1-A
XD1-B
XD1-C
XD2-A
XD2-B
XD2-C
XD3-A
XD3-B
XD3-C
XD4-A
XD4-B
XD4-C
XD5-A
XD5-B
XD5-C
XD1-A
XD1-B
XD1-C
XD2-A
XD2-B
XD2-C
XD3-A
XD3-B
XD3-C
XD4-A
XD4-B
XD4-C
XD5-A
XD5-B
XD5-C
0.0686
0
0.074
0.0629
0
0.2228
0.2235
0.2241
0
0.2052
0.2028
0.2009
0.1446
0
0.1983
0.1953
0.1885
0.166
0.11
0
0.1735
0.1735
0.1847
0.2327
0.2631
0.2465
0
0.1647
0.1564
0.1587
0.2309
0.2342
0.2424
0.1991
0
0.161
0.1476
0.1383
0.1892
0.2413
0.2157
0.1816
0.1736
0
0.3631
0.3739
0.3694
0.4558
0.4555
0.4556
0.4143
0.4482
0.4294
0
0.3589
0.3483
0.357
0.4544
0.4219
0.4351
0.4318
0.3953
0.4175
0.2139
0
0.3443
0.3387
0.3278
0.4107
0.4247
0.4001
0.4113
0.4007
0.3697
0.2909
0.2003
0
0.0977
0.1136
0.1114
0.2004
0.1914
0.1875
0.1591
0.176
0.1681
0.3386
0.3616
0.3589
0
0.1108
0.0903
0.1018
0.2015
0.1791
0.1786
0.1718
0.1415
0.1521
0.3605
0.3262
0.337
0.1005
0
0.1054
0.1023
0.0824
0.1739
0.1801
0.1624
0.1626
0.1565
0.124
0.3812
0.3425
0.3063
0.0868
0.0864
0
Table 4. Disturbance Identification during Run-4 of the TE Case

Study
R1
R3
R4
R5
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
0.0501
0.0592
0.0659
0.0693
0.0723
0.0755
0.0782
0.0779
0.0773
0.0767
0.0760
0.0752
0.0745
0.0737
0.0729
0.0511
0.0593
0.0656
0.0654
0.0604
0.0556
0.0510
0.0506
0.0508
0.0507
0.0506
0.0504
0.0500
0.0505
0.0504
0.0514
0.0589
0.0654
0.0572
0.0408
0.0240
0.0072
0.0070
0.0068
0.0065
0.0064
0.0061
0.0061
0.0060
0.0060
0.0512
0.0581
0.0647
0.0683
0.0702
0.0723
0.0742
0.0739
0.0732
0.0729
0.0723
0.0716
0.0712
0.0707
0.0703
0.9804
0.9864
0.9970
0.8746
0.6755
0.4317
0.1412
0.1383
0.1339
0.1282
0.1265
0.1210
0.1220
0.1188
0.1190
strategy is used. In this flanking phase of reference identification,

only the posterior flank is translated when a new sample is added
to the divergent segment. The anterior flanking segment remains
anchored at XA ) {tTPOD, tTPOD+1, ..., tTPOD+-1}. The length of
the divergent real-time signal increases with time; however, the
optimal reference identification is calculated in a computationally efficient fashion suited for online use. A detailed illustration
is given next to explain the above-described signal comparison
algorithm.
3.5. Illustrative Example. Consider the test signal T and the
two reference signals R1 and R2 shown in Figure 6. Online data
have been collected starting at T ) 1. Because the optimal
reference signal is unknown initially, it has to be identified first
following the description in section 3.2. POD
) 1, and signal
T
comparison starts at T ) 8 () ). The en bloc difference
calculation strategy is first used for finding the locus in reference
signals R1 and R2 since the signal length at T ) 8 is less than
2. At T ) 8, the DLA difference for R1 and R2 are as shown
in Figure 7. DS(, R1) has a clear minimum at R1 ) 199 and
(T, R1) ) 0.0151 (<max ) 0.05). The locus for the divergent
segment in R2 is at R2 ) 1083 and (T, R2) ) 0.0836 (> max).
Therefore, R ) 0.1810 (< Rmax ) 0.70). So at T ) 8, R1 is
confirmed to be the optimal reference signal with R1 ) 199 as
the base point. From the next sample, T ) 9, real-time signal
tracking as described in section 3.3 is performed to confirm
that T progresses as per reference signal R1.
A snapshot of the real-time signal tracking at T ) 226 is
shown in Figure 8. At this time, the anchor for comparison
between T and R1 is the previous corresponding singular points
SP
with SP
T ) 164 and R1 ) 359. The base point is at R1 ) 413.
Therefore, the evaluation window Tw ) {t164, ..., t226} is

compared with R/w ) {r359, ..., r413} using XTW. (Tw, R/w) )
0.0180 (< max) and
(tm, rR*) ) 0.0086 (<2 max), so
condition 14 holds and tracking can continue. Online tracking
proceeds similarly until T ) 682 when (Tw, R/w) ) 0.0217,
but
(tm, rR*) ) 0.3666 (>2 max), so eq 15 is violated.
XTWSP is then used for accurate calculation of the difference
between Tw and R/w. (Tw, R/w) ) 0.0213 and (tm, rR*) )
0.3633 (>2 max), so eq 16 is violated too. This confirms
that the real-time signal is no longer similar to R1 and the
) 682 (see
reference signal has to be re-identified with POD
T
Figure 9).
The divergent segment TD ) {t675, t676, ..., t682} is used to
find the new reference signal through DLA. Initially, m - POD
T
< 2, so en bloc difference calculation is performed. At T )
682, (TD, R2) ) 0.0641, R ) 0.9596, and eq 20 does not hold,
so the reference signal cannot be conclusively identified.
Comparison is therefore repeated when subsequent samples
becomes available. As shown in Figure 10, at T ) 685, (TD,
R2) ) 0.0388 (< max), and R ) 0.6039 (< Rmax). So the new
reference signal is confirmed to be R2 with R2 ) 1082 as the
anchor point and R* ) 1087 as the base point.
In the next section, we evaluate the proposed online signal
comparison on two case studies.
4. Case Studies
4.1. Case Study 1: Online Disturbance Identification in
the Tennessee Eastman Process. The Tennessee Eastman (TE)
process19 is a popular test bed for process control, fault
diagnosis, and signal comparison. In this section, data from this
simulated plant are used to test the accuracy of the proposed
method. The TE process produces two products (G and H) and
a byproduct (F) from reactants A, C, D, and E. The process
flowsheet is shown in Figure 11. The process has five units: a
two-phase reactor, a product condenser, a flash separator, a
recycle compressor, and a product stripper. There are 53
variables in the TE plant: 22 of these are process measurement
variables, 19 are component compositions, and 12 are processmanipulated variables. The closed-loop process simulator used
here was developed by Singhal20 on the basis of the base control
structure of McAvoy and Ye.21 During the simulation, variable
values are recorded every minute. The 22 process measurements
used in this paper for process state identification are shown in
Table 1.
In this case, we use dynamic signal matching to identify
process disturbances online. Five disturbance classes, called
XD1-XD5, that affect the A feed flowrate, reactor pressure,
reactor level, reactor temperature, and compressor work are
Table 5. Online Process Disturbance Detection in the TE Process
Run-1
Run-2
Run-3
Run-4
Run-5
Run-6
Run-7
Run-8
Run-9
Run-10
Run-11
Run-12
Run-13
Run-14
Run-15
average
disturbance
introduction
time
disturbance
identification
time
bestmatching
reference
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
15
13
13
17
13
16
16
13
13
14
13
17
13
13
13
XD1
XD1
XD1
XD2
XD2
XD2
XD3
XD3
XD3
XD4
XD4
XD4
XD5
XD5
XD5
identification
delay
(sample)
second
disturbance
introduction
time
disturbance
identification
time
bestmatching
reference
6
4
4
8
4
7
7
4
4
5
4
8
4
4
4
5.1333
1020
670
680
1030
1100
420
860
970
1100
820
380
400
500
600
1080
1025
675
684
1034
1107
427
867
974
1105
825
387
407
505
604
1084
XD4
XD4
XD4
XD4
XD5
XD5
XD1
XD1
XD1
XD2
XD2
XD2
XD3
XD3
XD3
Table 6. Standard Operating Procedure (SOP) for Lab-Scale

Distillation Column Startup
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
identification
delay
(sample)
average
time
cost
(CPU s)
average
time cost
of DTW
(CPU s)
5
5
4
4
7
7
7
4
5
5
7
7
5
4
4
5.3333
0.0501
0.0699
0.1233
0.0273
0.0734
0.1289
0.0367
0.0689
0.2509
0.0373
0.0535
0.2024
0.0784
0.0712
0.2115
0.0989
55.3446
55.2385
55.2589
55.2721
55.3044
55.3314
55.3578
55.3446
55.3374
55.2985
55.3051
55.2517
55.2919
55.3051
55.2517
55.2996
Table 7. Process Disturbances During Distillation Column

Operation
distillation column startup SOP
case
disturbance
type
set all controllers to manual

fill reboiler with liquid bottom product
open reflux valve and operate the column on full reflux
establish cooling water flow to condenser
start the reboiler heating coil power
wait for all of the temperatures to stabilize
start feed pump
activate reflux control and set reflux ratio
open bottom valve to collect product
wait for all the temperatures to stabilize
DST01
DST02
DST03
DST04
DST05
DST06
DST07
DST08
DST09
DST10
reboiler power low

reboiler power high
feed pump high
feed pump low
Tray temperature sensor T6 fault
reflux ratio high
reflux ratio low
bottom valve
cooling water
low cooling water flow and
feed pump malfunction
step
step
step
step
random variation
step
step
sticking
slow drift
step
studied here (see Table 2). Different instances (runs) of the same
disturbance class have different start times, duration, and
magnitude. For example, during XD1-A, the flowrate of A feed
from upstream is increased from the base case value of 0.25052
kscmh to 0.3902 kscmh (a 60% change) in three steps starting
at t ) 180 min as shown in Table 2a. After the process recovers
from these, the inverse change, decreasing the A feed flow, is
introduced at t ) 780 min. The process is then allowed to return
to a steady state. The effect on the A flow rate (XMEAS(1))
and the downstream pressures (XMEAS(13) and XMEAS (16))
is shown in Figure 12. Two other instances XD1-B and XD1-C
with changes of magnitude of 55% and 50% were also
introduced. As described in Srinivasan et al.8 similar changes
were introduced to bring forth the other disturbance classes.
One instance from each of the classes, XD1-B, XD2-B, XD3B, XD4-B, and XD5-B, was used to develop the reference
disturbance database for online disturbance identification.
The difficulty in identifying the disturbance online can be
estimated by a preliminary analysis. Table 3 shows the difference between the 15 disturbances calculated as the average
difference among signals. Comparison is made after signals have
been normalized to [0 1] on the basis of the range of the sensor
(see Srinivasan et al.8). As can be seen from the table, the
minimum inter-class distance is 0.0824 (between XD1-C and
XD5-C) and the maximum intra-class distance is 0.2909 (class
XD4). Therefore, difference between the classes by direct
comparison, even if the complete signal is available, is a
nontrivial exercise. In this work, we consider the even more
difficult task of differentiating between the disturbances as they
evolve.
The proposed online signal comparison method is used for
online disturbance identification as follows. Consider Run-4
where the process is in state XD2 until T ) 1030. An unknown
disturbance occurs starting at T ) 1030 which is initially
indicated when (Tw, R/w) ) 0.0028,

(tm, rR*) ) 0.3147, and
eq 14 is violated. An accurate difference is calculated using
XTWSP and (Tw, R/w) ) 0.0022 and (tm, rR*) ) 0.3065.
Because this is larger than the 2 max threshold even after
resynchronization, as per eq 15 it is evident that the real-time
signal does not confirm to reference signal R2 starting from T
) 1030 () point of divergence POD
T ). The disturbance can be
identified by calculating the new optimal reference signal R*
and the base point R* using the divergent segment TD ) {tm-V+1,
tm-+2, ..., tm}. In the first iteration, at time T ) 1030, (T, R1)
) 0.0501, (T, R3) ) 0.0511, (T, R4) ) 0.0514, and (T, R5)
) 0.0512 (see Table 4). Because the values for all the
reference signals are similar (as indicated by the inseparability
ratio R ) 0.9804 > Rmax), R* cannot be identified at this point
and further iterations are necessary. In each subsequent iteration,
as the real-time signal T evolves, the evaluation window is
updated (as shown in Figure 5) and the analysis repeated. As
the disturbance becomes more evident with time, R decreases
(see Table 4), and at T ) 1034, (T, R4) falls below max. The
optimal reference R* is then identified as R4 (i.e., disturbance
XD4). The base point at R* ) 314 in R4 is found to correspond
to T ) 1034. Real-time signal tracking is then resumed for
subsequent samples. The average time cost for this run was
0.0273 CPU seconds with the proposed method as against
55.2721 CPU seconds with DTW as summarized in Table 5
(depicted as the second disturbance in Run-4). This brings out
the computational advantages of the proposed strategy.
Similar disturbance identification studies were performed for
14 other runs. Details of these are presented in Qian17 and only
a summary is presented here. Similar high accuracies were found
in all test runs as shown in Table 5. In each run, two disturbances
were introduced. In all cases, the proposed method correctly
identified the disturbance with an average delay of 5.23 min.
4546
Table 8. Faults Diagnosis Results for Lab-scale Distillation Column Case Study
case
time fault
introduced
(sample)
detection
time
(sample)
Run-01
Run-02
Run-03
Run-04
Run-05
Run-06
Run-07
Run-08
Run-09
Run-10
1
1
359
356
425
350
345
470
1
300
6
6
370
357
426
353
346
472
6
302
detection
delay
(sample)
identification
time
(sample)
5
5
11
1
1
3
1
2
5
2
average
y
(sample)
6
6
371
360
430
355
347
473
6
309
6
6
13
5
6
4
3
4
7
5
3.6
5.9
Table 9. Effect of Noise on Fault Diagnosis Delay in Lab-Scale

Distillation Column Case Study
identification time
(sample)
fault
introduction
time
1%
Run-1
Run-2
Run-3
Run-4
Run-5
Run-6
Run-7
Run-8
Run-9
Run-10
average
1
1
359
356
425
350
345
470
1
300
6
6
371
360
430
354
347
473
6
308
identification delay
(sample)
2%
3%
4%
5%
6
6
371
360
430
355
347
473
6
308
6
6
371
360
430
355
347
473
6
309
6
6
371
360
430
355
347
473
6
309
6 5
5
5
5
5
6 5
5
5
5
5
371 12
12
12
12
12
360 4
4
4
4
4
438 5
5
5
5
13
355 4
5
5
5
5
348 2
2
2
2
3
473 3
3
3
3
3
6 5
5
5
5
5
311 8
8
9
9
11
5.3 5.4 5.5 5.5 6.6
1%
2%
3%
4%
5%
The average time cost for online signal comparison is only

0.0989 CPU seconds (on a Pentium IV, 2.4 GHz cpu) in contrast
to 55.3 s for DTW. This factor of 559 speedup in computation
over DTW clearly shows the efficiency of the proposed method
and illustrates its suitability for large-scale applications.
4.2. Case Study 2: Online Fault Diagnosis during Startup
of a Lab-Scale Distillation Column. In this section, the
proposed methodology is illustrated on a lab-scale distillation
unit. The schematic of the unit is shown in Figure 13. The
distillation column is of 2 m height and 20 cm inner-diameter
and has 10 trays. The feed enters at tray 4. The system is well
integrated with a control console and data acquisition system.
A total of 19 variables comprising all tray temperatures, reboiler
and condenser temperatures, reflux ratio, top and bottom column
temperatures, feed pump power, reboiler heat duty, and cooling
water inlet and outlet temperatures are measured at 10-s
intervals. Cold startup of the distillation column with an
ethanol-water 30% (v/v) mixture is performed following the
standard operating procedure shown in Table 6. The feed passes
through a heat exchanger before being fed to the column. The
startup normally takes 2 h, and different faults such as sensor
fault, failure to open pump, too high a reflux ratio, and so forth
can be introduced at different states of operation. The reference
database is first populated using data from 11 runs of the
process: one normal startup and the 10 faults summarized in
Table 7.
The online signal comparison algorithm was then used for
fault diagnosis and decision support during subsequent startups
of the column. Consider one run (Run-3; Figure 14) when a
fault was introduced at T ) 3590 s when the operators
introduced too large a feed pump flowrate (200 rpm) to the
column. This causes instability in the column resulting in a
drastic drop in the columns temperatures. Results from this
run show that the real-time signal is initially close to normal.
best
matching
reference
DST01
DST02
DST03
DST04
DST05
DST06
DST07
DST08
DST09
DST10
0.0162
0.0097
0.0101
0.0217
0.0109
0.0268
0.0319
0.0100
0.0103
0.0119
0.2817
0.1091
0.3456
0.6663
0.4113
0.2379
0.6940
0.2330
0.1582
0.6910
0.0156
0.3828
identification
delay (sample)
5
5
12
4
5
5
2
2
5
9
5.4
time cost
(s)
0.1716
0.1028
0.0317
0.0342
0.0256
0.0368
0.0264
0.0256
0.0556
0.0840
0.0594
The difference between the real-time signal and all other

references is much higher; R ) 0.2235 at T ) 6. Starting from
t ) 3700 s, the difference between the real-time signal and the
normal reference increases, indicating that there is a fault during
the startup operation. The difference between the real-time
signal and DST03 is less than max (0.05). Also, the R falls
below Rmax (0.70), so the fault is identified as DST03.
Similar tests were done for all other cases. In the interest of
space, only a summary of the findings is presented in Table 8.
Faults in all the test runs were correctly identified with an
average delay of 3.6 samples (and maximum detection delay
of 11 samples for Run-03). All faults could be accurately
identified within an average of 5.4 samples of their occurrence.
The maximum identification delay was about 12 samples for
Run-03. The average R at the time of identification is 0.3828
against the Rmax threshold of 0.7, which shows the clear
identification of the faults. The average computation time cost
at each sample was 0.0594 s, which is much less than the
sampling rate of 10 s. The proposed method is therefore clearly
suitable for online fault diagnosis in this case as well.
4.3. Robustness and Parameter Tuning. In this section, the
robustness of the proposed online signal comparison method is
studied. Varying amount of noise levels were added to the online
signal to investigate robustness to noise. The affect of the tuning
parameter settings on online signal comparison was also studied.
The interested reader is referred to Srinivasan and Qian1,2 for
robustness studies of singular point detection and DLA.
The robustness of the proposed method to sensor noise is
reported in this part using data from the lab-scale distillation
column case study. Additional measurement noises ranging from
1% to 5% were added to all the original signals and fault
diagnosis performed. The results are shown in Table 9. As has
to be expected, there is an increase in the fault identification
delay with increasing noise; however, the effect is minimal with
the average delay increasing from 5.3 samples to 6.6 samples.
A similar study was performed for the TE case study as well
(see Table 10), and the average identification delay increased
from 5.1333 samples to 17.7333 samples at 5% noise level.
This larger variation in the TE case arises from the inherent
complexity and larger noise levels in the base process. Overall,
the proposed method online signal comparison method is robust
to noise.
The proposed method uses two tunable parameters: Rmax and
max. While in the general case different process signals may
require different values of these parameters, we have found that
the same parameter settings can be used across variables and
case studies. The results of decreasing Rmax from 0.80 to 0.60
for the distillation column as well as the TE case studies show
that Rmax has no significant affect on fault detection delay. A
smaller Rmax would require a clearer separation between the
Table 10. Effect of Noise on Disturbance Identification in TE Case Study
identification time (sample)
Run-1
Run-2
Run-3
Run-4
Run-5
Run-6
Run-7
Run-8
Run-9
Run-10
Run-11
Run-12
Run-13
Run-14
Run-15
average
identification delay (sample)
disturbance
introduction time
1%
2%
3%
4%
5%
1%
2%
3%
4%
5%
181
241
301
181
241
301
181
241
301
181
241
301
181
241
301
187
245
305
189
245
308
188
245
305
186
245
309
185
245
305
192
246
306
196
244
316
199
246
317
191
253
314
188
245
313
194
245
313
198
254
317
201
249
318
205
253
314
196
247
315
194
245
313
201
256
323
195
256
326
220
253
315
193
247
314
199
246
313
205
265
316
203
255
329
220
253
314
198
249
316
6
4
4
8
4
7
7
4
4
5
4
8
4
4
4
5.1333
11
5
5
15
3
15
18
5
16
10
12
13
7
4
12
10.0667
13
4
12
17
13
16
20
8
17
24
12
13
15
6
14
13.6000
13
4
12
20
15
22
14
15
25
39
12
14
12
6
13
15.7333
18
5
12
24
24
15
22
14
28
39
12
13
17
8
15
17.7333
Table 11. Effect of rmax on Identification Delay in the TE Case Study

Run-1
Run-2
Run-3
Run-4
Run-5
Run-6
Run-7
Run-8
Run-9
Run-10
Run-11
Run-12
Run-13
Run-14
Run-15
average
disturbance
introduction time
Rmax )
0.80
Rmax )
0.75
Rmax )
0.70
Rmax )
0.65
Rmax )
0.60
Rmax )
0.80
Rmax )
0.75
Rmax )
0.70
Rmax )
0.65
Rmax )
0.60
181
241
301
181
241
301
181
241
301
181
241
301
181
241
301
187
245
305
189
245
304
186
245
305
185
245
305
184
245
305
187
245
305
189
245
305
187
245
305
185
245
308
184
245
305
187
245
305
189
245
308
188
245
305
186
245
309
185
245
305
187
245
311
190
245
313
189
249
315
186
247
313
185
245
313
187
245
313
191
267
313
189
249
316
186
247
327
185
282
355
6
4
4
8
4
3
5
4
4
4
4
4
3
4
4
4.3333
6
4
4
8
4
4
6
4
4
4
4
7
3
4
4
4.6667
6
4
4
8
4
7
7
4
4
5
4
8
4
4
4
5.1333
6
4
10
9
4
12
8
8
14
5
6
12
4
4
12
7.8667
6
4
12
10
26
12
8
8
15
5
6
26
4
41
54
15.8000
Table 12. Effect of rmax on Identification Delay in Lab-scale Distillation Column Case Study
Run-1
Run-2
Run-3
Run-4
Run-5
Run-6
Run-7
Run-8
Run-9
Run-10
average
fault introduction time
Rmax )
0.80
Rmax )
0.75
Rmax )
0.70
Rmax )
0.65
Rmax )
0.60
Rmax )
0.80
Rmax )
0.75
Rmax )
0.70
Rmax )
0.65
Rmax )
0.60
1
1
359
356
425
350
345
470
1
300
6
6
371
359
430
355
347
473
6
308
6
6
371
359
430
355
347
473
6
308
6
6
371
360
430
355
347
473
6
309
6
6
371
360
432
355
347
473
6
310
6
6
371
361
432
355
348
473
6
311
5
5
12
3
5
5
2
3
5
8
5.3
5
5
12
3
5
5
2
3
5
8
5.3
5
5
12
4
5
5
2
3
5
9
5.5
5
5
12
4
7
5
2
3
5
10
5.8
5
5
12
5
7
5
3
3
5
11
6.1
reference signal and the optimal reference signal before a fault

is confirmed. This would lead to a delay in fault identification.
The average identification delay for the TE case changed from
4.333 to 15.80 samples when Rmax was reduced from 0.80 to
0.60 (see Table 11) while for the distillation column case study
the delay increased from 5.3 to 6.1 samples (see Table 12).
The extent of robustness of Rmax is further revealed by the
fact that for the distillation column case study, even setting
Rmax ) 0.30 results in an average identification delay of only
9.1 samples. A similar result was obtained for max as well
(see Qian17 for details). Overall these results clearly establish
the robustness of the proposed method to the two tuning
parameters.
5. Summary
Online signal comparison is important for process monitoring,
fault diagnosis, and process state identification. In this paper,
we have proposed a signal comparison-based strategy for online
disturbance or fault identification. Given a suitably annotated
historical database of process states, normal and abnormal, the
proposed method finds the best matching state at any given time
by comparing the real-time sensor measurements with the signals
in the database. In contrast to signal comparison strategies
reported in literature, which are designed for offline signal
comparison, the proposed method does not require any a priori
knowledge about the online signal; specifically, the beginning
and end of the real-time signal do not need to coincide with
4548
those of the library signals. The endpoints of the two signals

are synchronized automatically using the DLA methodology.
DLA is inherently computationally efficient when the real-time
signal is small; the flanking strategy proposed here reduces the
search complexity tremendously when a long segment of the
real-time signal has to be compared. The real-time signal
tracking strategy based on XTW and XTWSP further reduces
the computational load required when the process essentially
follows a previously determined reference signal. These endow
the main advantage of the proposed method, which is that it is
significantly faster in comparison with other time warping
methods. This has been illustrated clearly using two different
case studies: disturbance identification in the Tennessee Eastman challenge plant and fault online diagnosis during startup
of a lab-scale distillation column. As shown in section 4, the
method is also robust to noise as well as parameter settings.
The time warping-based methods proposed here are inherently
multivariate, although they have been used in a uni-variate form
in the case studies. The choice between multivariate2 versus
univariate1 signal comparison is based on the features of the
application. Multivariate time warping relies on the premise that
there is no desynchronization between variables so that the same
warping can be applied to all the variables. This premise is not
valid in processes undergoing transitions where the inherent
variability of the manual operations would lead to singular points
and inflections in the different variables occurring at different
times. In such cases, additional process-specific logic can be
used to synchronize cross variable differences.
Notation
i, j Time index
DA(i, j) ) minimum accumulated DTW distance from point
(1, 1) to point (i, j)
DS ) DLA dissimilarity matrix between signal X and signal Y
K ) collection of reference signals
R ) reference signal R ) {r1, r2, ..., rj, ..., rn}
R* ) optimal reference signal
R/w ) locii of Tw in R*
T ) real-time signal T ) {t1, t2, ..., ti, ...tm}
TD ) divergent segment of T
Tw ) evaluation window in T
X ) signal X ) {x1, x2, ..., xi, ..., xm}
XA ) anterior flanking segment
XP ) posterior flanking segment
Y ) signal Y ) {y1, y2, ..., yj, ...yn}
YA ) matching segment of XA
YP ) matching segment of XP
Z ) segment of signal Y, {yl, yl+1, ..., yj}
(xi, yj) ) difference between xi and yj
(tm, rR*) ) difference between tm and its warping assignment

in R*
R ) inseparability ratio of reference signals 0 e R e 1
Rmax ) minimum inseparability threshold
V ) flank length
(Tw, R/w) ) normalized difference between Tw and R/w
(Tw, R/w) ) approximate XTW difference between Tw and R/w
max ) threshold for signal similarity
R ) time index in R
T ) time index in T
) time index of point of divergence in T
POD
T
SP
T ) time index of last singular point in T
Literature Cited
(1) Srinivasan, R.; Qian, M. S. Offline temporal signal comparison using
singular points augmented time warping. Ind. Eng. Chem. Res. 2005, 44
(13), 4697-4716.
(2) Srinivasan, R.; Qian, M. S. Online fault diagnosis and state
identification during process transition using dynamic locus analysis. Chem.
Eng. Sci. 2006, 61, 6109-6132.
(3) Webb, A R. Statistical pattern recognition; Wiley: West Sussex,
U.K., 2002.
(4) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault detection and
diagnosis in industrial systems; Springer: London, New York, 2001.
(5) Krzanowski, W. J. Between-group comparison of principal components. J. Am. Stat. Assoc. 1979, 74 (367), 703-707.
(6) Raich, A.; Cinar, A. Diagnosis of process disturbances by statistical
distance and angle measures. Comput. Chem. Eng. 1997, 21 (6), 661-673.
(7) Singhal, A.; Seborg, D. E. Pattern matching in historical batch data
using PCA. IEEE Control Syst. Mag. 2002, 22 (5), 53-63.
(8) Srinivasan, R.; Wang, C.; Ho, W. K.; Lim, K. W. Dynamic PCA
based methodology for clustering process states in agile chemical plants.
Ind. Eng. Chem. Res. 2004, 43 (9), 2123-2139.
(9) Cheung, J. T-Y.; Stephanopoulos, G. Representation of process trends
- Part I. A formal representation framework. Comput. Chem. Eng. 1990,
14 (4/5), 495-510.
(10) Bakshi, B. R.; Stephanopoulos, G. Representation of process trends
- IV. Induction of real-time patterns from operating data for diagnosis
and supervisory control. Comput. Chem. Eng. 1994, 18 (4), 303-332.
(11) Rengaswamy, R.; Venkatasubramanian, V. A syntactic pattern
recognition approach for process monitoring and fault diagnosis. Eng. Appl.
Artif. Intell. 1995, 8 (1), 35-51.
(12) Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S. N.; Yin, K.
A review of process fault detection and diagnosis Part III: Process history
based methods. Comput. Chem. Eng. 2003, 27, 327-346.
(13) Maurya M. R.; Rengaswamy, R.; Venkatasubramanian, V. Fault
diagnosis using dynamic trend analysis: A review and recent developments.
Eng. Appl. Artif. Intell. 2007, 20, 133-146.
(14) Sundarraman, A.; Srinivasan, R. Monitoring transitions in chemical
plants using enhanced trend analysis. Comput. Chem. Eng. 2003, 27 (10),
1455-1472.
(15) Sankoff, D.; Kruskal, J. B. Time Warps, String Edits, and
Macromolecules: The Theory and Practice of Sequence Comparison;
Addison-Wesley: Reading, MA, 1983.
(16) Kassidas, A.; MacGregor, J. F.; Taylor, P. A. Synchronization of
batch trajectories using dynamic time warping. AIChE J. 1998, 44 (4), 864875.
(17) Qian, M. S. Ph.D. thesis, National University of Singapore,
Singapore, 2004.
(18) Waterman, M. S.; Eggert, M. A New Algorithm for Best Subsequence Alignments with Application to tRNA-rRNA Comparisons. J. Mol.
Biol. 1987, 197, 723-728.
(19) Downs, J. J.; Vogel, E. F. A plant-wide industrial process control
problem. Comput. Chem. Eng. 1993, 17 (3), 245-255.
(20) Singhal, A. Tennessee Eastman simulation model. http://www.
chemengr.ucsb.edu/ceweb/computing/TE/tesimulation.htm
(accessed
2003).
(21) McAvoy, T. J.; Ye, N. Base control for the Tennessee Eastman
problem. Comput. Chem. Eng. 1994, 18 (5), 383-413.
ReceiVed for reView January 25, 2006

ReVised manuscript receiVed March 26, 2007
Accepted April 2, 2007
IE060111S

Online Temporal Signal Comparison

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Online Temporal Signal Comparison

Uploaded by

Copyright:

Available Formats

Ind. Eng. Chem. Res.

2007, 46, 4531-4548

Online Temporal Signal Comparison Using Singular Points Augmented Time

National University of Singapore.

Institute of Chemical and Engineering Sciences.

determined. The challenge, however, is that, as a result of the

10.1021/ie060111s CCC: $37.00 2007 American Chemical Society

Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007

Figure 1. Typical signal and its singular points.

their low computational requirements. Their main shortcomings

of the process to overcome the latter shortcoming. However, it

F* ) {c(1), c(2), ..., c(p), ..., c(P)}

d[i(p), j(p)] w(p)

D*(r, t) ) min[D(r, t)]

where (i(p), j(p)) is the warping assignment at step p, d[i(p)

Local constraints determine local features around this point. For

Figure 2. Algorithm for real-time signal tracking.

j - 1), and (i - 1, j - 2) as the set of predecessors to (i, j) and

where DA(1, 1) ) d(1, 1) and condition (A*) indicates that the

an anchoring strategy for updating the signal comparison when

Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007

Figure 3. Algorithm for optimal reference signal R* identification.

Figure 4. Flanking segments used for reference signal identification.

important criterion for linking two sets of singular points is that

Three possible successors, (i + 1, j), (i + 1, j + 1), and (i + 1,

DA(i + 1, j*) ) min

with initial condition DA(1, 1) ) d(1, 1). Condition (B*)

Table 1. TE Process Measurements and Their Base Values

using dynamic programming. The global optimality of the

Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007

real-time signal comparison where the online signal is available

D*(X, Z*) ) min{D*(X, Z)}

with respect to the reference signal. The former is called the

Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007

Figure 8. Snapshot at T ) 226 of signal comparison between T and R.

Figure 9. Snapshot at T ) 682 of signal comparison between T and R.

bounded by . In general, any method can be used to identify

comparing a signal that is evolving in time. Consider X ) {x1,

The portion of X used for comparison is called the eValuation

(X, Y) ) ({x1, ..., xm}, {y1, ..., yj})

Using the linear additivity property,

(X, Y) ) ({x1, ..., xtXSP-1}, {y1, ..., ytYSP-1}) +

Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007

explanation, the time variable for the real-time signal is denoted

on singular points. Let the last singular point in T be at time T

real-time signal tracking is to update the base point and R/w

of XTW comes at a price: a large (Tw, R/w) does not

The numerator is the XTW difference between Tw and R/w

( (Tw, R/w) < max) and (

Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007

Figure 13. Schematic of the lab-scale distillation column.

reference signal has to be identified, starting at T ) m as the

{tTPOD, ..., tm}

{tm-+1, ..., tm} if e m < POD

Note that POD

R* ) arg min ((TD, R))

A tradeoff exists between the speed and accuracy of reference

ing R* is small, several good matches may exist in K. We

The inseparability ratio thus reflects the uniqueness of R*.

((TD, R*) < max) and (R < Rmax)

where the first condition ensures low difference between TD

3.1 it is efficient as long as the length of the divergent segment

where (i, j) is the warping assignment between TD and its locus

DLA is used to identify all the matching segments YA in R

D(X, Z) ) min{D*(X, Z)}