Professional Documents
Culture Documents
Introduction
Introduction Introduction
In a batch process under control
variables describe typical trajectories
Data emerging from a batch process
Tri-dimensional matrix X :
Introduction Introduction
Introduction Background
Monitoring future batches Compute the Hoteling statistic
Background Background
• Take a complete future batch x’f and compute: • Qf represents the perpendicular distance between the
PT
future batch represented by the PT original r.vs. and
Q f = ∑ ( xi , f − xˆi , f ) 2 the plan defined by the q PCs retained in the model
i =1
• Control limits for Qf are:
xˆ f = U q c q , f
Eigenvector matrix of the p
U q = [u 1 | u 2 | ... | u q ] PCs retained • A large Qf indicates a change in the correlation
structure as present in r.vs. when the reference
(q × 1) vector of scores model was built
Background Background
On-line control scheme Strategies
• Batches may go out-of-control in early time • Supose l (out of T) time instants have already
periods, yielding bad products passed in the batch
• To calculate scores and chart statistics, x′f • Strategy 1 assumes in the remaining T – l
vector of must be at hand (and complete) instants variables will follow their mean
trajectories:
• On-line monitoring demands completing
vector x′f : – fill remaining entries of x′f with zeros
– Q-chart will be extremely sensitive, which is good
– There are three strategies for that – T2-chart will not be sensitive to extreme variability since
scores cql,f will be “neutralized” by mean values
Background Background
• What if batches are variable in duration? • Identify an indexing variable other than time
with same duration in all batches (Nomikos &
– Variables trajectories may shrink or expand according to McGregor, 1994):
process conditions and raw materials
– Such variable may not exist!
– Shorter or longer batches may or may not yield good
product
Background Background
Background Background
Approaches to the problem in literature - 4 How the problem may be dealt with?
Background Proposition
• Proposed approach has three main steps: • Three variables: X1, X2, X3
1. Completion of batches shorter than the longest • 40 good reference batches are available
batch observed in practice
2. Reduction of data complexity using the Statis • Time-to-completion of batches varies:
method – shortest batches end at t = 19
3. Use of non-parametric control charts to control – longest batches end at t = 21
future batches
Proposition Development
Example Example
Mean trajectories of variables Partial view of data
16
Time 1 2 3 4 ... 18 19 20 21
14 :
12 X 20,1421 30,165 33,9751 34,6274 16,5343 15,8786
26 Y 9,7174 9,859 9,5511 10,1236 11,5276 10,2035
10
Z 5,263 4,7974 4,4814 3,791 1,1216 1,3156
40
Y
8
X 19,9558 29,0202 32,9224 35,7367 16,0053 15,1632 15,8581
35 6 27 Y 9,7708 9,6661 9,8765 10,3492 12,1413 10,2817 9,8593
4 Z 4,8877 4,8504 4,344 4,0544 1,5986 1,2097 0,6747
30
X 19,86 28,7166 34,3295 34,2046 17,0258 15,1871
2
25 28 Y 10,2027 10,0999 9,968 10,592 11,5587 10,6265
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Z 5,0727 4,9251 4,1041 4,1746 1,2045 1,2405
X
20 ...
T ime X 20,6936 29,7174 34,9566 34,8955 17,6458 16,1126 15,0941 15,5051
15 29
BATCHES
Y 10,3195 10,848 10,2155 10,035 12,5399 11,2841 10,241 9,5713
Z 4,7232 5,2174 4,466 3,9165 1,1612 0,9443 0,6874 0,7952
10
6
X 20,5094 29,2835 33,5927 35,1364 16,972 14,9124 14,7962
5 30 Y 9,5758 10,013 10,454 9,8377 11,2068 10,5323 10,3085
5 Z 4,9597 4,9468 4,1527 3,9909 1,2549 0,8197 0,8696
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 X 19,4428 29,3049 32,2347 34,573 15,7927 15,3653 15,2845
4 31 Y 9,4775 9,818 10,3712 10,1699 12,1857 11,0023 10,2435
Time
Z 5,071 4,7932 4,3942 3,7319 0,6462 1,6457 1,0293
Z
3
Development Development
• Denote the minimum and maximum batch • Forecast standard deviations from time Tmin+1
duration in the reference sample by Tmin and to Tmax:
Tmax respectively sˆmin +1 , K , sˆmax
• Calculate the sample standard deviations • Pair sample means and forecasted std-
associated with the i-th variable: deviations from time Tmin+1 to Tmax:
– at each time instant
– using information from all batches (xmin +1 , sˆmin +1 ),K, (xmax , sˆmax )
Development Development
Use pairs of parameters to simulate Step 1 - Completing batch data
normally distributed variables Pros and cons
• There will be a series of simulated values for each ↑ Variables’ variances are preserved in the
time instant, from Tmin+1 to Tmax completed batches:
• Complete batches sampling from simulated values – No risk of flattening variation in last time instants,
at the proper time instant yielding control charts that are too sensitive
Time 1 2 3 4 5 ... 17 18 19 20 21
19,9558 29,0202 32,9224 35,7367 35,086 17,1604 16,0053 15,1632 15,8581 15,38223 ↓ Some batches may present step (disruption)
27 9,7708 9,6661 9,8765 10,3492 10,6152 12,2171 12,1413 10,2817 9,8593 8,416457
4,8877 4,8504 4,344 4,0544 3,1093 1,141 1,5986 1,2097 0,6747 0,922517
19,86 28,7166 34,3295 34,2046 34,8312 18,173 17,0258 15,1871 14,58002 14,77057 between real and simulated data:
28 10,2027 10,0999 9,968 10,592 9,9868 11,9542 11,5587 10,6265 10,68964 8,894751
5,0727 4,9251 4,1041 4,1746 2,9575 1,1215 1,2045 1,2405 1,08139 0,500459
20,6936 29,7174 34,9566 34,8955 34,5794 17,9739 17,6458 16,1126 15,0941 15,5051
29 10,3195 10,848 10,2155 10,035 9,958 ... 13,5571 12,5399 11,2841 10,241 9,5713
4,7232 5,2174 4,466 3,9165 3,5242 0,9818 1,1612 0,9443 0,6874 0,7952
− Autocorrelation structure may be perturbed
20,5094 29,2835 33,5927 35,1364 34,7326 16,3747 16,972 14,9124 14,7962 15,44145
30 9,5758 10,013 10,454 9,8377 10,0088 13,0497 11,2068 10,5323 10,3085 11,00494
4,9597 4,9468 4,1527 3,9909 2,8819 1,0449 1,2549 0,8197 0,8696 0,869784
19,4428 29,3049 32,2347 34,573 33,8549 17,4324 15,7927 15,3653 15,2845 16,31118
31 9,4775 9,818 10,3712 10,1699 10,2847 12,5208 12,1857 11,0023 10,2435 9,504082
5,071 4,7932 4,3942 3,7319 3,2404 1,2909 0,6462 1,6457 1,0293 0,850513
Development Development
Step 2
How the interstructure analysis works
Reduction of data complexity
• Completed batch data are analyzed using Statis • Let Xb be a table with data from batch b:
method 1 ⎡ X 11 X 21 L X P1 ⎤
⎢ X 22
2 ⎢ X 12 L X P 2 ⎥⎥
• Statis reduces data dimensionality preserving Xb =
M ⎢ M M O M ⎥
information on time instants ⎢ ⎥
Tmax ⎢⎣ X 1Tmax X 2Tmax K X PTmax ⎥⎦
• Data analyzed in Statis regarding their: • Each Xb will be replaced by another table, comprised
– interstructure of scalar products between “individuals” (i.e., time
instants):
– intrastructure
Wb = X b X′b
Development Development
Some remarks Inter-structure analysis
• Note that Wb is a (Tmax × Tmax) matrix: • The Hilbert-Schmidt scalar product is used to obtain a
similarity measure between pairs of batches:
– each line in Wb summarizes the information
regarding all variables measured in that time Wb Wb′ = Tr (DWb DWb′ )
HS
instant
– resulting matrix is related to the variance-
D = Matrix of importance
covariance matrix of unfolded data weights for the individuals
(time instants)
• Normalizing W matrices, the Hilbert-Schmidt • Organize RVs for each pair of batches in a
products become the coefficients of vectorial matrix S:
correlation among batches (RV coefficients): – S has dimension (B × B)
– If RV(b, b’) = 1, matrices Xb and Xb’ are equivalent
• Multiply S by ∆, such that:
– If RV(b, b’) = 0, variances in matrices Xb and Xb
are not correlated at all ⎡π 1 0⎤ importance weight assigned
∆ = ⎢⎢ O ⎥ to batch B
⎥
⎢⎣ 0 π B ⎥⎦
Development Development
Step 3(a) – Non-parametric control charts
Principal components analysis on S∆ IS - CC
• Interstructure representation is obtained projecting • Chart displays overall picture of reference
batch information in W1,…, WB on the 1st factorial batches on first factorial plane
plan of S∆:
• Each point in IS-CC corresponds to a batch
– Projection of Wb in the ith axis of the 1st factorial plan is
given by the coordinates of vector ab,i: • IS-CC is useful in off-line control of processes
a b ,i = λi u i
Let us check the IS-CC using the data from the
ith eigenvector of S∆
ith eigenvalue of S∆ example
Development Development
IS-CC
IS-CC in the example
Definition of a (1 −α) control region
IS chart
1. Determine robust centroid in the factorial plan
0,4
8
23 2. Define inner region s.t. 50% of points in the graph
0,3
26
Factor 2
7 34
-0,1
9 15 25
30 4. Establish control region by defining a multiple of
-0,2 11 31
21 20
28 the distance l between centroid and the boundary of
-0,3 5
-0,4
the 50%-hull that corresponds to the desired
Factor 1 probability of false alarm α
Batch 8 seems like an outlier, although belonging to reference set
Development Development
IS-CC with control region Off line control
Example Testing a bad batch in the IS-CC
35 14
30 12
25 10
20 8
X1
X2
15 6
10 4
5 2
0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Time
Time
X3
2
-1
Time
Development Development
Development Development
Compromise
Co (CO) Matrix Principal components analysis on WCOD
Wb = X b X′b
Development Development
• CO-CC displays overall picture of reference • CO-CC presents the average trajectory of
batches on first factorial plane of the intra- batches as time progresses:
structure:
– Projection of WCO in the ith axis of the 1st factorial
– each point in the chart represents the average plan is given by the coordinates of vector aCO,i:
behavior, over all batches, at a given time instant
• COt–CC displays behavior of batches at time
instant t: a CO ,i = δ i v i
– each point in the chart corresponds to a batch, as
observed at time t ith eigenvalue of WCOD ith eigenvector of WCOD
Development Development
CO-CC in the example
We have 21 time instants in the reference set COt – CC
Factor 2
2 1
16 a b ,i = Wb Dv i
-0,2 -0,15 -0,1 -0,05 0 0,05 0,1 0,15 0,2 0,25 3 0,3
15 δi
5 4
14
7 6
13 8
11 10 9
12
Development
21
20
19
1
• Each chart displays behavior of the 40 batches
18
17
Point t = 1 in CO-CC corresponds
in a given time instant
Factor 2
2
16 to center of data cloud in chart
-0,2 -0,15 -0,1 -0,05 0 0,05 0,1 0,15 0,2 0,25 3 0,3
15
5 4
14 6
CO1 • COt charts enable on-line process control:
13 8 7
11 10 9
12
Development
Exemplifying
On-line monitoring of a new batch How to do on-line control of new batches
Development Development
Complete WNB with reference batches Simulating on line control of a new batch
40
35
30
20
X1
15
10
using information in WCO, as follows: comprised of first 5 time
5
t =1 t =2 t =3 ... t = T max 0
instants (t = 1,...,5) in a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
t =1 Time
t =2
For t = 1: WNB = t =3
16
:
total of 21.
t = T max 14
12
10
8
X2
6
t =1 t =2 t =3 ... t = T max X NB ,t X′NB ,t
t =1 4
t =2 2
For t = 2: WNB = t =3 • New batch trajectory 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
:
Time
t = T max
w.r.t. average trajectory 6
t =1 t =2 t =3 ... t = T max
4
t =1
t =2 3
X3
For t = 3: WNB = t =3
WCO
: 2
t = T max
1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
M Development T im e
CO1-CC CO2-CC
CO t=1 CO t=2
0,2 0,07
24 31
21 10 27
0,18 22 37
26 0,06
27
2 17
40
28
0,16 12 18 14 0,05 33 39
19 25 13 14
38 NB 30 19 2 26
4 36
33 28 38
0,14 34 39 8 31
1616 9 0,04 22
3 20 7
23 13
35 5 11 24 30
0,12 29 15 36
0,03 12
37 NB
18 8
32 25 20 23
3
PC2
0,1 15 510
PC2
16 1
0,02
9 6
0,08
0,01 40 11 4
0,06 35
0 34
29 17 7
0,04 0,19 0,2 0,21 0,22 0,23 0,24 0,25 0,26 0,27 0,28
21
-0,01
32
0,02
-0,02
0 PC1
0,14 0,15 0,16 0,17 0,18 0,19 0,2 0,21
PC1
CO3-CC CO4-CC
CO t=3
CO t=4
0,03
0
0,2 0,21 0,22 0,23 0,24 0,25 0,26 6 0,27 0,28 0,29
0,02 24 24
-0,01
0,01 26 18
-0,02
2
8
0 NB 12 -0,03 10
1727 2 3 3526
0,2 0,22 0,24 0,26 16
0,2822 0,3 0,32 38
-0,01 23 NB 3 30
-0,04 37 21 9
20 36 19
20 39
18 38 35 8 4 12 17
-0,02 4 -0,05 33 1340
31
PC2
PC2
39 23 14 31
14 32
10 25 15 1 34 29
-0,03 3619 34 -0,06 28 5
33 9 15 27
30 21 1 7 22
6 28 32 -0,07 1611
-0,04 13 7 37
-0,05
-0,08
5 11
25
29 -0,09
-0,06
40
-0,1
-0,07
PC1
PC1
Conclusion
CO5-CC
Note that:
CO t=5 • First two time instants followed the average
0
40
0,15 0,16 0,17 0,18 0,19 0,2 0,21 0,22 0,23 0,24 trajectory:
-0,02
4 36 19
34 – no signal was expected
923
20 13
-0,04 8 18
17 16 7
12 33
14 30 38 2 6
24
26 1
31 15 29 5
• Last three time instant present significant
-0,06 10 21
PC2
339 28
22 11
37 35 25 departures from average trajectories,
-0,08 27
32
particularly t = 5:
-0,1
Closing Remarks
Future Work
• Test different strategies for completing missing data
from future batches
• Play with D matrix such that time periods between
Tmin and Tmax have smaller importance weights in the
CO matrix
• Test other non-parametric charts, such as those based
on data depth
• Simulate different types of bad batches, to test chart
efficiency
Conclusion