You are on page 1of 18

Fluid Phase Equilibria, 12 (1983) 211-234 Elsevier Science Publishers B.V.

, Amsterdam

217 Printed in The Netherlands

REGRESSION APPLICATION
R.G. RUBIO, Departamento J.A.R.

OF VAPOR-LIQUID EQUILIBRIUM OF THE MAXIMUM-LIKELIHOOD


RENUNCIO and M. DIAZ PEfiA Complutense,

DATA BASED PRINCIPLE

ON

de Quimica

Fisica, Universidad

Madrid - 3 (Spain)

(Received August 2, 1982; accepted in final form March 21, 1983) ABSTRACT Rubio, R.G., Renuncio, J.A.R. and Diaz Pena, M., 1983. Regression of vapor-liquid equilibrium data based on application of the maximum-likelihood principle. Fluid Phase Equilibria, 12: 217-234. A method first proposed by Anderson et al. (1978) has been applied to evaluate the excess Gibbs energy from vapor-liquid equilibrium (VLE) data. This method, which is based on the maximum-likelihood principle, is shown to be more accurate and to provide more information than classical methods based on the least-squares principle. The influences of experimental errors, the number of data points, and the vapor pressures of the pure components on the fitting are studied using reliable experimental data obtained for several systems. The selection of the set of variables to be considered ((p, T, x,y), (p, T, x) or (p, Z,y)) and the criteria for choosing the equation which best fits the excess Gibbs energy are also discussed.

INTRODUCTION

Most thermodynamic data are fitted to a certain equation in order to obtain a better description of the system studied. Whether a predictive model or an empirical equation is used, the goal of any data regression is to determine a unique set of parameters which adequately describes the original data. Several regression methods have been used in the last few years. A method based on the maximum-likelihood principle has been shown to present advantages over the classical least-squares method in the reduction of vapor-liquid equilibrium (VLE) data (Sutton and McGregor, 1977; Anderson et al., 1978; Fabries and Renon, 1975; Kemeny and Manczinger, 1978; Peneloux et al., 1975, 1976; Kemeny et al., 1982). As has been pointed out by Anderson et al. (1978), the main feature of this new method is that parameter determination is, done so as to make the experimental observations the most likely when taken as a whole, instead of considering that the independent variables are not affected by error.
0378-3812/83/$03.00 0 1983 Elsevier Science Publishers B.V.

218

This goal is attained mathematically by finding the maximum value of the likelihood function C(xm, x, 0), where xm is the vector of experimental variables, x the vector of the true values of the variables, and 0 the vector of the parameters. Several assumptions are usually introduced in order to maximize this function (Renon, 1978): the error in the experimental variables is assumed to be small, independent for the different variables, and randomly distributed with a null average value; the variances of the experimental variables are assumed to be known, and the errors in different experiments are considered to be independent and with the same distribution for a given variable. Finally, it is also assumed that the use of the model does not introduce any significant systematic error, and that this error is in any case smaller than the accidental errors. A rigorous discussion of this method has been given by Bard (1974). The maximization of the likelihood function fZ(xm, x, t3) has been carried out by various methods (Anderson and Prausnitz, 1978; Fabries and Renon, 1975; Peneloux et al., 1975; Schubert, 1974) and using different properties (excess Gibbs energy, activity coefficients, etc.) as constraint functions; Kemeny et al. (1982) have developed a general method from which the other methods may be derived. Application of this method to the UNIQUAC model has been given recently by Prausnitz et al. (1980). In this paper we discuss the application of Andersons method to the regression of VLE data using the equation GE/Rrx(l -x) = 2 A;(2x i=o - l)i/ [ 1 + E Bj(2x i=l - 1) I (1)

x being the molar fraction of component 1, A, and B, adjustable parameters, and T is temperature. Equation (1) is a Pade approximant (n/m) (Baker, 1975), first proposed by Marsh (1977). As has been pointed out, it is possible to derive from eqn. (1) several equations which are used frequently in VLE data regression (Marsh, 1977; Mansoori, 1980). The selection of the most adequate approximant, and the uncertainties in the parameters Ai and Bj and in any other thermodynamic magnitudes calculated from them, are discussed. If a nonadequate approximant is chosen, values of GE may be subject to larger variations than those indicated by the errors calculated from the experimental uncertainties. Is is easy to find studies which use many adjustable parameters to describe a set of VLE data; Prausnitz (1969) has pointed out that it is not possible to justify more than three or four parameters with the usual number of experimental data points; in this paper we study whether parameters are statistically significant for a data set.

219 SCOPE OF THE REGRESSION METHOD

The maximization of the likelihood function is equivalent to finding the minimum value of the function S which can be expressed as

(2)
where N is the number of experimental data points, xy is a vector which contains the i th value of the M variables, and pi is the variance-covariance matrix of the variables. The diagonal elements of the matrix pi are the estimated variances of the experimental variables, and the nondiagonal elements are their covariances. Anderson et al. (1978) considered four independent variables (pressure p, temperature T, liquid molar fraction x, and vapor molar fraction y). The variance matrix then becomes a diagonal matrix, which is equivalent to ignoring the error-propagation rule. If two variables are correlated (for instance, if x and y are obtained from a calibration curve), this hypothesis may be inappropriate (Peneloux et al., 1975). The adjustable parameters depend on the relative weights w{ of the variables, and not on their estimated absolute variances ui2: nevertheless, if a complete statistical analysis is to be attempted, it is necessary to know the variance per unit of weight, u2, defined by wi = C$/u* (3)

u2 can be estimated using the relation S/u, since this variable has a x2-type distribution with N - L degrees of freedom (Bevington, 1969, pp. 18-20) L being the number of adjustable parameters. Application of the unconditional consistency concept related to the sample variance z2 (Peneloux et al., 1975, 1976), i.e.,
z=S,/(N-L) (4)

allows an assessment of whether the systematic errors are really smaller than the accidental errors in the model, and of whether or not the estimated variances of the variables are compatible with the experimental data. If z: and zi are the sample variances of two different Padt approximants, the ratio
F = z;/zf zf > z; (5)

presents a Fishers distribution (Brownlee, 1965) and allows an examination of the degree of correlation between the two approximants within a certain level of confidence. The magnitude z* also makes possible the calculation of the variance-covariance matrix Z of the parameters. The diagonal elements

220

of this matrix provide the variances of the parameters and consequently the value of the error .E~ associated with each parameter 8j (A, and Bj in eqn. (1)). The ratio t =

e,/&,

(6)

follows a Students distribution (Anderson et al., 1978) and indicates the statistical significance of the parameter 0, within a given confidence level. The nondiagonal elements of X are related to the correlation coefficients between two given parameters. When one of these terms is close to unity, the two corresponding parameters may be substituted by a linear combination of the form
xj=

i= 1

2 Tiei

where I$ is the i th component of the eigenvector of Z associated with the eigenvalue Xj. The resulting linear combinations x j are statistically independent, and their variances are given by Xi (Fabries and Renon, 1975). It is important to notice that Z is a nonnegative definite matrix (Bryson and Ho, 1981); therefore, its eigenvalues must be positive or null, this has to be taken into account to select an adequate model. The calculation of the true values allows the distribution of residuals to be established over the entire concentration range and, therefore, an assessment of the thermodynamic consistency of the model, according to Van Ness criterion (Van Ness et al., 1973). The variance-covariance matrix X, also makes it possible to estimate the error of any magnitude H calculated from the parameters 8 (Anderson et al., 1978; Goral, 1977). The estimated error a(H) is given by CT(H) = hT,Zh, where h, is a vector whose components are the partial derivatives of the function- H( 8) with respect to each parameter ei. For the function given by eqn. (l), H = GE/RTx( 1 - x), the vector h, is 1, C, where 1 n A = c AiCi
i=O (n-3) (m-2)

. . . , C, - (A/B)C,

... , -

@/@Cm

(9)

c=2x-

(10)
.

m B = 1 + c BjCi
j=l

221

The values of the parameters define the center of a hyperellipsoid in L-dimensional space, whose axes can be calculated from the parameter variances for a certain level of confidence (Bryson and Ho, 1981). Any point of this hyperellipsoid represents a set of parameters able to reproduce the data within their experimental uncertainties for a given confidence level. The size and shape of this hyperellipsoid depend on the data, their uncertainties, etc. The size may be used as a criterion to discriminate between two fittings with the same number of parameters. The criteria for selecting the best Pade approximant may be summarized as follows. (1) The deviations of the variables (residuals) must present a random distribution with a null average value. The degree of randomness of the distribution of residuals may be estimated using the test of Abbe (Linnik, 1961), as described by Kemeny et al. (1982). The values for the variances of the variables should be similar to those estimated from the experimental uncertainties or repeated tests. The systematic errors introduced by the model must be smaller than random errors (unconditional consistency criterion of Peneloux). (2) The eigenvalues of the variance-covariance matrix Z should be positive or null. (3) When two approximants have a similar statistical significance, the one using fewer parameters is preferred. The statistical significance of the parameters should be as large as possible, and uncertainties in the thermodynamic magnitudes related to 0 must be as small as possible. (4) The values of the parameters of the Pade approximant selected should lead to accurate values of related thermodynamic properties (for instance, the activity coefficients at infinite dilution). COMPARISON WITH LEAST-SQUARES METHOD In order to compare the results obtained using this method with those corresponding to a classical least-squares method, we have chosen a recent and reliable set of VLE data for the n-hexane (1) + n-undecane (2) system at 303.15 K (Marsh et al., 1980). The authors fitted the data to eqn. (1) using a least-squares method and found the following values for the parameters of a (3/O) Pad& approximant: A, = -0.0529 A,= -0.0036 A, = -0.0017 u(p) = 2.4 Pa A, = 0.0058 (11)

The same set of data was fitted using the method based on the maximumlikelihood principle. The estimated variances used in this calculation were the experimental errors reported by the authors ( uO(X) = 1 x 10p4, uO(p) = 3

222 TABLE 1 variances for several Pad6 approximants for n-hexane (I)+ n-

Parameters and estimated undecane (2) system Parameters and variances -%I A,
A2

Pad& approximant (O/2) - 0.0265 -+ 0.0003 (l/O) - 0.0272 f 0.0006 0.0022 f 0.0007 (2/O) - 0.0265 f 0.0003 0.0005 * 0.0004 0.0035 f 0.0005 (3/O) - 0.0268 + 0.0005 + 0.0030 + 0.00 12 f 0.0003 0.0004 0.0006 0.0009

43 B,
B2

0.0225 f 0.0175 0.1239+0.0219 0.7 5 -C 1 1.3 10 (1

a(p)(Pa) 0(X)(X 105) u(T)(K)( x 104)

0.6 4 Cl

0.5 4 Cl

Pa, uO(T) = 0.01 K). The parameters and variances of the variables for the (3/O) approximant are summarized in Table 1. Comparison of the results indicates that application of the likelihood principle improves the fit of the data. It also allows a solution of the important problem of selecting the most adequate PadC approximant. The values of the nondiagonal terms of the correlation matrix, calculated from Z (Fabries and Renon, 1975), show that the parameters A, and A, are strongly correlated and that, therefore, the statistical significance of those parameters is weak. There are several other approximants (O/2, 2/O, O/3, l/2, 2/l) of similar significance. Even the (l/O) approximant is able to reproduce the data with deviations which are lower than those resulting from application of the least-squares method (see Table 1). Therefore there is no reason based on the sample-variance criterion to exclude this fitting (only two adjustable parameters). Making use of the criterion given by eqn. (5) it is possible to conclude that the (l/O) and (2/O) approximants are not equivalent within a 98% level of confidence. Consequently, the introduction of a new parameter in the (l/O) approximant to obtain the (2/O) approximant is statistically acceptable. On the other hand, the (O/2), (2/O), (O/3) and (3/O) approximants are equivalent to each other, and there is no reason to introduce a new parameter in the (2/O) approximant. Since the (3/O) approximant shows two correlated parameters, this approximant may be deleted. The (O/3) approximant is not acceptable, because one of the I: eigenvalues is negative. Therefore, the (O/2) and (2/O) approximants seem to be the most adequate to describe the behavior of this system. The values of the parameters

223 TABLE 2 for the various Padi! approximants shown in Table 1 for n-hexane + n-undecane

2 eigenvalues system

wt
WI +% % W4

Pad6 approximant

(O/2)
6.26x lO-8 2.35 x lO-4 5.50 x 1o-4

(l/O)
1.58 x lo- 6.55 x lo-

WO)
3.12x 1.09x 3.94x IO-* lo- lo-

(3/O) 2.74~ 9.02 x 3.04 x 1.01 x lo-* 1O-8 lo- 1o-6

and variances are given in Table 1. Calculation of the ratio defined by eqn. (6) indicates that parameter B, of the (O/2) approximant, and parameter A, of the (2/O) approximant are equal to zero within a 98% confidence level. The activity coefficients at infinite dilution for n-hexane, yp, have been calculated from the (2/O) and (O/2) approximants, giving the values 0.977 and 0.976, respectively. Analogous calculations from the (3/O) approximant reported by Marsh et al. (1980) lead to the value 0.9448 (o( y,) calculated from eqn. (8) are < 0.01 for any of these approximants). An estimate of the same coefficient from experimental GLC data (Letcher and Jerman, 1976) results in a value of 0.96 for vy. As has been reported by Lichtenthaler et al. (1974), activity-coefficient values obtained by GLC are always smaller than those obtained by extrapolation of VLE data. Results obtained from the maximum-likelihood method agree with this conclusion, within the calculated error, while those obtained by the least-squares method disagree. Table 2 shows the Z eigenvalues for the approximants in Table 1. The dimensions of the confidence hyperellipsoid are closely related to the eigenvalues given in Table 2 (Bryson and Ho, 1981). It may be observed that the (2/O) approximant has smaller values than the (O/2) approximant. Therefore the confidence hyperellipsoid corresponding to the (2/O) approximant must be of smaller size, indicating that this approximant is preferable. It is easy to find in the literature that when fitting thermodynamic data to a Redlich-Kister or a Pade equation, the selection procedure for finding the correct model is based on convergence of the variance of the variables (see Nunes da Ponte et al., 1978). Although this is a correct procedure when testing the accuracy of measurements (Peneloux et al., 1976) and leads to the same conclusion when applied to the n-hexane + n-undecane system, this selection procedure may lead to the deletion of approximants with high statistical significance. On the other hand, it is possible to find data sets (Diaz Pefia and Sotomayor, 1971) in which there is convergence of the variances but not of the GE values at a given composition. The selection of

224

TABLE 3 Parametersand uncertainties hexane+ undecanesystem (different values assumed for the for experimentalvariances) Uo(X)(X 104)
b. ( P NW

1 5 - 0.0264 & 0.0007 0.0001 -t 0.0001 0.0037*0.0013 0.65 1.4 0.7

1 7 - 0.0262 & 0.00 15 0.0005 + 0.002 1 0.0037 + 0.0027

9 - 0.0260 f 0.0024 0.0008 + 0.0033 0.0034 + 0.0043 2.1 4.6 3.5

Ao A, 4, AGE (J mol-) u(P)(Pa) a(x)( x 104)

1.3
3.0 1.2

the most adequate in these cases.

approximant

must include a complete

statistical

analysis

In order to point out the influence of experimental errors, and the importance of calculating variances of thermodynamic properties using eqn. (8), we have generated several data sets by addition of random errors with different variances to the original vapor-pressure data of Marsh et al. (1980). Table 3 gives the values of the parameters, the estimated error in GE, and the deviations a(p) and a( X) when u,(x) and a,(p) are assumed to take values greater than those given by Marsh et al. (1980). A (2/O) approximant has been used in this calculation. It may be observed that the parameter changes are limited to their uncertainty intervals. When uO(p) and u,,(x) take the highest values, the error in GE amounts to 18% of the maximum value of GE (GE(max) = 12 J mol-). Marsh et al. (1980) reported a value of 1 J mol- for the experimental uncertainty in G E. Table 3 shows that AGE calculated using eqn. (8) is smaller than the experimental uncertainty only when a(p) and a(x) are smaller than the experimental errors in p and X, respectively. Similar conclusions have been obtained for other systems. Therefore it seems to be necessary to use eqn. (8) when estimating errors in thermodynamic properties.
INFLUENCE OF THE RELATIVE WEIGHTS OF THE VARIABLES

The system benzene (1) + n-hexadecane (2) at 323.15 K (Diaz Pefia et al., 1982) has been chosen to study the influence of the relative weights of the variables on the fitting obtained using maximum-likelihood analysis. The experimental errors for this set of data are uO(T) = 0.01 K, uO( p) = 20 Pa and uO(x) = 3 X 10m4. When the error in the temperature is allowed to increase from 0.01 to 0.5

225

K, there no significant either in values of parameters or the variance-covariance Z for of the used. The is true all of systems studied this work. for the a(T) are smaller than K. This is different that reported the Wilson, and UNIQUAC The reason this may that the rule is taken into in Andersons and that expression for activity coefficients from eqn. does not an explicit dependence. These are given (12) and A and B are

lny,=[(1+C)/2B]2[AB+(1-C)(AB-AB)] where B and are given eqn. (

1-l

Bf=

5
j=l

(13)
jBjCj-'

The temperature dependence appears only in the expressions for the molar volumes, the virial coefficients and the correction for nonideality of the vapor phase. Table 4 gives the results obtained using the (l/l) approximants when the error in the pressure is assumed to vary while the variances of temperature and molar fraction remain constant (uo(T) = 0.01 K, u,(x)= 3 X 10P4).
TABLE 4 variances of p and x when experimental variance of p is assumed to change U(X)(X 0.4 2.0 3.5 8.5

Root-square

u. ( P XJ%
50 20 13 1

104)

4p)GW
30 25 18 0.5

SAN-L)
0.5 1.3 4.6 0.2

TABLE

5 variances of p and x when experimental variance of x is assumed to change u(x)( x 104) 2.0 0.4 0.0 o(p)(Pa) 2s 30 31 S/(N 1.3 3.0 3.0 - L)

Root-square %(X)(X 3 1 0.1 10

226

Similar results have been obtained for other approximants with up to five parameters. The values of a(x) and u(p) seem to be very sensitive to the value assumed for uO(p) except for the cases when u,,(p) is assumed to be 20 and 13 Pa, which are almost equivalent. Table 4 also gives the values of the function z2 (eqn. (4)), which are indicative of the magnitude of systematic errors introduced by the model. These values depend on how large the experimental uncertainties are. Finally, the error in the mole fraction has been allowed to vary from 3 X 10e4 to 1 X 10m5 while taking the values reported by the authors for uO( p) and uO(T). The results are given in Table 5. The second fitting of Table 5 seems to be equivalent to the first fitting of Table 4. This may be caused by the relative weights of the variables p and x being the same in both cases. Nevertheless, the values of z* (eqn. (4)) are different, thus indicating that these two fittings are not completely equivalent following the unconditional inconsistency criterion (Peneloux et al., 1975). It may be concluded that if inadequate values are adopted for the estimated variances, important changes will be obtained in the variance z*; changes also occur in the Z matrix and then in the confidence hyperellipsoid and in the calculated errors of thermodynamic properties related to the parameters.
INFLUENCE OF NUMBER OF DATA POINTS

VLE data for the methylethylketone (1) + benzene (2) system at 323.15 K (Diaz Pefia et al., 1978b) have been used to study the influence of the number of data points. Table 6 gives the values of the parameters for the
TABLE 6 approximant when number of data points is

Parameters and root-square variances for 3/O varied Parameters and variances

Number of data points 37 28 0.1798 0.0386 0.0173 0.0448 0.0 11 22 0.8 + 0.0004 &-0.0015 f 0.0019 f 0.0038 13 0.1803 k 0.0007 0.0372 + 0.0030 0.0155+0.0033 0.0507 f 0.0082 0.0 10 27 0.9

Ao A, -4, -4, 0(X)(X 104) a(p)(Pa) O(Y)(X 104) a(GE)(J mol-)

0.1798&0.0004 0.0387 + 0.0016 0.0181+0.0020 0.0428 f 0.004 1 0.0 12 27 0.9

227

(3/O) approximant, which was found to be the most appropriate for this set of data. There are 37 data points available, including the vapor pressures of both pure components. Fits were carried out for sets of 37, 28 and 13 data points. The selection of the data points was made uniformly along the concentration range while trying to keep constant the final variances of the variables. Two different sets of 13 data points were studied and found to be completely equivalent. Table 6 gives the values of the parameters and the variances of p, x, y and GE for the most representative fits. A (3/O) approximant seems to be the most adequate for all of them. The fits corresponding to 37 and 28 data points do not differ appreciably. When the number of data points is reduced from 28 to 13, the error in the parameters increases. The correlation matrix of the parameters also experiences changes, showing a high degree of correlation, especially between parameters A, and A,. Both changes may be explained in terms of corresponding variations in the variance-covariance matrix Z. When the number of data points is high, the uncertainty AGE in G varies with mole fraction, because Ax and Ap adopt different values depending on the mole-fraction region considered. When the number of data points is smaller, these differences disappear. This effect is illustrated in Fig. 1, where values of AGE have been plotted versus composition. It can be seen that these values increase when the number of data points is reduced.

13 Pomts

37 Paints

-1 .o

I
0.1

I
0.2

I
0.3

I
0.6

I
05 X

I
0.6

I
07

I
08

I
09

Fig. 1. Estimated error in GE when number of data points is varied.

228 INFLUENCE OF VAPOR PRESSURES OF PURE COMPONENTS

The vapor pressures of the pure components appear explicitly in the expression for the total pressure as reference values, and they have been considered as constants by several authors. Since their values are .usually determined experimentally, these pressures may also be treated as two or more data points. When they are considered only as physical constants, a systematic error which may be important is introduced (Fabries and Renon, 1975; Abbott and Van Ness, 1977). This is the case when these pressures have not been measured together with the other data points. In this case, it is possible to reduce the systematic errors by calculating their values using the spline-fit technique (Klauss and Van Ness, 1967), or by calculating them as adjustable parameters (Fabries and Renon, 1975). Several problems have been pointed out when vapor pressures are adjusted along the data regression (Abbott and Van Ness, 1977). The method studied in this paper is able to treat the vapor pressures of pure components as experimental points. Although the pure-component vapor pressures do not depend on the model parameters, when the xi = 0 points are introduced after the first iteration process, the calculated true values of x(x; = 0) are different from zero; therefore the calculated vapor pressures depend on the model parameters. If Ax( xi = 0) B u,,(x), the true p;(x, = 0) values can be considered as the true values for the pure components consistent with the whole data set. When the experimental values of the vapor pressures of the pure components disagree with the vapor pressures of the mixture, it can be expected

TABLE

7 vapor pressures on regression method a Case II 0.399 f 0.0008 80.679 52.916 51 99 0.1 -0.5 - 0.8 2.1 Case III 0.3999 f 0.00 16 80.634 + 0.025 52.957 & 0.026 43 81 0.1 40 -43 1.7

Influence of pure-component Parameters and variances Ao p?(KPa) pi(KPa) e(pXPa) maxi ApKPa) 0(x)(X 104) A&Pa) A&Pa) S/(N - L) a Refer to text. Case I

0.399 f 0.0009 80.679 52.916 53 104 0.1

2.3

229

that the true values for the pure components should be calculated to agree with the values for the mixture. In order to study the capability of the method in this respect, we have taken the hypothetical set of data proposed by Abbott and Van Ness (1977). Assume a system described by

GE/RT=0.4000 x( 1 -X)

(14)

Abbott and Van Ness added random errors with a variance of 35 Pa and a maximum error of 106 Pa. Estimated values for the variances of the variables are uO(p) = 35 Pa, uO(x) = 10e4 and u,(T)= 0.01 K. Table 7 shows the results obtained; case I considers the vapor pressures of the pure components as constant; case II introduces them as experimental points, and they are assumed to be equal to the true values in each iteration, as discussed above; case III treats the saturation vapor pressures as adjustable parameters. Table 7 shows that cases I and III reproduce the Abbott and Van Ness results within the uncertainty interval. The best results are obtained when the pure-component vapor pressures are treated as adjustable parameters (case III). Although calculated true values are used in case II, the regression method is not able to modify the total vapor pressures sufficiently when the

150-

30,

*I I

450

OD

I 0.2

I 0.4

I 0.6

I 0.8

1.0

-30

0.0

I 0.2

I 0.4

I 0.6

1 0.8

1.0

Fig. 2. Residuals Ax and Ap versus composition; 0, p,? not included as experimentalpoints; x, pp identified with true values of p for x = 0 and x = 1; A, pp treated as adjustable parameters.

mole fraction adopts the values zero or unity, indicating that the vapor pressures play a different role in the regression than do the other data points. Then it is important to use pure-component vapor pressures compatible with the whole data set, or to treat them as adjustable parameters. Figure 2 is a plot of the residuals Ap and Ax versus composition. The residuals in case III are distributed more randomly than those in cases I or II. The sample variance given in Table 7 reflects the same conclusion, since the systematic errors in cases I or II are more important than in case III (Peneloux et al., 1975). When an accidental error (up to 100 Pa) is introduced in one of the vapor pressures of the pure components, we have observed that the values of the parameters do not change within its uncertainty interval. Nevertheless, the error in the parameters, the error in GE and the variances a(p) and a(x) increase with the accidental error introduced in py. Even in this extreme case the Ax(xi = 0) value is smaller than 3 X 10w4, so that the calculated true value of p can be identified with the vapor pressure of pure component 1. In all cases we have studied Ax(xi = 0) is smaller than or similar to the u,(x) value.
INFLUENCE OF SET OF VARIABLES USED

When the analysis of VLE data is discussed much consideration is given to the problem of selecting the appropriate function or equation to which data are fitted (Van Ness et al., 1973; Abbott and Van Ness, 1975; Aitamaa, 1980; Neau and Peneloux, 1981). Another important topic is which variables should be considered. One of the advantages of the maximum-likelihood method is that the same parameters are obtained using different functions (Kemeny et al., 1982; Neau and Peneloux, 198 1).

TABLE

Influence of set of variables used in regression method Parameters and variances Set of variables

p-7-x 43
A, -4, 0.1141~0.0011 0.0225 + 0.0027 0.03 14 * 0.0057 3 9 21

P-Y-T 0.1146f0.0017 0.0201 * 0.0040 0.03 14 * 0.0082 8 9 20

p-T-x-y 0.1147*0.0017 0.0202 f 0.0040 0.03 12 f 0.0082 0 9 21

0(x)(X 105) m( PXW o(u)( x 104)

231
3
I I I I I I I I I

_____--------__

I_____------

-3

I 01

I 02

I 03

0.4

0.5 X

0.6

0.7

0.6

09

Fig. 3. Estimated error in G when different sets of variables are used.

We now study the influence of using different sets of variables in this regression method. A set of data presenting small variances for all variables, for the system toluene (1) + methylisobutylketone (2) (Diaz PeAa et al., 1978a), has been selected. Values of the experimental uncertainties reported by the authors have been used as estimated variances of the variables. The (2/O) approximant has been found to be the most appropriate to describe the (p, T, x, y), (p, T, x) and (p, T, y) sets of data considered. Table 8 shows the results obtained. The parameters obtained for these three sets agree with each other within their uncertainties (Kemeny et al., 1982; Neau and Peneloux, 198 1). The main differences between these three cases are those appearing in the variance-covariance matrix Z, and subsequently in the estimated errors in GE. These differences indicate that the (p, T,y) set should not be recommended. The (p, T, x) and (p, T, x, y) sets are compared in Fig. 3, where values of AGE obtained using the two sets of variables have been plotted versus composition. Important differences also appear in the values adopted by the z2 function (eqn. (4)) and in the size of the hyperellipsoid defined by the uncertainties in the parameters. These results could seem in opposition to those of Peneloux et al. (1976). The latter found that when y values are added to the data set, the model parameters have smaller variances. The difference is that they usually needed more parameters to reduce the (p, T, x, y) data set than the ( p, T, x) data

232

set (5 and 7 parameters in the example that they discussed). On the other hand, the above conclusions are true if the thermodynamic consistency of the data is perfect. We have used a simulated data set with random errors added and we have found results which agree with those of Peneloux et al. ( 1976). When the data set has a small inconsistency, as in the case of the data discussed here, the use of y data increase the variances of the model parameters. It may be concluded that although the same parameters are obtained, the cases discussed above are not equivalent.
LIST OF SYMBOLS

A
Ai

B B

Bi
c
F
GE H

he
L

r; ii ::

Tz S t
T

3.i wi
x

xi
XY

Y
22

numerator of Pade approximant in eqn. (1) derivative of A with respect to composition adjustable parameters in eqn. (1) denominator of Pad& approximant in eqn. (1) derivative of B with respect to composition adjustable parameters in eqn. (1) difference in mole fractions of components 1 and 2 ratio of sample variances excess Gibbs energy (J mol - ) any thermodynamic property calculated using the parameters Si partial-derivative vector number of adjustable parameters in eqn. (1) likelihood function degree of polynomial in denominator of eqn. (1) number of variables degree of polynomial in numerator of eqn. (1) number of data points pressure (Pa) gas constant sum of weighted squared residuals ratio of values for parameters and errors temperature (K) components of eigenvectors of variance-covariance matrix of parameters relative weight of variable i liquid phase mole fraction of component 1 vector of values of variables in i th experiment vector of experimental values of variables in ith experiment vapor phase mole fraction sample variance

233

Pi yk

AGE El
i

j
2

c72 a2(i),

uf

4
X_i
REFERENCES

variance-covariance matrix of variables activity coefficients of component k estimated error in excess Gibbs energy uncertainty of error in parameter Si adjustable parameter eigenvalue of variance-covariance matrix variance-covariance matrix of parameters variance per unit of weight calculated variance of variable i experimental variance of variable i linear combination of parameters 8,

Aittamma, J., 1980. Selection of objective functions in correlation of multicomponent vapor-liquid equilibria. J. Chem., 54: 1751- 1757. Abbott, M.M. and Van Ness, H.C., 1975. Vapor-liquid equilibrium. Part III. Data reduction with precise expression for GE. Am. Inst. Chem. Eng. J., 21: 62-71. Abbott, M.M. and Van Ness, H.C., 1977. An extension of Barkers method for reduction of VLE data. Fluid Phase Equilibria, 1: 3-l 1. Anderson, T.F. and Prausnitz, J.M., 1978. Application of the UNIQUAC equation to calculation of multicomponent phase equilibria. I. Vapour-liquid equilibria. Ind. Eng. Chem., Proc. Des. Dev., 17: 552-561. Anderson, T.F., Abrams, D.S. and Grens, E.A., 1978. Evaluation of parameters for nonlinear thermodynamic models. Am. Inst. Chem. Eng. J., 24: 20-29. Baker, J.A., 1975. Essentials of Pade Approximants. Academic Press, New York. Bard, Y., 1974. Nonlinear Parameter Estimation. Academic Press, New York. Bevington, P.R., 1969. Data Reduction and Error Analysis for the Physical Sciences. McGraw-Hill, New York. Brownlee, K.A., 1965. Statistical Theory and Methodology in Science and Engineering. 2nd edn. Wiley, New York. Bryson, A.E. and Ho, Y.B., 1981. Applied Optimal Control: Optimization, Estimation and Control. 2nd edn. Hemisphere, New York. Diaz Pefia, M. and Sotomayor, C.P., 197 1. Termodinamica de mezclas de alcoholes normales. VI. Presiones de vapor de1 sistema metanol+ n-dodecanol. An. Quim., 67: 233-248. Diaz Pefia, Crespo Cohn, A. and Compost&o, A., 1978a. Isothermal liqmd-vapour-equilibria. 1. The binary systems formed by toluene+ methylethylketone, +methylpropylketone, and + methyhsobutylketone. J. Chem. Thermodyn., 10: 337-341. Diaz Peiia, M., Crespo Colin, A. and Compostizo, A., 1978b. Isothermal liquid-vapour equilibria. 2. The binary systems formed by benzene + acetone, + methylethylketone, + methylpropylketone and + methylisobutylketone. J. Chem. Thermodyn., 10: 1101-I 106. Diaz Petia, M., Renuncio, J.A.R. and Rubio, R.G., 1982. Excess Gibbs energy for the benzene+ n-hexadecane system at 298.15 and 323.15 K. Thermochim. Acta, 56: 199-208. Fabries, J.F. and Renon, H., 1975. Method of evaluation and reduction of vapour-liquid equilibrium data of binary mixtures. Am. Inst. Chem. Eng. J., 21: 735-743. G&al, M., 1977. Error analysis in Barkers method of vapour pressure isotherm data processing. Z. Phys. Chem. 258: 1040-1044.

234 Kemeny, S. and Manczinger, J., 1978. Treatment of binary vapour-liquid equilibrium data. Chem. Eng. Sci., 33: 71-76. Kemeny, S., Skjold-Jorgensen, S., Manczinger, J. and T&h, K., 1982. Reduction of thermodynamic data by means of the multiresponse maximum likelihood principle. Am. Inst. Chem. Eng. J., 28: 20-30. Klaus, R.L. and Van Ness, H.C., 1967. An extension of the spline fit technique and applications to thermodynamic data. Am. Inst. Chem. Eng. J., 13: 1132-l 136. Letcher, T.M. and Jerman, P.J., 1976. The thermodynamics of mixing for the systems, benzene, cyclohexane, and n-hexane with n-alkanes, at infinite dilution. J. South Afr. Chem. Inst., 29: 55-62. Lichtenthaler, R.N., Liu, D.D. and Prausnitz, J.M., 1974. Thermodynamics of poly(dimethylsiloxane) solutions: combinatorial entropy and molecular interactions. Ber. Bunsenges. Phys. Chem., 78: 470-477. Linnik, J.W., 1961. Die metode der kleinsten quadrate in moderner darstellung. VEB Deutsche Verlag der Wissenchaften, Berlin. Mansoori, G.A., 1980. Molecular basis of activity coefficient (isobaric-isothermal ensemble approach). Fluid Phase Equilibria, 4: 61-69. Marsh, K.N., 1977. A general method for calculating the excess Gibbs free energy from isothermal vapour-liquid equilibria. J. Chem. Thermodyn., 9: 719-724. Marsh, K.N., Ott, J.B. and Richards, A.E., 1980. Excess enthalpies, excess volumes, and excess Gibbs free energies for n-hexane+ n-undecane at 298.15 and 308.15 K. J. Chem. Thermodyn., 12: 897-902. Neau, E. and Ptneloux, A., 1981. Estimation of model parameters: comparison of methods based on maximum likelihood principle. Fluid Phase Equilibria, 6: l- 19. Nunes da Ponte, M., Street, W.B. and Staveley, L.A.K., 1978. An experimental study of the equation of state of liquid mixtures of nitrogen and methane, and the effect of pressure on the excess thermodynamic functions of this system. J. Chem. Thermodyn., 10: 151-168. Ptneloux, A., Deyrieux, R. and Neau, E., 1975. Reduction des donntes sur les tquilibres liquide-vapeur binaires isothermes; criteres de precision et de coherence; analyse de linformation. J. Chim. Phys., 72: 1107- 1117. Peneloux, A., Deyrieux, R., Canals, E. and Neau, E., 1976. The maximum likelihood test and the estimation of experimental inaccuracies: application to data reduction for liquid-vapour equilibrium. J. Chim. Phys., 73: 706-716. Prausnitz, J.M., 1969. Molecular Thermodynamics of Fluid Phase Equilibria. Prentice-Hall, Englewood Cliffs, NJ. Prausnitz, J.M., Anderson, T.F., Grens, E.A., Eckert, C.A., Hsieh, R. and OConnell, J-P., 1980. Computer Calculations for Multicomponent Vapour-Liquid and Liquid-Liquid Equilibria. Prentice-Hall, Englewood Cliffs, NJ. Renon, H., 1978. Qualities of models for evaluation, representation and prediction of fluid phase equilibrium data. Fluid Phase Equilibria, 2: lO:l- 118. Schubert, E., 1974. Vergleich verschiedener, Metoden zur Ermittlung der Parameter in der Wilson- und NRTL-Gleichung. Chem. Ing. Tech., 46: 73-79. Sutton, T.L. and McGregor, J.F., 1977. The analysis and design of binary vapour-liquid equilibrium experiments. Part I. Parameter estimation and consistency test. Can. J. Chem. Eng., 55: 602-608. Van Ness, H.C., Byer, S.M. and Gibbs, R.E., 1973. Vapour-liquid equilibrium. Part I. An appraisal of data reduction methods. Am. Inst. Chem. Eng. J., 19: 238-244.

You might also like