You are on page 1of 328

BASICS OF

STRUCTURA
L
EQUATIO
N
MODELIN
G

To members of my family, who made this project


I dedicate this book:
my parents, George and Helen;
my wife, Barbara;
and our children, Kristie and Dan.

possible,

BASIC S OF
STRUCTURA L
EQUATIO N
MODELIN G

GEOFFREY M. MARUYAM
A

SAGE Publication s

International Educational and Professional Publisher

Thousan d Oaks London New Delhi

Copyrigh t

1 9 9 8 by Sag e Publications , Inc .

Al l right s reserved . N o par t o f thi s boo k ma y b e reproduce d o r utilize d in an y for m o r


by an y means , electroni c o r mechanical , includin g photocopying , recording , o r b y
an y informatio n storag e an d retrieva l system , withou t permissio n in writin g fro m th e
publisher .

Fo r information :
SAG E Publications , Inc .
2 4 5 5 Telle r Roa d
Thousan d Oaks , Californi a 9 1 3 2 0
E-mail : order@sagepub.co m
SAG E Publication s Ltd
1 Oliver' s Yard
5 5 Cit y Roa d
Londo n EC1 Y IS P
SAG E Publication s Indi a Pv t Ltd
B-42, Panchshee l Enclav e
Pos t Bo x 4 1 0 9
Ne w Delh i 110 0 1 7
Printe d in th e Unite d State s of Americ a
Library

of Congress

Cataloging-in-Publication

Data

Maruyama , Geoffre y M .
Basic s o f structura l equatio n modeling/b y Geoffre y M . Maruyama .
p . cm .
Include s bibliographica l reference s an d index .
ISB N 0 - 8 0 3 9 - 7 4 0 8 - 6 (cloth).ISB N 0 - 8 0 3 9 - 7 4 0 9 - 4 (pbk. )
1. Multivariat e analysis . 2. Socia l sciencesStatistica l methods .
I. Title .
QA278.M37 4
1997
519.5'35dc2 1

97-4839

03

Acquiring Editor:
Editorial Assistant:
Production Editor:
Production Assistant:
l^plsitttrlDesipur:
Covtr Dtsigntr:
Print Buyer:

10

C . Debora h Laughto n

Eilee n Car r
Dian a E. Axelse n
Denii e Sanioy o

Mario n Warre n
Candic e Harma n
Ann a Chi n

cX t t

Prefac e
Acknowledgment s

XI

xv

PAR T 1: Backgroun d

1. Wha t Doe s It Mea n to Mode l Hypothesize d


Causa l Processe s Wit h Nonexperimenta l Data ?
Method s for Structura l Equatio n Analyse s
Overvie w

9
12

Histor y an d Logi c of Structura l Equatio n Modelin g

15

Histor y
Sewell Wight
Path Analysis in the Social Sciences
Unidirectiona l Flow Model s
Movin g Beyon d Pat h Analysi s in Structura l
Equatio n Modelin g Researc h
Wh y Use Structura l Equatio n Modelin g Techniques ?

15
15
17
17
20
20

PAR T 2: Basic Approache s to Modelin g Wit h


Singl e Observe d Measure s of Theoretica l Variable s
3. Th e Basics : Pat h Analysi s an d Partitionin g of Varianc e

29

Logi c of Correlation s an d Covariance s


Decomposin g Relationship s Betwee n Variable s
Int o Causa l an d Noncausa l Component s
Direct Causal Effects
Indirect Causal Effects
Noncausal Relationships Due to Shared Antecedents
Noncausal Unanalyzed Prior Association Relationships
Approache s for Decomposin g Effects
Determinin g Degree s of Freedo m of Model s
Presentin g Partia l Regressio n an d Partia l
Correlatio n as Pat h Model s
Partial Regression
Partial Correlation
Peer Popularit y an d Academi c Achievement :
An Illustratio n

30

4. Effects of Collinearit y on Regressio n


an d Pat h Analysi s
Regressio n an d Collinearit y
Illustratin g Effects of Collinearit y
Confidenc e Interval s for Correlation s
Ridg e or Reduce d Varianc e Regressio n
5. Effects of Rando m an d Nonrando m Erro r
on Pat h Model s
Measuremen t Erro r
Background
Specifying Relationships Between Theoretical
Variables and Measures
Random Measurement Error
Nonrandom Error
Metho d Varianc e an d Multitrait-Multimetho d Model s
Method Variance
Additive Multitrait-Multimethod Models
Nonadditive Multitrait-Multimethod Models

35
39
40
41
42
44
48
49
49
51
53

60
62
66
70
73

79
79
79
81
84
87
88
89
92
96

Summar y
6. Recursiv e an d Longitudina l Models : Wher e Causalit y
Goe s in Mor e Tha n On e Directio n an d Wher e Dat a
Ar e Collecte d Ove r Tim e
Model s Wit h Multidirectiona l Path s
Logic of Nonrecursive Models
Estimation of Nonrecursive Models
Mode l Identificatio n
Longitudina l Model s
Logic Underlying Longitudinal Models
Terminology of Panel Models
Identification
Stability
Temporal Lags in Panel Models
Growth Across Time in Panel Models
Stability of Causal Processes
Effects of Excluded Variables
Correlatio n an d Regressio n Approache s for Analyzin g
Pane l Dat a
Summar y

97

99
100
100
103
105
108
109
110
111
112
115
117
118
119
120
122

I PAR T 3: Facto r Analysi s an d Pat h Modelin g


7. Introducin g th e Logi c of Facto r Analysi s an d
Multipl e Indicator s to Pat h Modelin g
Facto r Analysi s
Logic of Factor Analysis
Exploratory Factor Analysis
Confirmatory Factor Analysis
Use of Confirmatory Factor Analysis Techniques
Constrainin g Relation s of Observe d Measure s With Factor s
Confirmator y Facto r Analysi s an d Metho d Factor s
The Basic Confirmatory Factor Analysis Path
Model for Multitrait-Multimethod Matrices
Confirmatory Factor Analysis Approaches to
Multitrait-Multimethod Matrices and
Model Identification

131
132
132
136
139
140
147
148
148

152

Summary of Confirmatory Factor Analysis


and Multitrait-Multimethod Models
Initia l Testin g of Plausibilit y of Models : Consistenc y Test s
Number of Indicators and Consistency Tests
Costner's Original Consistency Model

154
154
155
158

PAR T 4: Laten t Variabl e Structura l Equatio n Model s


8. Puttin g It All Together : Laten t Variabl e
Structura l Equatio n Modelin g
Th e Basic Laten t Variabl e Structura l Equatio n Mode l
The Measurement Model
Reference Indicators
The Structural Model
An Illustratio n of Structura l Equatio n Model s
Model Specification
Identification
Equations and Matrices
Basic Idea s Underlyin g FhVSignificanc e Testin g
Individual Parameter Significance
Model Fitting
The Measurement Model
The Structural Model
The Variance/Covariance Matrices
9. Usin g Laten t Variabl e Structura l Equatio n
Modelin g to Examin e Plausibilit y of Model s
Exampl e 1: A Longitudina l Pat h Mode l
Exampl e 2: A Nonrecursiv e Multiple-Indicato r Mode l
Exampl e 3: A Longitudina l Multiple-Indicato r
Pane l Mode l

177
178
178
181
184
187
187
188
192
195
195
196
201
201
202

203
204
209
214

10. Logi c of Alternativ e Model s an d Significanc e Tests

234

Neste d Model s
Tests of Overal l Mode l Fit
Absolute Indexes
Relative Indexes
Adjusted Indexes
Fit Indexe s for Comparin g Non-Neste d Model s

235
238
242
243
245
246

Settin g Up Neste d Model s


Wh y Model s Ma y No t Fit
Illustratin g Fit Tests
11. Variation s on th e Basic Laten t Variabl e
Structura l Equatio n Mode l
Analyzin g Structura l Equatio n Model s Whe n Multipl e
Population s Are Availabl e
Overview of Methods
Comparing Processes Across Samples
Testing Plausibility of Contraints
Constraints in the Measurement Model
Constraints in the Structural Model
When and How to Impose Equality Constraints
Second-Orde r Facto r Model s
All-Y Model s
12. Wrappin g Up
Criticism s of Structura l Equatio n Modelin g Approache s
"Internal" Critics
"External" Critics
Emerging Criticisms
Post Ho c Mode l Modificatio n
Topic s No t Covere d
Power Analysis
Nonlinear Relationships
Alternative Estimation Techniques
Analysis of Noncontinuous Variables
Adding Analysis of Means
Multilevel Structural Equation Modeling
Writing Up Papers Containing Structural
Equation Modeling Analysis
Selecting a Computer Program to Do Latent
Variable Structural Equation Modeling
Appendi x A: A Brief Introductio n to Matri x Algebr a
an d Structura l Equatio n Modelin g
What Is a Matrix?
Matrix Operations
Inverting Matrices

247
249
250

255
257
257
259
261
261
262
262
265
268
271
272
272
275
277
278
280
280
280
281
282
282
282
283
283

285
285
288
291

Determinants
Matrices and Rules

292
293

Answer s to Chapte r Discussio n Question s

294

Reference s

299

Autho r Inde x

306

Subjec t Inde x

309

Abou t th e Autho r

311

r<if a

T h is boo k is intende d for researcher s wh o wan t to


us e structura l equatio n approache s bu t wh o feel tha t the y nee d mor e
backgroun d an d a somewha t mor e "gentle " approac h tha n ha s bee n
provide d by book s publishe d previously . From my perspectiv e as a
longtim e use r of structura l equatio n methods , man y individual s wh o
tr y to us e thes e technique s mak e fundamenta l error s becaus e the y lack
understandin g of th e underlyin g root s an d logi c of th e methods . The y
also can mak e "silly " mistake s tha t no t onl y frustrat e the m bu t als o
invalidat e thei r analyse s becaus e writer s hav e assume d tha t reader s
woul d understan d basic s of th e method s (e.g. , wha t ar e calle d refer
enc e indicators) .
Becaus e I cam e to thes e technique s fairl y earl y (in th e earl y
1970s), wha t no w is histor y wa s wha t wa s curren t then . I learne d
abou t method s suc h as pat h analysi s as contemporar y methods , an d
the y evolved int o th e curren t method s ove r time . I hop e tha t I
effectivel y transmi t th e strength s an d limitation s of thes e technique s
as wel l as th e way s in whic h the y led to curren t methods .
I bega n teachin g about thes e technique s in th e sprin g of 1977 by
default , for I wa s th e onl y perso n in my departmen t wh o ha d use d
th e laten t variabl e method s an d understoo d them , an d I als o wa s on e
of th e few wh o ha d acces s to th e programs . For years , I patche d
togethe r my cours e an d looke d for a boo k tha t I liked . Finally , I
decide d to writ e abou t wha t I taught . Th e produc t of tha t decisio n is
thi s book .

xi

xi i

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

I wrot e thi s no t as a statisticia n on th e cuttin g edg e of th e


approache s bu t rathe r as a use r wit h stron g interes t in methods . Th e
boo k reflect s th e wa y in whic h I cam e to thes e methods , namely ,
beginnin g wit h theor y an d a dat a set fro m a schoo l desegregatio n
stud y an d lookin g for method s tha t coul d us e th e nonexperimenta l
dat a fro m tha t projec t to examin e plausibilit y of differen t theoretica l
views . In fact , I practicall y advertis e th e substantiv e problem s tha t led
me to thes e methods , for the y appea r repeatedl y in example s an d
illustrations . As I wil l say agai n in th e text , 1 d o no t us e my dat a as
example s becaus e the y ar e grea t dat a set s or becaus e th e model s fit
perfectly . The y ar e no t an d d o not . At th e sam e time , the y ar e th e
kind s of dat a tha t researcher s find themselve s having , an d th e sub
stantiv e problem s ar e one s tha t ar e accessibl e to readers . If the y ar e
as accessibl e as I thin k the y are , the n I am likel y to get note s an d
comment s fro m reader s about th e alternativ e model s the y generate d
fro m th e dat a sets !
Throughou t th e book , I trie d to presen t topic s an d issue s in a wa y
tha t wil l hel p reader s conceptualiz e thei r models . In particular , I trie d
to spen d tim e discussin g logi c of alternativ e approaches . On e exampl e
is th e discussio n of nonrecursiv e versu s longitudina l models . Al
thoug h I hav e my preference , it is a relativ e on e rathe r tha n an
absolut e one , an d my ow n ultimat e decisio n in an y instanc e woul d be
drive n by a combinatio n of methodologica l an d conceptua l issues .
As I progresse d int o th e writin g of th e book , I foun d ou t quickl y
ho w har d it is to describ e som e of th e mor e comple x method s in
simpl e terms . I suspec t tha t ther e ar e instance s in whic h I di d no t
manag e to sta y "gentle " despit e my bes t intentions . I woul d appreciat e
feedbac k fro m reader s abou t wher e th e complexit y is to o grea t or
wher e th e description s ar e unclear .
Th e boo k is divide d int o fou r parts : background , single-measur e
approaches , facto r analysi s an d multipl e indicators , an d laten t vari
abl e structura l equatio n approaches . Reader s wit h stron g quantitativ e
skill s an d background s shoul d be abl e to sampl e selectivel y fro m th e
firs t thre e part s of th e boo k (Chapter s 1 -7) an d focu s on th e remainin g
chapters , whic h presen t laten t variabl e structura l equatio n modeling .
All readers , however , shoul d be sur e tha t the y understan d th e logi c
underlyin g th e methods . Furthermore , the y shoul d loo k at th e exam
ple s an d illustrations , for thos e mak e concret e man y of th e issue s
presente d in mor e abstrac t ways .

Preface

xiii

Finally , onc e reader s get to th e en d an d go on to tr y to us e th e


techniques , the y shoul d be abl e to go bac k to th e illustration s an d
compar e thei r analyse s to thos e I report . I hav e appende d LISREL
contro l statement s for mos t of th e examples . Comment s an d querie s
can be sen t by e-mai l to geofmar@vx.cis.umn.edu .

j
Ac k 0 VI I ft riIt n e: 1 s
|

First , I hav e to than k my forme r students , wh o use d


earlie r draft s of thi s boo k in my class as I refine d it. Thei r feedbac k
wa s ver y helpful , an d I hop e it result s in a boo k tha t wil l be kin d to
futur e students . Second , thank s ar e du e to thos e wh o helpe d me lear n
structura l equatio n modelin g (SEM) techniques : Norma n Miller , my
graduat e schoo l advise r wh o gav e me my firs t SEM proble m an d
helpe d me thin k abou t conceptua l issues ; Norma n Cliff, wh o hap
pene d to hav e th e Educationa l Testin g Servic e publicatio n on LISREL
an d focused me awa y fro m multipl e regressio n approache s to laten t
variabl e SEM; War d Keesling , an earlie r develope r of SEM ap
proache s wh o provide d advic e an d serve d informall y on my thesi s
committe e eve n thoug h his universit y wa s differen t fro m mine ; Pete r
Bentler , wh o allowe d me to sit in on a class of his (at tha t sam e othe r
university ) as he explore d SEM issues ; an d my colleague s her e at
MinnesotaGeorg e Huba , wh o share d th e firs t SEM class wit h me
as we staye d ahea d of ou r students , an d Bob Cudeck , wh o provide d
support , feedback , ideas , an d resource s as I worke d on thi s book .
Thir d ar e th e arra y of colleague s an d student s wh o cam e to me wit h
thei r problems , for the y enriche d my view s abou t wha t come s easil y
an d wha t is difficul t to understan d in SEM. Fourth , thi s boo k woul d
no t hav e bee n don e wer e it no t for th e encouragemen t of (or wa s tha t
proddin g by? ) my editor , C. Debora h Laughton , an d th e excellen t an d
helpfu l grou p of reviewer s tha t she found . Reviewer s whos e goo d
advic e I did no t follo w shoul d kno w tha t I trie d to incorporat e thei r

xv

xv i

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

feedback , an d wher e th e advic e wa s consisten t an d clear , I did . At th e


sam e time , I foun d instance s in whic h ther e wa s no t agreemen t amon g
them , whic h gav e me licens e bot h to pic k an d choos e an d perhap s to
sta y clos e to th e view s tha t I ha d acquire d ove r time .

BACKGROUN D

jlfcit:3i]LEL!8B^i]

cesse s witn
Nonexperimenta l Data r

The purpose of the statistical procedures is to assist in


establishing the plausibility of the theoretical model and to
estimate the degree to which the various explanatory variables
seem to be influencing the dependent variables.
Coolcy , 1978, p. 13

T h e abov e quot e capture s in a nutshel l th e essenc e


of technique s for modelin g hypothesize d relationship s amon g vari able s usin g nonexperimental , quantitativ e (i.e., correlational ) data .
Th e technique s describe d in thi s boo k ar e intende d to allo w re searcher s to examin e th e plausibilit y of thei r notion s abou t relation ship s an d impact s whe n dat a ar e nonexperimental . Throug h thes e
techniques , hypothesize d structures , typicall y calle d models or (less
accurately ) causa l models , can be eithe r rejecte d as implausibl e or
tentativel y accepte d as consisten t wit h th e data . Th e technique s to be
describe d ar e no t restricte d to nonexperimenta l methods , for thes e
technique s can be use d to mode l experimenta l dat a (e.g. , Bagozzi ,
1991). In experimenta l research , the y ar e mos t valuabl e for studie s
hypothesizin g mediatin g variable s tha t transmi t th e effect s of th e
manipulations .

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

Unfailingly , structura l equatio n method s nee d to star t fro m a concep


tuall y derive d mode l specifyin g th e relationship s amon g a set of
variables . Theor y provide s th e centerpiec e for structura l equatio n
methodologies ; the y wer e designe d for us e by researcher s wit h sub
stantiv e interest s in understandin g comple x pattern s of interrelation
ship s amon g variables . Withou t theory , ther e is littl e to distinguis h
amon g th e numerou s alternativ e way s of depictin g relationship s
amon g a set of variables . For mos t group s of variables , man y differen t
model s can be specified , wit h ver y differen t consequences . Mos t
important , in contras t to realit y wher e caus e an d effec t exis t indepen
dentl y of ou r idea s abou t ho w the y work , in model s caus e an d effect
ar e totall y dependen t on th e wa y in whic h th e relationship s ar e
specified , an d th e result s at bes t spea k to plausibilit y abou t th e wa y
in whic h relationship s ar e specified.
1

Structura l equatio n method s provid e estimate s of th e strengt h of all


th e hypothesize d relationship s betwee n variable s in a theoretica l
model . The y therefor e provid e informatio n abou t hypothesize d im
pact , bot h directl y fro m on e variabl e to anothe r an d via othe r
variable s positione d betwee n th e othe r two . Thos e othe r variable s ar e
calle d intervenin g or mediatin g variables . If on e can assum e tha t th e
hypothesize d mode l is true , the n th e informatio n wil l accuratel y
represen t underlyin g (causal ) processes .
So, on e migh t ask , "Wha t is th e catch ? Th e method s soun d bot h
interestin g an d promising . Wh y isn' t everyon e usin g them? " Th e
catc h come s from th e repeate d us e of term s suc h as hypothesized
causal impact an d assuming that the model is true. The y aler t reader s
abou t th e potentia l weaknesse s of th e methodology . If th e mode l is
wrong , the n th e analyse s ma y be misleadin g or eve n just plai n wrong .
As wa s alread y mentioned , ther e ar e for mos t set s of variable s man y
alternativ e way s of specifyin g thei r relationships . Even "small " error s
in positionin g variable s or including path s can creat e havo c all ove r

1. In som e ways , it is unfortunat e tha t th e method s ar e complex , fo r ther e ha s bee n a


tendenc y fo r th e technique s t o focu s to o muc h on technica l issue s tie d to method s an d to o
littl e on substantiv e one s relate d t o caus e an d effect . As wil l b e discusse d muc h late r in thi s
book , ther e hav e bee n tension s betwee n researcher s wh o argu e fo r usin g th e method s fo r
testin g a prior i model s an d researcher s wh o advocat e fo r pos t ho c changin g o f models , calle d
mode l modification , to produc e model s wit h goo d fit s (e.g. , MacCallum , Roznowski , &
Necowitz , 1 9 9 2 ) .

Background

a mode l an d resul t in th e solutio n suggestin g erroneou s inferences .


On e potentia l consequenc e coul d be to desig n an interventio n that ,
base d on inference s draw n from an incorrectl y specifie d model ,
actuall y manipulate s an effect an d no t a cause .
Conside r as an exampl e modelin g th e relationshi p betwee n popular
ity wit h peer s an d achievemen t in school . Thi s exampl e is an impor
tan t on e for thi s boo k becaus e th e topi c is use d wit h a singl e dat a set
to illustrat e an d compar e differen t structura l equatio n technique s an d
is use d wit h variou s dat a set s in a numbe r of illustrations . Further
more , to th e exten t tha t reader s can dra w on thei r persona l experi
ence s in school s to conjectur e or hypothesiz e about whethe r or no t
an d ho w th e variable s ar e related , it hopefull y also wil l prov e to be
an exampl e tha t is eas y to follo w an d understand . To limi t th e rang e
of developmenta l or age-specifi c hypothese s tha t ma y be generated ,
thi s exampl e assume s a focu s on elementar y grad e students .
Imagin e tha t a researc h tea m ha s decide d to investigat e th e plausibilit y
of a conceptuall y drive n mode l investigatin g th e nature of th e rela
tionshi p betwee n bein g accepte d by peer s in schoo l an d doin g wel l in
school . Th e researc h questio n coul d be "Doe s an individual' s popu
larit y affec t his or he r achievement? " As shoul d be tru e for all models ,
thi s mode l start s fro m a conceptua l on e specifyin g th e natur e of th e
relationship s betwee n th e variables . Interes t in th e mode l come s fro m
a larg e numbe r of correlationa l studie s tha t hav e foun d pee r accep
tanc e to correlat e positivel y wit h academi c achievemen t (see , e.g. ,
Maruyama , Miller , & Holtz , 1986). First , th e researc h questio n
implicitl y state s tw o alternativ e view s abou t th e impac t of popularit y
on achievemen t (doe s affect , doe s no t affect) . Second , ther e als o is
th e questio n of th e impac t of achievemen t on popularit y (doe s affect ,
doe s no t affect) . In pas t research , ther e ha s bee n theorizin g supportin g
each directio n of influence , an d eithe r on e coul d accoun t for th e
relationshi p betwee n th e tw o variable s tha t ha s bee n foun d by corre
lationa l research . Thus , we shoul d wan t to generat e a mode l tha t
allow s us to examin e hypothesize d relation s fro m each variabl e to th e
other , tha t is, goin g in tw o directions . Both view s coul d be supported ,
on e or th e othe r coul d be supported , or bot h coul d be foun d to be
implausibl e (see , e.g., Maruyam a & McGarvey , 1980). Th e last
possibilit y coul d occur , for example , if som e othe r variabl e or vari
able s influence d bot h popularit y an d achievemen t an d thereb y ac
counte d for thei r association .

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

Th e firs t vie w abou t impac t coul d be calle d a "socia l star " mode l in
whic h popula r childre n ove r tim e com e to d o bette r in schoo l
(popularit y affect s achievement) . Th e processe s coul d operat e in a
fashio n suc h as th e following : By virtu e of som e positiv e attribute s
the y possess , popula r childre n ar e like d bette r by teacher s wh o expec t
mor e fro m the m an d als o ar e like d mor e by an d helpe d mor e by peers ,
wit h th e resul t tha t thei r rat e of learnin g wil l increase . Th e reasonin g
draw s fro m teache r expectanc y effect s (e.g. , Broph y 8c Good , 1974)
an d from th e attractio n an d attractivenes s literature s (e.g. , Byrn e &
Griffirt , 1973; Maruyam a & Miller , 1981).
A secon d view , whic h coul d be calle d th e "academi c star " model ,
hypothesize s tha t high-achievin g childre n wil l be bette r like d becaus e
of thei r capabilitie s an d accomplishments . Thi s view als o can be
draw n fro m th e similarit y an d attractio n literatures , whic h sugges t
tha t we ar e attracte d to other s wh o rewar d us or hav e th e potentia l
to rewar d us (e.g. , Byrn e &c Griffitt , 1973).
Figur e 1.1 depict s pictoriall y a mode l tha t hypothesize s a set of causa l
processe s linkin g acceptanc e by peer s an d achievement . Th e mode l
woul d consis t of tw o structura l equations , on e for each variabl e tha t
ha s arrow s goin g to it. It tentativel y hypothesize s tha t popularit y
affect s achievemen t (th e arro w fro m popularit y to achievement ) an d
tha t achievemen t affect s popularit y (th e arro w fro m achievemen t to
popularity) . It als o articulate s othe r variablesth e socia l clas s of th e
student' s family , th e student' s academi c ability , an d th e student' s
social skillstha t ar e hypothesize d as affectin g eithe r popularit y or
achievemen t or both .
Becaus e th e mode l is complex , furthe r discussio n of it is left unti l late r
in th e book . Th e importan t poin t her e is tha t model s provid e a mean s
by whic h to articulat e pattern s of hypothesize d relationship s amon g
variables . Plausibilit y of th e mode l coul d be teste d if we wer e to
collec t measure s of th e variable s describe d an d us e th e technique s an d
method s to be describe d in thi s boo k to analyz e th e data . O f course ,
if th e hypothesizin g is flawed , the n we likel y woul d lear n littl e
regardles s of wha t th e analyse s suggest .
Th e technique s an d method s describe d in thi s boo k d o no t tr y to d o
th e impossible , namely , to establis h causalit y in th e absenc e of an
experimenta l intervention . The y canno t prov e tha t an y variabl e
cause s anothe r variable . At th e sam e time , however , the y als o d o no t

Background

Socia l

/
1

/
I

Famil y
octa l
Ctaa a

'
1

Popularit y \
with
I

1 1

L.
/

"

4
I

I
Acadaml c |
I Aehlavaman t j

\
.

AcadwWc

AMItty

Figure 1.1.

Model for the Relationship Between Popularity and Achievement

accept as truth an assertion that causality can and should be examined


only through experimentation. Rather, they provide an alternative
and complementary methodology to experimentation for examining
plausibility of hypothesized models. The approach is particularly
valuable in situations where, for various reasons (e.g., the variables
cannot be manipulated ethically, comparison groups are not and
cannot be made equivalent, a rich correlational data set is available
to provide guidance for future research), experimentation was not
done. In Figure 1.1, for example, each relationship described by a
single-headed arrow hypothesizes the existence of a cause-effect
relationship linking two variables, which means that analyses of the
model provide information about plausibility of those relationships
actually existing.
With the development of powerful computers and accompanying
software that make the complex mathematics of the most effective
techniques no longer a daunting obstacle, the methods described in
this book have made great strides in the past 20 years. Within
psychology, for example, they have gone from being generally un-

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

know n an d bein g discounte d as ver y limite d to bein g widel y accepted .


The y eve n hav e bee n viewe d as excitin g an d as havin g grea t potentia l
to transfor m or at leas t exten d researc h acros s th e socia l sciences ,
education , business , an d health . Afte r all, as ha s alread y bee n dis
cussed , we ail kno w abou t variable s tha t canno t be manipulate d
ethicall y bu t abou t whic h we hav e idea s abou t causality . For example ,
neithe r rac e no r ethnicit y can be manipulated , no r can gende r or eve n
socia l class , yet ther e ar e numerou s conceptua l model s articulatin g
way s in whic h thos e variable s ar e relate d to othe r variables . Som e of
th e model s eve n hypothesiz e causa l impact s of demographi c types of
variables . Furthermore , mos t researcher s hav e idea s abou t causalit y
irrespectiv e of whethe r the y us e experimenta l or nonexperimenta l
techniques . If theor y drive s a researc h study , the n th e analyse s shoul d
be conducte d in a wa y tha t doe s th e bes t possibl e job of examinin g
plausibilit y of th e theory , an d structura l equatio n technique s provid e
a usefu l tool .
Th e availabilit y an d acceptanc e of structura l equatio n technique s
shoul d no t lead reader s to underestimat e th e difficult y of usin g them .
The y hav e becom e so eas y to us e tha t ther e is no w greate r dange r of
suc h technique s bein g misuse d by researcher s wh o reall y d o no t
understan d the m tha n ther e is of suc h technique s bein g overlooked .
Th e prominen t compute r program s for doin g structura l equatio n
modelin g (SEM) hav e reache d th e poin t wher e the y can be ru n by
creatin g a diagra m of th e mode l (e.g. , th e AMO S an d EQS program s
tha t wil l be describe d late r in thi s text) . Onc e model s ar e specified ,
the y can easil y be modified , eve n fro m a diagra m (e.g. , th e LISREL
program , also describe d later) .
At present , ther e is availabl e an arra y of technique s for conductin g
structura l equatio n analyses . Thes e technique s includ e "ordinary "
regression , multistag e leas t square s regression , pane l analysis , an d
laten t variabl e (ofte n maximum-likelihood ) analysi s of structura l
equations . Thes e approache s ar e no t withou t controversy . Som e
critic s thin k of thei r us e wit h structura l model s as representin g GIG O
(garbag e in , garbag e out) ; other s vie w the m as tryin g to accomplis h
th e impossiblet o prov e causalit y fro m correlation . By contrast ,
experienc e usin g thes e technique s suggest s to me a muc h differen t
conclusion , namely , tha t the y serv e a valuabl e functio n in th e socia l
science s an d shoul d be par t of th e repertoir e of tool s availabl e to
researchers .

Background

Perhap s par t of wha t lead s to th e rang e of divergen t view s suc h as


thos e just describe d is th e wa y in whic h on e think s abou t thes e
methods . If th e view is tha t user s of thes e method s ar e tryin g to
"prove " causality , the n skepticis m (if no t rejection ) of thes e ap
proache s is reasonable . (We also coul d discus s ho w an d whethe r or
no t experimentatio n actuall y establishe s causality , bu t tha t is a discus
sion for a differen t book. ) If, on th e othe r hand , th e vie w is tha t ther e
ar e man y alternativ e way s of thinkin g about causalit y amon g an y set
of variable s and tha t dat a in man y circumstance s ough t to be usefu l
in distinguishin g betwee n or amon g alternativ e perspectives , the n
thes e approache s provid e importan t information . Mor e specifically ,
th e wa y in whic h on e distinguishe s betwee n variou s model s is by
findin g discontinuatio n of on e or mor e alternativ e models . In othe r
words , as is a centra l them e of thi s book , a particula r dat a set ma y or
ma y no t "fit" or be consisten t wit h a particula r model . If th e dat a an d
mode l ar e inconsistent , the n tha t mode l can be rejecte d as no t
plausibl e an d th e theor y tha t generate d it is pu t at risk . Onc e implau
sibl e model s ar e discarded , researc h can focu s on remainin g plausibl e
model s an d develo p way s of usin g variou s method s to pi t the m agains t
on e another .

Method s for Structura l Equatio n Analyse s

Th e approache s describe d in thi s boo k hav e bee n give n a numbe r of


differen t label s includin g pat h modeling , pat h analysis , causa l mode l
analysis , causa l modeling , structura l equatio n analysis , SEM, an d
laten t variabl e analysi s of structura l equations . Som e ar e eve n referre d
to by th e name s of compute r programs , for example , LISREL analysis .
Th e particula r ter m use d reflect s bot h th e philosoph y of th e use r
abou t th e approache s an d th e tim e perio d in whic h th e use r learne d
abou t th e methods . Initially , thes e technique s usuall y wer e calle d pat h
analysis , usin g th e nam e give n to th e approache s by Sewel l Wrigh t in
his earl y work s on decomposin g th e relativ e importanc e of differen t
geneti c path s (Wright , 1921,1934). Tha t terminolog y remaine d mos t
commo n throughou t th e earl y year s an d wel l int o th e perio d durin g
th e 1960s, whe n th e approache s wer e importe d int o th e social
science s by Blalock , Duncan , an d others . By th e 1970s, whe n ad
vance s in computer s an d in compute r application s mad e mor e com
ple x analyse s possibl e an d practical , th e ter m "causa l modeling " wa s

BASIC S O F S T R U C T U R A L E Q U A T I O N

10

MODELIN G

used . Tha t ter m fairl y quickl y fell int o disfavo r in th e mind s of at leas t
a subse t of socia l scientists , wh o objecte d to an y us e of th e ter m
"causal " wit h nonexperimenta l data . Th e ter m "causal " wa s replace d
by th e less controversia l an d ver y descriptiv e ter m "structura l equa
tion" ; th e relation s betwee n variable s in thes e approache s ar e define d
by a serie s of equation s tha t describ e hypothesize d structure s of
relationships , thu s structura l equatio n analysi s or structura l equatio n
modeling .
Figur e 1.2 provide s an illustratio n of a hypothesize d causa l struc
ture . Not e tha t th e model' s structur e hypothesize s tha t Variabl e 1
(famil y social class ) influence s Variabl e 2 (one' s successe s at school) ,
whic h in tur n influence s Variabl e 3 (one' s firs t job) . Ther e is on e
structura l equatio n tha t predict s Variabl e 2 an d a secon d tha t predict s
Variabl e 3.
Even th e simpl e mode l of Figur e 1.2 allow s a tes t to be conducted .
In thi s case , th e relationshi p betwee n Variabl e 1 an d Variabl e 3 is
mediate d or transmitte d by Variabl e 2. If th e mode l is true , then , as
wil l be explaine d mor e full y in Chapte r 3 whe n pat h model s ar e
described , th e correlatio n betwee n Variabl e 1 an d Variabl e 3 shoul d
be th e produc t of th e correlatio n betwee n Variabl e 1 an d Variabl e 2
an d th e correlatio n betwee n Variabl e 2 an d Variabl e 3. (Tha t is, th e
equalit y r = r x r or r - ( r x r ) = 0 shoul d hold . Reader s
familia r wit h partia l correlatio n or partia l regressio n formula s wil l
recogniz e th e latte r for m of th e equalit y as th e numerato r of thos e
formula s for three-variabl e case partial s betwee n Variabl e 1 an d
Variabl e 3.) If th e correlatio n betwee n Variabl e 1 an d Variabl e 3 is no t
simila r to th e produc t of th e othe r tw o correlations , the n th e mode l
can be rejecte d as no t accuratel y representin g th e data , at leas t for th e
dat a set . If th e correlatio n r is simila r to th e produc t of th e othe r
tw o correlations , the n th e mode l is viable . Ther e ar e importan t issue s
in determinin g wha t constitute s "similar. " Becaus e issue s of similarit y
ar e muc h easie r to understan d in th e contex t of statistica l test s of
mode l fit, th e discussio n wil l be left for later . For now , assum e tha t
by inspectio n we migh t be abl e to mak e approximat e determination s
abou t similarity .
Even if th e mode l "passes " ou r test s of similarit y or fit, ther e stil l
ar e limitation s abou t wha t conclusion s can be drawn . As wil l be
discusse d in detai l in a late r chapter , ther e ar e alternativ e theoretica l
model s tha t predic t th e sam e patter n o f relationship s amon g meas
ures . For example , in Figur e 1.2, th e "reverse " mode l forme d by
13

2i

I2

23

Background

11

Famil y
Socia l

Figur e 1.2.

\
- (

SUCCM S

t
Schoo l

V
/

0'
ret
Jo b

Simpl e Structur e Mode l Interrelatin g Thre e Variable s

reversin g th e directio n of th e arrows , namely , fro m Variabl e 3 to


Variabl e 2 to Variabl e 1, is mathematicall y equivalent , as ar e a numbe r
of othe r models , includin g one s in whic h variable s ar e relate d bu t d o
no t caus e on e another . Thus , failin g to rejec t a mode l as implausibl e
mean s onl y tha t it is on e of a numbe r of viabl e remainin g models . At
th e sam e time , however , inspectio n of th e substantiv e variable s in
Figur e 1.2 illustrate s th e poin t tha t no t all alternativ e model s ar e
necessaril y of equa l viability ; it make s littl e sens e to argu e tha t a
child' s firs t job cause s th e socia l class of his or he r birt h family . In
othe r words , som e mathematicall y viabl e model s ar e less logicall y
viabl e tha n others , an d researcher s nee d to tak e advantag e of logica l
as wel l as mathematica l an d theoretica l informatio n in assessin g
viabilit y of model s an d competin g models .
In summary , th e methodologie s to be describe d in thi s boo k ar e
intende d to encourag e an d allo w formalize d presentatio n of th e
hypothesize d relationship s underlyin g correlationa l research , to tes t
th e plausibilit y of th e hypothesizin g for a particula r dat a set , an d to
complemen t othe r methodologica l approaches . Becaus e structura l
equatio n method s hav e becom e a par t of th e arra y of tool s tha t
graduat e student s toda y ar e taught , developin g a soun d under
standin g of th e technique s an d th e logi c underlyin g the m is of critica l
importance , an d thi s bring s me to th e goa l of thi s book .
Th e goa l of thi s boo k is to provid e reader s wit h a goo d basi c
understandin g of ho w an d wh y structura l equatio n approache s hav e
com e to be used . Providin g tha t understandin g require s ensurin g tha t
reader s hav e th e opportunit y to lear n abou t th e logi c underlyin g th e
us e of thes e approaches , abou t ho w the y relat e to technique s suc h as
regressio n an d facto r analysis , abou t thei r strength s an d shortcoming s
as compare d to alternativ e methodologies , an d about th e variou s

12

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

methodologie s for analyzin g structura l equatio n data . Thi s boo k wil l


no t tr y to cove r th e entir e field of structura l equatio n techniques , for
tha t field currentl y is a fertil e on e tha t is expandin g rapidly . Explana
tio n of th e mor e comple x issue s wil l be left for othe r writer s suc h as
Bollen (1989), Haydu k (1996), an d Hoyl e (1995).

Overvie w

Thi s boo k is divide d int o fou r sections . Th e firs t section , Chapter s 1


an d 2, is an overvie w an d histor y of method s for pat h models . Th e
secon d section , Chapter s 3-6, cover s basi c approache s to structura l
modelin g wit h singl e measure s of theoretica l variables . Th e thir d
section , Chapte r 7, introduce s explorator y an d confirmator y facto r
analysi s technique s an d discusse s measuremen t issue s whe n multipl e
measure s of theoretica l variable s ar e available . Th e fina l section ,
Chapter s 8-12, cover s laten t variabl e SEM.
Th e boo k attempt s to cove r backgroun d informatio n tha t ofte n
ha s bee n overlooke d by othe r author s bu t tha t need s to be understoo d
if on e is to be an intelligen t use r an d tak e maxima l advantag e of th e
approaches . Becaus e th e focu s is on appealin g to a genera l audience ,
technica l languag e an d equation s ar e avoide d wheneve r possible . As
wil l be seen , however , technica l language , equations , an d matri x
algebr a ar e integra l part s of structura l equatio n approaches , an d I
foun d it impossibl e to cove r th e issue s withou t includin g them . Whe n
technica l languag e an d equation s ar e used , the y ar e complemente d
wit h bot h narrativ e explanation s an d illustration s tha t appl y to them .
In som e instances , thi s approac h ma y glos s ove r technica l issue s of
importance , bu t suc h instance s ar e few .

Chapte r Discussio n Question s


1. Are th e method s describe d in thi s boo k onl y for quantitativ e
data ?
2. Can econometri c dat a be analyze d usin g thes e techniques ?
3. In th e mode l in Figur e 1.2, is Succes s at Schoo l bot h moderat
ing an d mediatin g th e relationshi p betwee n th e othe r tw o
variables ?

13

Background

4. Wha t doe s it mea n to say tha t ther e ar e model s tha t ar e


mathematicall y equivalen t to th e on e in Figur e 1.2? Wha t ar e
som e alternativ e models ? Wha t make s th e differen t model s
equivalent ? Doe s thi s mea n tha t you shoul d choos e a mode l
tha t fits you r hypothesize d relationships ?

E X E R C I SE

1.1

Logi c of Pat h Modelin g

Thi s exampl e use s rea l dat a collecte d fro m colleg e students .


In thes e particula r models , th e approac h use d is th e on e
describe d in thi s chapte r bu t also is pat h analysis . Each pat h
(standardize d regression ) coefficien t is "simple " (i.e., bivari
ate ) and , therefore , in thes e case s pat h analysi s reduce s to
analysi s of simpl e correlations . N o statistica l procedure s ar e
neede d to get th e pat h coefficients .
Give n th e thre e variable s Test Anxiety , Test Expectations , an d
Test Performance , examin e th e plausibilit y of th e tw o model s
presente d as follow s for each of th e fou r group s (thin k of
the m as fou r replications) . N o forma l significanc e test s ar e
necessary .

Group

Group

Group

Correlation

Group

Tes t Anxiet y wit h Tes t


Expectation s

-.321

-.423

-.221

-.364

Tes t Anxiet y wit h Tes t


Performanc e

-.288

-.288

-.153

-.278

Tes t Expectation s wit h


Tes t Performanc e

.207

.311

.179

.306

N O T E : Figure s ar e r values .

Mode l A: Test Anxiet y - Test Expectation s - Test Performanc e


Thi s mode l hypothesize s tes t expectation s as mediatin g th e
relationshi p betwee n tes t anxiet y an d tes t outcomes . Tha t is,

14

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

student s highe r on tes t anxiet y expec t to d o less wel l an d


therefor e perfor m less well .
Is Model A plausible f
Mode l B: Test Expectation s - Test Anxiet y -> Test Performanc e
Thi s model , whic h canno t be tru e if Mode l A is true , hypothesize s
tha t student s wit h highe r tes t expectation s wil l be lowe r on tes t
anxiety , whic h wil l caus e the m to d o better . Thi s latte r mode l view s
tes t anxiet y as less of an individua l differenc e variabl e tha n personal
ity theorist s migh t view it.
/ s Model plausible?
Solution process. Each mode l is "tested " by multiplyin g tw o cor
relation s togethe r an d comparin g thei r produc t wit h th e thir d corre
lation . If th e mode l fits , the n th e produc t shoul d equa l th e thir d
correlation . Th e differenc e betwee n th e produc t an d th e thir d corre
latio n is th e par t of th e correlatio n tha t is unexplaine d by th e model .
For Mode l A, th e Test Anxiety-Tes t Performanc e correlatio n
is compare d to th e produc t of th e othe r tw o correlations .
For Mode l B, th e Test Expectations-Tes t Performanc e corre
latio n is compare d to th e produc t of th e othe r tw o correla
tions .
{Neithe r mode l fits. )

11 I 1

11[

I 1I 11 II

H i

jHppEi ^^rdjj^pcx

Jjii 1^ ^

Becaus e th e structura l equatio n modelin g (SEM)


literatur e is sprea d acros s man y field s an d encompasse s a numbe r of
traditions , reader s ne w to SEM approache s ofte n hav e a difficul t tim e
findin g informatio n tha t give s the m a basi c understandin g of th e
contex t an d purpose s of SEM. Th e goa l of thi s chapte r is to provid e
a basi c backgroun d an d history , addressin g question s suc h as th e
following . Wh o though t of thes e approaches ? In wha t contex t wer e
the y developed ? Ho w hav e the y bee n used ? Wh o translate d the m or
adapte d the m for othe r areas ? Wha t shoul d on e kno w abou t the m to
us e the m intelligently ? In som e instances , thi s explanatio n require s
introducin g term s tha t wil l be explaine d mor e full y in late r chapters ;
in suc h instances , I attemp t to provid e enoug h contex t for reader s to
understan d th e point . Th e firs t sectio n trace s th e histor y of SEM, an d
th e secon d focuse s on th e broa d class of researc h question s for whic h
model s can be useful .

Histor y

I Sewell Wright
As mentione d in Chapte r 1, th e root s of SEM go bac k to th e 1920s,
whe n Sewel l Wright , a geneticist , attempte d to solv e simultaneou s
15

16

BACKGROUN D

equation s to disentangl e geneti c influence s acros s generations . Wrigh t


faced a situatio n in whic h th e cause s (gene s of th e "parents" ) wer e
know n an d th e outcome s (th e offspring' s traits ) wer e known , an d
causalit y wen t in a singl e directio n withou t feedbac k or loop s tha t
circl e bac k on themselves . In th e SEM literature , his situatio n is calle d
a recursiv e or unidirectiona l causa l flow model. It is th e onl y kin d of
mode l tha t can properl y be calle d pat h analysis . Wrigh t wante d to
estimat e th e size s of th e effect s fro m each paren t to th e offspring . Th e
solutio n coul d be determine d by writin g th e syste m of equations ,
expressin g th e equation s in term s of th e correlation s amon g th e
variou s variables , an d solvin g for th e unknown s (ther e wer e mor e
known s tha n unknowns , so th e syste m wa s solvable) . In describin g
his methodology , Wrigh t (1921) stated ,
2

Th e presen t pape r is an attemp t t o presen t a metho d of measurin g th e


direc t effec t alon g eac h separat e pat h in suc h a syste m an d thu s o f findin g
th e degre e to whic h variatio n of a give n effec t is determine d by eac h
particula r cause . Th e metho d depend s upo n th e combinatio n of knowl
edg e of th e degre e of correlatio n amon g th e variable s in a syste m wit h
suc h knowledg e as ma y be possesse d of th e causa l relations . In case s
wher e causa l relation s ar e uncertain , th e metho d can be use d t o fin d th e
logica l consequence s of an y particula r hypothesi s in regar d t o them ,
(p . 5 5 7 )

Note , in particular , th e statemen t abou t findin g logica l consequence s


of an y hypothesis . It is anothe r wa y of saying , "If th e mode l is true ,
the n th e relationship s ar e . . ."
Later , Wrigh t (1934) stated ,
Th e metho d of pat h coefficient s is no t intende d t o accomplis h th e
impossibl e tas k of deducin g causa l relation s fro m th e value s of correla
tio n coefficients . It is intende d to combin e th e quantitativ e informatio n
give n by th e correlation s wit h suc h qualitativ e informatio n as ma y be at
han d on causa l relation s to giv e a quantitativ e interpretation , (p . 193)

2. Th e ide a of "recursive " meanin g th e sam e as "unidirectional " alway s ha s bee n problem
atic , fo r it is inconsisten t wit h othe r use s o f recursion . Recently , Ed Rigdo n provide d an
explanatio n fo r recursiv e on a structura l equatio n listserv e tha t make s sens e t o me : Th e
model s ar e recursiv e becaus e the y ar e mad e u p of a se t o f equations , an d thos e equation s
can b e ordere d suc h tha t on e solve s sequentiall y fo r eac h dependen t variable/equatio n an d
the n use s tha t informatio n t o retur n recursivel y t o th e syste m o f equation s t o solv e fo r late r
variables/equations .

History

and

Logic

17

In additio n to introducin g th e method s of pat h analysis , Wright' s


statement s clearl y define d th e purpose s of th e methodology , namely ,
to find th e consequence s of particula r hypothesize d structures . H e
also clearl y state d limit s of th e approache s in term s of issue s of
causality .
I Path Analysis in the Social Sciences
Perhap s surprisingly , wor k wit h th e idea s of Wrigh t in th e social
science s wa s negligibl e unti l th e 1960s, whe n Blalock (1964), Dunca n
(1966), an d other s introduce d the m to addres s socia l scienc e issues .
For pat h analysis , solutio n processe s wer e relativel y simple . Parame
ter s wer e estimate d by solvin g a syste m of equation s usin g linea r
algebr a (solvin g for a numbe r of unknown s usin g a syste m containin g
an equa l or a greate r numbe r of equations ) or usin g multipl e regression .
On e of th e prominen t area s of earl y structura l equatio n researc h
wa s on wha t ar e calle d statu s attainmen t processes , namely , wha t
determine s th e jobs an d career s tha t we en d up having . Tha t research ,
don e by Dunca n an d others , examine d antecedent s of succes s in
attainin g educatio n an d jobs . Thi s researc h looke d at variable s suc h
as social class of th e family , pas t academi c achievement , an d socia l
suppor t as predictor s of success . In suc h models , then , th e primar y
dependen t variable s wer e educationa l attainmen t (e.g. , year s of edu
cation , degree s received ) an d job status . Becaus e thes e model s crosse d
larg e period s of time , the y generall y wer e unidirectiona l in thei r flow .

I Unidirectiona l Flow Model s


For model s in whic h hypothesize d causalit y goe s in a singl e direction ,
th e solutio n proces s wa s fairl y straightforwar d an d amenabl e to
methodologie s availabl e from th e tim e of Wright . Initially , algebr a
(simultaneou s equations ) an d matri x algebr a wer e use d for estima
tion , solvin g on e or mor e equation s for som e numbe r of unknowns .
Later , regressio n technique s wer e use d as well . In som e instance s
th e tw o approache s yielde d identica l findings , bu t in other s the y
coul d differ . In model s tha t me t th e minimu m conditio n necessar y
for uniquel y solvin g for th e unknow n parameter s to be estimated ,

BACKGROUN D

18

namely , havin g th e sam e numbe r of equation s as unknow n parameter s


to estimat e (calle d just-identifie d model s in th e pat h analysi s an d SEM
literatures) , regressio n an d linea r algebr a approache s yielde d identica l
results . Tha t is, th e sam e uniqu e solutio n can be obtaine d eithe r by
solvin g for th e equation s usin g matri x algebr a or by usin g regressio n
approaches . (Reader s coul d loo k ahea d to Figur e 3.2 to see a just
identifie d model . It ha s 10 path s to estimat e an d ha s enoug h infor
mation1 0 correlationst o yiel d 10 equations. )
In model s in whic h ther e wer e to o man y parameter s to estimat e
for th e numbe r of observe d measures , namely , havin g mor e unknown s
tha n equation s (calle d under-identifie d models) , ther e woul d no t be
enoug h informatio n availabl e to uniquel y estimat e th e parameter s
regardles s of th e approac h used . Th e proble m cause d by no t havin g
enoug h informatio n is tha t ther e ar e an infinit e numbe r of alternativ e
solution s tha t ar e equall y viabl e an d no defensibl e wa y of choosin g
fro m amon g them . (Under-identificatio n wil l be addresse d in mor e
detai l in Chapte r 5 as par t of th e discussio n of non-recursiv e models. )
Therefore , attemptin g to impos e a singl e pat h analyti c solutio n to
interpre t make s no sense . Onc e again , reader s coul d loo k ahea d at
Figur e 3.2. If we wer e to ad d an y othe r possibl e path , for example ,
from bac k to X , the n th e mode l woul d hav e to o man y unknown s
an d no t be uniquel y solvable .
Finally , for model s in whic h ther e ar e fewe r unknown s tha n
equation s (calle d over-identifie d models) , th e equation s hol d enoug h
informatio n to produc e mor e estimate s tha n parameters . As a result ,
ther e wil l be mor e tha n on e wa y of solvin g for at leas t som e of th e
parameters , an d th e differen t way s wil l no t necessaril y (or usually )
produc e exactl y th e sam e estimates . Onc e again , reader s can loo k
ahea d at Figur e 3.2. If we wer e to dro p an y existin g path s becaus e we
decide d tha t theoreticall y the y shoul d be zero , the n th e mode l woul d
becom e over-identified , for it woul d hav e mor e equation s tha n pa
rameter s to solve . In suc h circumstances , regression , whic h produce s
onl y a singl e solutio n for interrelatin g a set of predictor s to a partic
ula r criterio n variable , ha s bee n show n to produc e th e bes t estimat e
(see Land , 1969). Furthermore , to anticipat e late r discussion s of
estimatio n usin g maximu m likelihoo d approaches , in suc h circum
stance s leas t square s an d maximu m likelihoo d estimate s ar e identica l
(e.g. , Land , 1969). Thes e estimate s for a give n paramete r typicall y

History

and

Logic

19

woul d be close to an averag e of th e variou s way s of estimatin g tha t


paramete r throug h algebrai c solutions .
Althoug h it ma y seem difficul t to believ e fro m today' s perspec
tive , eve n as recentl y as durin g th e 1960s a majo r constrain t on th e
us e of structura l equatio n method s to solv e broade r classe s of model s
wa s th e relativel y primitiv e stat e of computer s an d consequen t un
availabilit y of estimatio n technique s suc h as maximu m likelihood .
Th e hardwar e wa s no t in plac e to allo w genera l acces s to statistica l
analyse s tha t coul d hav e addresse d structura l equatio n method s usin g
relativel y comple x mathematica l approaches . For example , on e of th e
firs t paper s to compar e leas t squares , algebraic , an d maximu m likeli
hoo d estimate s misestimate d th e maximu m likelihoo d estimate s an d
had to correc t bot h th e result s and th e interpretation s in an addendum .
In summary , th e approache s to SEM use d durin g th e "pat h
analysi s era " of th e 1960s to solv e for unidirectiona l causa l flow
model s employe d multipl e regressio n techniques , ofte n calle d ordi
nar y leas t square s analysis . For pat h analysi s model s (whic h by
definitio n hav e onl y singl e measure s of each variabl e of interest) ,
thes e technique s wil l yiel d result s identica l to thos e of th e curren t ap
proache s because , as alread y mentioned , leas t square s an d maximu m
likelihoo d estimate s ar e identica l (e.g. , Land , 1969). For variation s
on pat h analysi s tha t emplo y measuremen t erro r or bidirectiona l
causality/feedbac k loop s (calle d nonrecursiv e models) , variation s on
regressio n technique s suc h as indirec t leas t squares , two-stag e leas t
squares , or three-stag e leas t square s coul d be used .
To anticipat e late r part s of thi s boo k an d direc t th e thinkin g of
readers , it is importan t to not e tha t a majo r advantag e of th e genera l
linea r mode l use d in program s suc h as LISREL (e.g. , Joresko g &
Sorbom , 1993), EQS (e.g. , Bentler , 1989), an d AMO S (e.g. , Arbuckle ,
1997) is tha t the y can handl e mos t type s of model s (recursive ,
nonrecursive , wit h an d withou t rando m an d nonrando m measure
men t error , an d wit h observe d an d unobserve d variables ) an d conse
quentl y d o no t requir e reader s to lear n an arra y of differen t tech
nique s for differen t type s of models . Thes e programs , whic h giv e ris e
to th e laten t variabl e structura l equatio n models , shar e th e genera l
linea r mode l of regressio n model s bu t diffe r insofa r as the y hav e
unmeasure d predicto r variable s (thi s poin t wil l be explaine d later) .
Lest th e curren t poin t be lost , however , it is importan t to not e tha t

BACKGROUN D

20

pat h analysi s model s ar c a class of multipl e regressio n models . Th e


shortcoming s of regressio n approache s therefor e ar e importan t to
conside r an d ar e covere d in Chapte r 4 of thi s book .

Movin g Beyon d Pat h Analysi s in


Structura l Equatio n Modelin g Researc h

Afte r a surg e of interes t in th e leas t square s methods , th e limitations ,


especiall y of pat h analysi s technique s describe d in detai l in Chapte r 3,
led th e method s int o disfavor . It wa s eas y to criticiz e th e approaches .
For example , mos t theoretica l variable s ar e assesse d inaccuratel y du e
to bot h imprecisio n in operationalizin g the m an d inaccurac y in mea
surin g th e observe d measures . Thes e difficultie s occu r regardles s of
whethe r or no t th e conceptua l mode l presente d is viable . In othe r
words , usin g pat h analysi s technique s wa s a surefir e wa y of invitin g
criticis m for poo r operationalizatio n of th e conceptua l variables , an d
thi s mad e publicatio n of SEM researc h difficult . In addition , if th e
hypothesize d mode l containe d feedbac k or causa l loops , the n tha t
mode l coul d no t be solve d by ordinar y regressio n techniques , whic h
limite d th e applicabilit y of th e methods .
Th e nex t surg e of enthusias m wa s led by th e work s of Joresko g
(1969) an d other s includin g Bock, Wiley , Browne , an d Keesling .
Thes e researcher s develope d a genera l linea r modelin g approac h tha t
allowe d researcher s to overcom e man y of th e limitation s of th e leas t
square s approache s by allowin g for far bette r operationalizatio n of
theoretica l variables . Th e approac h (e.g. , Wiley , 1973) ha s evolve d
ove r th e pas t 20 year s int o th e arra y of structura l equatio n approache s
an d compute r program s tha t ar e widel y distribute d today , amon g
whic h LISREL, EQS, an d AMO S ar e th e mos t widel y known . Discus
sion of th e method s underlyin g thos e program s comprise s th e latte r
par t of thi s book .

I Wh y Use Structura l Equatio n


Modelin g Techniques ?
A fairl y straightforwar d wa y of thinkin g abou t whe n to conside r usin g
structura l equatio n approache s come s directl y fro m multipl e regres

History

and

Logic

21

sion . Choosin g regressio n as a startin g poin t is logica l becaus e (a) th e


methodologie s hav e evolve d fro m regressio n technique s an d buil d on
th e assumption s of regressio n an d (b) th e reasonin g researcher s us e
for selectin g regressio n in man y instance s actuall y is bette r accom
plishe d by usin g an SEM approach . Thi s chapte r build s primaril y on
th e latte r point .
Ther e ar e tw o prominen t reason s wh y researcher s us e multipl e
regression . Th e firs t focuse s exclusivel y on explainin g as muc h vari
anc e as possibl e in th e dependen t variable . For thi s typ e of use , th e
weight s of th e variou s predicto r variable s ar e muc h less importan t
an d sometime s eve n inconsequentia l or irrelevant . Th e goa l is bes t
predictio n an d in thi s boo k is calle d regressio n for prediction . On e
exampl e of thi s approac h migh t be a situatio n in whic h ther e is an
arra y of informatio n available , an d a colleg e want s to us e tha t
informatio n to mak e "accurate " admission s decisions , namely , admit
tin g student s likel y to sta y an d graduate . A secon d exampl e coul d be
an employe r wh o want s to us e availabl e informatio n to hel p deter
min e whic h potentia l employee s wil l be bot h effectiv e an d likel y to
sta y wit h th e company . A thir d coul d be th e owne r an d coach/manage r
of a sport s tea m tryin g to decid e whic h athlete s the y wan t on thei r
team . In each instance , th e individual s involve d in selectin g ma y no t
car e at all abou t th e specifi c variable s tha t contribut e to th e predic
tion . Rather , the y ar e willin g to us e thos e variable s in th e aggregat e
(i.e., pu t the m all in th e equatio n as predictors ) an d try to maximiz e
varianc e accounte d for in predictin g succes s or retentio n an d mini
miz e imprecision .
Regressio n for predictio n doe s not provid e logi c consisten t wit h
SEM approaches . Rather , for thi s clas s of uses , SEM add s nothin g
importan t or eve n of value . Therefore , suc h problem s fall at bes t at
th e peripher y of th e issue s an d problem s covere d in thi s book .
By contrast , th e secon d set of use s of regressio n in whic h th e
particula r predictor s an d thei r regressio n weight s ar e of interest ,
calle d regressio n for explanatio n here , defin e wh y SEM technique s
ar e so valuable . In suc h circumstances , th e researcher s wan t to kno w
no t onl y ho w wel l th e predictor s explai n th e criterio n variabl e bu t
also whic h specifi c predictor s ar e mos t importan t in predicting . To
illustrate , imagin e a regressio n mode l in whic h th e researcher s hav e
five predicto r variable s an d a criterio n variable . Th e five predictor s
likel y wil l hav e bee n selecte d becaus e the y ar e though t to influenc e
th e criterio n variable , an d regressio n help s disentangl e th e relativ e

BACKGROUN D

22

influenc e of th e variou s predictors. In suc h a model , wha t is impor


tan t is ho w th e approac h represent s th e hypothesize d relationships .
SEM technique s mak e us e of all th e informatio n tha t is provide d by
regressio n technique s an d allo w th e opportunit y to conside r addi
tiona l informatio n tha t help s disentangl e possibl e impact s of variou s
predictors .
Figur e 2.1 present s a typica l regressio n mode l in whic h th e X's
ar e th e predicto r variables , Y is th e criterio n variable , th e b's ar e
regressio n coefficients , th e residua l e is th e unexplaine d variance , an d
th e curved , double-heade d arrow s connectin g th e predicto r variable s
represen t thei r intercorrelations . Not e tha t ther e ar e 10 intercorrela
tion s tha t ar e give n muc h less attentio n in regressio n tha n the y deserv e
(for som e statisti c packages , eve n seein g the m require s askin g for
optiona l output) . If th e predicto r variable s wer e orthogona l (i.e.,
independen t of on e another) , the n th e situatio n woul d be a simpl e
on e an d th e (standardized ) regressio n coefficient s woul d be th e
correlation s of th e predictor s wit h th e criterio n variable . Typically ,
however , th e predictor s ar e correlated , an d th e fact tha t differen t
predicto r variable s ar e interrelate d is muc h of wha t make s multipl e
regressio n interestin g an d lead s investigator s to wan t to disentangl e
th e variou s influences . In suc h circumstances , th e size of th e regres
sion coefficient s reflect s bot h th e size of th e correlatio n of th e
predicto r wit h th e criterio n variabl e an d th e size of th e intercorrela
tion s amon g th e predicto r variables . In fact , of th e problem s in usin g
regressio n approache s tha t ar e discusse d late r in thi s book , virtuall y
all ar e tied to th e size of th e relationship s amon g th e predicto r
variables .
Consider , for example , an illustratio n tha t wil l be use d through
ou t thi s boo k tha t is concerne d abou t predictor s of schoo l achieve
ment . Ther e ar e man y measure s tha t correlat e wit h studen t achieve men t includin g famil y socia l class , academi c ability , measure s of
individua l difference s (e.g. , self-concept , anxiety) , pee r relations ,
teache r evaluations , an d expectation s an d aspiration s of students . Th e
challeng e is to sor t throug h th e measure s an d identif y thos e likel y to
3

3. As an importan t aside , it als o ma y b e th e cas e tha t mor e tha n on e of th e fiv e predictor s


measure s th e sam e underlyin g construct . For example , if researcher s believ e tha t self-concep t
is an importan t predicto r an d ar e concerne d tha t self-concep t is difficul t t o asses s accurately ,
the n the y migh t includ e tw o o r mor e measure s of self-concep t as predictors . As wil l b e
explaine d in Chapte r 4 , suc h an approac h ma y b e self-defeatin g an d misleadin g becaus e it
can underestimat e th e impac t of self-concept .

History and Logic

Figure 2 . 1 .

23

Regression Model With Five Predictor Variables

help shape achievement and separate them from ones that merely
reflect achievement. What makes the challenge particularly difficult
is that the measures tend to be related, which makes the sorting-out
process more difficult both logically and methodologically.
As might be anticipated from the preceding example, problems
in regression models are more likely to emerge in complex models
because as the number of predictor variables increases, the number
of intercorrelations increases much more rapidly. Their number can
be calculated by using the equation
Number of Correlations = \p(p - 1)] / 2,
where p is the number of predictors. Whereas for two predictors there
are twice as many regression coefficients as intercorrelations among
predictor variables (2:1), for five variables there are only half as many
regression coefficients as intercorrelations among predictor variables
(5:10), and the ratio gets smaller as the number of predictors increases.

24

BACKGROUN D

Th e poin t of majo r importanc e her e is tha t becaus e a use r of


regressio n technique s need s to understan d th e size an d natur e of
th e relationship s amon g th e predicto r variables , thos e relationship s
shoul d be mad e a visibl e par t of th e user' s analyses . Doin g so bring s
togethe r regressio n an d pat h analysis , for Figur e 2.1 is a pat h model .
Onc e regressio n model s ar e viewe d as pat h analysi s models , however ,
th e curve d arro w relationship s ma y be though t of somewha t differ
ently . If th e researcher s wh o develope d a mode l ha d idea s abou t ho w
an d wh y th e predicto r variable s wer e causall y interrelated , the y coul d
recas t thei r mode l to represen t som e of th e predictor s as causin g or
bein g cause d by othe r predictors . It is importan t tha t imposin g an
orde r amon g one' s predicto r variable s does not change th e fc's tha t
appea r in th e mode l of Figur e 2.1. It just necessitate s addin g othe r
regressio n analyse s to solv e th e equation s for an y predicto r variable s
tha t ar e hypothesize d as bein g influence d by othe r predicto r variables .
In summary , then , th e poin t of Figur e 2.1 is tha t regressio n model s
in whic h th e significanc e or nonsignificanc e of regressio n coefficient s
fro m specifi c predictor s to th e criterio n variabl e is of primar y impor
tanc e can benefi t by bein g specifie d as pat h models . Presentin g the m
as pat h model s shoul d mak e researcher s mor e awar e of th e kind s of
problem s tha t limi t regressio n approache s while also encouragin g
the m to formaliz e th e intuitiv e idea s the y ma y hav e abou t ho w an d
wh y thei r predicto r variable s ar e interrelated . Becaus e pat h analysi s
model s ar e solve d usin g multipl e regression , thinkin g of regressio n
model s as pat h analyti c model s shoul d be noncontroversial . Further
more , becaus e pat h model s an d multipl e regressio n provid e th e cor e
informatio n neede d to understan d th e broa d clas s of SEM, it is onl y
a coupl e of mor e logica l step s fro m regressio n to laten t variabl e SEM.
Chapte r 2 ha s briefl y trace d th e histor y of SEM. It als o ha s
suggeste d tha t SEM approache s shoul d be considere d wheneve r
researcher s ar e intereste d in predictio n tha t focuse s beyon d th e
varianc e accounte d for (R ) to th e specifi c regressio n weights . In thos e
instances , SEM approache s forc e researcher s to articulat e thei r
thought s about relationship s of all variable s wit h on e another . Onc e
thos e thought s ar e articulated , thei r plausibilit y is subjec t to empirica l
examination .
1

History

and

25

Logic

Chapte r Discussio n Question s


1. In pat h analysis , do you alway s assum e tha t th e dat a set ha s
bee n standardized ?
2. Shoul d multipl e regressio n alway s be use d to solv e for pat h
analysi s models ? Pat h analysi s doe s not us e an y notatio n to
signif y partialing , but conceptuall y will pat h coefficient s be th e
sam e a s partia l regressio n coefficients ?
3. Doe s pat h analysi s wor k with longitudina l data ?
4. How do we gain a degre e of freedo m by takin g out or leavin g
out a path ? How doe s tha t help over-identification ? Wha t
exactl y do degree s of freedo m permi t researcher s to do ?
5. How can under-identifie d model s be solved ?
6. Do researcher s still us e pat h analysis ?

BASIC APPROACHE S TO
MODELIN G WIT H SINGL E
OBSERVE D MEASURE S OF
THEORETICA L VARIABLE S

" 1 |

^ a [ J L Alrijsi^LjSiJLjiii i lz

I n thi s chapter , th e basi c buildin g block s of structura l


equatio n modelin g (SEM) ar e presented . First , an intuitiv e basi s for
developin g correlatio n coefficients/covariance s is presented . The n
notion s abou t breakin g those coefficient s int o "causal " an d "non causal " component s ar e presented , followe d by approache s for break ing apar t component s for a give n model . Third , degree s of freedo m
for structura l equatio n model s ar e discussed . Finally , th e formula s for
three-variabl e partia l correlatio n an d partia l regressio n ar e derive d
fro m pat h model s to sho w th e link s betwee n pat h model s an d
commo n partialin g techniques .
Th e discussio n at thi s poin t focuse s on pat h analysi s models , which ,
whe n th e ter m is use d precisel y (e.g. , Duncan , 1975), ar e onl y thos e
model s (a) wit h unidirectiona l causa l flow an d (b) in whic h th e
measur e of each conceptua l variabl e is perfectl y reliable . In assumin g
perfec t reliability , pat h analysi s assume s tha t each conceptua l variabl e
is assesse d withou t erro r by a singl e measure . Ther e can be no erro r
in measurin g each variabl e (calle d measuremen t error ) or imprecisio n
in operationalizin g each variabl e (calle d specificatio n error) . Tha t is,
each measur e is viewe d as an exac t manifestatio n of an underlyin g
theoretica l variable . Illustration s of th e difficult y of eliminatin g bot h
measuremen t erro r an d specificatio n erro r ar e provide d late r in thi s
chapter .

29

30

SINGL E MEASURE S O F

VARIABLE S

Certainly , withi n th e social sciences , assumption s abou t perfec t reli


abilit y mus t be viewe d as generall y unrealistic . Wha t socia l scientis t
eve r ha s ha d model s in whic h ther e is no measuremen t erro r an d in
whic h all measure s perfectl y operationaliz e th e conceptua l variable s
tha t ar e bein g assessed ? Thi s shortcomin g help s explai n wh y pat h
analysi s di d no t becom e particularl y popula r in socia l scienc e re
search . Nonetheless , a pat h analysi s framewor k is chose n becaus e th e
modelin g processe s describe d hol d tru e for all type s of structura l
equatio n model s an d ar e mos t readil y illustrate d in th e relativel y
simpl e an d straightforwar d contex t of pat h analysis . Furthermore ,
thes e limitin g assumption s of pat h analysi s als o appl y to all regressio n
approaches .
Th e limitation s impose d by assumption s abou t measuremen t an d
specificatio n erro r in pat h analysi s ar e balance d by othe r feature s tha t
mak e pat h analyti c approache s ver y appealing . As wa s argue d earlier ,
an importan t strengt h of pat h analyti c model s is tha t the y forc e
researcher s to articulat e th e theoretica l model s tha t underli e thei r
design s an d thei r thinking . For th e discussio n in thi s chapter , however ,
it is a secon d an d equall y importan t strengt h of pat h model s tha t is
th e focus , namely , th e logi c develope d for attemptin g to tak e corre
lation s or covariance s an d brea k the m apar t int o causa l an d noncausa l
component s (calle d decompositio n of effects) . All of th e differen t
structura l equatio n approache s allo w for decompositio n of effects .
Th e genera l approache s ar e presente d her e as the y wer e develope d
along wit h pat h analysis .
Befor e discussin g decompositio n of effects , a basi c revie w of th e logi c
underlyin g correlation s an d covariance s is presented . Reader s com
fortabl e wit h th e logi c of correlations/covariance s shoul d feel free to
skip ahea d to th e subsequen t sectio n of thi s chapter .

I Logi c of Correlation s an d Covariance s


Imagin e tha t you kno w nothin g at all abou t correlationa l technique s
an d tha t you ar e tryin g to develo p a metho d for assessin g th e
associatio n betwee n tw o variables . On e logica l firs t ste p woul d be to
thin k abou t wha t migh t happe n if you multipl y each individual' s scor e
for th e firs t variabl e by th e scor e for th e secon d variabl e an d divid e
th e sum of th e individua l score s by th e numbe r of individual s or

Path Analysis

and Partitioning

Variance

31

observation s so tha t sampl e size doe s no t affec t th e result . For


variable s tha t ar e associated , smal l number s fro m on e variabl e woul d
be multiplie d by smal l number s from th e othe r variable , wherea s big
number s woul d be multiplie d by othe r big numbers . For variable s tha t
ar e no t associated , smal l number s woul d be as likel y to be multiplie d
by larg e number s as by smal l ones . For variable s negativel y associated ,
smal l number s on on e variabl e woul d be mor e likel y to be multiplie d
by larg e number s on th e othe r variabl e an d vice versa . Th e size of thi s
measur e of associatio n woul d increas e from a negativ e associatio n to
no relationshi p to a positiv e association .
If on e wer e to calculat e suc h products , the n th e result , whic h in
fact is calle d th e cross-produc t of th e tw o variables , coul d provid e
som e informatio n of value , particularl y if a numbe r of variable s
havin g th e sam e scale ar e compared . At th e sam e time , a majo r
shortcomin g become s apparen t whe n on e attempt s to compar e cross product s of variable s wit h markedl y differen t mean s an d standar d
deviations . At tha t point , it become s clea r tha t ther e is no eas y
intuitiv e basi s for makin g sens e of a cross-produc t fro m its size . A
secon d shortcomin g of th e cross-produc t is tha t change s in mean s of
th e variablesadding/subtractin
g a constan t to/fro m a variable' s
mea n (whic h coul d in effect be wha t happen s fro m on e sampl e to
another)change s its cross-produc t wit h othe r variables . Certainly ,
differen t magnitud e cross-product s resultin g fro m mea n difference s
is a resul t tha t is no t desirabl e whe n tryin g to dra w inference s abou t
association s or relationship s for a singl e variabl e in differen t groups .
Problem s cause d by difference s in mean s shoul d no t stymi e us for
long , however , for th e solutio n is suggeste d by th e natur e of th e
problem . A secon d logica l ste p is to remov e th e mean s fro m th e
calculation s an d the n tak e cross-product s onc e again . Removin g th e
mean s result s in th e measure s of associatio n no longe r providin g an y
informatio n abou t whethe r or no t mean s diffe r or change . Tha t
consequenc e ha s littl e valu e insofa r as comparison s of mean s typicall y
ar e don e throug h t test s an d analyse s of varianc e rathe r tha n throug h
measure s of association. Thes e new , "means-removed " cross-product s
woul d reflec t onl y th e variance s aroun d a mea n of 0. Removin g mean s
4

4 . The y als o can be analyze d throug h SE M technique s bu t ar e complicated , requirin g


inputtin g an augmente d momen t matri x rathe r tha n a covarianc e o r correlatio n matri x (e.g. ,
Byrne , Shavelson , & Muthen , 1 9 8 9 ; Sorbom , 1974, 1982), an d wil l no t b e covere d in an y
detai l in thi s book .

32

SINGL E MEASURE S O F

VARIABLE S

works , for if we tak e th e adjuste d score s an d multipl y each individ


ual' s score s on th e firs t variabl e by th e score s on th e secon d variable ,
sum thos e scores , an d the n divid e by th e numbe r of individuals , we
hav e th e covariance . Becaus e bot h variable s hav e bot h positiv e an d
negativ e score s (bot h ar e centere d aroun d 0), th e product s can hav e
bot h negativ e an d positiv e signs . Whe n th e variable s ar e unrelated ,
th e su m of th e product s wil l approximat e 0. For positivel y relate d
variables , th e overal l produc t wil l be positiv e du e to a preponderanc e
o f negativ e value s on on e variabl e bein g multiplie d time s negative s
on th e othe r an d of positiv e value s on on e variabl e bein g multiplie d
time s positive s on th e other . For negativel y relate d variables , th e
overal l produc t wil l be negativ e du e to a preponderanc e of positiv e
value s on on e variabl e bein g multiplie d time s negative s on th e othe r
an d vice versa .
At thi s point , we hav e arrive d at th e basi c buildin g block s for
structura l equatio n approaches . Covariance s contai n informatio n
abou t bot h th e strengt h of th e associatio n betwee n tw o measure s an d
thei r variabilit y for an y give n sample . At th e sam e time , however ,
covariance s ar e no t idea l for comparin g strengt h of association s
betwee n differen t pair s of variables , for afte r inspectio n of a numbe r
of covariance s it become s apparen t tha t difference s in variance s mak e
it difficul t to mak e comparison s abou t strengt h of associatio n acros s
pair s of variables . To illustrate , loo k at a hypothetica l covarianc e
matri x tha t appear s in Tabl e 3.1. Th e goa l of th e tabl e is to sho w
measure s wit h markedl y differen t standar d deviations . For purpose s
of illustration , imagin e tha t Variabl e 1 is colleg e grad e poin t averag e
(GPA) , Variabl e 2 is intelligenc e tes t performanc e (IQ) , Variabl e 3 is
weigh t in kilogram s (Weight) , an d Variabl e 4 is heigh t in meter s
(Height) . Hopefully , th e respectiv e standar d deviation s of 1.0, 15.0,
10.0, an d 0.3, althoug h "guesstimates, " ar e reasonable . So, whic h pai r
of variable s ha s th e stronges t relationship ? In fact , if my calculation s
ar e correct , th e variable s all hav e th e exac t sam e magnitud e stan
dardize d relationship , yet direc t compariso n of the m is difficult .
To mak e relationship s betwee n differen t pair s of variable s mor e
readil y comparable , a thir d ste p is possibl e and , for man y purposes ,
is valuable . Tha t ste p is to remov e th e difference s in variance s to
facilitat e direc t compariso n of differen t relationship s by puttin g the m
all on a commo n scale . Th e difference s in variance s can be eliminate d
by givin g all variable s a commo n varianc e by rescalin g them . Practi

Path Analysis

and Partitioning

Variance

33

TABL E 3.1 Illustrativ e Covarianc e Matri x


Grade Point
Average

IQ Test
Score

Weight
(kilograms)

Height
(meters)

(2)

(3)

1. Grad e Poin t Averag e

1.0

2. IQ Tes t Scor e

7.5

225.0

3. Weigh t (kilograms )

5.0

75.0

4. Heigh t (meters )

0.15

Standar d deviatio n

1.0

2.25

15.0

100.0
1.50

10.0

0.09

0.3

NOTE : Grad e Poin t Averag e assume s a rang e fro m 0 (F) to 4 (A). IQ Tes t Scor e ha s a norme d mea n of
100 an d a standar d deviatio n of 15.

cally , tha t is don e by dividin g each varianc e by som e valu e tha t result s
in all th e variance s endin g up wit h th e sam e scale . To complet e th e re scaling , each covarianc e betwee n tw o variable s is divide d by th e prod
uc t of th e squar e root s of th e value s use d for rescalin g th e variance s
of thos e tw o variables . Imposin g a singl e commo n varianc e or scale
make s relationship s betwee n differen t variable s eas y to compare ; th e
one s wit h large r value s hav e stronge r relationships .
So lon g as a "commo n variance " is bein g arbitraril y selecte d for
rescaling , on e tha t maximize s simplicit y an d eas e of interpretatio n
shoul d be selected . Th e bes t choic e is to selec t th e valu e 1 for th e
rescale d variance . Each varianc e is divide d by itself , an d each covari
anc e is divide d by th e squar e roo t of th e varianc e (th e standar d
deviation) . Th e resul t is tha t all relationship s rang e betwee n - 1 (a
perfec t negativ e relationship ) an d + 1 (a perfec t positiv e relationship) ,
an d th e squar e of th e relationshi p represent s th e amoun t of varianc e
tha t is share d betwee n tw o variables . Said differently , choosin g a
rescale d varianc e of 1 expresse s th e relationship s as correlation s an d
optimize s compariso n of relationship s wit h on e another . To repeat,
to change variances!covariances to correlations, divide each variance
by itself and each covariance by the standard deviations (i.e., the
square root of the variance) of the two variables that covary. To go the
other way, from correlations to covariances, multiply the correlations
times the product of the two appropriate standard deviations.

34

SINGL E MEASURE S O F

VARIABLE S

E X E R C I SE
If th e rescalin g is impose d on th e illustratio n in Tabl e 3.1,
the n wha t is th e commo n correlatio n betwee n th e pair s of
variables ? Do th e calculation s as describe d in th e precedin g
paragraphs . (The correc t answe r appear s at th e en d of th e
nex t paragraph. )

Th e fina l poin t of thi s sectio n is tha t eve n thoug h correlation s ar e


ver y appealin g an d wer e th e buildin g block s for pat h analysis , the y
ar e no t th e idea l measur e of associatio n for man y situations . As note d
earlie r in thi s chapter , covariance s contai n informatio n bot h abou t
th e relationshi p betwee n tw o variable s an d abou t th e variabilit y of
each variabl e in th e sampl e of interest , an d thi s make s variance /
covarianc e matrice s optima l for SEM. Despit e th e fact tha t correla
tion s an d covariance s ar e just simpl e linea r transformation s of on e
another , the y contai n ver y differen t information . Correlation s hav e
give n up th e informatio n from a sampl e abou t its variability . For
comparison s of relationship s acros s samples , acros s group s (e.g. ,
comparin g relationship s betwee n variable s in a mal e sampl e wit h
thos e in a femal e sample) , or acros s tim e in a singl e longitudina l
sample , correlation s d o no t allo w researcher s to asses s th e exten t to
whic h th e relationship s reflec t commonalities/difference s in strengt h
of associatio n versu s difference s in variances . Becaus e correlation s d o
no t allo w for changes/difference s in variances , onl y if variance s ar e
equa l acros s samples/groups/time
s is it appropriat e to compar e cor
relations. (Th e answe r to th e questio n in th e exercis e precedin g thi s
paragrap h is r = .50.)
Althoug h th e differenc e betwee n correlation s an d covariance s
ofte n is a difficul t on e for reader s to understand , on e wa y in whic h
to view it is to thin k abou t it in its extreme , namely , in situation s wher e
restrictio n of rang e in a variabl e attenuate s its correlatio n wit h othe r
variables . Imagin e a sampl e in whic h ther e is almos t no variabilit y on
a measur e of interest . For example , thin k abou t tryin g to asses s
5

5. Finally , althoug h prematur e fo r th e presen t discussion , it is importan t to not e tha t fo r


som e of th e estimatio n technique s describe d late r in thi s book , th e method s ar e worke d ou t
fo r covariance s bu t no t correlation s an d ma y produc e problem s if correlatio n matrice s ar e
analyze d (e.g. , Cudeck , 1 9 8 9 ) .

Path Analysis

and Partitioning

Variance

35

relation s of achievemen t score s wit h othe r variable s for a sampl e


selecte d becaus e of th e low achievemen t of its members . In suc h
circumstances , correlation s of achievemen t wit h othe r variable s ar e
greatl y attenuated . In contras t to th e correlatio n coefficient , th e
covariance , althoug h stil l affected , shoul d nevertheles s captur e th e
correspondenc e of tha t variabl e wit h othe r variable s in ra w scor e
terms , tha t is, preservin g th e effect s of th e relationshi p of a raw scor e
uni t of chang e in on e variabl e wit h som e uni t of chang e on th e second .
A differen t instanc e can be illustrate d by considerin g a case in
whic h a researche r ha s give n a sampl e a tes t tha t turne d ou t to be so
difficul t or eas y tha t virtuall y all th e score s clustere d together ; ther e
is littl e discriminatio n betwee n man y individual s wh o actuall y hav e
differin g level s of skill . In thi s instance , th e covarianc e stil l is pre
ferred , bu t it is limite d by inappropriatenes s of th e measur e for
assessin g th e underlyin g continuu m of abilitie s of th e particula r
sampl e of respondents .

Decomposin g Relationship s Betwee n Variable s


Int o Causa l an d Noncausa l Component s

As suggeste d severa l time s previousl y in thi s book , th e principle s


underlyin g decompositio n of effect s ar e bot h th e majo r strengt h of
pat h analyti c approache s an d essentia l for understandin g th e class of
approache s calle d structura l equatio n modelin g or SEM. Th e appea l
of pat h analysi s is its capacit y to tal k abou t indirec t effect s or
noncausa l relationships . For example , a stron g associatio n betwee n
tw o variable s indicate s nothin g about th e natur e of th e relationship .
For a give n model , however , on e can tal k abou t th e amoun t of th e
associatio n tha t is specifie d as causa l an d th e amount tha t is specifie d
as noncausal . Th e caveat , of course , is tha t unles s th e relationship s
betwee n variable s all ar e correctl y specifie d an d th e measure s per
fectl y asses s th e underlyin g theoretica l variables , th e divisio n of
varianc e int o causa l an d noncausa l wil l be imprecis e an d perhap s eve n
wrong . In othe r words , for thi s section , causa l shoul d be rea d as if it
is surrounde d by quotatio n mark s ("causal") , for causa l mean s tha t if
th e mode l is tru e an d if the theoretica l variable s ar e perfectl y opera
tionalized , then th e relationship s ar e as specifie d in th e model . At th e
sam e time , however , it is importan t to remembe r tha t th e conse

36

SINGL E MEASURE S O F

VARIABLE S

quence s of model s ar e assesse d in par t by th e wa y in whic h the y


specif y or hypothesiz e impact s of variable s on othe r variables .
For an y model , th e relationship s betwee n variable s can be decom
pose d int o causa l effect s an d noncausa l relationship s by usin g th e
logi c introduce d by pat h analysis . Furthermore , within causa l an d
noncausal , th e effect s can be broke n dow n eve n more . For causa l
effects , ther e ar e effect s tha t go directl y fro m on e variabl e to a secon d
variabl e (direc t effects ) an d effect s betwee n tw o variable s tha t ar e
mediate d by on e or mor e intervenin g variable s (indirec t effects) . For
noncausa l relationships , ther e ar e relationship s betwee n tw o variable s
tha t occu r (a) becaus e bot h ar e cause d by a thir d variabl e (thes e ar e
referre d to as noncausa l reflectin g commo n cause s or noncausa l du e
to share d antecedents ) an d (b) becaus e in model s wit h mor e tha n on e
independen t variabl e ther e can be relationship s amon g the m in whic h
caus e an d effect ar e no t theoreticall y articulate d (ofte n calle d unana
lyze d prio r associations) . If all independen t variable s in a mode l ar e
unrelate d to on e another , the n ther e is no variabilit y of thi s type . It
is mor e common , however , to find model s in whic h th e independen t
variable s ar e backgroun d type s of variable s wher e caus e an d effect
amon g thos e variable s canno t readil y be specified . Relationship s
goin g throug h thos e association s ar e calle d unanalyze d prio r associa
tion s becaus e th e mode l doe s no t attemp t to assig n caus e an d effect
(natur e of relationshi p unanalyzed ) and/o r becaus e th e association s
ar e du e to processe s tha t occu r earlie r tha n (prio r to ) th e relationship s
on whic h th e mode l focuses . Figur e 3.1 illustrate s th e differen t type s
of associations .
Figur e 3.2 provide s a pat h analysi s mode l tha t can be use d to
illustrat e decompositio n of effects . Th e mode l employ s fairl y stan
dar d pat h analyti c notation . (A summar y of genera l pat h notatio n
appear s in Appendi x 3.1 at th e en d of thi s chapter. ) Each variabl e is
containe d withi n a box . Straigh t line s wit h arrowhead s at onl y on e
en d indicat e hypothesize d causa l influence s betwee n variables , fro m
caus e to effect . Each direc t influenc e ha s a correspondin g pat h
coefficient , a " p " wit h tw o subscripts , th e firs t for th e variabl e affecte d
(th e effect ) an d th e secon d for th e determinin g or causa l variabl e (th e
predictor) . Curved , double-heade d arrow s indicat e relationship s
whos e causa l dynamic s ar e no t of interest . Such arrow s ar e use d whe n
relationship s ar e no t wel l understood , canno t be specifie d readily ,
occu r prio r to th e processe s in th e model , or ar e unimportan t to th e

Path Analysis

and Partitioning

Variance

37

Total Asm or
Correlation

Causa l Effects

Direct Effects

Figur e 3 . 1 .

Indirect Effects

Noncausa l
Relation s

Dueto Share d
Antecedent s

Unanalyzed
Prior Assn.

Decompositio n of Effect s

model . Straight-lin e arrow s from "variables " wit h no boxe s aroun d


the m indicat e residua l influences , namely , all othe r influence s no t
specifie d by th e model . (For pat h analysis , whic h typicall y work s wit h
standardize d coefficients , the y ar e th e traditiona l residual , 1 - R ,
fro m regressio n analyses . Mor e generally , the y ar e tota l varianc e
minu s explaine d variance. ) Not e tha t onl y variable s wit h arrowhead s
pointin g to the m hav e residuals . In th e model , suc h variable s ar e
"dependent " (ofte n calle d endogenou s variables) . X , X*, an d X in
Figur e 3.2 ar e endogenou s variables . Variable s wit h no causa l arrow s
pointin g towar d the m (remember , th e curved , double-heade d arrow s
ar e no t causal ) ar e calle d independen t or exogenou s variables . X! an d
X in Figur e 3.2 ar e exogenou s variables . If Xj an d X wer e no t related ,
the n th e double-heade d curve d arro w woul d be eliminate d fro m th e
model ; removin g tha t pat h woul d no t affec t thei r statu s as exogenou s
variables . Keepin g wit h traditiona l pat h modelin g approaches , cau
salit y flow s fro m left to righ t in th e diagram . Becaus e X] an d X ar e
exogenou s an d hav e no causes , the y d o no t nee d to hav e a residua l
specified ; all of thei r varianc e is unexplained , an d thei r residua l is th e
sam e as thei r variance .
2

38

1
I

CM
kpa

<

JJ
.2

XI

&

Path Analysis

and Partitioning

39

Variance

Direct Causal Effects

Th e mode l contain s a numbe r of relationship s specifie d as direc t


causa l effect s (e.g. , th e pat h fro m X, to X ). Thes e path s go fro m on e
bo x to anothe r an d hav e arrowhead s at onl y on e end .
3

E X E R C I SE
Befor e goin g on , reader s unsur e about thei r leve l of under
standin g shoul d coun t th e numbe r of direc t causa l effects .
Ho w man y ar e there ? (The correc t answe r appear s at th e en d
of th e nex t paragraph. )

In pat h models , th e direc t effect s ar e estimate d via leas t square s


(sometime s calle d ordinar y leas t square s or OLS) regressio n ap
proaches . Each of th e endogenou s variable s need s to be though t of
as havin g its ow n regressio n equatio n describin g th e structur e of th e
relationship s betwee n variables , tha t is, structura l equation s (each
regressio n equatio n is a structura l equation) . To solv e for th e direc t
paths , regres s each endogenou s or dependen t variabl e on the variables
with direct paths to it. For example , regres s X on X X , X , an d X,
to solv e for th e direc t effect s to X . For model s like th e presen t one ,
whic h ha s exactl y th e sam e numbe r of path s as piece s of informatio n
to us e to estimat e thos e path s an d therefor e is "just-identified, "
regressio n is on e of a numbe r of way s of gettin g to th e sam e solution .
In Chapte r 2, identificatio n wa s introduced . If ther e ar e mor e path s
tha n piece s of informatio n (i.e., underidentification) , the n no uniqu e
solutio n is possible . For model s wit h th e sam e amount of informatio n
as paths , all way s of solvin g th e mode l (e.g. , algebra , regression )
provid e th e sam e solution . For overidentifie d pat h analysi s model s
wit h mor e informatio n tha n path s to estimate , regressio n provide s
optima l estimates . (Th e answe r to th e questio n in th e exercis e pre
cedin g thi s paragrap h is 9, or all th e arrow s wit h p' s attache d to
them. )
s

ls

Th e regressio n equation s for Figur e 3.2, whic h provid e th e direc t


effects , ar e as follows :

40

SINGL E MEASURE S O F

VARIABLE S

(3.1)
(3.2)
(3.3)

3 = ?31 1 + Pn l + i
* = Pm \ + P*l l + P*i i + 4
s = Ps\ \ + Pn i + P53 *3 + P54 4 + :5

Not e tha t ther e ar e nin e paths : th e nin e direc t effects .


|

Indirect Causal Effects

Indirec t causa l effect s occu r in pat h model s whe n on e variabl e cause s


a second , whic h in tur n cause s a third . In suc h circumstances , th e firs t
variabl e exert s influenc e on th e thir d via th e secon d regardles s of
whethe r it ha s a pat h directl y to th e thir d variable . On e migh t thin k
of dominoes , wher e th e firs t domin o knock s dow n th e second , whic h
in tur n knock s dow n th e third . Th e firs t domin o indirectl y affect s th e
thir d regardles s of whethe r it directl y come s in contac t wit h th e thir d
domino .
Indirec t effect s d o no t ad d path s to pat h diagram s an d d o no t
"tak e up " degree s of freedo m as d o pat h coefficients . In pat h models ,
indirec t effect s ar e indicate d by tw o or mor e direc t effect arrow s in
combinatio n linkin g tw o variable s (see , e.g. , Baro n & Kenn y [1986],
for a discussio n of mediation) . Thus , X an d X hav e indirec t effect s
on bot h X4 an d X . Thei r indirec t effect s on X4 occu r throug h X ,
wherea s thei r indirec t effect s on X occu r throug h X , throug h X4, an d
throug h X an d X combined .
Onc e on e understand s th e logi c of indirec t effects , th e remainin g
issu e is th e mechanics . Tha t is, ho w doe s on e calculat e th e magnitud e
of thos e effects ? Th e answe r is tha t for each indirec t pathwa y linkin g
tw o variables , th e magnitud e of tha t pathwa y is estimate d by multi
plyin g togethe r th e pat h coefficient s alon g tha t pathway . Said differ
ently , th e path s combin e multiplicativel y to determin e th e magnitud e
of th e indirec t effects , or each indirect effect is the product of the path
coefficients that provide the pathway between the two variables that
are causally related. To determin e th e tota l indirec t effect betwee n
tw o variables , th e individua l indirec t effect s fro m th e variou s path
way s ar e summed . Tha t is, the total indirect effect between two
variables is the sum of the products taken in determining each indi
vidual indirect effect.
To illustrat e bot h individua l indirec t an d tota l effect s betwee n tw o
variables , retur n to Figur e 3.2. In tha t model , Xi cause s X indirectl y
throug h X , an d th e magnitud e is th e produc t of th e path s fro m Xi to
x

Path Analysis

and Partitioning

Variance

41

X an d fro m X to X , namely , p p . Similarly , X] cause s X throug h


X4, an d th e magnitud e of tha t relationshi p is p x p ; throug h bot h
X an d X4, th e magnitud e is (p p p ) . Th e tota l indirec t effect
of X, on X , accordin g to th e model , is th e sum of th e differen t indirec t
effects , namely , (p p ) + (p p ) + ( p p p ) . In larg e
models , indirec t effect s can includ e man y intervenin g variables , an d
tota l indirec t effect s can consis t of th e su m of a larg e numbe r of terms .
Not e finall y tha t eve n withou t a direc t effect , pat h p X, stil l "causes "
X in th e model . Thus , in contras t to multipl e regressio n in whic h onl y
th e direc t path s ar e examined , in pat h model s th e absenc e of a direc t
causa l pat h doe s not mea n tha t a variabl e is an unimportan t predicto r
of a particula r dependen t variable .
3

3I

Si

4J

S4

S4

43

J3

31

J4

41

J4

43

31

5U

Noncausal Relationships Due to Shared Antecedents

Th e focu s no w shift s to th e relationship s tha t can be calle d spurious .


Thes e relationship s ar e th e part s of th e correlations/covariance s tha t
are , accordin g to th e model , no t causal . Althoug h tw o differen t type s
ar e describe d in thi s chapter , it is wort h notin g tha t mos t compute r
program s focu s on th e causa l rathe r tha n th e noncausa l effects , wit h
th e resul t tha t noncausa l effect s ar e mos t commonl y looke d at as th e
differenc e betwee n th e tota l association (correlatio n or covariance )
an d th e tota l causa l effects . Tha t is, the y ar e tota l associatio n minu s
tota l causa l effects .
Th e firs t typ e of noncausa l relationship s to be discusse d ar e
relationship s tha t reflec t influence s du e to on e or mor e cause s tha t
tw o endogenou s variable s shar e in common. Loo k agai n at Figur e 3.2,
focusin g on Variable s X an d X,. Not e tha t bot h X an d X, ar e cause d
byX , an d als o by X ; the y hav e tw o source s of noncausa l relationship s
du e to share d anteceden t relationships . Par t of th e correlatio n be
twee n X an d X, occur s becaus e the y bot h ar e cause d by Xj; anothe r
par t occur s becaus e the y bot h ar e cause d by X .
Althoug h th e relationshi p betwee n X an d X look s as thoug h it
migh t be simila r to tha t betwee n X an d X , th e presenc e of an
additiona l intervenin g variabl e (X ) make s it muc h mor e complex . In
th e sam e figure , th e X wit h X relationshi p reflect s commo n cause s
, X , an d X , so ther e wil l be component s parallelin g th e relationshi p
betwee n X an d X , namely , fro m X to each of th e variable s X,, X , X
to X , plu s commo n cause s from X throug h X (becaus e X, cause s
bot h X an d X , on e par t of th e relationshi p goe s fro m X bac k to X
3

SINGL E MEASURE S O F

42

VARIABLE S

the n to X], an d on to X,, an d a secon d par t goe s fro m X4 bac k to X


the n to X], an d on to X ) an d X throug h X (similarly , on e par t of th e
relationshi p goe s fro m X bac k to X , the n to X , an d on to X4 , an d a
secon d par t goe s fro m X4 bac k to X , the n to X , an d on to X5) . Reader s
shoul d not e that , in Figur e 3.2, eve n thoug h th e X to Xj to X
relationshi p is th e sam e as th e X to X, to X relationship , th e sam e
canno t be said for th e X to X, to X to X relationshi p versu s th e X,
to X, to X to Xj relationship . The y ar e differen t becaus e the y involv e
differen t pat h coefficients . (For reader s uncertai n whethe r the y un
derstan d thes e last points , tracin g th e path s for themselve s fro m th e
figur e is highl y recommended. )
1(

In th e illustration s of share d anteceden t or commo n caus e rela


tionships , th e causa l impac t accordin g to th e mode l is attribute d to
th e commo n caus e an d therefor e canno t also be viewe d as repre
sentin g a causa l relationshi p in whic h eithe r of th e endogenou s
variable s is causall y preponderant . In othe r words , th e relationship s
betwee n X an d X tha t reflec t thei r commo n cause s Xj an d X ar e
attribute d causall y to X an d X in th e relationship s of thos e variable s
wit h X an d X4. In thi s instance , however , th e relationshi p tha t is bein g
examine d is th e on e betwee n X an d X4 , an d th e par t of thei r
relationshi p attributabl e to X, an d X doe s no t reflec t an y causa l
relationshi p of on e wit h th e other . Not e tha t thi s doe s no t impl y tha t
the y ar e causall y unrelated , for th e mode l specifie s tha t X cause s X,,
an d tha t pat h is a causa l (direct ) one . Rather , it signifie s tha t thei r
relationship , accordin g to th e model , can be divide d int o causa l an d
noncausa l component s an d tha t share d anteceden t relationship s ar e
on e typ e of noncausa l relationship .
3

Noncausal Unanalyzed Prior


Association Relationships

Thi s secon d typ e of noncausa l relationshi p is use d to describ e


relationship s tha t pas s throug h th e double-heade d curve d arrow s in
models . Thus , if a mode l ha s uncorrecte d exogenou s variable s or
onl y a singl e exogenou s variable , the n ther e wil l be no varianc e
componen t of thi s type . Because , in Figur e 3.2, Xj an d X ar e relate d
an d th e causa l natur e of thei r relationship , is unanalyze d (e.g. , in a
mode l of statu s attainment , we migh t decid e tha t we canno t specif y
caus e an d effect betwee n th e socia l clas s of th e family an d th e abilit y
2

Path Analysis

and Partitioning

43

Variance

of th e chil d an d thu s mak e th e relationshi p noncausal) , variabilit y tha t


flow s throug h thei r association canno t be viewe d as causa l becaus e
th e mode l doe s no t specif y eithe r variabl e as causall y preponderant .
Not e tha t if tw o exogenou s variable s ar e highl y correlated , thi s
varianc e componen t get s relativel y large , an d thi s is no t particularl y
desirabl e for decomposin g effects . (Ther e ar e mor e sever e problem s
create d by stron g association s betwee n predicto r variable s calle d
multicollinearity ; the y wil l be discusse d in Chapte r 4.)
Th e curve d arro w associatio n betwee n exogenou s variable s als o
result s in unanalyze d prio r association relationship s betwee n exoge
nou s an d endogenou s variable s an d betwee n pair s of endogenou s
variables . Thes e agai n can be illustrate d throug h Figur e 3.2. First , all
of th e relationshi p betwee n X, an d X is classified as noncausal ,
unanalyze d prio r association . For all othe r relationship s involvin g X]
or X , a par t of th e associatio n is of th e sam e type . For example , X> is
relate d to X throug h its associatio n wit h X an d X ' s effect on X .
Similarly , par t of th e ,-Xs relationshi p come s becaus e X, cause s X ,
X cause s X*, an d X, an d X ar e related , an d anothe r par t come s
becaus e Xj cause s X J, X cause s X , an d X) an d X ar e related . An d we
coul d go on an d specif y noncausa l du e to unanalyze d prio r associatio n
part s of relationship s betwee n an y othe r pai r of variable s in thi s model .
At thi s point , it is hope d tha t reader s wil l feel as thoug h the y hav e
som e basi c understandin g of th e variou s component s of varianc e an d
whic h part s of variou s relationship s fall unde r whic h categories . At
th e sam e time , becaus e th e previou s sectio n worke d by exampl e rathe r
tha n by principle s or method s tha t woul d allo w th e mode l to pul l
apar t each of th e relationships , no mean s hav e bee n provide d for full y
dividin g a mode l int o direc t causal , indirec t causal , an d noncausa l
relationships . (Commo n question s migh t wel l includ e "Ho w man y
causa l component s shoul d I be finding? " an d "Ho w wil l I kno w if I
hav e the m all?") Thi s shortcomin g is addresse d in th e nex t sectio n of
thi s chapter .
On e fina l poin t befor e turnin g to thos e approache s is tha t onl y
for just-identifie d model s an d overidentifie d model s tha t perfectl y fit
th e observe d dat a wil l th e decompositio n of effect s perfectl y divid e
th e covariances/correlation s int o thei r components . For mos t in
stance s in whic h ther e ar e degree s of freedo m in th e model , ther e wil l
be discrepancie s betwee n th e dat a an d th e value s predicte d by com
binin g th e causa l an d noncausa l relationship s predicte d by th e model .
2

44

SINGL E MEASURE S O F

VARIABLE S

In structura l models , it is th e size of th e discrepancie s (i.e., th e


mismatc h betwee n th e relationship s tha t actuall y ar e foun d an d thos e
predicte d by th e model ) tha t allow s model s to be teste d for adequacy .
As note d in Chapte r 1 (e.g. , in Exercis e 1.1, th e "mismatch " ther e for
each mode l is th e differenc e betwee n on e correlatio n an d th e produc t
of tw o other s an d is use d to tes t th e plausibilit y of th e model) , lack
of fit allow s model s to be disconfirme d an d rejected . Becaus e th e
relationship s predicte d in an y mode l includ e all th e causa l an d
noncausa l varianc e components , th e tes t of fit for a mode l is no t on e
of ho w wel l th e predictor s explai n th e dependen t or endogenou s
variable s bu t rathe r of ho w wel l th e entir e mode l fits th e data . Fit
tests , calle d fit indexes , ar e a crucia l par t of structura l equatio n
approaches , an d an entir e chapte r late r in thi s boo k is devote d to
them . Th e differenc e betwee n mode l fit an d predictio n of dependen t
variable s is an importan t distinctio n to make . A well-fittin g mode l
coul d d o a poo r job of predictin g (accountin g for variabilit y in ) th e
dependen t variable s in case s wher e th e relationship s betwee n predic
tor s an d dependen t variable s ar e small . By contrast , a poorl y fittin g
mode l coul d explai n almos t all th e variabilit y of each of th e depen
den t variables . Ther e is substantia l diagreemen t amon g SEM re
searcher s abou t ho w muc h on e shoul d focu s on variabilit y accounte d
for versu s overal l mode l fit.

Approache s for Decomposin g Effects

Ther e ar e a numbe r of differen t way s of full y decomposin g effects .


Som e approache s yiel d onl y numerica l value s for tota l direc t an d
indirec t effects . Other s allo w calculatio n of each contributo r to each
effect an d the n requir e summin g th e variou s component s to deter
min e tota l effects . Som e focu s almos t exclusivel y on causa l effects . All
provid e direc t effect s becaus e thos e ar e th e pat h coefficients . Befor e
presentin g th e approache s tha t I personall y find mos t helpfu l an d
accessible , it is wort h notin g tha t man y of th e widel y use d structura l
equatio n program s provid e indirec t effect s as eithe r standar d or
optiona l output . The y all shoul d d o so , for calculatio n of indirec t
effect s is computationall y simpl e for program s tha t wor k wit h matri
ces of paramete r estimates . Becaus e som e program s provid e indirec t
effects , it shoul d be onl y a matte r of tim e befor e the y all includ e
indirec t effects .

Path Analysis

and Partitioning

AS

Variance

I hav e chose n to begi n wit h wha t ar e calle d th e rule s for tracin g


paths . Th e strengt h of thi s approac h is tha t it provide s logi c tha t
meshe s wel l wit h wha t ha s bee n presented so far . Th e shortcomin g is
that , for comple x models , it is eas y to omi t varianc e component s an d
therefor e to misestimat e tota l effects . Althoug h th e rule s hav e bee n
presente d in man y alternativ e forms , I presen t the m in a for m tha t I
find mos t intuitive .
First , selec t th e pai r of variable s whos e relationshi p in a mode l is
to be decomposed . For each tracing , begi n at on e variabl e an d go
throug h path s an d variable s to th e other .
1. If on e cause s th e other , the n alway s star t wit h th e on e tha t is th e effect .
If the y ar e no t directl y causall y related , the n th e startin g poin t is
arbitrary . Bu t onc e a star t variabl e is selected , alway s star t there .
2. Star t agains t an arro w (go fro m effec t to cause) . Remember , th e goa l at
thi s poin t is to g o fro m th e star t variabl e t o th e othe r variable .
3. Eac h particula r tracin g of path s betwee n th e tw o variable s can go
throug h onl y on e noncausa l (curved , double-headed ) pat h (relevan t
onl y whe n ther e ar e thre e o r mor e exogenou s variable s an d tw o o r mor e
curved , double-heade d arrows) .
4 . Fo r eac h particula r tracin g o f paths , an y intermediat e variabl e ca n b e
include d onl y once .
5 . Th e tracin g can go bac k agains t path s (fro m effec t t o cause ) fo r as fa r
as possible , but , regardles s o f ho w fa r back , onc e th e tracin g goe s
forwar d causall y (i.e. , wit h an arro w fro m caus e t o effect) , it canno t
tur n bac k agains t an arrow .

Figur e 3.2 can be use d to illustrat e th e tracin g rules . Take , for


example , th e relationshi p betwee n X an d X,. Alway s begi n wit h X,,
for tha t is th e effect (Rul e 1). Th e path s ar e (a) X4 to its caus e X , or
p ; (b) X, to its caus e X, an d the n (wit h an arrow ) to X , or {p p ) ;
(c) X4 to its caus e X an d to X , or (p p ) ; (d ) X, to its caus e X,
throug h a noncausa l pat h to X an d to X , or (p x r p ) ; an d (e)
Xi to its caus e X throug h a noncausa l pat h to X) an d to X , or (p
P31). Not e that , first , as mentione d earlier , th e effect s ar e each
th e product s of th e variou s paths ; second , each tracin g goe s throug h
othe r variable s onl y onc e (Rul e 3); and , third , X neve r is include d in
th e decompositio n becaus e it is causall y "downstream " an d irrelevant .
Attemptin g to includ e it violate s Rul e 5 , for it woul d requir e goin g
agains t a pat h (fro m effect to cause ) afte r goin g wit h a pat h (fro m
caus e to effect) . Not e also tha t ther e ar e no indirec t effect s (X doe s
3

Ai

42

32

4l

31

4l

!2

32

41

SINGL E MEASURE S O F

46

VARIABLE S

no t caus e X throug h an y intervenin g variable) . Finally , as illustrate d


in Tabl e 3.2, Term (a) is the direct effect an d als o tota l causa l effect ,
Terms (b) and (c) are noncausal due to shared antecedent effects , an d
Terms (d) and (e) are unanalyzed prior association noncausa l effects .
Th e tota l noncausa l effect is th e su m of Term s (b), (c), (d) , an d (e). As
mentione d earlier , th e risk in th e approac h come s from missin g on e
or mor e path s in comple x models .
A secon d approac h to decomposin g effects , ofte n calle d Duncan' s
rul e (Duncan , 1966, 1975), employ s th e formul a
4

r = , P r ,

(3.4)

wher e i an d / ar e variable s in th e mode l (i > /' ) an d q is an inde x ove r


all variable s wit h direc t path s to i an d ;'.
Lookin g at th e relationshi p betwee n Xj (/' ) andX (i) in Figur e 3.2,
pluggin g number s int o th e formul a give s th e equatio n r = (p r )
+ (P12
3) + (P
a)- To mak e th e equatio n loo k like th e
component s from th e tracin g rules , we also nee d to solv e for r an d
n ( 3 i = 1 0 an d therefor e disappears) :
4

4)

AJ

31

31

3 i = (P JI

(Pn

u)

n)

(Pn * \i) = Pn + (Pn

(P3 2

(P JI

n)

ii)

32

Becaus e r is a noncausa l relationshi p betwee n tw o exogenou s vari


ables , it canno t be decomposed . Throug h substitution ,
u

= p41

(" + Pn u ) +
= (P4I Pll ) + (P4. 2
(P42 P32) + 43
x

P4 2 ( P J I
X

P32 ) +

+ p43

i 2 + P32 )

(P42

12

r
Pit) +

(3.5)

State d as th e five varianc e component s in th e precedin g an d in Tabl e


3.2, th e term s ar e
r

43

= (b) + (d) + (e) + (c) + (a).

As can be seen , th e tw o approache s yield an identica l result .


At thi s point , it ma y seem tha t ther e is no t an eas y wa y in whic h
to decompos e effects . If, however , on e know s matri x algebr a an d can

Path Analysis

and Partitioning

Variance

47

TABLE 3.2 Decompositio n of Effects for th e Relatio n Betwee n X3


an d X4 for th e Mode l in Figur e 3.2
Causal Effect
Direct
(a)

Noncausal
Shared Antecedent

(I>41 *P3l)

(C)

(P42 P32)

(d )

Unanalyzed

Prior Association

PA3

(B)

(e)

Relationship

(p l * r
4

x p

<**3\)

d o matri x multiplication , ther e is a metho d for disentanglin g causa l


effect s (direc t an d indirect ) tha t require s onl y settin g up a matri x of
pat h coefficient s an d the n multiplyin g tha t matri x by itself . Thi s
approac h is illustrate d in Tabl e 3.3. (Appendi x 3.1 provide s an
introductio n to matri x algebra. )
As can be seen from Tabl e 3.3, for Matri x A, th e pat h coefficient s
ar e set up so tha t th e predicto r variable s defin e th e column s an d th e
dependen t variable s defin e th e rows . Th e form is like th e equation s
excep t tha t th e variable s ar e removed . Exogenou s variable s hav e onl y
O's in thei r rows . Endogenou s variable s hav e th e coefficient s of th e
path s to them , aligne d accordin g to predicto r variables . Matri x (A
A) contain s all first-orde r indirec t effects , namely , effect s wit h on e
intervenin g variable . Matri x ( A A) contain s all second-orde r
indirec t effects , namely , thos e wit h tw o intervenin g variables . Becaus e
ther e ar e onl y thre e endogenou s variable s in th e mode l of Figur e 3.2,
no indirec t effect can hav e mor e tha n tw o intervenin g variables , an d
Matri x ( A A) is null , as woul d be foun d if ( A A) wer e
multiplie d by A. If th e numerica l value s of th e path s for a particula r
dat a set ar e pu t int o Matri x A rathe r tha n th e symbol s for th e paths ,
th e resultin g number s wil l reflec t th e sum of th e variou s coefficients .
Tota l indirec t effect s for thi s mode l can be calculate d by addin g
togethe r th e (A A) an d ( A A) matrices ; if thei r sum is adde d to
A, whic h is th e direc t effects , the n th e resul t is th e tota l causa l effect s
for all relationship s in th e model . Th e differenc e betwee n th e tota l
effect s (A + [ A] + [ A A]) an d th e observe d dat a matri x is
th e tota l of th e noncausa l effects .

48

SINGL E MEASURE S

OF

VARIABLE S

TABL E 3.3 Illustratio n of Matri x Approac h to Decomposin g Effect s


Variable

Xl

XA

XS

Matri x o f pat h coefficient s (A)


Xi

X2

4i

Pu

XJ

P i

X*

Xs

Ps i

4i

si

Ps 3

Matri x o f pat h coefficient s multiplie d by itsel f (A A)


Xi

X2

Xi

X4

<P

Xs

I(Ps 3

P3l )

PJL > +

(PI4XP4L) l

<P4JXP32 >

1(P53 X P J 2 > +
(PS4XP42 >1

(PI4*P43 >

Matri x A multiplie d b y Matri x ( ) = ( )


Xl

X3
X

XS

IP54(P43XP31)1 IPJ4(P4JXP32) ]

N O T E : Th e nex t orde r matri x produc t ( A x A x A x A ) i s a nul l matri x fo r Figur e 3 . 2 . T h e tota l effect s


ar e th e su m o f th e thre e matrice s , ( ) , an d ( A A) ; th e direc t effect s ar e A, an d th e indirec t
effect ) ar e th e su m o f th e matrice s (A x A) an d ( A A).

I Determinin g Degree s of Freedo m of Model s


As note d in an earlie r chapter , to estimat e a solutio n to a model , all
parameter s to be estimate d nee d to hav e uniqu e estimates . For pat h
analysi s model s (whic h dea l wit h standardize d relationships) , th e
informatio n use d to estimat e path s is th e correlation s of th e variable s
wit h on e another . Said differently , th e correlation s ar e th e "knowns "
in pat h analysis , wherea s th e pat h coefficient s to be estimate d ar e th e
"unknowns. " Therefore , an y pat h analysi s mode l ha s as its maximu m
numbe r of degree s of freedo m its correlations . O f course , if a mode l
ha d tha t maximu m numbe r of degree s of freedom , the n it als o woul d
contai n n o path s betwee n its variables . Thus , th e numbe r o f degree s
of freedo m typicall y wil l be substantiall y less tha n th e maximu m

Path Analysis

and Partitioning

49

Variance

possible , for ther e wil l be a numbe r of path s of interes t in mos t


models .
Th e numbe r of correlation s of a give n mode l can be calculate d by
usin g th e formul a
Numbe r of Correlation s = v(v - 1) / 2,

(3.6)

wher e is th e numbe r of variable s in th e model . For example , Figur e


3.2, wit h its five variables , ha s 5 x 4 / 2 = 10 correlations . N o mode l
wit h five variable s coul d hav e mor e tha n 10 degree s of freedom . Th e
actual degree s of freedo m of an y mode l is determine d by subtractin g
th e numbe r of coefficient s to be estimate d fro m th e maximu m num
ber of degree s of freedom . For th e curren t exampl e (Figur e 3.2), if
on e add s up th e path s to be estimate d betwee n variables , the n th e
tota l als o is 10, whic h mean s tha t th e mode l ha s no degree s of freedo m
( 1 0 - 1 0 = 0) or is just-identified . As note d earlie r in thi s chapter , for
just-identifie d models , solvin g usin g regressio n provide s th e sam e
solutio n as doe s solvin g usin g possibl e alternative s (e.g. , solvin g a
syste m of simultaneou s equation s wit h 10 equation s an d 10 un
knowns) . If on e or mor e path s fro m Figur e 3.2 wer e dropped , the n
th e mode l woul d be overidentifie d an d hav e degree s of freedom ; in
suc h instances , differen t solution s ma y diverge , an d regressio n ap
proache s provid e th e "best " wa y of estimatin g th e path s (e.g. , Land ,
1969). Finally , becaus e all pat h analysi s model s can hav e no covaria
tio n amon g residual s an d hav e a unidirectiona l causa l flow , the y
alway s ar e identified .

Presentin g Partia l Regressio n an d Partia l


Correlatio n as Pat h Model s

I Partial Regression
Figur e 3.3 provide s diagram s representin g partia l regressio n an d
partial correlation . Inspectio n of th e partia l regressio n mode l shoul d
hel p remin d reader s tha t pat h analysi s employ s partial regressio n
approache s for its solutions ; th e partia l regressio n mode l is jus t a basi c
pat h model . Th e pat h p , whic h also i s B , , is use d to illustrat e ho w
th e partial regressio n formul a can be derive d fro m th e pat h model .
n

50

SINGL E MEASURE S

OF

VARIABLE S

Usin g th e tracin g rule s describe d earlie r in thi s chapter , th e relation


shi p betwee n Xj an d X is
3

u = Pn + Pn

iv

( )
3 7

Becaus e thi s equatio n provide s us wit h onl y a singl e equatio n in tw o


unknowns , a secon d equatio n is neede d to solv e for th e pat h p . Th e
relationshi p betwee n X an d X is used :
3)

'32 = Pj l

' 2 1 + P32 >

( )
3

yieldin g tw o equation s in tw o unknown s (th e p's) , whic h allow s


findin g a solution . Now , if th e secon d equatio n is expresse d in term s
of p , the n it is
32

P3 2 =

Then , substitutin g for p

32-p3i

2i -

( )
3

in Equatio n 3.7,

n =P3 ! +

(r -p xr )xr
n

3l

2V

Then , rearrangin g terms ,

n =Pl\

32

^2 1

P1

21>

an d combining ,

=P .

- 21 )
r

' 3 2 * '2!

Finally , expressin g th e equatio n in term s of p

P3 1 U

- 2i )

P3 1 =

( 31 -

32

' J I - "32
X

21> / <

(3-11)

- 21 )>
r

3 b

2i

<

whic h is th e traditiona l formul a for partia l regressio n

(B ).
3i2

Path Analysis

and Partitioning

51

Variance

Partia l Regraialo n

Figur e 3.3.

Pat h Diagram s fo r Partia l Regressio n an d Partia l Correlatio n

Partial

Correlation

Th e lowe r par t of Figur e 3.3 contain s a diagra m for partial


correlation . In thi s instance , th e relationshi p to solv e for in term s of
observe d correlation s is th e on e betwee n th e erro r term s e^ an d e^.
To mak e th e mode l fit basi c pat h rule s an d th e residua l pat h a
correlatio n (we are , afte r all, tryin g to solv e for a partial correlation) ,
th e residual s ar e mad e standardize d variables , an d so path s nee d to
be adde d betwee n th e error s an d X an d Xj. Thos e paths , specifie d in
th e diagra m as th e coefficient s c an d d, ar e use d to represen t th e
relationshi p betwee n th e residual s an d th e endogenou s variables .
Thos e path s ar e no t reall y unknown s to estimate , for the y represen t
th e unexplaine d varianc e (which , for an y variable , is 1 -R ). Becaus e
th e path s square d nee d to equa l th e unexplaine d variance , the y ar e
2

SINGL E MEASURE S O F

52

VARIABLE S

th e squar e roo t of tha t variance ; c = (1 - r ) , so c = sqrt( l - r ) ,


an d d - (1 - r ) , so d = sqrt( l - r ) .
Usin g th e rule s for tracin g path s (reader s shoul d not e fro m thi s
illustratio n tha t th e tracin g rule s wor k for pat h model s tha t ar e no t
pat h analysi s models) , th e relationshi p betwee n X an d X is
2

12

I2

13

13

= Pi\

23.i

(
3

Initially , thi s ma y seem like on e equatio n wit h fou r unknowns . But


we can substitut e in th e correlation s wher e the y ar e equa l to paths .
First , th e tw o path s p an d p ar e simpl e regressio n coefficients ,
which , in th e standardize d case , ar e correlations , namely , r = p \ an d
n
p3i - Furthermore , as explaine d in th e precedin g section , c =
sqrt( l - r ) an d d = sqrt( l - r , , ) . Thus , th e equatio n become s
21

31

12

) 2

32 = 12

U+

S a

, ( l " 12 )
r t

23.1

"I **. ~ u)'


1

( )

13

Solvin g for th e partia l correlatio n ( r , ) , th e equatio n become s


23

r , sqrtf l - r
2 3

2
1 2

) x sqrt( l - r, ) = r
2

- r

12

x r ,
1 3

which , specifie d differently , is


r

23.i =

( 3 2 - i 2
r

u ) /

~ i2 )
r

xsqrt( l - r ) ,
2

(3.14)

finall y comin g in Equatio n 3.14 to th e traditiona l formul a for partial


correlation . Th e logi c of th e formul a is fairl y straightforward ; it take s
ou t th e effect s of a contro l variabl e fro m th e relationshi p betwee n th e
tw o variable s whos e partial is of interes t ( r - r x r ) an d the n adjust s
th e residua l variable s bac k to uni t varianc e by dividin g th e resultin g
covarianc e by th e standar d deviation s of th e residual s (sqrt[ l - r J
an d sqrt[ l - r ]). On e additiona l poin t relate d to generalizabilit y of
th e formul a is tha t highe r orde r partial s can be viewe d as partial s of
partials ; the y can be extracte d usin g th e derive d formul a repeatedl y
to eliminat e effect s of variou s variables .
Finally , althoug h it is no t apparen t fro m th e example s becaus e th e
sam e variable s wer e no t use d for th e tw o differen t type s of partials ,
in fact th e numerator s of partia l correlatio n an d partia l regressio n ar e
identical . As wil l be illustrate d in an exercis e at th e en d of Chapte r 4,
3 2

1 2

13

I2

13

Path Analysis

and Partitioning

Variance

53

however , partia l correlatio n an d partia l regressio n coefficient s usuall y


ar e no t th e same , eve n whe n th e sam e variable s ar e partialed .

Peer Popularit y an d Academi c


Achievement : An Illustratio n

Throughou t thi s book , I wil l try to emplo y a singl e dat a set usin g a
variet y of technique s rangin g from pat h analysis , to pane l analysis , to
confirmator y facto r analysis , to laten t variabl e SEM. Th e dat a set I
us e addresse s th e issue s presente d in Figur e 1.1, explorin g th e rela
tionship s betwee n pee r acceptanc e an d achievement . Th e matri x use d
for th e analyse s appear s late r in Tabl e 9.3. In practice , th e sampl e size
for th e differen t analyse s woul d likel y var y from analysi s to analysi s
becaus e selectin g differen t variable s in differen t model s woul d resul t
in differen t sampl e size s du e to missin g data . In thes e examples ,
however , a commo n matri x wil l be use d for all analyse s an d a commo n
sampl e size of 100. Becaus e I am workin g from matrice s tha t hav e
precisio n wel l beyon d th e tw o or thre e digit s tha t appea r in th e text ,
replicatio n ma y no t produc e identica l solution s to wha t I report .
Th e cor e questio n is th e relatio n betwee n acceptanc e by peer s an d
academi c achievement . Tha t questio n wil l be looke d at wit h singl e
measure s of each conceptua l variable , bot h cross-sectionall y (pat h
analysis ) an d longitudinall y (pane l analysis) . The n it wil l be addresse d
agai n usin g multipl e measure s of each conceptua l variabl e (laten t
variabl e SEM). First , th e relationship s (correlations ) amon g th e laten t
variable s wil l be examine d throug h confirmator y facto r analysis .
Then , causa l relationship s amon g variable s wil l be modeled . For all
illustrations , th e dat a wil l be analyze d usin g SEM programs , wit h
othe r approache s use d as wel l to sho w thei r equivalence .
For thi s chapter , th e illustratio n focuse s on pat h analysis .

Illustratio n 1:

Cross-Sectiona l Pat h Analysi s

This mode l looks at the variable s from Figure 9.2 but looks like Figure 3.2 with
one exception , namely , tha t pat h p^ Is se t to 0 (i.e., omitted) . The mode l is
specifie d for pat h analysis ; namely , it is recursiv e an d ha s only a single
measur e of eac h theoretica l variable . Prior Informatio n was use d to selec t the
"best " indicato r of eac h theoretica l variabl e for the pat h analysis , namely , the
4

54

SINGL E MEASURE S O F

VARIABLE S

Dunca n SEI a s the measur e of Family Socia l Class (Duncan) , th e Peabod y


PVT a s Academi c Ability (Peabody) , a semanti c differentia l scal e scor e of
teacher' s evaluatio n of eac h child (TchrEval), classroo m seatin g choice s by
pee r nomination s for pee r popularit y (PeerPop) , an d performanc e on a stan
dardize d verba l achievemen t tes t a s th e measur e of schoo l achievemen t
(VerbAch). Consisten t with Figure 3.2 , Dunca n an d Peabod y ar e specifie d to
be exogenou s an d ar e correlated , an d eac h ha s direc t path s to all thre e othe r
variables . TchrEval ha s direc t path s to PeerPo p an d VerbAch. Finally, PeerPo p
an d VerbAch ar e not viewed a s causall y related , giving the mode l a degre e of
freedom , makin g it overidentified . The matri x is a s follows:
Matri x to Be Analyze d

Dunca n
Peabod y
TchrEval
PeerPo p
VerbAch

Duncan

Peabody

1.0 0
.01
-.12

1.0 0
.24

.04
.09

TchrEval

.16

1.0 0
.17

.31

.30

PeerPop

VerbAch

1.0 0
.08

1.0 0

The proble m ca n be solve d by multipl e regression , regressin g eac h dependen t


variabl e on the variable s with arrow s to it. Reader s intereste d in building thei r
pat h analysi s skills shoul d try solving usin g regression . To mak e th e illustratio n
relevan t to late r SEM analyses , this illustratio n Is se t up to solve th e proble m
usin g LISREL 8. (For an y earlie r version , drop th e secon d to las t line, "pat h dia
gram, " an d the proble m ca n be solved . The output , however , will look somewha t
different. ) The contro l statement s for LISREL appea r In Appendi x 3.2 .
The outpu t from the analyses , the regressio n coefficient s with standar d error s
an d f values , is as follows:
Regressio n Coefficient s
Independent
Duncan
Dependen t variable s
TchrEval

PeerPo p

Peabody

-.12

.24

(10 )
-1.2 8
.06
(.10 )

(10 )
2.5 3

0.5 6

Variables

TchrEval

.13

.15

(.10 )
1.2 7

(10 )
1.4 5

PeerPop

VerbAch

Path Analysis

and Partitioning

Variance

55

Regressio n Coefficient s (continued )


Independent

VerbAch

Duncan

Peabody

.12
(.09 )
1.2 7

.25

.25

(.09 )
2.6 6

(.10 )
2.6 0

Variables

TchrEval

PeerPop

VerbAch

NOTE: Standar d errors are In parentheses , rvalue s are in rows below standar d errors .

As note d earlier , the mode l ha s 1 degre e of freedo m (ther e Is no pat h betwee n


PeerPo p an d VerbAch). The fit statisti c from LISREL Is a s follows:
GOODNES
S O F FI T STATISTIC S
CHI-SQUARE WIT H 1 DEGREE O F FREEDO M = 0.007
THE FI T I S PERFECT .

0 P=1.00

The slight optimis m in LISREL abou t overal l fit shoul d be note d but ignored .
Note tha t If this metho d Is appropriat e an d the mode l depict s realit y accurately ,
the n th e following Interpretation s ca n be made :

1. Socia l clas s is unimportan t for this model .


2. Academi c ability Is relate d to bot h teache r rating s an d studen t achieve
ment .
3. None of th e variable s predict s acceptanc e by peer s In th e pre
desegregatio n classroo m (choice s from student s from simila r ethni c back
grounds) .
4. Teache r rating s als o ar e relate d to studen t achievement .
5. Given the modes t size s of the paths , ther e Is muc h unexplaine d varianc e
In eac h of th e variables .
6. Even thoug h no relatio n betwee n pee r acceptanc e an d achievemen t Is
hypothesized , the goo d overal l fit show s tha t no relationshi p exist s be
twee n the two variable s (assumin g th e mode l is appropriate) .

Chapte r Discussio n Question s


1. Does the input matri x for pat h analysi s com e from regressions ?
If not , the n wher e doe s it com e from?

SINGL E MEASURE S O F

VARIABLE S

2. Wha t is th e differenc e in logic betwee n partia l correlatio n an d


partia l regression ? Is ther e a reaso n wh y on e woul d us e partia l
correlatio n ove r partia l regression ?
3. Are ther e eve r reason s to us e matrice s of partia l correlation s
for pat h analysis , or is th e correlatio n matri x alway s used ?
4. Are th e sign s an d value s of nonstandardize d regressio n coef
ficient s reall y meaningful ?
5. Is stepwis e regressio n not cheating ? Doe s it not just let th e dat a
self-selec t withou t theoretica l basis ?
6. Will othe r SEM technique s be separatin g relationship s be
twee n variable s into th e sam e categorie s (direct , indirect ,
commo n causes , an d unanalyzed) ?
7. Can th e matri x form of decompositio n be use d for model s tha t
ar e not just identifie d (i.e., th e degree s of freedo m ar e mor e
tha n th e numbe r of paths) ?
8. Are analyse s of varianc e eve r use d in pat h analysis , or will
regressio n alway s be used ?

E X E R C I SE

3. 1

Anothe r Pat h Analysi s Illustratio n

Loo k at th e mode l tha t appear s in Figur e 3.4. Tha t diagra m


wa s constructe d usin g th e progra m AMOS , whic h is ver y eas y
to us e to produc e high-qualit y diagrams .
A. Use informatio n containe d in th e followin g regressio n equa
tion s to solv e for th e pat h coefficients .
B. Use th e regressio n equation s to decompos e effect s int o direct ,
indirect , an d noncausa l (includin g spurious ) regressio n equa
tions .

Path Analysis

and Partitioning

Socia l Class
X1

Variance

57

Ability
X3

P31

Schoo l Ach
X5
Family^Size
Self-estee m
X4

Figur e 3.4.

Pat h Analysi s Illustratio n

Regressio n Equation s
DV

IV

Xs

Xi

WT

IV

DV

.38
-.15

WT

DV

IV

Xi

.06

X,

.19

Xl

-.07

-.02

Xl

.14

.58

.08

WT

X4

XJ

.11

Xi

-.11

Xt

.32

Xi

-.23

Xs

Xi

.19

Xi

-.02

Xi

.59

N O T E : D V - dependen t variable ; IV - independen t variable ; W I - rcgresiio n weight .

Correlation s
Xl

Xi

X3

X4

Xi

1.00

Xi

-.33

1.00

Xi

.39

-.33

XA

.14

-.14

.19

1.00

Xs

.43

-.28

.67

.22

X5

1.00

1.00

SINGL E MEASURE S O F

58

A P P E N D IX

VARIABLE S

3 .1

Pat h Modelin g Notation s

BOXES ar e use d to describ e observe d measures . Observe d


measure s ar e sometime s calle d Indicators.

C I R C L ES ar e use d to describ e theoretica l variables . Othe r


term s tha t ar e use d ar e latent variables, unmeasured
variables, an d constructs.

This ARROW, whethe r betwee n two boxe s or two circles ,


represent s a causa l relationshi p from a causa l variabl e to a n
effect .

This ARROW, which als o ca n connec t two boxe s or two


circles , represent s a noncausa l relationshi p betwee n two
variables .

This ARROW, which doe s not originat e from a box or circle,


represent s a residua l to a measur e or variable .

This ARROW represent s a covarianc e betwee n two residuals .

Path Analysis

and Partitioning

59

Variance

A P P E N D IX

3. 2

LISRE L 8 Setu p fo r Figur e 3.4

Reader s shoul d refe r to a LISREL progra m manua l to understan d


each of th e symbols . Brackete d statement s ar e no t par t of th e pro gram , bu t the y provid e description .
Mexica n America n dat a for pee r acceptance , class illustratio n
DA NI= 5 NO=10 0 MA=KM
KM S Y F O FI=a:MAcsecmt.r
(8F10.7 )

[Thi s assume s tha t th e matri x tha t appear s abov e is on th e A driv e


an d is calle d MAcsecmt.r x an d tha t each elemen t cover s a 10-colum n
field . Th e mysteriou s nam e is my idiosyncrati c attemp t at abbrevia tio n of Mexica n America n cross-sectiona l matrix. ]
MO NY= 5 NE= 5 LY=i d BE=fu,f i PS=sy,f i TE=di,f i
FR B E 3 1 B E 3 2 B E 4 1 B E 4 2 B E 4 3 B E 5 1 B E 5 2
PS
st
pat
OU

BE 5 3 C
2 1 P S 3 3 P S 4 4 P S 5 5
1. 0 P S 1 1 P S 2 2
h diagra m
P T S E T V AD=OF F

- 1 1 1 1p j a p 4 1 1 1 1 1 1 1 1 ~
A s stated earlier in Chapter 2, interrelationships
among predictor variables in regression models are both the things
that make multiple regression and structural equation modeling (SEM)
in general so interesting and the source of a number of problems. In
the simplest case, if one has an array of predictor variables that are
unrelated to one another, then the coefficients from multiple regres sion are reduced to simple bivariate regression coefficients and inter pretation of those coefficients is straightforward. By contrast, if
predictors are interrelated, then issues of partitioning of variance
become important and interestingand the mathematics becomes
more than inspection of a correlation or covariance matrix. As is
discussed in more detail later in this chapter, the partial regression
coefficients have to spread the common variance among predictor
variables across the set of predictors. Finally, if the correlations
among predictors become too large, then the solution from regression
analyses potentially becomes unstable and individual coefficients can
change dramatically and go from strongly significant to nonsignifi cant across even nearly identical samples.
This chapter focuses on problems that can occur when the predictor
variables in multiple regression are strongly related. Those problems

60

Effects

of

Collinearity

61

usuall y ar e calle d problem s of multicollinearity . Regressio n an d othe r


structura l equatio n approache s canno t be use d appropriatel y an d
effectivel y unles s collinearit y effect s ar e wel l understood . It is impor
tan t tha t structura l equatio n approache s can hel p dea l wit h som e case s
wher e th e correlation s amon g predictor s ar e large . For example ,
havin g to labe l conceptua l variable s an d operationaliz e the m in pat h
diagram s shoul d preven t researcher s from includin g tw o variable s
tha t measur e th e sam e conceptua l variabl e as predictors . In pat h
models , the y migh t be combine d or on e woul d be dropped ; in laten t
variabl e model s describe d late r in thi s book , th e tw o variable s woul d
togethe r defin e a singl e conceptua l variable . Althoug h th e laten t
variabl e approac h is preferable , in eithe r case thei r hig h relationshi p
an d redundan t relationship s wit h othe r variable s woul d be remove d
from th e regressio n equation . Althoug h laten t variabl e approache s
hel p in mos t instance s by removin g measuremen t an d specificatio n
erro r fro m variables , the y ironicall y ma y mak e hig h collinearit y
appea r in case s wher e it previousl y ha s no t bee n a problem . Problem s
seem mos t likel y to emerg e for variable s tha t chang e a lot whe n the y
ar e include d in laten t variabl e models , for example , thos e assesse d by
measure s wit h low reliability , tha t ar e difficul t to asses s or hav e bee n
poorl y operationalize d (th e resul t is tha t th e variabl e actuall y meas
ure d is no t wha t is intende d to be measured) , or tha t hav e bee n
imprecisely conceptualize d an d ar e no t conceptuall y distinc t from
othe r variable s in th e model .
Issue s of collinearit y or multicollinearit y an d of biase d estimatio n
(ofte n calle d ridg e regressio n or reduce d varianc e regressio n [e.g. ,
Darlington , 1978]) to addres s collinearit y ar e discusse d in thi s chap
ter . Ridg e estimatio n is discusse d briefl y becaus e it is an optio n in
som e of th e structura l equatio n program s (e.g. , LISREL). Matri x
algebr a concept s ar e used . (Appendi x 3.1 provide s an introductio n to
matri x algebra. ) The y greatl y facilitat e explanatio n of collinearit y
issue s an d wil l be usefu l at variou s point s throughou t th e boo k to
explai n concept s an d approaches . Reader s wh o hav e take n regressio n
course s tha t cove r collinearit y issue s an d matri x algebr a shoul d hav e
bee n expose d to th e issue s addresse d her e an d ma y choos e to ski p thi s
chapter . For additiona l informatio n on regression , see , for example ,
Darlingto n (1990).

SINGL E MEASURE S O F

62

VARIABLE S

Regressio n an d Collinearit y

As suggeste d in th e precedin g section , in virtuall y all instance s wher e


regressio n approache s ar e used , th e variable s collecte d wil l be inter correlate d wit h on e another . Uncorrelate d predicto r variable s can be
foun d primaril y in experimenta l researc h whe n experimenters , by
ensurin g tha t th e cell size s for th e variou s condition s ar e equal ,
produc e orthogona l or uncorrelate d effects . In suc h circumstances , if
regressio n approache s ar e use d to analyz e th e dat a (whic h is don e in
genera l linea r mode l approache s to statistics) , the n th e analyse s ar e
straightforwar d an d simpl e to explain . Each effect is independen t of
all othe r effects ; th e independenc e extend s as wel l to interaction s
betwee n predicto r variables . (Multiplyin g togethe r tw o standardize d
variable s tha t ar e independen t of on e anothe r yield s a thir d variabl e
tha t is uncorrelate d wit h th e othe r two. ) Therefore , tota l varianc e
accounte d for in an y dependen t variabl e is th e su m of th e independen t
effects , an d th e multipl e regressio n coefficient s ar e th e simpl e regres
sion coefficients , which , in th e standardize d case , ar e th e correlations .
By contrast , if in experimenta l researc h it turn s ou t tha t cell size s
ar e unequal , the n on e ha s to mak e a ne w decisio n in selectin g th e
analyse s use d becaus e th e independen t variable s no longe r ar e inde
penden t of on e another . Even thoug h th e tota l varianc e accounte d
for in th e dependen t variabl e doe s no t change , differen t way s of
orderin g th e extractio n of effect s lead to differen t interpretation s of
th e size s of individua l effects , th e sam e proble m encountere d by
researcher s conductin g nonexperimenta l researc h an d usin g regres
sion approaches .
Correlate d independen t variable s ar e th e typica l case for nonex
perimenta l researc h an d for multipl e regressio n techniques . Thus , th e
challeng e for regressio n approache s is to partitio n commo n vari
anc e amon g th e variou s predicto r variables . Althoug h regressio n
approache s partitio n varianc e in logica l ways , th e technique s canno t
perfor m magi c suc h as uniquel y assignin g varianc e to particula r
predicto r variables , let alon e identifyin g "tru e causes " (see , e.g. ,
Goldberger , 1964). Wha t th e approache s can d o is sprea d commo n
varianc e acros s correlate d predicto r variables . Problem s emerg e pri
maril y whe n th e correlation s get substantial . (For a listin g of sugges
tion s abou t whe n th e correlation s ar e "to o big " [i.e., whe n collinearit y

Effects

of

63

Collinearity

ma y be a problem] , see Tabl e 4.1. Unfortunately , ther e is no simpl e


rul e to defin e whe n on e shoul d worr y about collinearity.) In th e
extrem e case wher e tw o variable s ar e identical , ther e is no mathe
matica l solutio n to a multipl e regressio n problem becaus e varianc e
canno t be partitioned . In mor e moderat e cases , as is illustrate d in thi s
chapter , a mathematica l solutio n is possible , bu t it can be unstable ,
sometime s defyin g interpretationcollinearit y ha s give n ris e to th e
ter m bouncin g beta s to describ e coefficient s tha t chang e sign s or
"bounce " throug h th e zer o pointan d yieldin g solution s tha t canno t
be trusted . (For a discussio n of stabilit y of regressio n coefficients , see ,
e.g. , Green , 1977.)
An illustratio n of wh y collinearit y cause s problem s in regressio n
can be illustrate d from th e genera l matri x form of th e regressio n
equation , namely , = + E. Reader s unfamilia r wit h matri x
notatio n ma y wan t to loo k bac k at Appendi x 3.1. For th e illustration ,
standardizatio n of variable s is assumed , so th e metri c wil l be on e of
correlation s rathe r tha n covariances .
To illustrat e solvin g for regressio n models , Figur e 3.2 is use d onc e
again . For thi s illustration , th e equatio n for X4 is used , an d we ar e
tryin g to solv e for th e regressio n coefficient s forX*, namely , p p >
an d p j. Th e equatio n is
6

4U

42

X, = X,p , + Xtf
4

42

+ X p + e.
3

(4.1)

Equatio n 4.1 doe s no t provid e enoug h informatio n to solv e for th e


unknow n regressio n weights , for th e equatio n ha s thre e coefficient s
to estimate . Additiona l informatio n can be brough t to bea r by multi
plyin g th e equatio n by X,, the n by X , an d the n by X , producin g thre e
2

6. Fo r researcher ! analyzin g thei r experimenta l dat a usin g multivariat e analysi s of varianc e


(MANOVA ) approaches , th e sam e typ e of proble m can occu r if th e variou s dependen t
variable s in th e MANOV A ar e highl y intercorrelated ; thei r collinearit y can lea d t o an overal l
significanc e leve l tha t is misleading . For example , a colleagu e an d I foun d a nonsignifican t
MANOV A effec t in a stud y wher e eac h of th e nin e dependen t variables ' univariat e ANOV A
effect s wa s significan t (Maruyam a 8c Miller , 1 9 8 0 ) . Becaus e th e nin e measure s all containe d
th e sam e informatio n (i.e. , wer e unidimensional) , th e canonica l correlatio n solutio n pro
duce d b y th e MANOV A progra m wa s nonsignificant . We solve d ou r proble m by talcin g a
singl e linea r composit e fo r ou r dependen t variable . Its effec t wa s highl y significant .

64

SINGL E MEASURE S

TABL E

OF

VARIABLE S

4.1 Way s of Detectin g Multicollinearit y

1. Whe n th e varianc e (standar d errors ) in bet a weight s is large .


2. Whe n sign s on bet a weight s ar e inappropriate .
3. Whe n regressio n weight s chang e radicall y du e to th e inclusio n o r exclusio n o f singl e
variables .
4. Whe n th e determinan t o f th e correlatio n matri x o f th e predicto r variable s approache s
zero .
5. Whe n a facto r analysi s o f th e predicto r variable s yield s a ver y larg e "conditio n
number, " wher e th e conditio n numbe r is define d as th e squar e roo t o f th e rati o o f th e
larges t eigenvalu e to th e smalles t eigenvalue . (An eigenvalu e is th e amoun t o f varianc e
explaine d b y eac h factor , expresse d in a correlationa l metri c so tha t an eigenvalu e o f I
mean s tha t a facto r account s fo r as muc h variabilit y as on e variable. ) Ther e is no t per
fec t agreemen t on rule s of thum b fo r conditio n number ; bot h 3 0 an d 100 hav e bee n
suggested .
6. Whe n on e o r mor e eigenvalue s approac h zero .
7. Whe n th e "varianc e inflatio n factors " (VIFs) , define d as th e diagona l element s o f th e
invers e o f th e correlatio n matrix , ge t large . Thos e element s ar e 1 / (1 - R ), wher e R
is th e amoun t o f varianc e in eac h predicto r variabl e tha t ca n b e explaine d b y th e othe r
predicto r variables . A suggeste d rul e her e fo r VIF s is tha t non e shoul d b e greate r tha n
1

6 o r 7.
8. Whe n simpl e correlation s ar e greate r tha n . 8 0 o r .90.
9. Whe n simpl e correlation s betwee n tw o predicto r variable s ar e greate r tha n th e R of
all th e predicto r variable s wit h th e dependen t variable .
2

N O T E : Thes e suggestion s com e fro m a variet y o f sources , so som e ar e mor e libera l tha n others .

equation s in thre e unknown s in term s of th e correlations . Th e result


ing equation s ar e
Pai + ( + (Xi*)
( ) = ( ) P4i +
( ) = (XjX,) p , + (X2X2) P + ( ) P43 +
( ) = (XjX,) p + (XjXJ p + ( ) p + (X e- ).
4

41

42

43

(4.2)
(4.3)
(4.4)

Takin g expecte d values , th e term s in parenthese s can be expresse d as


correlations . Becaus e th e correlatio n of th e error s wit h variable s is
zero , th e fina l ter m drop s ou t in each equation , yieldin g
r\4 = ( r , , ) p i + ( r ) p
1 2

ru =
34

(r )p4 .
2 1

+ (r )p

(r l)p4 1 +
3

2 2

4 2

4 2

0*32)P42

+ (n )p4 3

+
+

(4-5)

(4-6)

(r23 )p4 3
faj)p4J

Effects

of

Collinearity

65

In matri x form , th e equation s ar e


|(/n )p4 i +

('

(r )p \

X3

+ ( r ) p + (r )p |
I (r i)p4 i + (r )p + (r )p | ,
Ifoifo i

2 2

4 2

23

32

43

33

43

whic h is th e sam e as
I 'M I = |(r)(r )(r )| |p |

= IfaOi/uHriJ l |p |

= | ( 3 , ) ( ) ( ) | |p | .
12

13

41

42

43

Whe n th e element s of th e matrice s ar e expresse d in term s of th e X


an d Y variable s an d th e regressio n weight s (A), the y ar e equivalen t to
X'Y = X'XA.

(4.8)

In othe r words , in matri x terms , wha t th e previou s operation s di d


wa s premultipl y th e equatio n presente d at th e star t of thi s section , Y =
+ E, by th e transpos e of X. As just noted , ther e woul d be an
matrix/vector , bu t it drop s ou t becaus e by definitio n it contain s onl y
zeros . Th e expecte d valu e of is th e correlation s of th e X's wit h
th e dependen t variabl e (in thi s case r , r an d r ) , wherea s th e
expecte d valu e of X'X is th e intercorrelation s amon g th e X's (r to
r ) an d A is th e regressio n weight s (p j , p an d p ) .
To solv e for A, X'X need s to be eliminate d fro m th e righ t sid e of
th e equation . Tha t is accomplishe d by doin g th e matri x equivalen t of
dividin g bot h side s by X'X, namely , multiplyin g X'X by its inverse . Th e
notatio n for th e invers e of X'X is (X'X)" . Becaus e each sid e ha s to be
multiplie d by th e sam e quantity , th e resultin g equatio n is
4I

42)

43

33

42>

43

(')- () = (XX)- (X'X)A = A.

(4.9)

Th e quantit y (X'X)~ (X'X) is an identit y matri x an d drop s out . It is th e


matri x equivalen t of th e scala r numbe r 1; whe n an identit y matri x is
multiplie d by an y othe r matrix , th e resul t is tha t othe r matrix .
In effect , then , th e regressio n coefficient s ar e estimate d by mul
tiplyin g th e correlatio n or covarianc e matri x containin g th e relation s
of th e independen t variable s wit h th e dependen t variable s ( ) by
th e invers e of th e correlation/covarianc e matri x containin g th e rela
,

66

SINGL E MEASURE S O F

VARIABLE S

tion s amon g th e independen t variable s ([X'X]" ). Th e firs t importan t


poin t is tha t if on e or mor e of th e X's ar e perfec t linea r combination s
of othe r X's, the n X'X is singular , whic h mean s tha t it can hav e no
inverse , so ther e can be no solutio n for th e regressio n weights . Th e
othe r extrem e is wher e th e independen t variable s ar e uncorrected ;
the n th e matri x X'X is an identit y matri x wit h 1 's on th e diagona l (th e
diagonal s woul d be variance s if we wer e workin g wit h covariances )
an d all othe r element s ar e 0, an d it is th e sam e as its inverse . In suc h
a case , X*Y = A, whic h mean s tha t th e regressio n coefficient s ar e th e
correlations .
O f mos t importance , however , is no t th e limitin g condition s of
independenc e or perfec t collinearit y bu t rathe r thos e betwee n th e
extremes . Regressio n coefficient s ar e a functio n of th e correlation s of
th e X's no t onl y wit h th e dependen t variable s bu t als o wit h each other ,
an d thos e relationship s wit h each othe r ar e th e cause s of collinearit y
problems .
Fortunately , ther e ar e fairl y straightforwar d way s of examinin g
exten t of collinearity . Th e easies t requir e inspectin g th e invers e of th e
correlatio n matri x of predicto r variables . Th e diagona l element s
provid e informatio n abou t collinearit y of each predicto r variabl e wit h
th e res t of th e predictors ; for a correlatio n matrix , the y ar e (1 / [1
R ]). Thus , whe n th e square d multipl e correlatio n of a predicto r wit h
th e other s get s large , th e diagona l elemen t of th e invers e als o get s
large . In Tabl e 4.1, Poin t 7 for detectin g multicollinearity , th e invers e
is calle d th e varianc e inflatio n factor , an d a rul e of thum b for larg e
diagona l element s is given . Inverse s can be obtaine d fro m mos t facto r
analysi s programs , whic h inver t th e correlatio n matri x as a startin g
poin t for iterativ e principa l factor s solutions . (Th e appropriat e cor
relatio n matri x to examin e include s onl y th e predictors. )
Althoug h ther e ar e man y way s of illustratin g th e impac t of th e
correlation s amon g predicto r variables , I try by example . Th e follow
ing exampl e draw s heavil y fro m Rober t A. Gordon' s illustration s in
his 1968 American Journal of Sociology article , "Issue s in Multipl e
Regression. "
1

Illustratin g Effects of Collinearit y

Thi s sectio n is buil t aroun d a hypothetica l correlatio n matri x of 10


variables . O f thes e variables , 4 measur e on e construct , 3 measur e a

Effects

of

Collinearity

67

second , 2 measur e a third , an d a singl e variabl e measure s th e fourt h


construct . Interpretatio n of th e result s woul d be th e sam e if th e firs t
4 variable s measure d on e set of highl y relate d constructs , 3 measure d
a secon d set of relate d constructs , an d so forth .
Followin g th e logi c of Gordo n (1968), all th e within-construc t
correlation s ar e .7, th e cross-construc t correlation s ar e .2, an d all
th e correlation s wit h th e dependen t measur e ar e .5. The y appea r in
Tabl e 4.2.
Th e question s of interes t cente r aroun d interpretatio n of result s
from multipl e regression . Inspectio n of th e correlatio n matri x shoul d
sugges t a numbe r of conclusions , namely , tha t th e construct s seem
wel l define d (base d on th e within-construc t correlations) , tha t each
is relate d moderatel y to th e dependen t variabl e (th e .5 correlations) ,
an d tha t th e predicto r construct s ar e no t ver y highl y interrelated . Th e
primar y issu e her e is wha t happen s if all 10 variable s ar e entere d int o
th e regressio n equatio n rathe r tha n usin g composit e variable s or
laten t variabl e approache s tha t emplo y multipl e measure s of each
construct .
Imagine , for example , tha t researcher s ar e collectin g surve y dat a
fro m a larg e sampl e an d tha t the y ar e searchin g for "new " predicto r
variable s tha t accoun t for varianc e tha t ha s no t bee n accounte d for
previousl y by othe r predictors . The y decid e to operationaliz e th e
construct s underlyin g th e new predictor s in severa l ways ; afte r all, if
th e variabl e is elusiv e (an d it mus t be give n tha t other s hav e no t bee n
abl e to eithe r identif y or defin e it in way s tha t hav e allowe d it to ad d
to prediction) , the n the y wan t to measur e it effectively . Furthermore ,
the y ma y wan t to sho w tha t th e variou s measure s converg e to defin e
a singl e construct . Finally , becaus e the y ar e concerne d abou t construc t
validity , the y wan t to sho w tha t tha t construc t is relate d to othe r
construct s in predicte d ways . Thus , the y includ e (smalle r number s
of) "mor e traditional " variable s tha t hav e previousl y bee n reporte d
to predic t th e dependen t variable . If the y d o a regressio n analysi s
an d ente r all th e predicto r variable s to see whic h variable s "com e
through " an d predic t th e dependen t variable , the n a situatio n suc h as
th e on e illustrate d wil l hav e bee n create d becaus e ther e wil l be
multipl e measure s of th e construct s of "greates t interest " an d fewe r
measure s of traditiona l or well-establishe d predictors .
A secon d circumstanc e migh t occu r if researcher s differentiall y
sample d from differen t set s of domains , choosin g fou r variable s of
on e type , thre e of a second , an d so forth . Differentiall y samplin g

68

TABL E

SINGL E MEASURE S O F

VARIABLE S

4.2 Artificia l Correlatio n Matri x


Ai

A3

Bl

B2

B3

Ci

C2

Dl

Ai

1.0

Ai

.7

1.0

Ai

.7

.7

1.0

.7

.7

.7

1.0

Bi

.2

.2

.2

.2

Bi

.2

.2

.2

.2

.7

Bi

.2

.2

.2

.2

.7

.7

1.0

Ci

.2

.2

.2

.2

.2

.2

.2

1.0

Ci

.2

.2

.2

.2

.2

.2

.2

.7

1.0

Di

.2

.2

.2

.2

.2

.2

.2

.2

.2

1.0

.5

.5

.5

.5

.5

.5

.5

.5

.5

.5

1.0
1.0

migh t resul t if on e wer e to "thro w in " a variabl e or tw o on a whi m


or as a last-minut e addition . On e migh t imagine , for example , includ
ing a variabl e suc h as "birt h order " becaus e it seem s intuitivel y
interesting . If ther e is onl y a singl e indicato r of birt h orde r an d man y
indicator s of othe r variables , the n th e situatio n coul d readil y occur .
Tabl e 4.3 show s wha t happen s whe n th e variou s indicator s ar e
entere d int o multipl e regressio n equations . In th e illustration , th e
variable s assesse d by thre e or fou r differen t measure s all ar e nonsig
nificant , wherea s th e one s wit h fewe r measure s contribut e signifi
cantl y to prediction . Th e lowe r part s of th e tabl e sho w wha t happen s
whe n each subse t is exclude d fro m th e grou p of predicto r variables ;
in thi s illustration , however , th e change s ar e no t major . Ironically ,
wha t come s throug h consistentl y in th e illustratio n is tha t wha t make s
a measur e a significan t predicto r is not having othe r measure s tha t
asses s th e sam e underlyin g variabl e tha t it does . Gordo n (1968) calle d
th e proble m cause d by difference s in numbe r of indicator s repetitive
ness .
A secon d issu e discusse d by Gordo n (1968) is wha t happen s whe n
th e correlation s of th e predicto r variable s wit h th e criterio n variabl e
ar e no t unifor m bu t instea d vary . Again , as an illustratio n followin g
th e spiri t of his article , in Tabl e 4.4 on e of th e correlation s wit h th e
criterio n variabl e is change d slightl y an d th e solution s ar e reesti
mated . Th e full 10-variabl e arra y of predictor s is include d in thi s

Effects

of Collinearity

69

TABLE 4.3 Regressio n Analyse s Based on Tabl e 4.2

A3

A*

B\

Bz

B3

C\

Di

Regressio n analyse s usin g al l 10 predicto r variable s


y

.097

.097

.097

.097

.124

.124

.124

.172

.172

.279

Standar d erro r

.090

.090

.090

.090

.086

.086

.086

.079

.079

.059

rvalu e

1.081 1.081 1.081 1.081 1.441 1.441 1.441 2 . 1 7 8 2 . 1 7 8 4 . 7 4 7

Residua l varianc e - . 3 0 8
Square d multipl e correlatio n = . 6 9 2
Regressio n analyse s omittin g th e singl e indicato r variabl e (Di )
Y

.109

.109

.109

.109

.140

.140

.140

.193

.193

Standar d erro r

.100

.100

.100

.100

.095

.095

.095

.087

.087

1 . 0 9 8 1.098 1.098 1.098 1.465 1.465 1.465 2 . 2 1 6

2.216

t valu e

Residua l varianc e = . 3 7 9
Square d multipl e correlatio n = .621

Regressio n analyse s omittin g th e tw o indicato r variable s (Ct , Ci)


Y

.113

.113

.113

.113

.144

.144

.144

.324

Standar d erro r

.102

.102

.102

.102

.098

.098

.098

.066

1 . 1 0 4 1.104 1.104 1.104 1.473 1.473 1.473

4.898

rvalu e

Residua l varianc e = . 3 9 7
Square d multipl e correlatio n = .603

Regressio n analyse s omittin g th e thre e indicato r variable s (Bi , Bi, Bj )


Y

.114

.114

.114

.114

.202

.202

.328

Standar d erro r

.103

.103

.103

.103

.090

.090

.067

rvalue s

1 . 1 0 7 1.107 1.107 1.107

2.235 2.235 4.915

Residua l varianc e = . 4 0 6
Square d multipl e correlatio n = .594
Regressio n analyse s omittin g th e fou r indicato r variable s (, Ai, Aj, A4)
Y

.147

.147

.147

.203

.203

.331

Standar d erro r

.099

.099

.099

.091

.091

.067

t valu e

1.479 1.479 1.479 2 . 2 3 8 2 . 2 3 8 4 . 9 2 5

Residua l varianc e = .411


Square d multipl e correlatio n = . 5 8 9

regressio n illustration . In each of th e fou r variation s illustrated , on e


correlatio n is increased ; differen t illustration s var y correlation s in
differen t constructs . Th e magnitud e of th e increas e is onl y from .50

70

SINGL E MEASURE S O F

VARIABLE S

to .55, wel l withi n th e confidenc e interva l for a correlatio n for mos t


sampl e sizes . (A discussio n of confidenc e interval s for correlation s
appear s late r in thi s chapter. ) Not e tha t whe n a correlatio n is in
crease d in eithe r th e three - or four-indicato r construct , tha t indicato r
become s significan t along wit h indicator s fro m th e two - an d one
indicato r constructs .
Mos t importan t for thi s book , inspectio n of th e correlation s
show s tha t modes t change s in th e magnitud e of correlationso f a
magnitud e tha t woul d occu r du e to samplin g fluctuationsca n mark
edl y chang e th e interpretatio n of regressio n coefficients . (For mor e
illustration s of th e effect s of issue s suc h as th e one s covere d in thi s
chapter , reader s can refe r to Gordo n [1968]. Gordo n als o illustrate s
wha t happen s as th e correlation s withi n construct s increase , a situ
atio n he label s as redundancy , for th e predictor s the n contai n mor e
redundan t information. )
Thes e illustration s ar e importan t for thi s boo k for tw o reasons .
Th e firs t an d obviou s on e is tha t the y poin t ou t weaknesse s inheren t
in multipl e regression . "Bad " decision s in th e selectio n of predictor s
for inclusio n in regressio n equation s can produc e misleadin g (or at
leas t difficul t to replicate ) results , as can samplin g fluctuation s in th e
size of correlations . Second , an d of greate r importance , is tha t mul
tipl e indicator s an d laten t variabl e approache s minimiz e th e problem s
describe d in th e precedin g by eliminatin g differentia l repetitivenes s
(each conceptua l variabl e appear s onl y onc e in a regressio n equation )
an d by adjustin g for differentia l reliabilit y of measures .

Confidenc e Interval s for Correlation s

Relativel y few researcher s seem to hav e muc h experienc e in estimat


ing confidenc e interval s for correlations . Th e lack of experienc e in
suc h estimatio n ma y resul t becaus e significanc e of correlation s typi
cally is determine d by compute r program s tha t correlat e variable s an d
becaus e ther e ar e table s in man y statistic s book s tha t provid e signifi
canc e informatio n on correlations . It als o ma y occu r becaus e estimat
ing confidenc e interval s is fairl y complex . Finally , th e resultin g con
fidenc e interval s ar e nonsymmetric , whic h make s the m mor e difficul t
to explai n or understand . Regardles s of th e cause , th e shortcomin g is
ironi c give n tha t confidenc e interval s provid e th e bes t informatio n
on expecte d fluctuation s in correlation s acros s samples . A recen t

Effects

of

Collinearity

TABL E 4.4

71

Variatio n on th e Regressio n Analyse s Fro m Tabl e 4.3,


Increasin g a Singl e Relationshi p Wit h th e Criterio n
Variabl e From .50 to .55
A\

Ai

Ai

A*

B\

Bi

Bi

C\

Ci

D\

Increasin g a correlatio n in th e first set o f predictor s

.227

.060

.060

.060

.123

.123

.123

.170

.170

.277

Standar d erro r

.087

.087

.087

.087

.084

.084

.084

.077

.077

.057

rvalu e

2.592 0.686 0.686 0.686 1.468 1.468 1.468 2.218 2.218 4 . 8 3 4

Residua l varianc e = .292


Square d multipl e correlatio n = .708
Increasin g a correlatio n in th e secon d set o f predictor s
y

.096

.096

.096

.096

.243

.076

.076

.170

.170

.276

Standar d erro r

.087

.087

.087

.087

.083

.083

.083

.077

.077

.057

rvalu e

1 . 1 0 2 1 . 1 0 2 1.102 1.102 2 . 9 1 1 0 . 9 1 5 0 . 9 1 5 2 . 2 2 0 2 . 2 2 0 4 . 8 4 1

Residua l varianc e = . 2 9 0
Square d multipl e correlatio n - .710

Increasin g a correlatio n in th e thir d set o f predictor s


y

.096

.096

.096

.096

.122

.122

.122

.272

.105

.275

Standar d erro r

.087

.087

.087

.087

.083

.083

.083

.076

.076

.057

rvalu e

1.105 1.105 1.105 1.105 1.473 1.473 1.473 3 . 5 7 5 1 . 3 8 2 4 . 8 5 2

Residua l varianc e = . 2 8 6
Square d multipl e correlatio n = .714
Increasin g th e correlatio n in th e fourt h set o f predictor s
y

.095

.095

.095

.095

.121

.121

.121

.167

.167

Standar d erro r

.085

.085

.085

.085

.082

.082

.082

.075

.075

.056

1 . 1 1 0 1 . 1 1 0 1 . 1 1 0 1.110 1.481 1.481 1.481 2 . 2 3 7 2 . 2 3 7

5.998

/valu e

.335

Residua l varianc e = .278


Square d multipl e correlatio n = . 7 2 2

articl e by Olki n an d Finn (1995) provide s expression s for confidenc e


interval s for simple , partial , an d multipl e correlations .
Calculatin g confidenc e interval s require s convertin g correlation s
to Fisher' s (which , for those wh o car e to know , is th e hyperboli c
arctangen t of th e correlation) , calculatin g confidenc e interval s for th e
z, an d convertin g th e z's for th e uppe r an d lowe r limit s of th e
confidenc e interva l bac k to r's . On e formul a for convertin g a corre
latio n to Fisher' s is

SINGL E MEASURE S O F

72

z = V [log (l + r) - log,( l - r)] .


2

VARIABLE S

(4.10)

Man y statistic s book s contai n table s wit h r to Fishe r conversions . It


turn s ou t tha t for smal l correlation s (less tha n .25), th e approximate s
th e correlation ; however , as th e correlatio n increase s beyon d .25, th e
tw o diverge , wit h th e increasin g mor e rapidly . Th e standar d erro r
for correlation s is calculate d from th e sampl e size usin g th e formul a
Standar d Erro r = 1 / sqrt( N - 3),

(4.11)

wher e is th e sampl e size . The n multiplyin g th e standar d erro r time s


th e scor e valu e for th e probabilit y leve l give s th e confidenc e interval .
To illustrate , imagin e tha t we wan t to determin e th e confidenc e
interva l aroun d a correlatio n of .50 for sampl e size s of 100 an d 500.
Th e Fishe r for r = .50 is .549, an d th e respectiv e standar d error s
ar e (1 / sqrt[97] ) = .1015 an d (1 / sqrt[497] ) = .0449. Choosin g
a probabilit y leve l of .05 (two-tailed) , th e appropriat e scor e is
1.96, an d th e confidenc e interval s for Fisher' s becom e .549
(1.96)(.1015), or .549 .199, an d .549 (1.96)(.0449), or .549
.088. For th e sampl e size of 100, th e confidenc e interva l score s rang e
from .350 to .748, equivalen t r's bein g .336 an d .634; for a sampl e
of 500, th e scor e interva l is from .461 to .637, equivalen t r's bein g
.430 an d .560. Th e exampl e illustrate s th e importanc e of havin g larg e
samples ; wit h a sampl e of 100, correlation s shoul d be expecte d to
fluctuat e markedl y acros s samples . Th e lack of symmetr y in th e
confidenc e interva l expresse d in correlation s shoul d be apparen t fro m
th e illustration . Mos t importan t for th e collinearit y exampl e illus
trate d in Tabl e 4.4, a fluctuatio n in a correlatio n fro m .50 in on e
sampl e to .55 falls withi n th e bound s of samplin g error , eve n for a
sampl e of size 500. Thus , the differences among the first three solu
tions presented in Table 4.4 may result from modest fluctuations in
correlations across samples, yet the interpretations about important
predictor variables would change markedly. In othe r words , drawin g
meanin g from significan t predictor s in regressio n analyse s is risk y
business .
So wha t is on e to d o if faced wit h collinearit y problems ? Ar e ther e
way s of stabilizin g solution s acros s samples ? Ther e ar e techniques ,
suc h as jackknifing , in whic h value s for a set of predictor s ar e
reestimate d repeatedly , each tim e droppin g on e predicto r unti l all

Effects

of

Collinearity

73

hav e bee n omitted . Substantia l change s in th e regressio n weight s by


omissio n of singl e variable s poin t to collinearity . In addition , ther e
ar e way s of addressin g collinearit y tha t stil l us e basi c regressio n
approaches , includin g eliminatin g som e of th e variable s fro m th e
regressio n equation , combinin g variable s tha t represen t singl e con
struct s (i.e., usin g composit e variables) , an d increasin g sampl e size to
increas e one' s confidenc e in th e sampl e estimates .
Furthermore , as a genera l principle , in larg e sample s on e shoul d
randoml y spli t th e sampl e an d cross-validat e th e finding s (i.e., esti
mat e a solutio n on hal f of th e sampl e an d see whethe r it can be
replicate d in th e othe r half ) (see , e.g., Cudec k & Browne , 1983).
Althoug h cross-validatio n is valuabl e for man y reason s (e.g. , to allo w
som e pos t ho c mode l changes) , consistenc y acros s th e tw o sample s
argue s agains t samplin g fluctuation s producin g misleadin g results .
Finally , on e coul d abando n traditiona l regressio n an d its relianc e on
unbiase d estimate s an d instea d us e a set of method s know n as ridg e
regression , reduce d varianc e regression , or ridg e estimation . Thes e ar e
discusse d becaus e som e structura l equatio n program s (e.g. , LISREL)
includ e ridg e estimation .

Ridg e or Reduce d Varianc e Regressio n

Ridg e estimatio n technique s (e.g. , Darlington , 1978) provid e a mean s


of stabilizin g th e solutio n for a collinea r predictio n model . In thos e
approaches , a mor e stabl e solutio n is attaine d by addin g a smal l
constan t to th e element s of th e diagona l of th e correlatio n matrix .
Th e firs t challeng e for ridg e approache s is to introduc e as smal l a
constan t as possibl e to kee p th e matri x yieldin g th e ridg e solutio n as
close to th e origina l matri x as possible . Thus , a typica l ridg e progra m
slowl y increases th e constan t tha t is adde d to th e diagona l (ofte n
beginnin g wit h a constan t as smal l as .001) an d successivel y reesti
mate s th e regressio n coefficients . At som e poin t in th e proces s of
increasin g th e constant , all th e regressio n estimate s becom e stable
(i.e., chang e ver y modestl y acros s successiv e solutions ) an d begi n to
slowl y mov e towar d zero . At tha t point , th e ridg e solutio n ha s bee n
obtained . Th e differen t approache s for estimatin g th e ridg e constan t
ar e no t presente d here , for th e goa l is no t to teac h th e methodolog y
bu t rathe r to simpl y introduce th e logi c an d generall y describ e th e

74

SINGL E MEASURE S O F

VARIABLE S

methodology . Pric e (1977), amon g others , provide s suggestion s abou t


selectin g ridg e constants .
Estimate s fro m a ridg e solutio n shoul d cross-validat e wel l acros s
samples , for the y shoul d be stabl e despit e instabilit y in th e actua l dat a
du e to collinearit y affectin g th e (X'X)" matrix . On th e othe r hand ,
th e cost s of attainin g stabilizatio n of th e solutio n ar e tha t (a) th e
estimate s ar e biased , (b) the y wil l no t hav e standar d error s (an d so
significanc e of individua l predictor s canno t be established) , an d (c)
th e varianc e accounte d for wil l be decrease d somewhat .
Lack of significanc e testin g an d bia s ma y be smal l price s to pa y
for circumstance s in whic h perfec t collinearit y occurs , particularl y if
suc h collinearit y seem s unavoidabl e (som e econometri c model s ar e
particularl y likel y to face suc h problems) , for no solutio n woul d be
possibl e withou t an approac h like ridg e estimation . In thos e circum
stances , th e choice s ar e to selec t a biase d solutio n or to ge t no
informatio n fro m one' s dat a set . Althoug h ther e ar e argument s for
bot h positions , my preferenc e is th e pragmati c one , namely , to us e
ridg e technique s to get estimate s an d to us e th e informatio n bot h
abou t wha t coul d hav e cause d th e collinearit y problem s an d abou t
wha t th e estimate s sugges t in plannin g th e follow-u p study .
For circumstance s in whic h collinearit y is hig h bu t a solutio n can
be estimate d withou t ridg e techniques , th e introductio n of bia s tha t
result s durin g ridg e estimatio n ha s led to som e disagreemen t amon g
socia l scientist s abou t whethe r or no t ridge technique s shoul d be used .
On e questio n is whethe r we wan t to dra w inference s fro m biase d
coefficients . Th e answe r to tha t questio n ha s to be weighe d agains t
tryin g to interpre t th e value s fro m ordinar y leas t square s estimatio n
whe n thos e coefficient s can def y logic . For example , standardize d
regressio n coefficient s can greatl y excee d 1 an d hav e sign s opposit e
thei r zero-orde r correlation .
A nic e illustratio n of ridg e estimatio n is reporte d by Pric e (1977),
wh o describe s result s fro m analyse s of a highl y collinea r five-predic
to r dat a set assessin g employe e satisfaction . In his example , th e five
collinea r predictors , whic h basicall y seem to defin e a singl e factor ,
hav e correlation s wit h th e dependen t variabl e rangin g fro m .158 to
.827. Despit e th e fact tha t all correlation s in th e matri x ar e positive ,
th e reporte d standardize d regressio n coefficient s rang e fro m - 3 . 6 9 to
2.11 an d includ e a 1.85 an d a 1.25. Onl y th e 2.11 is significant , whic h
1

Effects

of

Collinearity

75

shoul d say a lot abou t th e size of th e standar d errors . (In regressio n


wit h standardize d variables , larg e standar d errors , whic h mea n larg e
variance s for th e estimates , ar e a goo d indicato r of collinearit y
problems. ) Finally , th e -3.69 is a classi c exampl e of a bouncin g beta ,
wher e a stron g correlatio n yield s a regressio n coefficien t wit h a
puzzlin g (backward ) sign .
In Price' s (1977) data , th e collinearit y is apparen t fro m inspectio n
of th e correlatio n matri x (it include s correlation s betwee n predictor s
of .91, .87, an d .82) as wel l as fro m inspectio n of th e diagona l
element s fro m th e invers e of th e matri x of predictor s (varianc e
inflatio n factor s includ e 493 an d 129). In othe r instances , effect s of
collinearit y ma y be mor e subtle . Nevertheless , th e illustratio n is a nice
on e becaus e it show s th e impac t on th e regressio n coefficients , ha s
bouncin g betas , an d ha s a solutio n tha t change s quickl y as a ridg e
constan t is introduce d an d increased .
In summary , thi s chapte r ha s presented an d illustrate d ho w col
linearit y can produc e problem s for basi c regressio n approaches .
Collinearit y problem s (a) poin t ou t issue s tha t ar e fundamenta l to
understandin g th e entir e rang e of structura l equatio n approache s an d
(b) highligh t an advantag e of laten t variabl e approache s to SEM
techniques , namely , tha t man y problem s relate d to collinearit y within
constructs ar e eliminate d whe n a set of collinea r predictor s is replace d
by a singl e composit e predictor . Th e problem s canno t be solve d whe n
on e or mor e measure s ar e exac t linea r composite s of othe r measure s
(in whic h case no approac h wil l yiel d a viabl e solution ) an d whe n
measure s ar e so close to bein g composite s tha t the y mak e th e solutio n
proces s unstable . Laten t variabl e approache s also d o no t hel p whe n
th e ver y hig h correlation s amon g predictor s ar e of predictor s fro m
differen t conceptua l variables . In thos e instances , it make s sens e firs t
to addres s issue s of convergent/discriminan t validit y to assur e onesel f
tha t th e conceptua l variable s ar e in fact different . Matri x for m wa s
use d to illustrat e th e underlyin g natur e of collinearit y problems ,
followe d by an exampl e derive d from Gordo n (1968) to sho w ho w
smal l fluctuation s in th e size of correlation s coul d affec t regressio n
coefficient s dramatically . Th e hig h likelihoo d of thos e type s of fluc
tuation s wa s illustrate d throug h a discussio n of confidenc e interval s
for correlations . Finally , on e approac h for dealin g wit h collinearity ,
ridg e regression , wa s discusse d briefly .

SINGL E MEASURE S

E X E R C I SE

OF

VARIABLE S

4. 1

Partia l Correlatio n an d Regressio n

Give n th e variable s
X,
X
X
X
X

=
=
=
=
=

Social Clas s
Famil y Size
Abilit y
Self-Estee m
Schoo l Achievemen t

as wel l as thei r correlation s


Xs

X4

X3

X2

Xi

Xi

1.00

Xi

-.33

1.00

Xi

.39

-.33

1.00

X*

.14

-.14

.19

1.00

Xs

.43

-.28

.67

.22

1.00

an d th e followin g finding s fro m regressio n equation s predictin g


X , X4, an d as dependen t variables , th e followin g coefficient s
ar e standardize d regressio n coefficients :
3

DV

IV

Xs

X<

Xi

WT

DV

Xi

.38

X4

X2

-.15

Xi

.11

X2

-.11

Xi

.32

X2

-.23

Xj

IV

WT

DV

IV

Xs

WT

Xi

.06

Xl

.19

X2

-.07

X2

-.02

Xi

.14

X3

.58

Xi

.19

X4

.08

X2

-.02

X3

.59

N O T E : D V = dependen t variable ; IV = independen i : variable ; W I regreuio n weight .

Effects of Collinearity

77

Social Class
Class

>i

\ S

Figure 4.1.

A.

Family Size
X2

Ability
X3

School Achievement
X5

S
'

'

Partial Regression Illustration

regression.
Using the preceding variables and analyses,
what is the regression weight or path coefficient from Ability
to School Achievement for the diagram in Figure 4.1? (The
goal here is simply to determine the appropriate equation and
find the regression coefficient.)
B. Partial correlation.
Again using the preceding correlation matrix,
what is the partial correlation between Social Class and School
Achievement for the model in Figure 4.2? (By contrast, this
requires work.)
Partial

Solution
suggestion.
Solving requires estimating secondorder partialing, which can be done by partialing variables one
at a time and using the formula presented earlier in this book.
Successively partial X and X from all remaining relations,
leaving only a residual relation between X, and X. As you do
2

78

SINGL E MEASURE S OF VARIABLE S

F a m i ly S i ze
X

/*

. S o c i al C l a ss

X 1

6 1

\ ^
\

~.-

Ability

M5.23
IS c h o ol
x

Figur e 4 . 2 .

. A c h i e v e m e n 't
s

~z
e

i/

Partia l Correlatio n Illustratio n

this, think about how difficult it would be to estimate fifth-,


sixth-, or even higher order partials by this approach (and
thank whoever invented computers!).
C. What is the value of the partial regression coefficient from Part
A relating Social Class to School Achievement (p ,)?
5

How does the partial correlation between Xi and X control ling f o r X and X compare with the partial regression coefficient from X to X controlling for X and Xj? Are they the
same or different?
5

What is the logic of each approach?

I [ [ ^ s a j i ^ E m L ^ t a i ^ s i i i i

A s ha s bee n mentione d throughou t thi s book , th e


ter m path analysis refer s onl y to a restricte d subse t of pat h models .
In thi s chapter , extension s fro m th e subse t of model s tha t can be calle d
pat h analysi s ar e considered . First , model s tha t contai n measuremen t
erro r ar e introduced , followe d by model s in whic h bot h multipl e
trait s an d method s ar e include d in th e data . In movin g beyon d pat h
analysi s models , on e faces th e possibilit y tha t th e model s develope d
canno t be uniquel y solved , tha t is, ar e no t identified . Discussio n of
identificatio n issue s wil l be covere d in th e nex t chapte r alon g wit h
anothe r variatio n on pat h analysis , model s wit h bidirectiona l or eve n
multidirectiona l causality .

I
I

Measuremen t Erro r
Background

Inabilit y to allo w for measuremen t erro r ha s bee n th e primar y down fall of pat h analysi s models . Ther e ar e few place s in th e social science s
wher e a case can be mad e tha t th e variable s of interes t can be
measure d withou t appreciabl e error . Particularl y in area s suc h as
assessmen t of attitudes , it simpl y is implausibl e to assum e tha t th e
conceptua l variable s ar e measure d anywher e nea r perfectly . Inabilit y
79

80

SINGL E MEASURE S O F

VARIABLE S

to mak e tha t assumptio n in effect rule s ou t us e of pat h analysis . As is


illustrate d late r in thi s chapter , whe n measuremen t erro r is present ,
pat h coefficient s becom e biase d an d th e solutio n canno t be truste d to
accuratel y reflec t th e processe s involved .
Befor e beginnin g to discus s specifics , a genera l backgroun d for
thinkin g abou t measure s is provided . Thi s perspectiv e draw s fro m
reliabilit y theor y (e.g. , Mehren s & Lehmann , 1984), whic h partition s
th e varianc e of measure s int o tru e an d erro r variance . Consisten t wit h
facto r analysi s logi c (e.g. , Gorsuch , 1983), however , it furthe r parti
tion s tru e scor e varianc e int o varianc e relate d to th e dimensio n of
interes t an d varianc e tha t is reliabl e bu t tap s somethin g othe r tha n
th e dimensio n of interest . For thi s discussion , reliabilit y shoul d be
though t of in term s of interna l consistenc y reliability .
Th e thre e varianc e component s of a measur e ar e as follows . First ,
tru e scor e varianc e relate d to th e theoretica l construct(s ) of interes t
to researcher s is th e par t tha t researcher s wan t to isolat e an d keep .
Thi s componen t is par t of th e reliabl e varianc e o f th e measure . In
mos t instances , however , it is less tha n th e tota l reliabilit y of th e
measur e becaus e no t all of th e reliabl e varianc e is relate d to th e
construct(s ) of interest . Thi s firs t varianc e componen t can be calle d
tru e scor e commo n variance . A secon d varianc e componen t is th e
differenc e betwee n th e reliabilit y of th e measur e an d its relatio n to
th e construct(s ) of interest . Thi s varianc e componen t is no t erro r an d
wil l consistentl y appea r each tim e th e measur e is used . Thi s compo
nen t can be calle d tru e scor e uniqu e variance . A thir d varianc e
componen t is traditiona l erro r variance , th e unreliabl e varianc e tha t
is par t of a measure . O f course , if a measur e is assesse d withou t error ,
the n thi s componen t is zero . Thi s componen t stay s th e sam e regard
less of th e theoretica l variabl e tha t is bein g assessed .
For example , if we wer e to tr y to asses s self-concept , w e woul d
choos e on e or mor e measure s tha t purportedl y asses s th e theoretica l
variabl e of self-concept . But thos e measure s woul d likel y ta p mor e
tha n just self-concept . In additio n to measuremen t error , the y coul d
contai n metho d variance , hav e measure-specifi c variance , or eve n
asses s a secon d theoretica l variable . As a consequence , thei r estimat e
of interna l consistenc y reliabilit y woul d be greate r tha n thei r relation
shi p wit h th e pur e conceptua l variabl e of self-concept . It is assessin g
onl y part s of measure s tha t provide s th e greates t challeng e for re
searcher s usin g structura l modeling . The y nee d to be abl e to extrac t
fro m thei r measure s th e par t of th e variabilit y tha t assesse s th e

Random

and Nonrandom

Error

81

theoretica l variabl e of interest . On e migh t thin k of wor k in field s suc h


as chemistry , wher e impuritie s frequentl y nee d to be extracte d to
wor k wit h solution s whos e propertie s ar e perfectl y understood . Th e
challeng e for social scientist s is parallel : to remov e part s of measure s
wit h unwante d propertie s so tha t th e actua l effect s of theoretica l
variable s can be clearl y observed .
As wil l be argue d late r in thi s boo k whe n facto r analysi s logi c is
introduced , th e idea l situatio n for operationalizin g a theoretica l
variabl e is on e in whic h ther e ar e availabl e multipl e measure s of it.
Onl y wit h multipl e measure s can variou s varianc e component s be
tease d apar t usin g principle s of convergen t an d discriminan t validit y
an d can th e varianc e relate d to th e construc t of interes t be isolated .
As a secon d example , imagin e tha t I hav e a measur e of famil y size .
If I us e tha t measur e to asses s a theoretica l variabl e also calle d famil y
size , the n th e measur e is likel y to be mad e up almos t entirel y of tru e
scor e commo n varianc e an d erro r variance . Th e tru e scor e uniqu e
varianc e shoul d effectivel y be zero . If instea d I us e famil y size to asses s
a differen t theoretica l variable , on e calle d famil y socia l class , the n th e
proportion s of tru e scor e commo n varianc e an d tru e scor e uniqu e
varianc e chang e markedly . For an y measure , th e relativ e size s of th e
firs t tw o component s depen d on wha t th e researche r is tryin g to
assess .
|

Specifying Relationships Between


Theoretical Variables and Measures

Althoug h ther e ar e differen t way s of thinkin g abou t th e relationship s


betwee n th e measur e an d its underlyin g construct , in th e structura l
equatio n field thos e relationship s typicall y ar e viewe d as reflectin g
influenc e of th e construc t on th e measure . Thus , th e arrow s fro m pat h
modelin g wil l go fro m th e underlyin g construc t to th e measur e unles s
th e case can be mad e tha t th e measur e cause s th e theoretica l variabl e
(for a discussio n of causa l indicators , see Bollen & Lennox , 1991;
MacCallu m & Browne , 1993; Tanaka , Panter , Winbourne , & Huba ,
1990). A commo n diagra m representin g th e relationshi p betwee n a
measur e an d its underlyin g construc t appear s in Figur e 5.1.
Consisten t wit h th e logi c presented in th e precedin g discussio n
an d wit h facto r analysis , th e unmeasure d construc t "causes " th e
measur e becaus e th e measur e assesse s variabilit y fro m tha t construct .
Th e residua l include s measuremen t erro r as wel l as tru e scor e uniqu e

82

SINGL E MEASURE S O F

VARIABLE S

Measur e

t
Residua l
Figur e 5 . 1 .

Expressin g M e a s u r e s in T e r m s o f C o n s t r u c t s

variance . In othe r words , th e measur e is viewe d as bein g mad e up of


variabilit y from th e construc t of interes t plu s othe r variability . Be
caus e th e figur e is a pat h mode l an d follow s tha t logic , by definitio n
th e residua l is mad e up of all cause s of a measur e tha t ar e no t include d
in th e model . Tha t is, th e residua l is all of th e varianc e othe r tha n th e
construc t of interest . If, however , othe r variable s wer e to be include d
in th e mode l as cause s of th e measure , the n par t of th e uniqu e varianc e
in Figur e 5.1 woul d becom e commo n varianc e in th e modifie d mode l
an d woul d be represente d by arrow s fro m th e additiona l cause s to th e
measure . Th e residua l the n woul d be smaller .
To retur n to th e poin t mad e earlie r abou t a measur e of famil y size ,
th e construc t of socia l clas s can be use d to illustrat e th e thre e type s
of varianc e components . Social clas s supposedl y assesse s one' s socio
economi c status . It reflect s a combinatio n of prestige , of acces s to
resource s includin g knowledge/expertise , of economi c advantage ,
an d of value s consisten t wit h prestig e an d attainment .
Social class typicall y is operationalize d (imperfectly ) as som e
combinatio n of measure s of income , occupationa l status , an d educa
tiona l attainment . Even thoug h measure s of income , status , an d
attainmen t each coul d be use d to mak e th e poin t abou t varianc e

Random

and Nonrandom

Error

83

partitioning , it can be mad e alon g wit h a secon d poin t by focusin g on


famil y size , whic h ha s bee n use d to asses s socia l class in instance s
wher e measure s of on e or mor e of th e thre e domain s ar e missing .
Famil y size ha s bee n use d as a measur e of social class becaus e in
ou r cultur e mor e advantage d peopl e ten d to hav e smalle r families .
On e could , of course , argu e tha t famil y size is a questionabl e measur e
of socia l class , whic h undoubtedl y is true . Yet, a researche r wit h a dat a
set havin g onl y a flawe d measur e of a potentiall y importan t construc t
suc h as socia l class need s to balanc e th e competin g argument s in
decidin g whethe r to exclud e a potentiall y importan t construc t or to
asses s it imperfectly . Assum e tha t in thi s case th e decisio n is tha t social
class is to o importan t to omi t fro m th e model , an d so it is include d
eve n thoug h th e onl y measur e tha t tap s social class is famil y size .
Famil y size can be measure d wit h a hig h degre e of reliability , in fact
almos t perfectl y so. Unreliabilit y ma y be limite d to codin g error s by
investigator s or familie s tha t ar e in flu x du e to marriages , separations ,
divorces , or othe r instabilities . In othe r words , th e measuremen t erro r
componen t is ver y small . Yet, famil y size in no wa y is close to a perfec t
measur e of social class , for it contain s varianc e du e to man y othe r
variables . Othe r variable s influencin g famil y size includ e cultura l
value s (e.g. , som e group s in ou r cultur e valu e larg e familie s mor e tha n
d o others) , religiou s practice s (e.g. , Roma n Catholic s hav e differen t
belief s abou t us e of birt h contro l tha n d o man y othe r religiou s
groups) , an d fertilit y difference s an d understandin g of effectiv e birt h
contro l practices . Thos e othe r source s of variabilit y no t relate d to
socia l class do no t diminis h th e reliabilit y of th e "famil y size " meas
ure ; the y ar e component s of tru e scor e varianc e of famil y size an d ar e
reliable . Yet, the y ar e no t par t of th e underlyin g dimensio n of social
class . The y ar e portion s of varianc e no t share d wit h othe r measure s
of social class (i.e., no t commo n variance) ; instead , the y ar e tru e scor e
uniqu e variance . Tru e scor e uniqu e varianc e diminishe s th e reliabilit y
of famil y size with respect to th e underlyin g construc t of socia l class .
Furthermore , if thes e varianc e component s wer e hypothesize d to be
relate d to th e sam e criterio n variable s tha t social class shoul d predict ,
the n the y wil l lead to problem s of interpretation . In Figur e 5.1, thes e
latte r source s of variabilit y ideall y woul d be par t of th e residual ,
whic h woul d mea n tha t the y ar e no t relate d to th e dependen t vari
able s wit h whic h socia l class is related .
To summarize , for variable s in th e social sciences , ther e almos t
alway s ar e discrepancie s betwee n th e conceptua l variable s an d th e

SINGL E MEASURE S O F

84

VARIABLE S

measure s tha t operationaliz e them . Thes e discrepancie s mak e it criti


cal to accuratel y partitio n reliabl e varianc e int o tru e scor e commo n
varianc e an d tru e scor e uniqu e variance . For pat h models , it woul d
be idea l to partitio n varianc e in a wa y tha t leave s onl y th e tru e scor e
commo n varianc e as reliabl e an d lump s togethe r th e tru e scor e uniqu e
varianc e wit h measuremen t error .
Unfortunately , suc h partitionin g canno t be don e unti l multipl e
measure s of construct s ar e introduce d an d th e logi c of facto r analysi s
is used , yieldin g laten t variabl e structura l equatio n modelin g (SEM)
technique s describe d in Chapte r 8. Th e closes t approximation s of
thos e approache s in single-indicator , manifes t variabl e pat h modelin g
us e composit e measure s in pat h models . Th e composit e measure s
ideall y combin e commo n tru e scor e varianc e in an additiv e fashion ,
wherea s rando m error s an d uniqu e tru e scor e varianc e component s
combin e in tru e rando m fashion , no t increasin g (e.g. , Mehren s fie
Lehmann , 1984). Unfortunately , all to o ofte n th e composit e measure s
contai n commo n metho d or othe r share d variance , wit h th e resul t
tha t error s combin e additivel y as well .
For th e measuremen t erro r model s describe d in thi s chapter , tru e
scor e commo n varianc e an d tru e scor e uniqu e varianc e typicall y wil l
be combine d as reliabl e variance . In suc h models , onl y measuremen t
erro r appear s as error .
|

Random Measurement

Error

Rando m measuremen t erro r is erro r tha t actuall y meet s th e desire d


propertie s of erro r variance , namely , tha t is unrelate d to predicto r
variables , criterio n variables , an d error s of othe r measures . Becaus e
it is unrelate d to an y variables , it exist s independentl y of othe r
measures . Tha t is, it doe s no t contribut e to th e relationship s of th e
measure s havin g rando m error s wit h an y othe r measures . Becaus e it
is no t relate d to an y othe r measures , its presenc e reduce s th e relation
shi p of th e measur e it affect s wit h othe r measures . In pat h models , it
result s in misestimatio n of th e strengt h of variou s relationship s fro m
th e model . In th e bivariat e case , th e strengt h of th e relationshi p
alway s is underestimated . In th e multivariat e case , unfortunately , suc h
a simpl e an d straightforwar d conclusio n is no t possible . If ther e is
onl y on e variabl e wit h less tha n perfec t reliability , the n its relation
ship s wit h othe r variable s appea r weake r tha n the y should . At th e
sam e time , reducin g th e relationshi p of on e predicto r variabl e wit h a

Random

and Nonrandom

Error

85

dependen t variabl e belo w wha t it shoul d be ma y allo w relationship s


of othe r predicto r variable s to becom e stronge r tha n the y woul d hav e
bee n if erro r ha d no t bee n involved . In othe r words , som e coefficient s
get bigge r tha n the y should , wherea s other s get smaller .
In sum , in multivariat e instances , rando m erro r produce s neithe r
an idea l case no r a predictabl e one . It ma y increas e as wel l as decreas e
relationships . Furthermore , unreliabilit y ha s differen t effect s on th e
dependen t variabl e in a regressio n equatio n tha n it doe s on an
independen t variable .
For dependen t variables , erro r get s absorbe d int o th e residual .
Erro r reduce s th e varianc e accounte d for an d th e standardize d regres
sion coefficients . On th e othe r hand , slopes , th e unstandardize d
coefficients , remai n unaffected , an d thi s is a persuasiv e reaso n for
workin g wit h covariance s (nonstandardize d data ) if ther e is concer n
abou t erro r in one' s dependen t variable s (e.g. , Kenny , 1979). On
balance , then , measuremen t erro r in th e dependen t variabl e doe s no t
creat e terribl e problems , for it get s absorbe d int o th e erro r term , wit h
predictabl e results .
Usin g a typica l regressio n model , it is eas y to illustrat e wha t
happen s whe n rando m erro r exist s in th e dependen t variable . In suc h
an instance , th e equatio n desire d is , = XA + e, wher e Y, is th e tru e
scor e Y. Yet, th e observe d Y is no t th e sam e as Y for it is mad e up of
reliabl e varianc e plu s error . Tha t is, = , + , wher e is th e erro r
in Y. Becaus e , = Y - v, th e actua l equatio n become s , = Y - =
+ e, an d so , expresse d in term s o f th e observe d variables , Y =
XA + e + v. Becaus e th e erro r on Y is random , is unrelate d to X,
an d so th e nonstandardize d regressio n weigh t is no t affected , bu t th e
erro r ter m for Y become s e + rathe r tha n e. Becaus e th e erro r ter m
ha s becom e larger , th e R is reduce d an d th e standardize d regressio n
coefficient s ar e reduced .
For independen t variables , th e effect s of erro r ar e mor e problem
atic . Becaus e th e regressio n coefficien t is estimate d for all of th e
varianc e in th e independen t variable , th e ne w erro r ter m biase s th e
regressio n weigh t an d therefor e canno t readil y be dismissed . To
illustrate , begi n wit h th e desire d equation , = X, A + e. Whe n X
is measure d wit h error , however , X = X, + u and , therefore , X, X
M, wher e u is th e erro r in X. Thus , = (X - u)A + e, or Y = XA +
(e - uA). Not e tha t th e valu e of th e regressio n weigh t A is affecte d by
th e residua l u, clearl y an unwante d effect tha t biase s th e regressio n
coefficient . Furthermore , th e erro r ter m is no t independen t of th e
2

86

SINGL E MEASURE S O F

VARIABLE S

regressio n weigh t (A). In th e bivariat e case , becaus e u is unrelate d to


Y, wherea s X presumabl y is relate d to Y, th e relationshi p betwee n X
an d Y is underestimated . In th e multivariat e case , underestimatin g
relationship s of imperfectl y assesse d variable s wit h th e dependen t
variabl e affect s othe r relationship s in unpredictabl e ways . On bal
ance , then , if on e wer e to hav e erro r in a singl e measure , the n it woul d
be preferabl e to hav e it in th e dependen t variable , wher e its effect s
ar e easie r to address .
Becaus e rando m erro r create s suc h a proble m for multipl e regres
sion , it migh t seem as thoug h someon e woul d hav e determine d way s
of gettin g rid of th e error . For example , coul d unreliabilit y no t be
remove d by correctin g correlation s for attenuation , tha t is, by adjust
ing the m to wha t the y woul d hav e bee n if unreliabilit y ha d no t bee n
involved ? In principle , yes , bu t in practic e it is no t so easy . Th e bigges t
proble m is determinin g exactl y wha t th e reliabilit y of th e measur e
shoul d be for th e sampl e at hand . Shoul d preestablishe d reliabilit y for
a well-establishe d measur e be used , or shoul d sampl e reliabilit y be
used ? Shoul d reliabilit y be define d in term s of th e construc t tha t is
assesse d rathe r tha n in term s of erro r in assessin g th e measur e
(remembe r th e illustratio n of famil y size) ? Tha t is, is th e measur e th e
critica l predicto r whos e reliabilit y need s to be use d for correction , or
shoul d correctio n for unreliabilit y adjus t wit h respec t to th e under
lyin g construct ?
Unfortunately , ther e ar e no righ t answer s to ho w to "perfectly "
correc t for unreliability . Wha t is clear , however , is tha t if reliabilit y is
judge d to be highe r tha n it reall y is, the n th e correctio n wil l no t full y
remov e th e unreliabilit y an d th e relationship s of th e "corrected "
variabl e wit h other s stil l wil l be underestimated . In suc h a circum
stance , th e impact s of th e measur e ar e likel y to be understated . In th e
opposit e circumstance , if reliabilit y is judge d to be lowe r tha n it reall y
is, the n th e correctio n wil l be to o grea t an d th e relationship s of th e
"corrected " measur e wit h othe r variable s wil l be stronge r tha n the y
shoul d be . In suc h a circumstance , th e impact s of th e measur e ar e
likel y to appea r stronge r tha n the y actuall y are , or collinearit y prob
lem s ma y be exaggerate d or ma y appea r whe n in fact collinearit y
shoul d no t be problematic .
Neithe r overestimatio n no r underestimatio n is desirable , for on e
importan t underlyin g purpos e of structura l equatio n methodologie s
is to identif y causa l processe s tha t can be use d to targe t interventions .
It woul d be undesirabl e to hav e variable s droppe d fro m an equatio n

Random

and Nonrandom

Error

87

or viewe d as unimportan t becaus e of poo r reliabilit y or to selec t as


th e targe t for an interventio n a variabl e tha t actuall y ha s muc h weake r
effect s tha n it appeare d to hav e becaus e on e ha s overcorrecte d for its
unreliability . Thus , th e botto m line wit h respec t to correctin g a matri x
for unreliabilit y is as follows : Even thoug h superficiall y attractive ,
correctin g for unreliabilit y is risk y an d can be potentiall y problematic .
At th e sam e time , however , th e logi c of reliabilit y correctio n is
th e hear t of multiple-indicator , laten t variabl e SEM. Thos e technique s
provid e a mean s to overcom e th e type s of problem s tha t hav e just
bee n described . The y produc e a generall y effectiv e wa y in whic h to
addres s problem s of rando m erro r by estimatin g reliabilit y in term s
of th e specifie d model .
I Nonrandom

Error

Nonrando m erro r is erro r varianc e tha t is relate d in som e systemati c


wa y to a variabl e or othe r erro r term . In othe r words , nonrando m
erro r can resul t bot h from erro r varianc e tha t is share d acros s meas
ure s an d fro m extr a source s of reliabl e varianc e share d acros s meas
ures . Th e mos t commo n type s of nonrando m erro r ar e thos e tha t
resul t fro m tw o measure s havin g mor e tha n on e underlyin g dimen
sion (construct ) in common ; th e extr a dimension s ma y be substantive ,
bu t the y als o coul d be purel y metho d variance . For example , if all
measure s in a pat h mode l wer e draw n fro m a singl e paper-and-penci l
survey , the n ther e potentiall y coul d be commo n metho d varianc e tha t
woul d exaggerat e th e relationship s of th e differen t variables . If so ,
th e metho d coul d be considere d a typ e of nonrando m error .
A diagra m illustratin g nonrando m erro r appear s in Figur e 5.2.
Not e tha t th e relationshi p betwee n X an d occur s no t onl y du e to th e
relationshi p betwee n th e construct s tha t the y measur e bu t als o du e to
th e relationshi p betwee n thei r residuals . As suggeste d in th e precedin g
discussion , tha t relationshi p coul d represen t anothe r variabl e tha t
cause s bot h X an d Y, or commo n metho d variance . If th e nonrando m
erro r (i.e., th e arro w betwee n e an d e ) wer e ignored , the n th e
relationshi p betwee n X, an d Y, woul d be estimate d inaccurately .
Model s containin g nonrando m error s canno t be solve d by usin g
traditiona l regressio n techniques . In som e instances , th e nonrando m
error s ma y be estimate d by calculatin g partia l correlation s (e.g. , th e
partia l correlatio n mode l in Chapte r 3), in other s the y ma y be solvabl e
throug h matri x algebra , an d in stil l other s the y ma y be estimate d
x

88

SINGL E MEASURE S OF

( x )

ex

VARIABLE S

Figure 5 . 2 .

Two-Construc t Mode l

through multistage least squares techniques. Rather than attempting


to show how they can be solved in various ways, I will leave them for
the discussion of latent variable structural equation models. Using
those techniques, they can be dealt with either as residual covariances,
as methods factors, or even as unmeasured variables. Regardless of
how they are specified, in latent variable models they require no
special methods but are estimated as part of the general model in
which they are included.

I Method Variance and Muititrait-Multimethod Models

The remainder of this chapter focuses on issues related to method


variance. It discusses method variance and describes the multitraitmultimethod (MTMM) approach. In the language of M T M M ap proaches, what we thus far have called constructs or theoretical
variables are termed traits. Any additional systematic variability that

Random

and Nonrandom

Error

89

reflect s th e way s in whic h dat a wer e collecte d is collectivel y referre d


to as methods .
I Method Variance

Th e notio n of metho d varianc e shoul d no t be a ne w on e for readers ;


it ha s bee n mentione d throughou t thi s chapter . Metho d varianc e is a
prominen t an d commo n typ e of nonrando m error . It can occu r in a
numbe r of differen t way s whe n th e metho d use d intrude s to introduc e
additiona l commo n variance . A frequen t exampl e of commo n metho d
varianc e occur s whe n tw o measure s ar e administere d as par t of a
singl e instrument , particularl y if thei r item s ar e interspersed .
Potentiall y havin g measure s wit h commo n metho d varianc e
shoul d no t be seen as necessaril y bad , for metho d varianc e ma y be a
neede d byproduc t of a researcher' s effort s to tap substantiv e dimen
sion s of interest . For example , if on e or mor e of th e measure s to be
collecte d ar e likel y to be reactiv e (i.e., respondent s wil l kno w wha t
the y ar e responding to an d ma y choos e to answe r in way s tha t diffe r
fro m wha t woul d be thei r tru e responses) , the n thos e measure s
potentiall y coul d be mad e less obtrusiv e if mixe d in wit h item s tappin g
differen t conten t areas . Thi s logi c ha s bee n use d to measur e racia l
prejudic e agains t Africa n American s in th e Unite d State s (e.g. , Cran
dall , 1994; McConahay , 1986). In contras t to administerin g sequen
tiall y a serie s of item s abou t attitude s towar d Africa n Americans , thes e
prejudic e measure s embe d th e item s withi n a muc h large r serie s of
item s assessin g differin g attitudes . Th e expectatio n is tha t by "hiding "
items , respondent s wil l be less awar e tha t thei r racia l attitude s ar e
bein g assesse d and , consequently , wil l respon d mor e truthfull y rathe r
tha n in mor e sociall y desirabl e or othe r reactiv e ways . A potentia l cost
of thes e approache s is tha t th e mixin g of item s potentiall y lead s to
share d metho d varianc e acros s th e differen t attitudes , inflatin g thei r
actua l relationships .
A secon d sourc e of commo n metho d varianc e can be broade r in
scale , for example , resultin g fro m collectio n of a numbe r of differen t
measure s via a singl e metho d suc h as a paper-and-penci l survey ,
particularl y if all item s shar e a commo n respons e format . All re
sponse s collecte d by a singl e approac h ma y be relate d du e to th e wa y
in whic h th e measure s ar e collected . Yet a thir d potentia l sourc e of
commo n metho d varianc e is interviewe r bias , reflectin g way s in whic h

90

SINGL E MEASURE S O F

VARIABLE S

interviewer s shap e an d interpre t informatio n provide d to the m by


respondents .
In mos t instances , metho d varianc e is relativel y eas y to includ e in
pat h model s so lon g as researcher s ar e awar e of its effects . If metho d
varianc e is share d by tw o measure s or indicators , the n thei r residuals
can be connecte d by a pat h tha t signifie s thei r secon d sourc e of
commo n varianc e (as wa s illustrate d in Figur e 5.2). In Figur e 5.2,
imagin e tha t X an d Y ar e assesse d usin g a commo n metho d an d tha t
the y ar e viewe d as sharin g metho d varianc e ove r an d abov e thei r
othe r relationship . Thei r relationshi p can be estimate d base d on th e
tracin g rule s for pat h analysis . Thei r metho d varianc e relationshi p is
modele d throug h th e residuals ; th e relationshi p throug h th e con
struct s (X, an d Y,) is modele d throug h thei r respectiv e relationship s
wit h thos e construct s an d th e relationshi p betwee n th e construct s X,
an d Y,. By th e tracin g rule , bot h th e trai t an d metho d relationship s
ar e th e product s of thre e paths .
Figur e 5.2 canno t be solved , for ther e ar e identificatio n problem s
in tryin g to separat e trai t varianc e fro m metho d variance . Problem s
woul d occu r eve n if X an d Y measure d a singl e trait , for it woul d be
impossibl e to disentangl e trai t varianc e fro m metho d variance . As wil l
be discusse d in mor e detai l late r in thi s chapte r as wel l as in th e
chapte r on facto r analysi s (Chapte r 7), onl y wit h mor e measure s an d
usin g a facto r mode l coul d th e mode l be solved . An illustratio n of a
mode l tha t is identifie d appear s in Figur e 5.3, wher e bot h X an d Y
hav e tw o measures . In Figur e 5.3, onl y X] an d Y\ shar e a commo n
method . As wil l be explaine d late r in thi s book , buildin g model s wit h
laten t variable s an d multipl e indicator s allow s man y instance s of
nonrando m erro r to be modele d successfully .
If commo n metho d varianc e is share d acros s mor e tha n tw o
measures , the n an alternativ e is to mode l metho d as an additiona l
laten t (unmeasured ) variabl e rathe r tha n to vie w metho d varianc e as
nonrando m error . If it wer e specifie d as nonrando m error , the n th e
residual s betwee n all pair s of variable s woul d be connecte d by arrows .
But fro m th e perspectiv e of facto r analysis , whic h trie s to identif y
source s of commo n variance , specifyin g share d metho d varianc e as a
laten t variabl e make s bette r sens e tha n doe s specifyin g multipl e
residua l covariances , for metho d is a sourc e of commo n variance .
Th e logi c of modelin g metho d as a theoretica l variabl e (or , in th e
languag e of facto r analysis , as a commo n factor ) is somewha t differen t
fro m just allowin g th e residual s to covar y amon g each pai r of vari

Random and Nonrandom

X2

Figure 5 . 3 .

Error

91

X1

Y1

^2

An Illustration of Nonrandom Method Variance

ables. Modeling method as a single latent variable requires greater


consistency in relationships. For example, imagine a case in which
there are three indicators sharing a common method. Modeling pairs
of residuals could yield 0, 1, 2, or 3 significant residual covariances.
Consider the case in which two are significant and the third is not. In
such an instance, it is difficult to envision the method variance as
defining a single method factor. A single method factor could have
an appreciable effect on none or one of the measures (in which case
there would be no common method factor) or on two or three
measures (in which case there would be a common method factor).
If method affected only two measures, then two of the indicators
should share appreciable method variance, resulting in one significant
residual covariance. If method affected all three measures, then all
three residuals should be significant. By contrast, two significant
residual covariances would mean that two of the pairs of indicators
share a second appreciable source of common variance, whereas the
third pair has only a single source of common variance. Such findings
are not consistent with a single factor model but rather would suggest
two additional sources of common variance.

92

SINGL E MEASURE S O F

VARIABLE S

In othe r words , if on e believe s tha t metho d varianc e is an appre


ciabl e sourc e of extraneou s varianc e acros s thre e or mor e indicator s
sharin g a commo n method , the n th e bes t wa y in whic h to mode l tha t
varianc e is by specifyin g a commo n metho d facto r rathe r tha n by
allowin g all pair s of residua l covariances . Allowin g pair s coul d yiel d
pattern s of finding s no t consisten t wit h th e presenc e of a singl e
metho d factor .
At thi s point , a cavea t is in order . I hav e seen man y instance s in
whic h my student s hav e trie d to extrac t bot h metho d an d trai t
varianc e fro m a set of indicator s tha t supposedl y measur e a singl e
facto r via a singl e method . Thi s is not possible; wit h onl y a singl e
method , it is impossibl e to separat e metho d varianc e fro m trai t
variance . Th e tw o source s of commo n varianc e ar e confounded .
Addin g additiona l indicator s tha t asses s th e sam e trai t by th e sam e
metho d doe s no t help ; th e proble m exist s whethe r ther e ar e 3
indicator s or 30 indicators . Onl y whe n an additiona l indicato r meas
urin g eithe r th e trai t or th e metho d bu t no t bot h is availabl e is
extractio n of bot h trai t an d metho d factor s possible . In th e languag e
of construc t validity , tha t additiona l indicato r provide s informatio n
abou t discriminan t validit y tha t is neede d to teas e apar t metho d an d
trai t effects .
To th e exten t tha t metho d varianc e crosscut s th e theoretica l
variable s of interest , effect s of metho d can be separate d fro m effect s
of othe r source s of commo n variance . On e wa y in whic h to desig n a
stud y so tha t metho d effect s ar e separabl e fro m othe r source s of
commo n variance , whic h her e ar e calle d trai t factor s eve n thoug h no t
all of the m ma y be trait s in a traditiona l sense , is to cros s method s
an d traits . Such an approach , analogou s to an experimenta l stud y in
whic h th e factor s ar e crossed , can sampl e all trait s wit h all methods .
Such an approac h ha s bee n calle d an MTM M matri x approac h
(Campbel l & Fiske , 1959). Logi c for thi s approach , whic h wa s
worke d ou t wel l in advanc e of availabilit y of appropriat e structura l
equatio n methods , is presente d next .
I Additive Multitrait-Multimethod

Models

Campbel l an d Fiske (1959) presente d a mode l for interpretin g trait s


acros s methods . Although , as is discusse d later , ther e ar e non-obviou s
problem s in usin g pat h modelin g approache s to solv e for MTM M
data , th e logi c of metho d varianc e is of centra l importanc e to struc

Random

and Nonrandom

Error

93

tura l equatio n methods . Th e genera l goa l of th e MTM M approac h is


to be abl e to addres s issue s of validit y withou t thos e issue s bein g
confuse d by th e presenc e of commo n varianc e cause d by commo n
methods .
Campbel l an d Fiske (1959) argue d tha t withou t measurin g mul
tipl e method s as wel l as multipl e traits , th e relativ e contribution s of
trai t an d metho d varianc e canno t be determined . The y presente d
MTM M data , define d differen t type s of element s of th e correlatio n
matri x dependin g on thei r trait-metho d combination , an d develope d
rule s of thum b for determinin g validit y of trai t variables .
First , Campbel l an d Fiske (1959) pu t thei r effort s int o a frame
wor k of convergen t an d discriminan t validity . The y reminde d reader s
tha t "validatio n is typicall y convergent, a confirmatio n by indepen
den t measuremen t procedures " (p . 81, emphasi s in original) . In othe r
words , a variabl e assesse d by on e metho d shoul d be strongl y relate d
to tha t sam e variabl e measure d by a differen t method ; if th e relation
ship s acros s method s ar e small , the n th e variabl e fails th e tes t of
convergen t validity . Th e flip sid e of convergen t validit y is discrimi
nan t validity , whic h mean s tha t to be a vali d measure , tha t measur e
need s to be less substantiall y relate d to measure s of differen t vari
ables . If correlation s of measure s acros s variable s ar e to o high , the n
on e ma y wonde r whethe r th e measure s ar e assessin g wha t the y
purpor t to be measuring . Thus , differen t trait s assesse d by a commo n
metho d shoul d no t in genera l be ver y highl y correlated . Exception s
ma y occu r if th e differen t trait s ar e expecte d to be substantiall y
correlate d or if th e trait s bein g assesse d ar e elusiv e ones , readil y
overpowere d by metho d variance .
Th e ide a of "elusiv e traits " is no t on e mentione d by Campbel l
an d Fisk e (1959) bu t is on e tha t ha s intrigue d me . Ther e ar e variable s
tha t inherentl y ar e difficul t to asses s becaus e responses tha t ta p thos e
variable s als o ten d to trigge r othe r variable s an d metho d variance .
The y ma y be overlooke d or ignore d becaus e of th e difficult y in
measurin g them . Sometime s th e variable s can be "aggregate " types of
variables , suc h as famil y suppor t or socia l climate , whic h seemingl y
hav e to includ e man y aspects/components . Other s coul d includ e
personalit y variable s tha t peopl e tal k abou t bu t hav e difficult y opera
tionalizing , suc h as empath y an d ambition . Ambition seem s particu
larl y pron e to deman d characteristic s an d social desirability , an d
question s abou t ambitio n potentiall y seem to tap abilit y an d achieve
men t as wel l as ambition .

94

SINGL E MEASURE S O F

VARIABLE S

Anothe r goo d illustratio n is provide d by a variabl e mentione d


earlie r in thi s chapter : prejudice . Despit e th e man y way s an d time s
researcher s hav e trie d to measur e prejudice , ther e exist s no widel y
accepte d wa y in whic h to measur e it. Insofa r as variable s suc h as
prejudic e ar e likel y to prov e to be importan t if the y eve r can be
effectivel y assessed , it seem s natura l to attemp t to us e approache s tha t
migh t disentangl e thos e variable s fro m method s an d othe r source s of
extraneou s variance . Onl y the n wil l researcher s be abl e to identif y
the m an d thei r relationship s wit h other , mor e reliabl y assesse d vari
ables .
Campbel l an d Fiske (1959) too k as thei r departur e poin t for
MTM M matrice s th e nee d for multipl e trait s an d multipl e methods .
The y suggeste d movin g th e poin t to its logica l end , namely , measurin g
each trai t assesse d by all method s used . Th e resul t is a full y crosse d
trai t metho d correlatio n matrix . Reader s shoul d tr y to thin k broadl y
abou t wha t is a trai t an d wha t is a method , for ther e ar e opportunitie s
to us e trait s an d method s creatively . For example , McGarvey , Miller ,
an d Maruyam a (1977) use d an MTM M mode l to compar e differen t
way s of scorin g field dependenc e usin g th e Witki n ro d an d fram e
apparatus .
Tabl e 5.1 present s th e prototyp e MTM M matrix , a 3 trai t 3
metho d matrix . Campbel l an d Fiske (1959) divide d th e matri x up
int o fou r differen t type s of correlation s base d on same/differen t
metho d an d trai t combinations . First , th e underline d elements , th e
mai n diagonal , ar e th e monotrait-monometho d correlations . Camp
bell an d Fiske pu t th e reliabilitie s on tha t diagona l to defin e th e
maximu m possibl e relationshi p tha t exist s betwee n each measur e an d
an y othe r measure . Second , th e thre e set s of thre e correlation s tha t
form triangle s nex t to th e reliabilit y diagona l (e.g. , r r an d r ) ar e
calle d th e heterotrait-monometho d correlations . Thes e shar e com
mo n metho d varianc e bu t asses s differen t traits . Third , bol d prin t is
use d to identif y th e thre e set s of thre e correlation s alon g th e subdi
agonal s withi n th e heterometho d blocks . Thes e ar e monotrait
heterometho d correlations , whic h Campbel l an d Fiske calle d th e
validit y diagonals . The y ar e calle d validit y diagonal s becaus e ideall y
the y tap commo n trai t varianc e independen t of metho d variance .
Finally , fourt h ar e th e correlation s within block s on eithe r sid e of th e
validit y diagonals . Thes e ar e th e heterotrait-heterometho d correla
tions , thos e tha t shar e neithe r commo n trai t varianc e no r commo n
metho d variance .
l u

i h

32

Random

and Nonrandom

TABL E

5.1 Illustrativ e 3 x 3 Multitrait-Multimetho


Method

Error

95

d Matri x

Metho d 2

Method

Trait A\ Trait B\ Trait C\ Trait Trait Bi Trait Ci Trait A } Trait B3 Trait Ci


Metho d 1
Trai t A1

'11

Trai t Bi

l'2 1

Trai t C i

I'j l

121
r

il

111

Metho d 2
Trai t

'41

'4 2

'4 3

'4 4

Trai t B2

'51

'5 2

'5 3

k5 4

Trai t C2

'6 1

'6 2

'63

I' M

'6 5

66

Metho d 3
Trai t A3

'71

'7 2

'73

'7 4

'7 5

'7 6

Trai t B3

'8 1

'8 2

'8 3

'8 4

'8 5

'8 6

l'8 7

88

Trai t

'9 1

'92

'93

'9 4

'95

96

l' 7

'9 8

'9 9

Campbel l an d Fiske (1959) suggeste d fou r condition s tha t woul d


nee d to be me t to establis h validity .
1. Entrie s in th e validit y diagonal s "shoul d b e significantl y differen t fro m
zer o an d sufficientl y larg e to encourag e furthe r examinatio n of validity "
(p . 8 2 ) . Tha t is , trait s measure d by differin g method s stil l shoul d be
highl y correlated .

Thi s firs t conditio n is th e tes t of convergen t validity . Th e remainin g


thre e condition s tes t discriminan t validity .
2. Th e valu e of eac h elemen t in th e validit y diagonal s shoul d be highe r
tha n th e value s lyin g in it s colum n an d ro w in th e heterotrait
heterometho d triangles . Thi s almos t alway s shoul d b e found , fo r it
require s onl y tha t th e correlatio n of a singl e variabl e assesse d by
differen t method s be greate r tha n differen t variable s assesse d by thos e
sam e differen t methods . To illustrate , th e correlatio n m shoul d b e
large r tha n eithe r 81, r83, r72. or rn.
3. For eac h measure , commo n trai t varianc e shoul d be greate r tha n com
mo n metho d variance . In th e word s of Campbel l an d Fiske , "A variabl e
[should ] correlat e highe r wit h an independen t effor t t o measur e th e
sam e trai t tha n wit h measure s designe d t o ge t at differen t trait s whic h
happe n to emplo y th e sam e method " (p . 8 3 ) . Practically , element s o f th e
validit y diagona l nee d to b e greate r tha n thei r correspondin g element s

96

SINGL E MEASURE S O F

VARIABLE S

in th e heterotrait-monometho d triangles . Th e compariso n is betwee n a


measure' s correlation s wit h othe r measure s of th e sam e trai t by differen t
method s an d th e measure' s correlation s wit h measure s of differen t trait s
by th e sam e method . To illustrate , t o tes t fo r Trai t A by Metho d 1 , M I an d
71 shoul d be large r tha n m an d , t o tes t fo r Trai t b y Metho d 2,
r52 an d rg5 shoul d be greate r tha n ru an d res
4 . Th e relativ e siz e (o r at leas t rank ) of th e element s withi n eac h heterotrai t
bloc k shoul d be maintaine d acros s blocks . In Campbel l an d Fiskc' s
words , "Th e sam e patter n o f trai t interrelationship s [should ] b e show n
in al l of th e heterotrai t triangle s of bot h th e monometho d an d
heterometho d blocks " (p . 8 3 ) . Again , t o illustrate , if n\ > n\ > rn,
the n w e als o shoul d fin d tha t rs4 > rn > res an d tha t M2 > M3 > rs3 ,
an d so forth , fo r all heterotrai t blocks .

Thes e rule s wer e importan t one s whe n analysi s of MTM M matrice s


neede d to be don e by inspection . Wit h th e developmen t of mor e
sophisticate d methodologie s capabl e of teasin g apar t variou s varianc e
component s in matrice s (e.g. , Kenn y &c Kashy , 1992), the y becam e
less importan t eve n thoug h thei r logi c is basicall y sound .
From my perspective , th e firs t tw o condition s ar e straightforwar d
an d fairl y obvious . Th e thir d is reall y unneeded , for ther e is no rea l
reaso n wh y trai t varianc e ha s to be stronge r tha n metho d varianc e so
lon g as the y can be separated ; it is thi s differenc e in perspectiv e fro m
Campbel l an d Fiske (1959) tha t ha s generate d my interes t in elusiv e
traits . Finally , barrin g extr a source s of commo n varianc e (whic h of
cours e coul d be modele d as residua l covariatio n if the y wer e antici
pated) , Conditio n 4 als o seem s reasonabl e an d quit e possibl e to attain .
Presentin g MTM M matrice s in pat h mode l for m require s a basi c
understandin g of th e logi c of facto r analysis . As a result , MTM M
matrice s wil l be discusse d mor e full y in th e late r chapte r on facto r
analysi s (Chapte r 7). At thi s point , onl y on e mor e poin t is covere d in
thi s chapter : tha t th e discussio n abou t MTM M matrice s ha s assume d
tha t trait s an d method s combin e additively . Alternatively , the y hav e
bee n hypothesize d as combinin g multiplicativel y (Campbel l an d
O'Connell , 1967), whic h yield s ver y differen t approaches .
I Nonadditive Multitrait-Multimethod

Models

Despit e th e intuitiv e appea l of an additiv e MTM M model , severa l


researcher s hav e argue d that , in man y instances , trait s an d method s

Random

and Nonrandom

97

Error

combin e in a multiplicativ e fashion . Th e firs t to sugges t thi s patter n


wer e Campbel l an d O'Connel l (1967). The y suggeste d as alternative s
(a) an invers e relationshi p betwee n trait s an d method s in whic h th e
stronge r th e trai t relationshi p betwee n tw o variables , th e less th e
impac t of commo n metho d on thei r relationship , an d (b) a multipli
cativ e relationshi p in whic h th e stronge r th e relationshi p betwee n
traits , th e mor e it is augmente d by commo n metho d variance . Thei r
analyse s of severa l MTM M dat a set s wer e in genera l mor e consisten t
wit h a multiplicativ e relationshi p betwee n trait s an d method s tha n
wit h eithe r an additiv e or an invers e relationship . On th e basi s of thei r
analyses , the y questione d whethe r an additiv e effect s mode l fro m
facto r analysi s is appropriat e for MTM M matrices .
Th e positio n of Campbel l an d O'Connel l (1967) ha s bee n refine d
by Brown e (1984) an d others . Cudec k (1988) provide d an illustratio n
contrastin g additiv e an d multiplicativ e model s as wel l as presentin g
approache s for assessin g whethe r or no t effect s combin e in multipli
cativ e fashion . Becaus e thos e approache s fall outsid e th e set of
structura l equatio n method s describe d in thi s text , the y ar e no t
discusse d here . SEM researcher s should , however , conside r thes e
othe r approache s as importan t alternativ e methodologie s for MTM M
data .

Summar y

Thi s chapte r ha s presente d ho w mesuremcn t erro r produce s types of


pat h model s tha t go beyon d pat h analysis . First , th e consequence s of
rando m an d nonrando m erro r wer e discussed . Rando m erro r in
dependen t variable s reduce s th e R bu t doe s no t bia s unstandardize d
pat h coefficients . Rando m erro r in independen t variable s reduce s
relationship s in th e bivariat e case bu t ha s mor e comple x an d unpre
dictabl e effect s in th e multivariat e case . Nonrando m erro r lead s to
fundamenta l problem s in estimatio n whe n "normal " regressio n ap
proache s ar e used . Alternativ e approache s ar e needed . Second ,
metho d variance , specificall y as it co-occur s along wit h trai t variance ,
wa s discussed , an d th e logi c of MTM M analysi s wa s presented . For
SEM approaches , trait s an d method s ar e assume d to combin e addi
tively .
2

SINGL E MEASURE S O F

VARIABLE S

Chapte r Discussio n Question s


1. What agai n is the rational e for focusin g on covarianc e rathe r
tha n correlatio n matrices ? Are ther e trade-offs ?
2. What are som e of the basi c test s tha t you would nee d to do to
chec k out the violations of assumption s suc h as l o o muc h
nonrando m error"?
3. Standardize d B's are understoo d to be b/SE. In the measure
men t erro r example , "standardized " seeme d to be use d in a
differen t sense . Is tha t so ?
4. Would it be advantageou s for method s to correlate ?
5. MTMM was first use d in the 1950s . Is it not use d an y longer ?

E X E R C I SE

5. 1

Elusiv e Construct s

Individuall y and , if possible , the n in groups , brainstor m abou t


construct s tha t hav e bee n difficul t to asses s bu t tha t migh t be
interpretabl e onc e metho d varianc e is take n ou t an d othe r
trai t dimension s ar e separatel y extracted .

--i^si.
I

I I

I I I

i i

I 1.1 L U I I I J U I I \JL\ \

1 !

II II

in Mor e Tha n On e
Directio n an d Wher e
Dat a Ar e Collecte d
Ove r

U p to thi s point , ther e ha s bee n littl e discussio n of


way s of analyzin g structura l mode l dat a wher e th e arrow s in model s
d o no t go in a singl e directio n an d wher e ther e is repeate d assessmen t
of particula r measure s acros s time . Thi s chapte r focuse s on th e
analysi s of suc h dat a becaus e the y contribut e a uniqu e piec e to th e
understandin g of structura l equatio n approaches . In bot h cases , dat a
canno t be analyze d satisfactoril y usin g th e pat h analysis/ordinar y leas t
square s technique s describe d thu s far . Model s wit h feedbac k loop s
canno t be analyze d by ordinar y regressio n analysi s becaus e th e as sumptio n of independenc e of error s is violated . Dat a collecte d repeat edl y fro m a singl e sampl e ove r tim e introduc e a new set of problem s
(e.g. , growt h ove r time , identification) , concerns , an d opportunities .
Finally , as is elaborate d in detai l in thi s chapter , th e tw o approache s
ar e linke d because , to th e exten t tha t th e multidirectiona l processe s
occu r acros s time , modelin g processe s acros s tim e can allo w multidi rectiona l causa l influenc e withi n a unidirectiona l flow model .
99

SINGL E MEASURE S O F

100

VARIABLE S

Model s Wit h Multidirectiona l Path s

In th e structura l equatio n literature , model s in whic h th e causa l


arrow s flow in mor e tha n on e directio n ar e calle d nonrecursiv e
models . In contras t to pat h analysi s models , nonrecursiv e model s ma y
no t be uniquel y solvable , eve n in instance s in whic h th e degree s of
freedo m sugges t overidentification . Th e firs t par t of thi s chapte r
discusse s nonrecursiv e model s an d the n cover s test s tha t can be use d
to asses s whethe r or no t nonrecursiv e model s can be uniquel y solved .
Up to thi s point , all th e model s tha t hav e bee n introduce d hav e
ha d causalit y flowin g onl y in a singl e direction . In othe r words , ther e
alway s is a "downstream " flow to th e models . By contrast , in nonre
cursiv e model s causatio n doe s no t follo w suc h a straightforwar d path .
Th e model s ma y includ e feedbac k loop s (A > C - A) throug h
whic h causalit y turn s bac k on itself , reciproca l causa l relationship s
(see Figur e 6.1) in whic h tw o or mor e variable s caus e each othe r
simultaneously , or eve n both . Becaus e th e notio n of simultaneou s
causatio n is bot h difficul t to envisio n an d somewha t controversial , an
alternativ e wa y in whic h to thin k abou t simultaneou s causatio n
model s is as illustrate d in th e lagge d mode l of Figur e 6.1, whic h
represent s situation s in whic h tw o or mor e variable s continuousl y
caus e each othe r ove r som e tim e period .
|

Logic of Nonrecursive

Models

Althoug h nonrecursiv e model s hav e bee n use d quit e frequentl y in th e


social sciences , researcher s shoul d be sur e tha t in fact thei r nonrecur
sive model s reall y ar e nonrecursive . In man y instances , it seem s tha t
researcher s develo p model s base d on th e limitation s of thei r dat a
rathe r tha n on th e underlyin g theory , for example , testin g a nonre
cursiv e mode l becaus e th e dat a tha t the y hav e availabl e ar e cross
sectiona l rathe r tha n longitudinal .
On e critica l principl e to conside r durin g mode l developmen t is
th e principl e of finit e causa l lag . Thi s principl e state s tha t an y caus e
produce s an effect on a secon d variabl e afte r som e (finit e amoun t of)
tim e ha s passed ; thus , ther e is a lag fro m caus e to effect . Th e lag can
be ver y short , as, for example , an eyeblin k respons e to a puf f of air
(cause : air puff ; result : eye blink) , bu t nonetheles s ther e is a lag . As a
consequence , th e variabl e tha t is cause d become s differen t acros s th e

Recursive

and Longitudinal

101

Models

Reciprocal :

Lagged :

Time 1

Figur e 6 . 1 .

Time 2

Reciproca l Causatio n an d Lagge d Causatio n M o d e l s

lag tim e interval , so if it also cause s th e variabl e tha t ha s cause d it an d


th e tw o cause s (i.e., measures ) ar e assesse d at th e sam e time , the n it
actuall y is affectin g a late r versio n of tha t variabl e rathe r tha n th e
causa l version . In othe r words , th e variabl e tha t is bein g cause d is
differen t fro m th e variabl e tha t is th e cause , for tim e ha s to pas s for
a caus e to produc e an effect . Not e tha t it is no t possibl e to justif y a
mode l of reciproca l causalit y by arguin g tha t th e variable s in th e
bidirectiona l relationshi p d o no t change , for the n ther e can be no
causa l effect .
In man y instances , an alternativ e tha t ma y be mor e accurat e tha n
reciproca l causatio n is a lagged , cross-causal mode l ofte n calle d a
cross-la g pane l mode l (th e topi c of th e nex t sectio n of thi s chapter) .
Th e tw o alternative s ar e presente d in Figur e 6.1.
If th e logi c presented in th e precedin g is at all persuasive , the n
reader s ma y be wonderin g whethe r ther e reall y ar e nonrecursiv e

10 2

SINGL E MEASURE S O F

VARIABLE S

model s or whethe r model s alway s shoul d attemp t to cros s time . Her e


on e get s int o disciplinar y as wel l as individua l difference s in perspec
tives . Som e researcher s wil l tak e th e precedin g argument s as definin g
fact an d argu e tha t ther e neve r ar e reciproca l causatio n model s be
caus e bidirectiona l causatio n reall y is th e lagge d mode l of Figur e 6.1.
Other s tak e equall y stron g position s in suppor t of reciproca l causa
tio n models , for example , arguin g tha t a variabl e measure d at a singl e
poin t in tim e is th e aggregatio n of an arra y of influence s fro m acros s
tim e and , consequently , tha t it can be cause d by a variabl e tha t it
causes . For processe s of continuou s bidirectiona l causatio n an d fairl y
hig h stabilit y of th e variable s involved , little , if anything , is likel y to
be lost by modelin g th e proces s as nonrecursive . In fact , if th e causa l
processe s ar e continuousl y ongoing , the n it migh t produc e exactl y
th e sam e outcom e as woul d a unidirectiona l lagge d mode l bu t withou t
havin g to collec t longitudina l data . Anothe r argumen t is tha t th e
reciproca l causatio n mode l can be use d to "test " competin g model s
of causation , wit h th e expectatio n tha t th e mode l wil l separat e caus e
fro m effect an d typicall y leav e a recursiv e mode l onc e th e primar y
caus e is identified .
For an illustratio n of nonrecursiv e models , I retur n agai n to th e
exampl e of th e relationshi p betwee n acceptanc e by peer s an d schoo l
achievement . Both pee r relationship s an d studen t achievemen t de
velo p ove r time . Fro m my perspective , th e bes t wa y in whic h to mode l
suc h relationship s probabl y is longitudinally , for example , throug h
measure s collecte d at th e beginnin g an d agai n at th e en d of a schoo l
year . Alternatively , however , on e coul d argu e tha t at an y poin t in tim e
each of the m is an aggregatio n of a serie s of influence s tha t hav e
occurre d acros s tim e an d tha t modelin g the m as reciprocall y relate d
woul d pic k up th e ongoin g processe s of chang e an d influenc e (e.g. ,
Maruyam a fic McGarvey , 1980).
Regardles s of one' s view s abou t nonrecursiv e models , the y ar e an
importan t par t of structura l equatio n modelin g (SEM) becaus e (a)
causa l processe s canno t be restricte d to one-directiona l causatio n an d
(b) thinkin g about alternativ e way s of modelin g bidirectiona l causa
tio n is integra l to accurat e mode l development . Withou t th e logi c of
feedbac k an d reciproca l relationships , pat h model s becom e muc h
weake r methodologica l approaches . Finally , if onl y cross-sectiona l
dat a ar e available , the n th e onl y wa y in whic h to represen t bidirec
tiona l relationship s is to us e reciproca l causatio n models .

Recursive

and Longitudinal

Models

103

I Estimation of Nonrecursive Models


Estimatio n of pat h coefficient s in nonrecursiv e model s differ s fro m
pat h analysi s in tw o importan t ways . First , basi c (i.e., ordinar y leas t
squares ) regressio n approache s do no t work . Second , mode l identifica
tion become s a critica l issue . For example , in th e top par t of Figur e 6.1,
ther e is a singl e relationshi p betwee n tw o variables , bu t ther e ar e tw o
path s to estimate , makin g tha t mode l underidentified . Furthermore ,
eve n if th e numbe r of relationship s (correlation s or covariances ) wer e
to be mad e greate r tha n th e numbe r of path s by addin g othe r
variables , th e assumptio n of independenc e of residual s is violated .
Specifically, if A cause s an d if cause s A as in Figur e 6.1, the n A's
residua l is no t independen t of B's residual .
Onc e identificatio n is establishe d (an approac h for assessin g
identificatio n is addresse d in th e latte r par t of thi s section) , to solv e
for pat h model s usin g regressio n approache s on e need s to us e multi
stag e leas t square s techniques . Such approache s ar e no t treate d in
detai l her e for tw o reason s (bu t intereste d reader s can see , e.g. , Kenn y
[1979]). First , nonrecursiv e model s can be handle d routinel y withi n
th e genera l framewor k for laten t variabl e SEM. Second , ther e is littl e
carryove r fro m regressio n approache s for estimatin g parameter s in
nonrecursiv e model s to latent variabl e structura l equatio n approache s
to thos e models . In othe r words , onc e on e understand s ho w to d o
laten t variabl e SEM, whic h wil l be addresse d late r in thi s book , ther e
is no nee d to lear n a multistag e regressio n approac h for solvin g
nonrecursiv e models . For reader s intereste d in understandin g ho w to
solv e for suc h model s usin g regressio n approaches , a brie f descriptio n
follows . (Som e reader s ma y be intereste d in knowin g tha t SEM
compute r program s [e.g. , LISREL] ma y generat e th e initia l estimate s
of parameter s to be estimate d by usin g a variatio n of multistag e leas t
square s techniques. )
In regressio n approaches , th e reciprocall y relate d variable s firs t
ar e each separatel y regresse d on th e full arra y of predicto r variables .
Predicte d score s for the m ar e calculated . Thos e predicte d score s ar e
the n include d in plac e of th e origina l endogenou s variabl e in th e
regressio n equation s for predictin g th e othe r endogenou s variable(s) .
Thus , th e regressio n analyse s hav e bee n don e in tw o stages : first ,
regressin g each endogenou s variabl e in a reciproca l relationshi p on
all exogenou s variable s and , second , solvin g for th e structura l path s

104

SINGL E MEASURE S O F

VARIABLE S

by including th e predicte d scor e for th e endogenou s variabl e in a


regressio n analysi s wit h all predictor s tha t hav e direc t path s to th e
endogenou s variable . Not e tha t if all of th e exogenou s variable s ar e
include d in th e equation , the n a solutio n woul d no t be possibl e
becaus e th e predicte d scor e variable s ar e perfec t linea r combination s
of th e ful l set of exogenou s variables . Such a mode l als o woul d be
underidentified . Furthermore , if th e variable s exclude d fro m th e
equatio n (calle d instrumenta l variables ) ar e unrelate d to th e predicte d
scor e variable , the n th e sam e collinearit y problem s appear .
At thi s point , reader s ma y be wonderin g wha t happen s to an y
covariatio n betwee n th e reciprocall y relate d variable s tha t is no t
share d wit h th e exogenou s variables . It certainl y is no t desirabl e to
assum e tha t thei r relationshi p is tie d totall y to exogenou s variables .
Th e wa y in whic h tha t issu e is resolve d in stag e estimatio n is tha t th e
residual s betwee n reciprocall y relate d variable s typicall y ar e specifie d
as covarying . Residua l covariatio n pick s up relationship s tha t exis t
ove r an d abov e relationship s wit h exogenou s variables .
Th e multistag e leas t square s approache s nee d to be pu t int o a
broade r context . Whethe r or no t on e follow s th e logi c abou t two
stag e leas t square s approache s is relativel y unimportant , for nonre
cursiv e model s can be solve d usin g th e genera l linea r mode l tha t is
use d in laten t variabl e SEM. Th e laten t variabl e SEM approac h
handle s nonrecursiv e model s in th e sam e wa y as it doe s recursiv e
ones . Furthermore , becaus e th e solutio n is a ful l informatio n one ,
specifyin g covariatio n betwee n residual s of reciprocall y relate d vari
able s is no t necessary . Residual s shoul d be specifie d as covaryin g onl y
if ther e is a substantiv e reaso n for believin g tha t ther e is an additiona l
sourc e of commo n varianc e betwee n th e tw o variable s beyon d thei r
reciproca l causa l relationship .
Finally , as on e think s abou t nonrecursiv e model s fro m th e frame
wor k of pat h modeling , it is importan t to remembe r tha t decompo
sitio n of effect s work s differentl y in nonrecursiv e models . For exam
ple , th e matri x approac h for solvin g for indirec t effect s wil l no t wor k
becaus e th e matri x use d (see Chapte r 3) neve r goe s to zero . An
alternativ e tha t work s for nonreciproca l relation s is to us e a modifie d
tracin g rul e approac h (see Kenny , 1979) in whic h th e resul t fro m th e
tracin g rul e is divide d by th e quantit y (1 -ab, wher e a an d b ar e th e
path s betwee n th e tw o feedbac k variables) . Becaus e th e curren t
versions o f structura l equatio n program s comput e indirec t effect s for
models , it seem s sufficien t for reader s to understan d th e logi c under

Recursive

and Longitudinal

105

Models

lyin g decompositio n of effects . Therefore , th e mechanic s of decom


positio n for nonrecursiv e model s ar e no t explaine d furthe r (bu t
intereste d reader s can see Kenn y [1979]).

Mode l Identificatio n

Unlik e recursiv e pat h model s withou t measuremen t erro r tha t alway s


wil l be identified , ther e is no guarante e tha t a uniqu e solutio n can be
obtaine d for nonrecursiv e models . Som e nonrecursiv e model s can be
underidentifie d an d therefor e no t solvable . To ensur e identification ,
certai n condition s nee d to be met . Thos e condition s can be met whe n
som e of th e predicto r variable s d o not hav e direc t path s to certai n
endogenou s variables . Th e ter m frequentl y use d to describ e suc h
variable s is instrumenta l variabl e or instrument . A predicto r variabl e
serve s as an instrumen t for an endogenou s variabl e an d help s to
identif y its equation , provide d tha t variabl e ha s a direc t pat h to othe r
endogenou s variable s bu t not to th e variabl e of interest . For a mode l
to be identified , each equatio n need s to hav e as man y instrument s
(variable s withou t direc t paths ) as ther e ar e variable s in reciproca l
relationships . Furthermore , as is explaine d in th e nex t sectio n of thi s
chapte r whe n th e ran k conditio n for identificatio n is described , th e
instrument s hav e to be distribute d in particula r way s for each depen
den t variabl e to hav e a solvabl e equation .
Conside r th e illustratio n in Figur e 6.2. Ho w coul d we tell
whethe r or no t it is identified ? Conside r firs t estimatin g th e mode l
includin g th e dashe d lin e path . Ther e ar e five variable s (therefore ,
5 x 4 / 2 = 10 degree s of freedom ) an d exactl y 10 paths , suggestin g
tha t th e mode l migh t be identified . Furthermore , X is an instrumen t
for th e V) equation . But notic e also tha t all thre e of th e exogenou s
variable s hav e arrow s directl y to
whic h mean s tha t th e endogenou s
variabl e ha s no instruments , an d therefor e its equatio n is no t identi
fied . Onc e th e dashe d lin e pat h is dropped , however , Xi become s an
instrumen t for th e equatio n of Y an d th e mode l become s identified .
Ther e ar e tw o condition s tha t mus t be me t to ensur e identifica
tion . Befor e presentin g thes e tw o conditions , however , it shoul d be
note d that , particularl y for comple x models , ensurin g identificatio n
ma y be ver y difficult . In principle , however , th e compute r program s
tha t analyz e structura l equatio n model s shoul d provid e test s for
mode l identification . If th e propose d mode l is underidentified , the n
3

106

SINGL E MEASURE S OF

Figur e 6.2.

* ^
5^
N

Vr

VARIABLE S

Nonrecursiv e Pat h Mode l t o Illustrat e Mode l Identificatio n

the program should not be able to generate a complete solution.


Specifically, calculation of confidence intervals requires inverting the
matrix of estimates. A matrix called the information matrix (see, e.g.,
Joreskog &c Sorbom, 1988), which is based on the matrix of estimates,
should be singular and noninvertible for an underidentified model,
with the result that confidence intervals cannot be produced for the
estimated parameters. Although this should provide a surefire test of
model identification, there is waffling about identification because
exceptions seem to have been found. Therefore, readers concerned
about complex models are referred to the works of Bollen (e.g., 1989)
and his colleagues as well as Rigdon (1995).
The treatment of identification issues for manifest variable
models that I find most understandable is the one presented by
Namboodiri, Carter, and Blalock (1975, pp. 502-505). I will try to
model my description after theirs. The first condition, which they
called the order condition, is a necessary but not a sufficient condition
for identification. It requires that for any system of endogenous
variables (which therefore means that there will be equations, one
for each endogenous variable), a particular equation will be identified
only if at least - 1 variables are left out of that equation (i.e., their

Recursive

and Longitudinal

Models

107

regressio n weight s ar e set to 0). For th e Y variable s in Figur e 6.2, wit h


tw o endogenou s variable s th e tw o equation s ar e
Y, = a,*X, + a*X + 0*X + a*Y + e,
Y = a *X + a *X + a *Xj + /Y , + e .
3

(6.1)
(6.2)

Th e residual s can be ignore d becaus e the y go to th e left sid e of th e


equation , wherea s th e dependen t variable s join th e othe r variable s on
th e righ t sid e of th e equatio n (th e sign s on th e coefficient s als o ar e
inconsequentia l an d can be ignored) , yieldin g
-e
-e

= , , + * + 0*X - 1*Y, + a*Y


= a *X + a *X + a *X + / , - 1Y .
3

In term s of th e orde r condition , each equatio n need s to hav e ( 2 - 1 )


variable s omitte d from th e equation . Th e firs t equatio n is fine becaus e
X is omitted , wherea s th e secon d equatio n fails to mee t th e orde r
condition . Onc e th e a coefficien t is set to zero , tha t equatio n als o
meet s th e orde r conditio n for identification .
Th e secon d condition , mor e restrictiv e tha n th e firs t an d bot h a
necessar y an d a sufficien t conditio n for identification , is calle d th e
ran k condition . Give n a syste m of dependen t variables , for th e ran k
conditio n to be satisfie d for a particula r equation , it mus t be possibl e
to for m at leas t on e nonzer o determinan t of ran k - 1 fro m th e
coefficient s of th e variable s omitte d fro m tha t equation . Usin g th e last
set of th e precedin g equations , wit h th e residual s isolate d fro m all
othe r variables , follo w thes e thre e steps .
3

1 . F o r m a m a t r i x f r o m th e coefficient s (sign s agai n c a n b e i g n o r e d ) . Fo r


th e e x a m p l e , it w o u l d b e a s follows :
Xi

X2

Xi

Yi

"7

Yi
Yz

Yi

2 . T o tes t fo r identificatio n o f a particula r e q u a t i o n , delet e f r o m th e m a t r i x


(a ) th e r o w o f tha t equatio n an d (b ) al l c o l u m n s tha t d o n o t hav e a z e r o
in th e r o w o f th e equatio n o f interest .
3. Fin d a n o n z e r o d e t e r m i n a n t o f ran k

- 1 f r o m th e remainin g values .

108

SINGL E MEASURE S O F

VARIABLE S

Concretely , for V, th e entir e firs t row (th e Y, row ) is deleted , as ar e


th e firs t (X,), secon d ( X J, fourt h (Y,), an d fifth (Y ) columns , leavin g
[a ], whic h happen s to be a 1 1 matri x wit h a nonzer o determinan t
unles s a happen s to be exactl y 0. For Y , th e entir e secon d ro w is
deleted , as ar e th e secon d throug h fifth columns , leavin g [ ] , anothe r
l x l matri x wit h a nonzer o determinan t unles s a, is exactl y 0. Becaus e
bot h a an d a ar e bein g estimated , the y ar e expecte d to be nonzero .
If so , th e modifie d Figur e 6.2, wit h th e dashe d pat h fro m Xj to Y
omitted , is an identifie d model . As suggeste d earlier , X] serve s as an
instrumen t for Y an d X as an instrumen t for Y,.
A few fina l points abou t identificatio n ar e in order . First , if th e X
variable s ar e highl y intercorrelated , the n it ma y mak e littl e sens e to
argu e tha t on e X can readil y be droppe d fro m each equatio n give n
tha t the y shar e muc h commo n varianc e an d ar e no t easil y distinguish
abl e on e fro m another . Ideally , instrument s ar e basicall y independen t
of othe r exogenou s variables . An importan t poin t is tha t althoug h
instrument s ar e essentia l for attainin g mode l identification , in som e
instance s it ma y be ver y difficul t to find variable s tha t mee t th e
requirement s of goo d instruments . Second , wha t if th e tw o endo
genou s variable s "shared " th e sam e instrument , for example , if in
Figur e 6.2 we wer e to pu t a bac k int o th e mode l an d remov e a . Th e
answe r is tha t th e ran k conditio n no longe r coul d be satisfie d becaus e
th e l x l matrice s woul d be 0. Th e importan t poin t her e is tha t each
endogenou s variabl e in a reciproca l relationshi p need s its ow n sepa
rat e instruments .
2

Longitudina l Model s

Th e remainde r of thi s chapte r focuse s on stabilit y an d chang e of


variable s an d relationship s betwee n variable s acros s time . Th e focu s
is not on change s in score s of individuals . Tha t typ e of chang e is
modele d differentl y (e.g. , Willet t fic Sayer , 1994). Reader s wh o ar e
tryin g to loo k at bot h stabilit y of relationship s an d change s in mea n
level s shoul d see , for example , McArdl e an d Abe r (1990).
Th e dat a discusse d in thi s chapte r mos t typicall y ar e calle d
longitudina l data . The y als o hav e bee n describe d as pane l dat a or
eve n cross-la g pane l data . Althoug h th e term s sometime s ar e use d
almos t interchangeably , on e distinctio n tha t can be mad e betwee n
the m is tha t th e forme r refer s to an y set of dat a in whic h measure s

Recursive

and Longitudinal

Models

10 9

ar e collecte d at differen t points in tim e eve n if no measur e is collecte d


mor e tha n once , wherea s th e latte r tw o typicall y ar e reserve d for
instance s in whic h som e of th e sam e measure s ar e collecte d at tw o or
mor e differen t points in time . It is th e instance s whe n measure s ar e
collecte d mor e tha n onc e tha t warran t specia l description , so th e ter m
"pane l data " is use d in thi s chapte r to describ e those instances .
Th e firs t par t of thi s sectio n focuse s on th e logi c underlyin g
longitudina l approaches . It build s on th e discussio n fro m th e preced
ing sectio n on nonrecursiv e models . It include s an introductio n to th e
terminolog y for analysi s of pane l data , a discussio n of identificatio n
issue s for pane l models , an d a revie w of th e uniqu e natur e of longi
tudina l data . Then , in th e secon d par t of thi s section , manifes t variabl e
pane l analysi s approache s ar e discussed .
|

Logic Underlying
Longitudinal Models

O f mos t importanc e to user s or prospectiv e user s of structura l equa


tio n methodologie s is th e logi c underlyin g structura l equatio n analy
sis of pane l data . Thi s logi c provide s a perspectiv e tha t refine s an d
extend s th e logi c of pat h analysi s by explicitl y introducin g notion s of
stabilit y an d change . Withou t thos e notions , resultin g model s for
pane l dat a ar e unlikel y to accuratel y explai n causa l processes .
Longitudina l model s ar e importan t for user s of structura l equa
tio n methodologie s becaus e (a) the y ad d int o structura l modelin g
notion s of stabilit y an d change , (b) the y provid e th e bes t wa y of
modelin g reciproca l causatio n to researcher s wh o ar e persuade d by
th e concep t of finit e causa l lag , (c) the y provid e an additiona l per
spectiv e for thinkin g abou t mode l identification , an d (d ) th e languag e
use d is explici t in separatin g particula r type s of relationship s an d
type s of residua l covariatio n from othe r types . Thi s last point , whic h
is covere d next , is particularl y usefu l to researcher s whe n the y at temp t to explai n thei r model s to others .
In contras t to th e importanc e of th e logic , mos t of th e method
ologie s tha t wer e develope d to analyz e longitudina l dat a hav e majo r
shortcoming s tha t mak e the m less tha n appealin g (e.g. , Rogosa ,
1980). Th e method s includ e bot h analysi s of pane l correlation s
(e.g. , Calsy n & Kenny , 1977) an d pat h regressio n approache s (e.g. ,
Shingles , 1976). The y ar e variant s of two-variable , two-wav e mod
els. All hav e problem s an d shortcoming s du e to assumption s of non

110

SINGL E MEASURE S O F

VARIABLE S

rando m erro r an d of cause s no t specifie d in th e models . Onc e again ,


an appropriat e an d flexibl e wa y in whic h to mode l suc h dat a is to us e
laten t variabl e SEM, for it can allo w researcher s to desig n model s tha t
mak e realisti c assumptions .
|

Terminology of Panel Models

Conside r Figur e 6.3, whic h is use d to illustrat e th e terminolog y of


longitudina l analyses . Th e mode l in Figur e 6.3 is a two-variable ,
two-wave , longitudina l pane l (path ) model . Variable s X an d Y bot h
ar e measure d at tw o point s in time . For th e moment , ignor e th e
direction s of th e path s an d th e fact tha t Figur e 6.3 is a varian t of a
regressio n model . Focu s instea d on th e differen t type s of zero-orde r
relationships , or correlations , betwee n variables . In th e languag e of
cross-la g pane l analysis , th e relationship s betwee n th e tw o exogenou s
variable s (X, an d Y)) an d betwee n th e tw o endogenou s variable s (X
an d Y ) bot h ar e calle d synchronou s correlations ; the y represen t
relationship s betwee n tw o differen t variable s at a singl e poin t in time .
In purel y cross-sectiona l models , all correlation s ar e synchronous .
Th e X1-X2 d
relationship s ar e calle d autocorrelations , or
stabilities , reflectin g th e amoun t of chang e in a singl e variabl e acros s
time . Th e Xj-Y an d Yi-X relationship s ar e th e lagge d or cross-lagge d
(becaus e the y cros s betwee n variables ) correlations . Finally , th e path s
betwee n th e residual s (e's) , whic h typicall y ar e no t include d as par t
of a pane l analysi s model , ar e residua l covariances , or autocorrelate d
residuals , sometime s genericall y calle d correlate d errors . Thi s last
typ e of pat h reflect s th e fact tha t whe n a measur e is administere d at
differen t times , ther e is th e likelihoo d of substantia l varianc e bein g
share d acros s th e differen t administration s of tha t measur e du e no t
to th e underlyin g construc t tha t is assesse d bu t rathe r to particular s
of th e measur e tha t is administere d (i.e., measure-specifi c variance) .
2

E X E R C I SE
Give n tha t you shoul d no w be familia r wit h identification , is
Figur e 6.3 identified ?

Recursive and Longitudinal

Models

111

X,

^ X

V_

Figure 6.3. Two-Variable, Two-Wave Panel Model

Identification

Issues of identification tie back into the exercise just presented. The
answer to that question is no, for there are only six correlations but
seven paths to solve. The important point here is that some panel
models, despite being recursive, still may not be identified. The
reason why identification becomes an issue is that repeated assessment of the same measure produces two sources of common variance,
one due to the underlying construct and the second due to measurespecific variance. This latter source of common variance usually
would be part of the unique variance of the measure; however,
because the same measure is collected twice, that variance becomes
part of the common variance of the measure. The model would be
identified if a researcher were willing to allow the two sources of
common variance within the measure to be lumped together, but
combining the two yields an inaccurate assessment of the stability of

SINGL E MEASURE S O F

112

VARIABLE S

th e underlyin g variable . To buil d model s tha t ar e close r to th e


processe s tha t occur , it is necessar y for researcher s to conside r
whethe r or no t autocorrelatio n exist s acros s tim e betwee n th e residu
als whe n administerin g measure s repeatedly . If it is likel y to exist , the n
multiple-indicato r model s ar e needed . The y woul d be identifie d eve n
if measure-specifi c varianc e wer e included .
I

Stability

Withou t detractin g fro m th e importanc e of terminolog y or identifi


catio n issues , th e mos t importan t concep t adde d by pane l model s is
stabilit y of a variable . To illustrate , imagin e tha t som e variabl e calle d
is perfectl y stable . By definition , then , 2 measure d at Tim e 1 (Z,)
wil l hav e no cause s othe r tha n itself , for it is perfectl y determine d by
itsel f fro m an earlie r poin t in time , her e calle d Tim e 0 (Z ). Yet, if Z
is modele d in a cross-sectiona l mode l as an endogenou s variable , as
it is wit h respec t to Z (remembe r tha t cross-sectiona l model s woul d
hav e dat a collecte d at onl y on e poin t in time , so th e earlie r tim e poin t
versio n of tha t variable , Z , canno t be a predicto r becaus e it woul d
no t hav e bee n assessed) , the n othe r variable s eithe r correlate d wit h
Z , causin g Z , or cause d by Z all coul d appea r to be cause s of 7L All
tha t need s to happe n for th e possibilit y of incorrec t inference s to
occu r is tha t som e othe r variable s tha t ar e relate d to nee d to be
place d in a mode l causall y prio r to Z tha t is, wit h arrow s pointin g
directl y to it. Even thoug h placin g the m causall y prio r to woul d
resul t in a mode l tha t is misspecified , tha t misspecificatio n coul d
easil y go undetected . Such misspecificatio n coul d occur , for example ,
whe n ther e ar e argument s for bidirectiona l causatio n an d th e othe r
variable s ar e collecte d temporall y prio r to Z .
As an exampl e of issue s of stability , conside r a hypothetica l
situatio n in whic h antecedent s of schoo l achievemen t ar e sought . Th e
importanc e of identifyin g variable s tha t coul d improv e achievemen t
is sufficientl y motivatin g to lea d researcher s to loo k widel y for
predictor s of achievement . To illustrate , woul d it no t be wonderfu l
to find way s of markedl y improvin g th e achievemen t of childre n wh o
ar e strugglin g in school ? In thinkin g abou t thi s situation , remembe r
tha t an interventio n tha t improve s th e achievemen t of all childre n bu t
preserve s thei r relativ e achievemen t level s in compariso n to on e
anothe r woul d neithe r affec t stabilit y no r appea r to be a caus e in a
structura l mode l for th e interventio n sample . Th e mean s woul d
0

Recursive

and Longitudinal

Models

113

change , bu t th e covariance s woul d be unaffected . Onl y in a multisam


ple stud y in whic h treatmen t is a dumm y variabl e woul d th e effect be
apparent . Said differently , th e processe s bein g examine d loo k at
relationships , no t mea n (level ) shifts . For SEM model s to identif y
changes , th e relativ e achievement s of student s woul d hav e to change ,
for example , as an interventio n raise d th e achievemen t of childre n
wh o ar e lowes t in achievement . If everyon e is affecte d by an inter
vention , the n dramati c change s in mean s coul d be invisibl e in struc
tura l models .
To searc h broadl y for possibl e predictor s whil e buildin g a mode l
tha t seem s realistic , researcher s shoul d sampl e an arra y of possibl e
predictors . Such predictor s migh t includ e personalit y measures , pee r
relations , teache r an d paren t ratings , an d demographic s (see , e.g. ,
Maruyam a [1977] for suc h a study) . Th e researcher s shoul d fram e
thei r wor k by recognizin g that , at an aggregat e level , achievemen t is
likel y to be highl y stable , for childre n wh o d o relativel y wel l in on e
yea r by an d larg e do relativel y wel l in subsequen t year s as well . Yet,
if th e dat a examine d to explor e antecedent s of achievemen t wer e
cross-sectiona l an d faile d to includ e pas t achievemen t as a variable ,
the n th e omissio n of pas t achievemen t migh t resul t in th e emergenc e
of a numbe r of "promising " predictors . Thos e promisin g predictor s
ar e mos t likel y to be variable s strongl y relate d to pas t achievement .
In effect , thos e promisin g predictor s ma y just be correlate d wit h or
cause d by pas t achievemen t (which , in th e precedin g discussion , is th e
Z variable) .
For researcher s attemptin g to be sensitiv e to issue s of stabilit y
withi n th e limitation s of a cross-sectiona l design , ther e alway s is th e
optio n of tryin g to collec t longitudina l dat a at a singl e poin t in tim e
throug h retrospectiv e reporting . For example , in a stud y of academi c
performance , student s ma y be aske d abou t thei r cumulativ e grad e
poin t average s prio r to th e presen t year ; in a stud y of attitudes ,
participant s ma y be aske d about ho w the y though t the y use d to thin k
abou t som e issue ; an d in a socia l statu s study , participant s ma y be
aske d abou t thei r pas t earning s or thei r families ' socia l status . Such
dat a ma y allo w a mor e realisti c mode l to be teste d for viability . The y
ar e not , however , th e sam e as multiwav e longitudina l sampling , whic h
provide s curren t dat a at each tim e period . Particularl y for researc h in
area s like attitud e assessmen t bu t also for area s like reportin g of pas t
achievement , retrospectiv e reportin g get s distorte d by curren t per
spectives . Othe r thing s bein g equal , it likel y lead s to greate r consis
Q

SINGL E MEASURE S O F

114

VARIABLE S

tenc y acros s tim e tha n woul d be foun d if dat a wer e collecte d at tw o


or mor e differen t point s in time .
Th e goa l of thi s discussio n is no t to dissuad e on e fro m eve r
attemptin g to collec t retrospectiv e data . Such dat a can be ver y valu
abl e if collecte d thoughtfully . Rather , it is to war n researcher s abou t
limits of relyin g on retrospectiv e report s to replac e longitudina l
sampling .
A secon d potentia l typ e of proble m or shortcomin g can occu r
whe n collectin g retrospectiv e or file dat a to supplemen t th e curren t
tim e data . To th e exten t tha t th e retrospectiv e or file informatio n
provide s imperfec t dat a about th e underlyin g variable , th e resultin g
measure s ar e unreliable . If, in th e precedin g achievemen t illustration ,
th e measur e of pas t achievemen t is less tha n perfectl y reliable , the n
th e Z variabl e wil l no t perfectl y determin e Z . As a result , othe r
predictor s coul d appea r to be importan t whe n in fact the y ar e not .
Thi s proble m is no t uniqu e to retrospectiv e dat a collection . It als o
can occu r in pane l dat a an d is th e sam e unreliabilit y proble m as wa s
discusse d in Chapte r 5.
Finally , th e "flip side " or convers e of stabilit y is chang e or
variability . Low stabilitie s sugges t tha t a variabl e is changin g rapidl y
or at leas t appreciabl y withi n th e tim e interva l studied . Althoug h suc h
chang e migh t be viewe d as an opportunit y for researc h in tha t it coul d
allo w man y variable s to exer t causa l influence , it ma y hav e othe r
meanings . On e possibl e explanatio n to conside r as causin g low sta
bilit y acros s tim e is poo r reliability . If th e measure s hav e low reliabil
ity, the n th e variabl e can be problemati c for an y structura l modeling .
A ver y differen t alternativ e explanatio n is that , du e to som e
proces s suc h as developmenta l change s in subjects , th e variabl e as
assesse d durin g th e earl y tim e ma y no t be th e sam e variabl e as is
collecte d at th e late r tim e point . An exampl e of th e latte r possibilit y
ma y be provide d by assessmen t of mathematic s skill s amon g youn g
children . If on e wer e to sampl e certai n skill s at tw o point s in time ,
the n at th e earlie r tim e poin t a cluste r of skill s ma y be poorl y
differentiate d or eve n undeveloped , an d migh t be unidimensiona l
aspect s of genera l ability . By th e late r time , however , th e cluste r of
skill s ma y be muc h mor e develope d an d differentiated , wit h th e
consequenc e tha t th e skill s tap mor e tha n a singl e dimension . In suc h
circumstances , th e stabilit y of th e construc t shoul d be relativel y poor .
In conclusion , th e ide a of stabilit y is an importan t on e for
structura l equatio n models . If stabilit y of a variabl e acros s tim e is no t
0

Recursive

and Longitudinal

Models

115

assesse d accurately , the n misinterpretation s of causa l impac t can


readil y occur . Variable s tha t d o no t chang e can appea r to be affecte d
by othe r variables . Variable s tha t ar e measure d unreliabl y will , whe n
modele d as causes , likel y appea r to hav e less of an impac t tha n the y
actuall y do and , whe n modele d as effects , likel y appea r to be influ
ence d by mor e variable s tha n actuall y influenc e them .
Th e nee d to effectivel y mode l stabilit y of construct s provide s an
importan t reaso n to us e pane l designs . Becaus e accurat e modelin g of
stabilit y an d reliabilit y issue s is almos t impossibl e withou t usin g
multipl e indicators , laten t variabl e approache s ar e preferable . Thos e
approache s can effectivel y partitio n measure s int o varianc e compo
nents . Finally , an d a poin t tha t I retur n to late r in thi s chapter ,
longitudina l dat a provid e a reaso n wh y model s shoul d wor k wit h
covarianc e rathe r tha n wit h correlatio n matrices , namely , to allo w for
change s in variabilit y acros s time .
I Temporal Lags in Panel Models
Th e concep t of finit e causa l lag introduce d in th e nonrecursiv e mode l
sectio n ha s importan t implication s for th e understandin g of pane l
models . First , from on e perspective , it is finit e lag tha t lead s to th e
developmen t of longitudina l model s to examin e reciproca l causation .
Second , an d of muc h greate r importanc e for th e curren t discussion ,
if influenc e occur s ove r a finit e interval , the n it is criticall y importan t
to hav e an a prior i understandin g about ho w lon g th e causa l lag
actuall y is. In th e idea l world , tw o reciprocall y relate d variable s caus e
each othe r at th e sam e rate , so th e onl y issu e is to estimat e th e lengt h
of a singl e lag interval . Tha t is, th e tim e tha t it take s for on e to caus e
th e secon d is th e sam e as th e tim e it take s for th e secon d to caus e th e
first . In suc h circumstances , all tha t is neede d is to asses s th e tw o
variable s acros s a tim e interva l th e sam e as (or slightl y greate r than )
th e tim e lag . Figur e 6.4 provide s an illustratio n of causa l processe s in
whic h tw o variable s influenc e each othe r acros s a tim e interva l of 1
unit : 1 cause s 2, 2 cause s 3, an d so forth . In suc h an instance , we
woul d wan t to ensur e tha t th e tim e lag selecte d is at leas t as lon g at
1 unit .
Acceptin g "slightl y greater " is recommende d base d on a consid
eratio n of th e consequence s of overestimatin g th e causa l interva l
versu s underestimatin g th e causa l interval . If th e interva l is overesti
mated , the n dat a wil l be collecte d to o far apart . Th e cost her e is tha t

11 6

SINGL E MEASURE S O F

T4

TE

1 *l

r~2>2r V
YI ^- Y2

Figur e 6.4.

VARIABLE S

Y.|

Multiwav e Pane l M o d e l

th e causa l relationship s wil l hav e "decayed " somewha t fro m thei r


maximums . By contrast , if th e interva l is to o short , the n th e processe s
wil l no t yet hav e occurre d an d no effect shoul d be apparent .
Conside r as an illustratio n of selectin g appropriat e causa l lag s th e
diagra m in Figur e 6.4. In Figur e 6.4, overestimatin g coul d occu r if
we wer e to collec t dat a at Tim e 1 an d halfwa y betwee n Time s 2 an d
3 (Tim e 2.5). We woul d be abl e to asses s an y influence s tha t occurre d
wit h a causa l lag of on e tim e interval , bu t thos e influence s woul d be
reduce d by an y change s in th e measure s fro m Tim e 2 to Tim e 2.5.
Th e cost of selectin g to o lon g an interva l depend s on ho w fast
th e variabl e changes ; it ma y rang e fro m triviall y underestimatin g to
missin g virtuall y all of th e effect . Th e latte r possibilit y shoul d be
relativel y unlikel y unles s th e processe s occurre d onl y onc e an d di d
no t continu e to repea t themselve s acros s time . (Th e issu e of stabilit y
of causa l processe s is discusse d in mor e detai l late r in thi s chapter. )
Tha t is, in Figur e 6.4, collectin g measure s at Time s 1 an d 4 woul d
underestimat e th e effect dramaticall y onl y if th e stabilit y of X an d Y
wer e low acros s interval s an d th e processe s occurre d onl y fro m Tim e
1 to Tim e 2.
On th e othe r hand , wha t if th e interva l selecte d wer e to o short ?
Lookin g agai n at Figur e 6.4, imagin e tha t w e wer e to asses s th e
variable s at Tim e 1 an d halfwa y betwee n Time s 1 an d 2 (Tim e 1.5).
Th e consequenc e woul d be tha t no t enoug h tim e woul d hav e passe d
for th e causa l processe s to occur , so no causa l impac t woul d be
apparen t or detectable . In othe r words , th e cross-lagge d path s woul d

Recursive

and Longitudinal

Models

117

effectivel y be zero , a definitel y unappealin g resul t for instance s in


whic h causa l impac t occurs .
Furthe r complicatin g longitudina l model s is that , in man y in
stances , it is no t possibl e to assum e tha t th e causa l lag s betwee n
variable s ar e th e same . If the y ar e not , the n tw o differen t tim e lag s
nee d to be estimated . If dat a wer e to be collecte d at onl y tw o point s
in time , the n selectio n of differen t lengt h tim e lag s tied to on e of th e
relationship s coul d markedl y chang e th e inference s draw n eithe r by
underestimatin g th e effect s of on e predicto r by selectin g to o lon g a
lag for tha t relationshi p or by missin g th e effect s of th e othe r predicto r
by selectin g to o shor t a lag for tha t relationship . Th e potentia l for
incorrec t inference s is particularl y grea t whe n th e goa l is to tal k abou t
preponderan t influenc e of on e variabl e on another . Th e on e wit h th e
stronge r impac t coul d chang e dependin g on th e interva l selected .
Even thoug h differin g lags complicat e longitudina l research , ther e
ar e way s of buildin g accurat e model s despit e them . For example , if
th e lag s differe d substantially , the n three-wav e dat a coul d be collecte d
tied to th e differen t interval s to ta p bot h causa l processes .
In summary , collectio n of longitudina l dat a introduce s new type s
of problem s an d complexitie s tha t requir e carefu l though t befor e
collectin g data , for in cross-sectiona l samplin g researcher s neve r hav e
to worr y abou t wha t th e tempora l lag shoul d be . At th e sam e time ,
however , as importan t as it is to mak e researcher s articulat e th e lengt h
of predicte d causa l lags , tha t articulatio n shoul d no t dete r individu
als fro m longitudina l research ; rather , it shoul d onl y mak e the m
explicitl y stat e somethin g tha t shape d thei r thinkin g an d mode l
development . If it is ne w thinking , the n thei r wor k wa s incompletel y
develope d an d th e mode l wa s no t wel l though t out . In othe r words ,
a norma l par t of thinkin g about relationship s betwee n an y tw o
variable s is to ask th e followin g question : If th e relationshi p is causal ,
the n ho w lon g doe s it tak e for on e variabl e to affec t th e other ? If
bot h variable s affec t each other , the n th e questio n ha s to be aske d
twice .
I Growth Across Time in Panel Models
Yet anothe r importan t issu e whe n developin g pane l model s is th e
issu e of growt h or change . Thi s issu e wa s mentione d earlie r in thi s
chapte r whe n th e illustratio n of increasin g complexit y of mathematic s
achievemen t in childre n wa s presented . Whe n th e sam e measure s ar e

118

SINGL E MEASURE S O F

VARIABLE S

collecte d repeatedly , ther e ar e danger s in usin g standardizatio n be


caus e it remove s difference s in variabilit y acros s time/occasions . In
othe r words , th e methodolog y tha t allow s for growt h is SEM usin g
covariances , whic h focu s on raw scor e relationships . By contrast ,
correlation s focu s on standar d deviatio n uni t relationships , an d
change s in th e size of th e standar d deviation s can resul t in apparen t
strengthenin g or weakenin g of standardize d relationships .
Said differently , for longitudina l models , ther e is onl y on e occa
sion in whic h us e of standardize d relationship s make s logica l sense ,
namely , th e occasio n in whic h ther e is no chang e in variabilit y of an y
of th e variable s acros s time . In suc h a situation , analysi s of correla
tion s provide s result s identica l to analysi s of covariances , whic h is
wh y analysi s of correlation s is acceptabl e for tha t situation . Give n th e
smal l likelihoo d of all th e variance s remainin g unchange d acros s tim e
(an d th e temptatio n to decid e tha t small , nonsignifican t difference s
betwee n variance s ar e "smal l enough " to be considere d as equivalen t
or unchanged) , it seem s advisabl e to ignor e thi s "specia l case " alto
gethe r an d to alway s mode l longitudina l dat a usin g covarianc e matri
ces. Furthermore , Cudec k (1989) pointe d ou t tha t SEM approache s
ar e w o r k e d ou t fo r covariances , no t correlations .
|

Stability of Causal Processes

As mentione d earlie r in thi s chapter , stabilit y of an y causa l processe s


is an additiona l issu e of importanc e for pane l models . Thi s typ e of
stabilit y differ s from stabilit y of measure s acros s tim e tha t wa s dis
cusse d earlie r in thi s chapter , for it refer s to causa l dynamic s acros s
tim e rathe r tha n singl e variable s acros s time . Stabilit y of causa l
processe s mean s tha t th e wa y in whic h som e variable , X, affect s a
secon d variable , Y, acros s on e tim e interva l is th e sam e as its impac t
on Y acros s a secon d tim e interva l of th e sam e length . Unles s causa l
processe s ar e basicall y stable , longitudina l model s can be misleadin g
an d at bes t wil l tap processe s specifi c to th e particula r interval s
sampled .
As an illustration , loo k bac k to Figur e 6.4 an d assum e tha t th e
figur e accuratel y represent s causa l processe s betwee n X an d Y. In tha t
figure , so lon g as th e differen t coefficient s fro m X to Y ar e of simila r
magnitud e as th e differen t coefficient s fro m Y to X, th e processe s sta y
th e sam e regardles s of th e startin g poin t selecte d or of th e particula r

Recursive

and Longitudinal

119

Models

interva l crossed . Pane l dat a from Tim e 1 to Tim e 2 woul d yiel d th e


sam e finding s as woul d dat a from Tim e 4 to Tim e 5.
By contrast , if th e patter n of tru e causa l relationship s (arrows )
wer e to diffe r acros s tim e periods , the n th e processe s woul d be
unstable . Th e relationship s identifie d by th e analyse s woul d diffe r
dependin g on th e startin g poin t selected . Becaus e differen t processe s
ar e occurrin g at differin g tim e points , genera l statement s abou t causa l
dynamic s ar e impossible . It is particularl y undesirabl e for circum
stance s in whic h on e want s to spea k generall y abou t causa l dynamics .
On th e othe r hand , modelin g unstabl e processe s ma y be ver y attrac
tiv e if th e dynamic s ar e predicte d to var y acros s differen t intervals ,
as migh t be predicte d for developmenta l dat a or wher e an interven
tio n wa s implemente d betwee n tw o intervals . In suc h circumstances ,
however , th e tim e points selecte d for dat a collectio n ar e critica l an d
warran t clea r justification .
I Effects of Excluded Variables
On e fina l poin t importan t for analysi s of pane l model s is a poin t tha t
is tru e for all structura l model s bu t on e tha t is mad e particularl y
salien t by repeate d samplin g of variable s acros s time . Tha t poin t is
tha t structura l model s assum e a close d system , namely , tha t all vari
able s tha t ar e importan t ar e include d in th e model . A secon d wa y in
whic h to say th e sam e thin g is tha t no omitte d variabl e shoul d by its
inclusio n chang e an y of th e path s in th e model . Clearly , thi s assump
tio n is widel y violated , for it assume s tha t researcher s ar e abl e to star t
at th e en d of a proces s of identifyin g importan t variables . Tha t is, if
model s ha d to be correctl y specifie d befor e th e firs t stud y wa s
conducted , the n ther e woul d be littl e research . By contrast , for an y
set of possibl e relationships , we com e to understan d causa l processe s
ove r tim e an d throug h th e accumulatio n of research . Tha t researc h
ofte n include s element s of tria l an d error.
Th e importan t poin t for longitudina l model s is that , du e to
repeate d samplin g acros s time , th e numbe r of variable s in a mode l
7

7. Th e iterativ e proces s of mode l refinemen t make s salien t th e tensio n betwee n us e o f SE M


technique s fo r mode l confirmatio n versu s fo r mode l development . Ther e ar e clea r disagree
ment s abou t ho w muc h model s can b e change d withi n a singl e dat a se t t o improv e th e matc h
betwee n th e dat a an d th e model . Discussio n of thes e issue s wil l b e lef t fo r a late r chapter ,
whe n technique s tha t guid e mode l refinemen t ar e presented .

120

SINGL E MEASURE S O F

VARIABLE S

increase s rapidly . Resultin g model s potentiall y can be ver y comple x


an d difficul t bot h to estimat e an d to interpret . In suc h cases , ther e is
th e temptatio n to exclud e variable s to kee p th e mode l manageable .
If, however , critica l variable s ar e omitte d in th e mode l simplificatio n
process , the n som e of th e path s tha t ar e estimate d ma y wel l be
"wrong. " Th e wron g path s ar e an y tha t woul d be differen t if th e
omitte d variabl e ha d bee n included . Researcher s nee d to carefull y
trad e off betwee n model s tha t ar e larg e an d comple x (whic h ma y be
difficul t to estimat e eve n if the y ar e wel l though t ou t an d articulated )
an d simple r model s (whic h ma y misrepresen t causa l processes) .

Correlatio n an d Regressio n Approache s


for Analyzin g Pane l Dat a

No w tha t a numbe r of basi c issue s underlyin g us e of pane l model s


hav e bee n presented , manifes t variable/observe d measur e approache s
to mode l estimatio n for pane l dat a ar e discussed . Thes e includ e bot h
correlationa l an d regressio n techniques . All begi n from two-variable ,
two-wav e models .
On e of th e uniqu e feature s of pane l analysi s is tha t th e method s
seem to hav e bee n develope d independentl y by tw o group s of re
searcher s from differen t field s (see , e.g. , Pelz & Andrews , 1964;
Rozell e & Campbell , 1969). Both set s of researcher s attempte d to
find way s of usin g cross-tim e an d cross-variabl e correlation s to asses s
th e relativ e causa l influence s of tw o variable s on each other .
Althoug h th e logi c underlyin g th e approache s wa s simila r (i.e., to
find a wa y in whic h to compar e th e magnitud e of th e cross-la g
correlations) , th e method s chose n wer e not . Th e approac h of Rozell e
an d Campbel l (1969), in general , compare d magnitude s of th e cross lag correlation s afte r adjustin g for differences . It wa s mad e mor e
sophisticate d throug h a rang e of adjustment s for potentia l confound
ing factor s suc h as differentia l reliability . Pelz an d Andrew s (1964),
by contrast , employe d partia l correlation s to examin e plausibilit y of
causa l impact .
Ther e is littl e valu e in discussin g eithe r of th e tw o approache s an y
furthe r or in goin g int o detai l abou t ho w thei r method s actuall y can
be used . As suggeste d earlier , dat a analysi s method s in thi s field hav e
bee n flawe d an d therefor e limite d in wha t the y ar e capabl e of

Recursive

and Longitudinal

Models

121

accomplishing . The y hav e no t bee n abl e to tak e advantag e of th e


sophisticatio n of th e thinkin g underlyin g pane l models .
For reader s wh o nevertheles s thin k tha t the y migh t be intereste d
in usin g pane l analysi s or nee d to understan d th e differen t approache s
so tha t the y can effectivel y convinc e colleague s tha t usin g cross-la g
pane l method s woul d be a wast e of thei r time , Shingle s (1976)
provide d a comprehensiv e revie w an d critiqu e of th e potentia l use s
of variou s approaches . A secon d sourc e is Rogos a (1980), wh o als o
provide s a critiqu e of cross-la g pane l methods .
If cross-la g pane l method s pe r se hav e an y rol e amon g variou s
social scienc e methods , the n tha t rol e ma y be to hel p sugges t possibl e
inference s abou t causa l preponderanc e in situation s wher e onl y cor
relation s ar e availabl e an d ther e ar e no t enoug h variable s availabl e to
buil d a multiple-indicato r structura l model . In suc h circumstance s
(e.g. , whe n archiva l dat a set s ar e availabl e bu t hav e limite d measures) ,
cross-la g pane l method s migh t be valuabl e in providin g guidanc e
abou t "mor e likely " causa l impacts .
Althoug h th e shortcoming s in pane l method s in larg e par t hav e
reflecte d th e lack of effectiv e technique s for analyzin g data , the y als o
hav e a second , an d perhap s eve n mor e critical , shortcoming . Tha t
shortcomin g woul d hav e bee n avoide d ha d pane l model s bee n viewe d
as a class of pat h model s (Rozell e an d Campbel l [1969] wer e mor e
guilt y of thi s tha n wer e Pelz an d Andrew s [1964]). As can be seen
fro m Figur e 6.3, considerin g cross-la g pane l model s as pat h model s
woul d mak e issue s of identificatio n an d mode l specificatio n immedi
atel y apparent .
With respec t to identification , two-variable , two-wav e pane l model s
reall y ar e underidentifie d pat h models . The y can mos t readil y be
mad e identifie d by droppin g th e residua l covariances , in effect allow
ing an y measure-specifi c varianc e to be merge d wit h stabilit y of th e
construct . If suc h an assumptio n is made , the n th e mode l is solvabl e
as an overidentified , albei t misspecified , pat h mode l usin g ordinar y
leas t square s regression .
Even if on e can estimat e value s for th e relationship s betwee n th e
variables , ther e ar e basi c question s abou t th e adequac y of mode l
specification . Tha t issu e primaril y is whethe r or no t th e close d syste m
assumptio n of pane l model s is tenable . In mos t instances , th e two
wave , two-variabl e model s tha t hav e bee n articulate d in cross-la g
pane l approache s suffe r terminall y fro m th e close d syste m assump

SINGL E MEASURE S O F

122

VARIABLE S

tion ; ther e ar e few , if any , situation s in whic h it is safe to assum e tha t


tw o variable s caus e each othe r withou t an y othe r variable s bein g
important .

Summar y

Thi s chapte r bega n wit h a discussio n of nonrecursiv e models , namely ,


model s wit h reciproca l causa l relation s or wit h feedbac k loops .
Becaus e nonrecursiv e model s can lead to problem s of mode l identi
fication , necessar y an d sufficien t condition s for mode l identificatio n
wer e described . Longitudina l model s provide d th e secon d topi c of
thi s chapter . Thos e model s shar e commonalitie s wit h nonrecursiv e
model s bu t brin g a somewha t differen t logica l perspectiv e to dat a
analyses .
For user s of structura l equatio n techniques , ther e is muc h to lear n
fro m earl y wor k on nonrecursiv e model s an d pane l analysis . Wha t
need s to be learne d is no t th e manifes t variabl e method s of two-stag e
leas t square s estimatio n or cross-la g pane l analysis , for thos e ar e
inferio r an d flawed . Instead , wha t is mos t importan t is an under
standin g of ho w principle s of finit e causa l lag an d stabilit y of variable s
acros s tim e can be use d to develo p mor e realisti c SEM models . Said
differently , th e understanding s develope d fro m thi s chapte r shoul d
be straightforward . The y ar e no t cumbersom e method s specifi c to
pane l design s bu t rathe r ar e principle s tha t can guid e one' s thinkin g
as model s ar e constructed . Furthermore , the y illustrat e in anothe r
wa y th e valu e of employin g multipl e measure s to operationaliz e th e
theoretica l variable s of interes t in a structura l equatio n model .

E X E R C I SE

6. 1

Testin g Mode l Identificatio n

Whic h of th e diagram s in Figur e 6.5 is (are ) identified ? Wha t


need s to be don e to identif y th e one(s ) tha t is (are ) no t
identified ? (Analysi s of identificatio n of th e model s in Figur e
6.5 appear s in Tabl e 6.1.)

Recursive

and Longitudinal

Figur e 6.5 .

Models

E x a m p l e s o f Nonrecursiv e Pat h M o d e l s

123

SINGL E MEASURE S O F VARIABLE S

124

TABLE 6.1 Identificatio n Test s for Model s in Figur e 6.5 (a) Figur e in
Top Pane l (N = 3; thus , - 1 = 2 )
Condition
Xl

Xi

X3

X4

X5

Yl

Yl

Yi

Met?

Orde r conditio n (require s rw o o r mor e zeros , eac h row )


Yi

Tn

Y2

Tu

2 2

2 3

2 4

3 4

r 2 5

3 5

2 1

3 1

1 2

1
0

, j

Yes

2 3

No

Yes

Ran k conditio n (require s a 2 2 non-zer o determinan t matrix )

Yl
Yi

23 24

25

|0

Y sl

34

iTn l

No

|0 |
Y3
Fo r Yi, i(yi4

= 0

iT n

Ti i

22

| 0| , if
|o

Yes

1 2

11

13

2 5

= 0

|
34

0|

|0

Yes (both )

35

(b) Figur e in Botto m Pane l (N = 3 ; thus , - 1 = 2 )


W\

W2

Wi

WA

ZI

Zi

Zi

Condition

Metf

Orde r conditio n (require s tw o o r mor e zeros , eac h row )


Zi

Yu

Yl2

13

Zi

22

23

, ,

Zi

Zl

|0

1 2

34
0

33
32
Ran k conditio n (require s a 2 2 non-zer o determinan t matrix )

IY34
Z2

lYll
10

Z3

3 2

Yes

P23

Yes

Yes
Yes

|
2 3

i|

Yes

0|
734

Yes

|o

ILLUSTRATIO N 2 :

PEE R POPULARIT Y AN D ACADEMI C

ACHIEVEMENTPANE L ANALYSI S OF O B S E R V E D VARIABLES

Becaus e the dat a se t describe d initially in Chapte r 3 is longitudinal , the mode l


presente d ther e can be extende d acros s time . This Illustratio n take s the sam e
measure s an d relationship s as in tha t earlie r exampl e but add s secon d an d
third time period s with the pee r acceptanc e an d achievemen t variables . Be
caus e school s were desegregate d betwee n the first an d secon d time period s

Recursive

and Longitudinal

125

Models

an d the issu e of interes t was how acceptanc e by mainstrea m peer s shape d


achievemen t in desegregate d classes , the acceptanc e measur e use d was
choice s by white peers .
Methodologically , longitudina l samplin g provide s additiona l informatio n but
require s allowing variabilit y to chang e acros s time in panele d variables . In
othe r words , a s explaine d in the text , a covarianc e matri x need s to be analyzed .
In this case , I chos e to scal e the matri x to a correlation-lik e metri c (e.g.,
Meredith , 1964) , standardizin g eac h measur e the first time it appeare d an d
expressin g late r time s in term s of the varianc e at the first point in time (i.e., a
rati o of the varianc e at eac h late r point to the varianc e at the first point) .
Becaus e the mode l is a manifes t variabl e (single-indicator ) model , the result s
from regressio n analysi s ar e identica l to thos e from SEM program s an d from
maximu m likelihood estimation . I illustrat e how to estimat e the mode l usin g the
SEM progra m LISREL.
The first five variable s ar e the sam e a s in the previou s example . The additiona l
variable s in this mode l ar e seatin g popularit y with white peer s at Time 2
(SeatPop2 ) an d Time 3 (SeatPop3 ) an d achievemen t tes t performanc e at Time
2 (VerbAch2) an d Time 3 (VerbAch3). The input matri x (a rescale d covarianc e
matrix ) is a s follows:
Covarianc e Matri x to Be Analyze d (calle d MAMATRIX.LG )
Fern
SocClass

Peabody

Tchr
Eval

Seat
Pop1

Verb
Ach1

Peabod y
TchrEval
SeatPop I
VerbAch 1
SeatPop 2

1.0 0
.01
-.12
.04
.09
.04

1.0 0
.24
.16
.31
.01

1.0 0
.17
.30
.11

1.0 0
.08
.12

1.0 0
.10

VerbAch2

.08

.28

.42

.12

SeatPop 3

-.03
.02

-.03
.22

.01
.68

.00
.23

FamSocClas s

VerbAchS

.33
.00
.43

Seat
Pop2

1.0 0
-.05
.19
.19

Verb
Ach2

.81
-.05
.52

Seat
Pop3

Verb
Ach3

1.0 5
-.01

1.2 4

Note tha t the Times 2 an d 3 standardize d tes t score s have variance s othe r
tha n 1.0 , a s doe s pee r acceptanc e at Time 3. Those ar e variable s tha t appea r
at mor e tha n one point in time . (Becaus e the pee r choice s variabl e change d
from Time 1 [before desegregation ] to Time 2, the measure s were not scale d
agains t one another. )
Once again , the solutio n could be obtaine d from regressio n analysis , but the
nonstandardize d path s would be the one s tha t shoul d be interpreted . Becaus e

126

SINGL E MEASURE S O F

VARIABLE S

this solutio n was produce d from a scale d matrix , it shoul d be replicabl e only
from a progra m tha t will analyz e a covarianc e matri x tha t th e use r supplies .
Otherwise , the coefficient s would be eithe r standardize d (not appropriate ) or
nonstandardize d from a tru e covarianc e matrix . Appendi x 6.1 provide s the
LISREL comman d statement s tha t could be use d to analyz e the matrix .
The estimate s from the LISREL solutio n appea r in th e following. For eac h
dependen t measur e an d pat h coefficient , ther e is th e estimate d path , its
standar d error , an d its rvalue . For example , the pat h from Peabod y to TchrEval
is .24 , with a standar d erro r of .10 an d a t value (the coefficien t divided by the
standar d error ) of 2.53 , which is significant .
Independen t Variable s
Fam
SocClass

Peabody

Tchr
Eval

-.12
(.10 )
-1.2 8
.06
(.10 )
0.5 6
.12

.24
(.10 )
2.5 3
.13
(.10 )
1.2 7
.25

(09 )
1.2 7

(09 )
2.6 6

.15
(.10 )
1.4 5
.25
(.10 )
2.6 0

Seal
Pop1

Verb
Ach1

Seat
Pop2

Verb
Ach2

.12
(10 )
1.1 6
.10
(08 )
1.1 4

.09
(.10 )
0.9 1
.32
(.08 )
3.8 4

.19
(10 )
1.8 8
.22
(.09 )
2.3 6

-.05
(.11 )
-0.4 5
.66
(.10 )
6.3 6

Seat
Pop3

Verb
Ach3

Dependen t
Variable s
TchrEval

SeatPop I

VerbAcM

SeatPop 2

VerbAch2

SeatPop 3

VerbAch3

NOTE: Standar d errors are in parentheses , rvalue s are in rows below standar d errors .

Overall, th e fit of the mode l is muc h wors e tha n wa s foun d for th e cross
sectiona l model : chi-squar e with 21 degree s of freedo m = 54.8 1 (p=.000075) .
The sam e conclusion s abou t relationship s betwee n variable s tha t wer e state d

Recursive

and Longitudinal

12 7

Models

for the cross-sectiona l mode l hold for the longitudina l one , for "downstream "
variable s do not alte r the relationship s tha t preced e them . In term s of the
longitudina l element s of the model , achievemen t wa s not very stabl e throug h
the desegregatio n experienc e but was muc h mor e stabl e (althoug h still chang ing substantially ) within desegregate d classrooms . Pee r relation s were not
significantl y stabl e throug h desegregatio n an d were only marginall y stabl e
within the desegregate d classrooms . The only significan t cross-la g pat h wa s
from pee r acceptanc e Time 2 to achievemen t Time 3. In othe r words , if this
mode l accuratel y depict s wha t happened , the n desegregatio n markedl y dis rupte d th e pee r relation s an d achievemen t of Mexican America n students , an d
achievemen t did not see m to influenc e pee r relations , but pee r relation s in the
desegregate d classroom s did see m to be relate d to late r achievement . In
othe r words , ther e is som e suppor t for a pee r acceptanc e to achievemen t
relationship .

A P P E N D IX

6. 1

LISRE L Command s fo r Pane l Illustratio n

On e setu p tha t work s for LISREL 8 is as follows :


Mexica n America n data , for choice s of whites , class illustratio n
DA NI= 9 NO=10 0 MA=CM
KM S Y F O FI=a:MAMATRIX.L G

[Not e tha t th e matri x is on th e A drive. ]


(8F10.7 )
SD F O
(11F7.5 )
1. 0 1. 0 1. 0 1. 0 1. 0 1. 0 .90 1 1.02 5
MO NY= 9 NE= 9 LY=i d BE=FU,F I PS=sy,f
FR B E 3 1 B E 3 2 B E 4 1 B E 4 2 B E 4
BE 5 3 C
BE 6 5 B E 6 4 B E 7 5 B E 7 4 B E 8 7
9 6 c

1.11 4
i TE=di,f i
3 BE 5 1 BE 5 2
BE 8 6 BE 9 7 BE

PS 2 1 P S 3 3 P S 4 4 P S 5 5 P S 6 6 P S 7 7 P S 8 8 P S
9 9
ST 1. 0 P S 1 1 P S 2 2
pat h diagra m
OU P T S E T V AD=OF F

SINGL E MEASURE S O F

128

VARIABLE S

For an y earlie r versio n of LISREL, remov e th e "pat h diagram " line .


Fit indexe s from th e outpu t
Accordin g to th e LISREL program , th e measure s of mode l fit ar e
as follows :
GOODNESS O F FI T STATISTIC S
CHI-SQUARE WIT H 2 1 DEGREES O F FREEDO M = 5 4 . 8 1 ( P =
0.000075 )
ROOT MEAN SQUARE ERRO R O F APPROXIMATION (RMSEA ) = 0 . 1 3
9 0 PERCEN T CONFIDENCE INTERVA L FO R RMSEA = (0.08 7 ;
0.17 )
P-VALUE FO R TES T O F CLOS E FI T (RMSE A < 0.05 ) = 0.001 9
CHI-SQUARE FO R INDEPENDENCE MODEL WIT H 3 6 DEGREES O F
FREEDOM = 14 3 .5 5
ROOT MEAN SQUARE RESIDUA L (RMR) = 0.1 1
STANDARDIZED RMR = 0 . 1 1
GOODNESS O F FI T INDE X (GFI ) = 0.9 1
ADJUSTED GOODNESS O F FI T INDE X (AGFI ) = 0.8 2
PARSIMONY GOODNESS O F FI T INDE X (PGFI ) = 0.4 3
NORME
D FI T INDE X (NFI ) = 0 . 6 2
NON-NORME
D FI T INDE X (NNFI ) = 0 . 4 6
PARSIMONY NORMED FI T INDE X (PNFI ) = 0.3 6
COMPARATIV
E FI T INDE X (CFI ) = 0.6 9
INCREMENTAL FI T INDE X (IFI ) = 0.7 2
RELATIVE FI T INDE X (RFI ) = 0. 35

Becaus e I hav e no t yet talke d abou t fit statistics , reader s shoul d


wai t an d loo k bac k at th e variou s statistic s whe n the y finis h Chap
ter 10.

Chapte r Discussio n Question s


1. What doe s "lag" refe r to? Is it the sam e a s "variable " or "time
lag"?
2. Stabilit y was use d in a sens e tha t proportiona l increment s in
an d y were the same . Can y the n be "stable " in som e sens e
without x, or is stabilit y tied to collinearity ?

FACTO R ANALYSI S AND


PATH MODELIN G

I I I J J ll I 1 I I I I I I I I I I I U - UgflflH^HB

ii.z^ctcr"^d^aiandiiii

-j

llllmiiltipl e I j i a j c a ^ O i ^ i i l ii
rat h Modelin g

T h i s chapte r introduce s an d explain s tw o relate d


perspective s tha t provid e method s for idea s presente d earlie r in th e
text . Th e first , facto r analysis , articulate s principle s underlyin g th e
us e of unmeasure d variable s in pat h models . Throughou t thi s text ,
ther e ha s bee n discussio n of constructs , or unmeasure d theoretica l
variables , operationalize d by som e measur e or set of measures . Ther e
ar e marke d advantage s in havin g availabl e multipl e measure s of
constructs . In fact , in most instances the only defensible way in which
to create viable models is to use multiple measures of each construct
assessed. Up to thi s point , however , ther e ha s bee n no attemp t to
explai n th e mechanic s of ho w to go abou t actuall y usin g unmeasure d
variable s in models . (To refres h thei r memories , reader s ma y wan t to
revie w th e beginnin g of Chapte r 5; in tha t chapter , idea s abou t
unmeasure d variable s initiall y ar e introduced. ) By introducin g per spective s develope d in th e facto r analysi s literature , notion s abou t
unmeasure d variable s can be develope d mor e fully . As par t of th e
descriptio n of facto r analysis , thi s chapte r introduce s confirmator y
facto r analysi s or CFA , namely , technique s in whic h th e item s definin g
each facto r an d th e relationship s amon g factor s ar e specifie d a prior i
rathe r tha n lettin g th e facto r analyti c method s defin e factors . CFA is

131

132

F A C T O R ANALYSI S AN D PAT H

MODELIN G

on e typ e of laten t variabl e structura l equatio n model . As par t of th e


discussio n of confirmator y facto r models , differen t level s of con
strain t tha t can be impose d on th e relation s of measure s to factor s ar e
discussed , an d th e multitrait-multimetho d (MTMM ) mode l pre
sente d earlie r is reintroduced . Th e reintroductio n of MTM M issue s
is importan t insofa r as dealin g effectivel y wit h issue s of metho d
varianc e is an importan t par t of structura l equatio n model s an d
methods .
Th e secon d perspective , develope d by Costne r an d his colleague s
(e.g. , Costner , 1969; Costne r fit Schoenberg , 1973) an d articulate d
nicel y by Kenn y (1979), applie s perspective s of facto r analysi s to pat h
models . Tha t perspectiv e use s simpl e algebrai c calculation s to dem
onstrat e ho w havin g multipl e measure s of a construc t can hel p
separat e commo n varianc e from uniqu e tru e scor e varianc e as wel l as
fro m erro r varianc e an d covariance . Th e approac h explaine d allow s
preliminar y dat a screenin g "by hand " befor e attemptin g to ru n struc
tura l equatio n compute r programs . Th e preliminar y screenin g can be
particularl y helpfu l for comple x model s or whe n ther e ha s bee n littl e
empirica l wor k to guid e mode l development .

Facto r Analysi s

I Logic of Factor Analysis


Facto r analysi s is designe d to lin k observe d measure s to a smalle r
numbe r of underlyin g conceptua l variable s (for a fulle r description ,
see , e.g. , Gorsuch , 1983; Mulaik , 1972). Facto r analysi s represent s
th e observe d measure s in term s of (unobserved ) commo n factor s plu s
uniqu e variance ; th e relationship s betwee n unobserve d factor s an d
observe d measure s ar e define d in term s of weight s (e.g. , regressio n
weights ) linkin g factor s to measures . In othe r words , facto r analysi s
provide s a vehicl e for movin g fro m a "singl e measur e for each
construct " pat h mode l to a multipl e measur e of each construc t or
multiple-indicato r pat h model . On e stil l can examin e th e sam e un
derlyin g theoretica l variable s tha t wer e of interes t in pat h analysis .
Thos e variable s no w ar e though t of as factors , an d severa l measure s
ar e collecte d of each theoretica l variabl e to tak e advantag e of im
prove d measuremen t propertie s tha t com e fro m multipl e measures .

Factor

Analysis

and Multiple

Indicators

133

As wit h mos t othe r extension s of pat h analysis , whe n logi c of facto r


analysi s is integrate d wit h pat h modeling , th e resultin g model s canno t
be solve d by ordinar y leas t square s regressio n techniques .
To illustrate , conside r onc e agai n a mode l linkin g pee r relation
ship s an d academi c achievement , focusin g on th e academi c achieve
men t construct . In earlie r pat h mode l illustrations , a measur e of
verba l achievemen t base d on a standardize d tes t wa s used . Alterna
tively , we als o migh t hav e chose n as a measur e of achievemen t teache r
rating s of thei r students ' performance , studen t grades , standardize d
tes t performanc e on a domai n othe r tha n verba l skills , or performanc e
of student s in meetin g som e type s of standards . Becaus e regressio n
analyse s limi t us to a singl e measur e of each construct , we ha d to pic k
a measur e of achievemen t tha t we though t wa s close to th e theoretica l
variabl e of interes t an d hop e tha t wha t it assesse s is wha t we want . If
a rang e of measure s wer e available , ou r bes t choic e for path analysis
usuall y woul d be to creat e a composit e measur e of th e differen t
choices. In principl e (see th e Chapte r 5 discussio n of measuremen t
error) , tha t measur e shoul d displa y th e bes t reliabilit y eve n thoug h it
stil l contain s som e erro r from th e summe d measures .
8

By contrast , rathe r tha n attemptin g to selec t a "best " measure , in


a facto r analyti c approac h a numbe r of differen t measure s of achieve
men t (as man y as ar e available ) coul d be selecte d to asses s achieve
ment . Th e construc t of achievemen t (on e factor ) is define d by wha t
thos e measure s hav e in common . Th e achievemen t facto r is wha t is
interrelate d wit h othe r theoretica l variables , each als o a facto r define d
by a set of measures . In a pat h mode l wit h single-heade d arrow s
linkin g factors , th e correlation s (covariances ) amon g factor s ar e
turne d int o pat h coefficient s in th e sam e wa y tha t regressio n analysi s
turn s correlation s or covariance s int o pat h coefficients .
On e of th e classi c area s of focu s for facto r analysi s ha s bee n th e
assessmen t of abilities , for example , definin g primar y menta l abilitie s
an d addressin g whethe r or no t ther e is a genera l abilit y tha t underlie s
othe r abilitie s (e.g. , Thurstone , 1938). In thi s illustration , a primar y
abilit y suc h as verba l comprehensio n is define d throug h facto r analy
sis by attemptin g to extrac t a singl e sourc e of commo n variabilit y fro m
a numbe r of measure s tha t ostensibl y tap tha t ability . Furthermore ,
8. In th e illustration , tha t wa s no t don e so reader s coul d se e ho w specifi c measure s
performe d whe n include d in regressio n as compare d t o bein g an indicato r in a multipl e
indicato r model .

134

FACTO R ANALYSI S AN D PAT H

MODELIN G

whe n measure s of differen t primar y abilitie s (e.g. , verba l comprehen


sion , numerica l ability ) ar e include d in a singl e facto r analysis , meas
ure s of verba l comprehensio n shoul d defin e on e factor , wherea s
measure s of numerica l abilit y shoul d defin e a second . Becaus e th e
factor s ar e though t to be related , th e differen t abilitie s ar e expecte d
to correlat e wit h on e another . If a researche r ha d measure s of a
numbe r of primar y abilities , the n thei r correlation s coul d be use d in
a second , "highe r level " factorin g (describe d in Chapte r 11) to see
whethe r underlyin g the m is a singl e construc t tha t coul d be calle d
genera l ability .
For reader s unfamilia r wit h facto r analysi s bu t wh o hav e followe d
all tha t ha s bee n covere d thu s far in thi s book , facto r analysi s can be
seen as bein g ver y muc h like regressio n in tha t it share s th e genera l
linea r mode l (e.g. , Gorsuch , 1983). It can be viewe d as a varian t of
regression , th e mos t prominen t differenc e bein g tha t in facto r analysi s
no t all of th e variable s in th e regressio n mode l ar e measured . It als o
is generall y th e case tha t in facto r analysi s th e matri x bein g analyze d
is a correlatio n matrix ; thus , th e analogou s regressio n solutio n woul d
focu s on th e standardize d (beta ) coefficients .
Th e basi c regressio n equatio n in matri x for m is = BX + ,
wherea s th e basi c facto r analysi s equatio n in simila r for m is Y = Pf +
U. In th e latte r equation , onl y th e Vs actuall y ar e measured . Thos e
Vs ar e define d in term s of a vecto r of f factor s representin g th e
unmeasure d factors ; a weigh t matrix , P, tha t is th e matri x of coeffi
cient s relatin g factor s to th e observe d measure s Y; an d a vecto r of
residuals , V. Th e element s of ar e essentiall y partia l regressio n
coefficient s bu t usuall y ar e describe d by terminolog y of facto r analysi s
as bein g element s of th e facto r patter n matrix . Th e element s of V, th e
residual s afte r th e commo n factor s ar e extracted , ar e calle d unique
nesse s in facto r analysis . Th e facto r analysi s equatio n is paralle l in
for m to th e regressio n equation .
Althoug h th e paralle l to regressio n can be reassurin g for som e
readers , it also can creat e confusio n becaus e it is difficul t to thin k
about usin g unmeasure d variable s to predic t othe r variables . For
example , ho w can on e kno w wha t th e predictor s ar e whe n ther e ar e
no score s on them ? Can the y be anything ? Can the y no t chang e fro m
measur e to measure ? In fact , th e mos t comple x an d controversia l par t
of explorator y facto r analysi s (EFA) is determinin g wha t th e factor s
are . (Ho w man y factor s ar e there ? Wha t shoul d th e factor s be called ?
Wha t d o thos e factor s actuall y represent? ) Ther e is, for example , th e

Factor

Analysis

and Multiple

Indicators

135

risk of inaccurat e labelin g of factors , whic h Cliff (1983) calle d th e


nominalisti c fallacy ; namin g factor s doe s no t mak e the m wha t the y
ar e labeled .
At th e sam e time , th e ide a of unmeasure d variable s as cause s
shoul d no t seem altogethe r unfamilia r to readers . In th e reliabilit y
mode l presente d in Chapte r 5, th e tru e scores , whic h paralle l th e
factor s in facto r analysis , ar e unmeasure d an d ar e cause s rathe r tha n
effects . As mentione d in th e discussio n of reliability , measure s ar e
viewe d as cause d by th e commo n dimension s tha t the y tap as wel l as
by thei r uniqu e varianc e an d error . For example , subjects ' score s on
a measur e of abilit y can be seen as cause d by thre e components : (a)
th e underlyin g abilit y dimensio n tha t th e measur e is suppose d to
assess , (b) an y uniqu e dimension s tha t th e measur e consistentl y ma y
tap , an d (c) error .
An additiona l reaso n wh y unmeasure d variable s shoul d no t pos e
to o grea t a problem is that , in principle , researcher s shoul d hav e a
prett y goo d ide a abou t wha t th e factor s ar e whe n the y collec t thei r
measures . Tha t is, measure s shoul d be selecte d to tap particula r
underlyin g dimensions , an d issue s about th e numbe r of factor s an d
wha t the y ar e shoul d hav e bee n wel l though t ou t in advance . Whe n
researcher s hav e organize d thei r measure s aroun d an a prior i set of
underlyin g dimensions , facto r analysi s is use d muc h mor e for confir
matio n or mode l testin g tha n for exploration . In practice , however ,
determinin g th e nature of th e unmeasure d variable s is no t alway s
straightforwar d give n tha t facto r analysi s technique s als o hav e bee n
use d as explorator y techniques . Usin g facto r analysi s to defin e dimen
sionalit y of measure s tha t hav e bee n assemble d atheoreticall y or bee n
combine d somewha t haphazardl y can lead to problem s in interpretin g
unmeasure d factors .
A secon d issu e relevan t to pat h modelin g is whethe r or no t
predictors , becaus e the y ar e unmeasured , can be mad e th e sam e as
som e of th e observe d measures . Afte r all, woul d tha t no t giv e prett y
goo d (i.e., perfect ) predictio n of thos e observe d measure s as depen
den t variable s an d hel p on e to kno w wha t th e predicto r variable s are ?
Th e answe r is yes , on e coul d mak e th e unmeasure d variable s th e sam e
as som e of th e observe d measures , an d certai n facto r analysi s method s
hav e don e that . Tha t make s th e issu e of predictio n for certai n
measure s a trivia l one , for we woul d be predictin g variable s usin g
themselve s as predictors . Tha t is th e situatio n tha t occur s in regressio n
approache s to pat h analysis , wher e each measur e is suppose d to

136

F A C T O R ANALYSI S AN D PAT H

MODELIN G

correspon d directl y to an underlyin g theoretica l dimension . Tha t is,


in pat h analysi s each underlyin g construc t is treate d as if it wer e th e
sam e as th e observe d measure/variable , for it is necessar y to assum e
tha t variable s ar e measure d perfectl y an d withou t error . For example ,
in a pat h analysi s interrelatin g abilit y an d self-concept , a singl e
measur e of ability , whateve r tha t measur e happen s to be , define s th e
abilit y construct , a singl e measur e of self-concep t define s th e selfconcep t construct , an d so on for all othe r measures .
As suggeste d in previou s chapters , th e primar y shortcomin g of
pat h analysi s is tha t each theoretica l variabl e is operationalize d by
onl y a singl e measure . Th e resul t is tha t measuremen t erro r an d
specificatio n erro r canno t be disentangle d fro m varianc e tappin g th e
theoretica l variabl e of interest . By contrast , whe n multipl e measure s
ar e available , differen t varianc e source s can be disentangle d an d
reliabilitie s of measure s can be estimate d (e.g. , Miller , 1995). Ther e
ar e problem s separatin g varianc e component s onl y whe n a facto r is
define d as identica l to an observe d measure . Then , som e of th e
informatio n tha t allow s separatin g construct s fro m measure s an d
partitionin g of common , unique , an d erro r variance s is wasted .
Althoug h th e focu s of thi s discussio n is on ho w facto r analysi s
technique s can be applie d to improv e structura l models , it is impor
tan t to remembe r tha t suc h use s hav e no t bee n typica l in tha t
literature . Instead , facto r analysi s ha s bee n use d mos t widel y to
represen t a large r numbe r of observe d variable s in term s of a smalle r
set of source s of commo n variance . In man y instances , researcher s
starte d wit h a conceptua l mode l the y hope d to fit; in othe r instances ,
th e researc h wa s muc h less drive n by theoretica l concerns . Regardles s
of whethe r th e approac h wa s explorator y or confirmatory , however ,
it wa s assume d tha t th e resultin g source s of commo n varianc e woul d
hav e meanin g tha t coul d be discerne d fro m th e patter n of relation
ship s of th e observe d variable s wit h th e unobserve d variables ,
|

Exploratory Factor Analysis

Becaus e EFA approache s hav e littl e in commo n wit h th e method s


discusse d in thi s text , the y ar e no t covere d in muc h detai l here . A nic e
introductio n to EFA can be foun d in Ford , MacCallum , an d Tait
(1986), an d reader s intereste d in mor e detail s linkin g facto r an d
structura l equatio n model s shoul d see Loehli n (1992). Here , th e focu s

Factor

Analysis

and Multiple

Indicators

137

is on EFA approaches , thei r prominen t features , an d ho w the y com


par e to th e type s of model s we hav e discusse d throughou t th e text .
Perhap s th e firs t definin g featur e of EFA is tha t mos t researc h
usin g EFA ha s extracte d factor s tha t ar e orthogonal , tha t is, uncorre
late d wit h or independen t of on e another. ' Th e ide a of uncorrelate d
predicto r variable s wa s discusse d earlie r as par t of th e discussion s of
collinearity . For th e presen t discussion , not e tha t if a structura l
equatio n pat h mode l wer e to extrac t uncorrelate d factors , the n it
woul d be prett y borin g give n tha t ther e woul d be no path s betwee n
an y of th e theoretica l variables . In othe r words , structura l equatio n
approache s stan d in marke d contras t to EFA insofa r as th e variable s
of interes t (factors ) in structura l equatio n model s usuall y wil l be
hypothesize d as correlatin g wit h on e another .
Second , ther e ar e a numbe r of differen t assumption s mad e tha t
shap e th e typ e of EFA techniqu e used . If on e assume s tha t ther e is no
uniqu e variance , as is don e in principa l component s analysis , the n
th e "error " par t of th e facto r mode l disappears . Onc e again , give n
th e importanc e of dealin g effectivel y wit h imprecisio n of measure
men t an d th e likelihoo d of imprecisio n actuall y occurring , compo
nent s analysi s ha s littl e to offer user s of structura l equatio n ap
proaches . By contrast , consisten t wit h pat h modelin g approaches , a
principa l factor s approac h extract s commo n an d uniqu e varianc e
components .
Third , an d a poin t of particula r importanc e to structura l equatio n
users , in EFA th e mode l tha t is teste d is underidentified , whic h mean s
tha t ther e is no uniqu e solutio n bu t rathe r an infinit e numbe r of
possibl e solutions , each of whic h fits th e dat a equall y well . Par t of th e
challeng e of thi s typ e of facto r analysi s is to pic k on e fro m th e arra y
of possibl e "equall y goo d fit" solution s tha t give s a solutio n tha t is
interpretable .
Fourth , in mos t type s of EFA, all measure s ar e relate d to ever y
factor . It is, of course , hope d tha t mos t of th e relationship s ar e trivia l
so tha t each measur e is substantiall y linke d onl y to on e or , at most , a
few of th e factors . Th e approac h tha t trie s to attai n suc h a solutio n
ha s bee n calle d attainin g simpl e structure . Becaus e ther e is an infinit e
numbe r of solution s tha t ar e mathematicall y equivalent , facto r analy
9. Althoug h no t reall y relevan t to th e presen t discussion , it is importan t to not e that , whe n
takin g composite s of item s t o for m factors , th e factor s tha t emerg e fro m an orthogona l
facto r analysi s ma y b e intercorrelated .

138

F A C T O R ANALYSI S AN D PAT H

MODELIN G

sis ha s methods , calle d facto r rotation , for movin g fro m on e solutio n


to anothe r in an attemp t to attai n a simpl e structure . Tha t is, rotatio n
move s fro m an initia l solutio n to anothe r tha t fits equall y wel l bu t ha s
somewha t differen t propertie s in an attemp t to find a solutio n in
whic h each measur e is triviall y relate d to mos t of th e underlyin g
factor s bu t substantiall y relate d to on e factor . By contrast , a confir
mator y us e of facto r analysi s hypothesize s particula r relationship s
betwee n measure s an d factor s an d the n typicall y wil l set all th e othe r
relationship s betwee n measure s an d factor s to zero . Rotatio n is no t
possibl e whe n ther e is a uniqu e solution , for an y othe r solutio n woul d
no t hav e th e sam e fit.
Give n th e difficultie s inheren t in selectin g th e "best " solutio n an d
in namin g factors , it seem s clea r tha t confusio n abou t facto r analysi s
can occu r whe n facto r analyti c approache s ar e use d for dat a explora
tion . In thei r wors t form , suc h approache s migh t be characterize d as
th e "I'm no t sur e wha t is her e an d ther e ar e to o man y measure s to
mak e sens e of, so let' s d o a facto r analysi s an d reduc e th e measure s
to a mor e restricte d set of variable s an d see wha t emerges " approach .
Give n tha t suc h approache s ar e characterize d so negatively , reader s
ma y be wonderin g wh y facto r analysi s ha s bee n so widel y used . In
part , th e answe r is tha t a numbe r of reasons , includin g methodologi
cal/analyti c limitations , hav e restricte d th e us e of facto r analysi s
approache s for mode l testin g (an d the y stil l impos e som e constraint s
on th e numbe r of measure s an d variable s tha t can be considere d at
on e time) , wit h th e resul t tha t facto r analysi s wa s for a tim e th e mos t
accepte d wa y of matchin g observe d measure s to underlyin g dimen
sions . It wa s use d eve n whe n a stron g a prior i theoretica l mode l ha d
bee n use d to generat e th e data . It is interestin g tha t recen t wor k
(Gerbin g &CHamilton , 1996) suggest s that , whe n an a prior i structur e
is hypothesized , thes e type s of facto r analysi s technique s provid e a
usefu l firs t ste p to complemen t mor e sophisticate d type s of CFA
describe d late r in thi s chapter .
Overall , then , variant s of EFA hav e man y feature s tha t structura l
equatio n approac h user s wan t to avoi d in a methodology , orthogona l
factors , an underidentifie d solutio n tha t is no t uniquel y solvable , an d
relationship s betwee n factor s an d measure s tha t ar e incompletel y
specified . At th e sam e time , however , Gerbin g an d Hamilto n (1996)
recentl y foun d tha t EFA technique s can be valuabl e whe n use d in
anticipatio n of us e of th e hypothesi s testin g confirmator y technique s
tha t ar e describe d next . In conclusion , then , facto r analysi s technique s

Factor

Analysis

and Multiple

Indicators

139

contribute d muc h to th e logica l foundation s of structura l equatio n


modelin g (SEM). Nevertheless , it wa s no t unti l CFA technique s wer e
develope d tha t muc h of th e valu e of pat h modelin g wa s produced .
I Confirmatory

Factor Analysis

Wit h th e relativel y recen t developmen t of powerfu l computer s an d


software , ther e ha s bee n a shif t to alternativ e facto r analysi s ap
proache s tha t attemp t to tes t th e viabilit y of a prior i structures . Thes e
latte r type s of facto r analysi s ar e calle d confirmator y facto r analysis .
CFA approache s examin e whethe r or no t existin g dat a ar e consisten t
wit h a highl y constraine d a prior i structur e tha t meet s condition s of
mode l identification . Thi s fittin g proces s sometime s is referre d to
somewha t inaccuratel y as "confirming " a mode l or hypothesize d
structure . In fact , as mentione d earlier , a mode l neve r can be con
firmed . It can be disconfirme d (it doe s no t fit th e observe d data) , or
it can fail to be disconfirme d (it fits) . Th e mos t importan t point s for
th e curren t discussion , however , ar e tha t CFA approache s begi n wit h
a theoretica l mode l tha t ha s to be identifie d (an d therefor e be
uniquel y solvable ) an d mus t attemp t to see whethe r or no t dat a ar e
consisten t wit h tha t theoretica l model .
If CFA approache s soun d a lot like pat h models , it is for goo d
reason . Genera l CFA model s ar e a for m of pat h model s tha t hypothe
size relationship s betwee n unmeasure d construct s an d observe d meas
ures . Th e differenc e betwee n CFA model s an d laten t variabl e pat h
model s is tha t in pat h model s th e laten t variable s (unmeasure d
constructs ) ar e hypothesize d to be causall y interrelated , wherea s in
CFA model s the y ar e intercorrelated . Said differently , in CFA model s
all th e laten t variable s ar e viewe d as exogenous . As is tru e of exoge
nou s variable s in an y model , CFA model s do no t attemp t to dis
entangl e th e cause s of hypothesize d interrelationship s amon g them .
Th e strengt h of relationship s amon g them , however , usuall y is of
interest .
Even in CFA model s wher e a prior i underlyin g dimension s ar e
operationalize d throug h observe d measures , ther e wil l be uncertaint y
abou t whethe r or no t th e measure s ar e capabl e of assessin g (or hav e
assessed ) th e dimension(s ) of interes t (e.g. , Cliff, 1983). In th e facto r
analysi s domai n and , consequently , in analyse s usin g laten t variabl e
structura l equatio n approaches , on e shoul d be war y of facto r label s
an d shoul d provid e as muc h construc t validit y informatio n as is

F A C T O R ANALYSI S AN D PAT H

140

MODELIN G

possible . For example , if I choos e to pu t a labe l of "self-concept " on


a factor/unmeasure d variable , my assignin g tha t labe l doe s no t mak e
th e variabl e self-concept , an d it certainl y doe s no t mak e th e variabl e
th e sam e as othe r variable s tha t als o hav e bee n calle d self-concept .
Furthermore , if I d o no t kno w th e relationshi p of my measure s of
self-concep t wit h othe r availabl e measure s of self-concept , the n I am
missin g som e valuabl e informatio n abou t construc t validity . (Not e
that , of course , construc t validit y informatio n can be obtaine d fro m
measure s of othe r construct s as wel l via convergen t an d divergent/dis
criminan t validit y information. )
In summary , facto r analysi s provide s a numbe r of feature s tha t
enric h structura l equatio n approaches . First , it is a methodolog y tha t
explicitl y include s latent/unobserve d variable s plu s observe d meas
ure s an d interrelate s th e two . Second , it draw s attentio n to issue s of
operationalizatio n of underlyin g variable s an d inheren t shortcoming s
of pat h analysi s models . Third , it illustrate s ho w regressio n model s
can be extende d to unmeasure d variables . Fourth , CFA technique s
provid e a pat h modelin g methodolog y for linkin g observe d measure s
to underlyin g theoretica l variables .
|

Use of Confirmatory

Factor Analysis Techniques

CFA approache s wer e widel y considere d bu t littl e use d unti l th e


1970s. Precurso r program s to th e curren t LISREL programACOV S
(Analysi s of COVarianc e Structures) , LISREL 1, an d SIFASP (Simul
taneou s Facto r Analysi s acros s Severa l Populations)al l wer e devel
ope d at an d distribute d by Educationa l Testin g Servic e in th e earl y
1970s. At tha t point , CFA becam e a viable , if infrequentl y used ,
approach , for thos e program s provide d a metho d for fittin g dat a to
hypothesize d models . Th e firs t version s of th e programs , however ,
wer e limite d in th e size of problem s (e.g. , numbe r of measures ,
numbe r of factors ) the y coul d addres s an d wer e cumbersom e an d
complicate d to use , wit h th e resul t tha t the y wer e no t widel y used .
By contrast , mor e recen t versions of SEM program s ar e muc h mor e
flexible , ar e easie r to use , an d handl e muc h large r problems , makin g
the m muc h mor e accessibl e an d practica l to use .
CFA is straightforwar d to set up onc e th e interrelationship s ar e
specifie d an d th e representativ e pat h mode l is constructed . Specifyin g
th e interrelationship s shoul d be easy , for th e specification s com e

Factor

Analysis

and Multiple

Indicators

141

directl y fro m th e theor y underlyin g th e model , whic h guide s opera


tionalizatio n of th e conceptua l variables . Diagrammin g als o shoul d
be easy , for in thi s ste p th e mode l just ha s to be set up as a pat h mode l
wit h th e factor s as independen t variable s an d th e observe d measure s
as dependen t variables . Each dependen t variabl e need s a residua l pat h
(its uniqueness ) as wel l as path s fro m othe r variable s (factors) . Cur
ren t versions of SEM program s AMO S an d EQS can produc e mode l
estimate s onc e user s us e th e programs ' drawin g tool s to dra w th e pat h
diagram s an d lin k th e observe d measure s to th e diagrams . Even for
program s withou t drawin g tools , th e proces s of settin g up a mode l
for analysi s is no t to o difficult .
As an illustratio n of a CFA model , loo k at Figur e 7.1. As can be
seen , th e figur e ha s thre e latent variables , or factors , each wit h thre e
indicators , or measures . Th e thre e laten t variable s ar e viewe d as
intercorrelated . Th e path s fro m th e factor s to th e measure s ar e partial
regressio n coefficients ; in thi s model , becaus e each measur e is cause d
by onl y a singl e predictor , th e path s reduc e to simpl e regressio n
(correlation s in th e standardize d case) . Th e matri x of coefficients , as
note d earlier , is th e facto r patter n matrix . Th e e's ar e th e residual s
(uniquenesses ) for th e endogenou s variables .
So, ho w doe s on e get from th e diagra m to facto r analysi s matrice s
an d to solvin g for th e parameters ? Begin wit h th e basi c equatio n liste d
earlier ,
Y = Pf+U.

(7.1)

Th e equation s can be set up for each dependen t variabl e in term s of


th e thre e independen t variables . In equatio n by equatio n form , look
ing like regressio n equation s for each dependen t variable , th e ele
ment s of th e matrice s ar e as follows :
v, = P,7 ,

v,

+ 0 7 ,

Pi7 , +

P,7i

0 7

+ 0 7

+ 0 7

+ ,

+ 0 7 ,

+ e,

0 7 ,

PSh
e
PCh +
e
+
e

07,

*i -

07,

07,

07,

+ 0 7

P 7,

, = 07, + 0 7

+ p, 7

+ ,

Y, =

Y =
7

0 7 ,

P,7

+ 0 7

0 7 ,

+ 0 7 ,

P,7,

0 7 ,

s
+

142

FACTO R ANALYSI S AN D PAT H

pyTI

ei

Figur e 7 . 1 .

63

es

e6

e?

MODELIN G

Interrelation s A m o n g T h r e e Laten t Variable s an d N i n e Indicator s

Pu t bac k in matri x form , V = Pf + e, th e Vs becom e a 1 9 vector ,


as follows :

1 * 2.
1^, 1
m i

y=

|Y |
5

m i
|v l
7

I Y .I

Th e factor s becom e a 1 x 3 vector :


If, I

f=


l/j l

Th e error s ar e a 1 9 vector :

Factor Analysis and Multiple Indicators

143

k.l

Kl
U=

\e \
s

\e \
6

\e \
7

The estimated coefficients are a 3 9 factor pattern matrix:

IP, o o|
| p 2 0 0|
l p 3 0 0l
|0p 40|
P=\0 0\
Ps

10 p 01
6

|0 0 p 7 |
|00p 8|
|00p 9|
Note that most of the coefficients in the factor pattern matrix are
fixed to 0 and that, as in the diagram, each measure is directly related
to only a single factor. Because the measures are clustered, with
indicators of each factor together, the factors can be readily discerned
from the matrix.
Unfortunately, in this form, there is not enough information (nine
equations but more than nine unknowns) to uniquely solve all the
pattern matrix (P) coefficients and factor correlations. Furthermore,
the factor correlations do not appear anywhere in the equations, so
it would seem to be difficult to solve for them. The equation can be
turned into a solvable form by multiplying each side by its transpose
(because the two sides are equal, their transposes also are equal); that
changes the left side to the variance/covariance oJ Ir U
intercorrelation
llCll.UllClcll.lUl l
matrix of the observed measures, which then gives enough informa tion to solve fo
Thus
I U [r Lthe
i l t Imodel.
llUUtl
. 1
U U 3 ,
,
s enough informa-

YV = (Pf

+ U)(Pf

u)'.
(7.2)

144

F A C T O R ANALYSI S AN D PAT H

MODELIN G

Expanded , th e equatio n become s


YY' = (Pf)(Pf)' + (Pf)U' + U(Pf)' + UU'.

(7.3)

Becaus e th e error s ar e by definitio n independen t of th e factors , th e


tw o middl e term s on th e righ t sid e of th e equation (Pf)U' an d
U(Pf)both dro p out , for the y ar e zero , leavin g
YY' = (Pf)(Pf)' + UU'.

(7.4)

Usin g rule s of matri x algebra , th e equatio n become s


YY' = Pff'F

+ UU',

(7.5)

wher e UU' represent s th e varianc e covarianc e matri x of th e residuals


an d ff represent s th e varianc e covarianc e matri x of th e factors . Tha t
matri x is pre - an d postmultiplie d by th e facto r patter n matri x (P an d
P"). Thus , we hav e reache d a "traditional " for m for facto r analysi s in
whic h th e varianc e covarianc e matri x of th e observe d measure s is
expresse d in term s of a facto r patter n matrix , a facto r varianc e
covarianc e matrix , an d a residua l varianc e covarianc e matrix .
Usin g sigm a () to represen t th e varianc e covarianc e matri x of
observe d measures , ph i () to represen t th e facto r varianc e covari
anc e matrix , an d ps i ( ) to represen t th e residua l matrix , th e equa
tio n is
lyy = P< W + .

(7.6)

To repeat , sigm a is th e varianc e covarianc e (or correlation ) matri x


of th e Y vecto r tha t appear s in th e preceding . Psi is th e variance /
covarianc e matri x of th e residuals , tha t is, th e U vector . Ph i is th e
covarianc e matri x of th e / vector . Becaus e it is symmetric , onl y th e
lowe r triangula r par t is presente d to illustrat e it:

=|

2 1

2 2

Becaus e th e factor s ar e unmeasured , value s for th e variance s can be


specifie d in a numbe r of ways . Th e variance s do , however , hav e to be
fixed in som e way , for no t specifyin g the m leave s an indeterminac y

Factor

Analysis

and Multiple

Indicators

145

proble m betwee n th e facto r loadin g an d th e facto r variance ; it is


analogou s to a two-indicato r facto r mode l describe d late r in thi s
chapter . Th e simples t wa y is to set th e variance s to unities , whic h
woul d mak e ph i a correlatio n matri x an d th e off-diagona l element s
correlations . Anothe r way , usin g wha t ar e calle d referenc e indicators ,
wil l be describe d later .
If we assum e tha t th e diagona l element s of ph i all ar e fixed to 1.0,
the n all th e element s of th e diagra m hav e bee n specifie d sufficientl y
to allo w estimatio n of th e mode l provide d tha t it is identified . For
that , we can revisi t th e identificatio n issue s fro m pat h models , her e
recas t in term s of th e facto r model . Wit h multipl e measure s of each
facto r an d no residua l covariances , identificatio n is straightforward .
Th e covarianc e matri x of th e observe d measures , th e Vs , ha s availabl e
{[{v{v + 1)] / 2 } degree s of freedom ; thi s formul a is th e tota l numbe r
of nonredundan t element s in th e matrix , includin g th e variance s an d
th e covariances . In th e presen t example , th e availabl e degree s of
freedo m i s 9 x l O / 2 = 4 5 . A tota l of 3 degree s of freedo m ar e lost
to estimat e th e phis , 9 for th e element s of P, an d 9 for th e element s
of psi , leavin g 24 degree s of freedo m in th e model . Thus , thi s mode l
is overidentifie d an d can be estimated .
Finally , ther e ar e implication s of th e fact tha t at leas t som e of th e
factor s estimate d in CFA ar e likel y to be hypothesize d as correlatin g
wit h on e another . First , as can be illustrate d by th e mode l in Figur e
7.1, th e intercorrelation s betwee n factor s can accoun t for relation
ship s betwee n measure s tha t cros s factor s in th e model . Even thoug h
mos t of th e loading s in th e patter n matri x correspondin g to Figur e
7.1 ar e zero , relationship s tha t cros s factor s woul d no t be zer o unles s
th e factor s ar e uncorrelated . Becaus e in Figur e 7.1 th e factor s ar e cor
related , all th e measure s wil l correlat e wit h on e another . Th e magni
tud e of th e cross-construc t correlation s depend s on ho w strongl y th e
factor s ar e interrelated . Standar d rule s for tracin g path s can be use d
to estimat e th e correlations . For example , th e mode l predict s th e
relatio n betwee n Y, an d Y to be (p r p ). Not e tha t for each
cross-facto r relationship , ther e is onl y on e pat h connectin g each pai r
of measures , an d it goe s fro m th e firs t measur e via its loadin g to th e
firs t facto r (e.g. , p ), from tha t facto r to th e secon d facto r via th e
correlatio n betwee n the m (e.g. , r ) , an d on to th e secon d measur e via
its loadin g on tha t secon d facto r (e.g. , p ). (Remembe r tha t th e tracin g
rule s do no t allo w path s tha t go throug h tw o curve d arrows. )
Second , becaus e th e facto r correlation s typicall y wil l be substan
tiall y less tha n unit y insofa r as factor s wil l be distinc t rathe r tha n
7

ix

31

146

FACTO R ANALYSI S AN D PAT H

MODELIN G

highl y similar, th e relationship s of measure s acros s factor s wil l in


genera l be less tha n th e relationship s of measure s within factors . Thi s
issu e wil l be revisite d whe n MTM M matrice s ar e presented .
Finally , for reader s familia r wit h EFA, th e CFA mode l present s th e
facto r structur e versu s facto r patter n matri x issu e in a wa y that , to
me , ha s seeme d particularl y clear . For thi s assumption , I wil l assum e
tha t a correlatio n matri x is bein g analyzed . Th e facto r patter n matri x
contain s standardize d (partia l regression ) coefficient s to reproduc e
th e measure s fro m th e factors . Whe n factor s ar e orthogonal , element s
of th e facto r patter n matri x becom e correlation s (essentiall y simpl e
standardize d regressio n coefficients ) betwee n factor s an d measures ,
whic h make s the m relativel y eas y to interpret .
Wheneve r factor s ar e allowe d to correlate , th e coefficient s in th e
patter n matri x tak e int o accoun t th e relationship s amon g th e factors ,
makin g thei r interpretatio n mor e difficult , for in EFA ever y facto r
"causes " each measure . To aid interpretation , researcher s sugges t als o
interpretin g th e facto r structur e matrix , whic h is th e produc t P(ff)
(or ), namely , th e facto r patter n matri x (P) multiplie d time s th e
facto r correlatio n (covariance ) matrix . Not e tha t whe n factor s ar e
uncorrelated , ff is an identit y matri x (/) , an d Piff) = PI = P. Tha t
is, th e structur e an d patter n matrice s ar e identical , an d th e structur e
versu s patter n distinctio n is meaningless .
Whe n factor s ar e correlated , th e patter n matri x coefficient s es
sentiall y becom e partial regressio n coefficients , an d th e structur e
matri x contain s informatio n combine d wit h th e strengt h of th e
correlation s amon g factor s as wel l as strengt h of association s betwee n
measure s an d factors . Th e differen t complexit y in interpretin g infor
matio n fro m obliqu e solution s is on e reaso n wh y orthogona l factorin g
is use d so ofte n in EFA. A secon d is tha t unlik e CFA, in whic h ther e
is a uniqu e estimat e for each relationshi p amon g th e factors , th e
solutio n can be rotate d to chang e th e magnitud e of th e correlation s
betwee n factor s (as note d earlier , tha t capabilit y result s fro m under
identification ) as wel l as betwee n facto r loadings . Selectin g a partic
ula r magnitud e of relationshi p betwee n factor s to interpre t is difficul t
an d can seem arbitrary .
Wit h respec t to structura l models , th e structur e versu s patter n
issu e is largel y irrelevant . Both th e weight s an d th e facto r relation
ship s ar e of interest . Therefore , ther e is littl e interes t in th e facto r
structur e matri x excep t as par t of a proces s for reconstructin g th e
relationship s amon g observe d measure s (i.e., mode l fitting) .

Factor

Analysis

and Multiple

Indicators

147

In summary , CFA develop s fro m theor y tha t specifie s exactl y th e


natur e of th e relationship s betwee n measure s an d factors , an d it can
be don e onl y if th e mode l is identified , yieldin g a uniqu e solution . In
othe r words , CFA is a form of laten t variabl e SEM. In CFA, th e
construct s ar e no t causall y interrelate d bu t ar e allowe d to covary/cor
relate . Th e theor y dictate s a mode l tha t can be presented as a pat h
model . Tha t mode l is teste d for plausibilit y by th e dat a collected , an d
it use s th e equatio n
lyy = + ,

(7.6)

whic h wil l provid e th e fundamenta l element s of laten t variabl e SEM


approache s tha t "causally " interrelat e laten t variables .

Constrainin g Relation s of Observe d Measure s Wit h Factor s

Befor e turnin g to algebrai c way s of assessin g plausibilit y of factor s


an d th e size of relation s betwee n measure s an d factors , a secon d topi c
fro m tes t theor y an d facto r analysi s is relevant . Tha t topi c deal s wit h
th e expecte d natur e of th e relation s of differen t measure s of a facto r
wit h tha t factor . In som e instances , for example , researcher s ma y
believ e tha t differen t observe d measure s wil l relat e to a facto r in
exactl y th e sam e way . If so , the y can examin e plausibilit y of stronge r
assumptions . Tha t is, th e basi c assumptio n is tha t indicator s wil l be
substantiall y relate d to th e factor s the y purportedl y measure . A
stronge r assumption , for example , woul d be tha t no t onl y ar e the y
related , bu t th e strength s of thei r relation s to thos e factor s ar e equa l
(e.g. , Joreskog , 1971).
First , conside r th e highl y restricte d conditio n in which , for a
facto r or laten t variable , th e relation s of each of th e differen t meas
ure s wit h th e facto r ar e expecte d to be exactl y th e sam e and th e
magnitud e of th e residual s is expecte d to be exactl y th e same . In thi s
case , th e researche r need s to be abl e to assum e tha t th e tru e scor e
componen t of each measur e is th e sam e an d tha t th e remainin g part s
of each measur e ar e th e same . If thes e assumption s can be made , the n
th e measure s ar e said to be paralle l test s of th e variable . For Figur e
7.2, th e measure s of th e facto r woul d be paralle l if = b = c = d an d
if e, = e = f) = e . Not e tha t instea d of estimatin g fou r differen t
facto r loadings , ther e no w is onl y on e to estimate . Tha t chang e yield s
2

148

FACTO R ANALYSI S AN D PAT H

MODELIN G

3 new degree s of freedo m in th e mode l tha t is estimated . For a


variance/covarianc e matrix , th e constraint s on th e residual s giv e 3
mor e degree s of freedom. Thus , th e paralle l tes t mode l ha s mor e
degree s of freedo m tha n doe s a basi c (unconstrained ) model , for onl y
on e loadin g an d on e residua l ar e estimated .
Second , if onl y th e relation s of measure s wit h th e variabl e ar e th e
sam e (a = b = c = d), the n th e measure s ar e calle d tau equivalent .
For tau equivalen t models , th e tru e scor e component s of th e model s
agai n ar e assume d to be th e same , bu t erro r component s ar e allowe d
to differ . Not e tha t if measure s ar e standardized , the n it makes no
sens e to constrai n th e loading s withou t als o constrainin g th e residu
als , for each total s to th e sam e valu e (a varianc e of 1.0). Tau equivalen t
model s als o hav e mor e degree s of freedo m tha n doe s th e basi c model .
Finally , if no constraint s ar e imposed , the n th e test s ar e calle d
congeneric . Thi s is th e basi c an d mos t commo n model . In man y
instances , no t enoug h is know n abou t th e indicator s to impos e
assumption s abou t equa l loading s on them . In man y others , re
searcher s kno w tha t th e assumptio n of equalit y of relationship s doe s
no t mak e sens e for thei r data .
Th e thre e model s can be compare d for a singl e set of data . Tha t
coul d be don e by movin g from leas t restrictiv e (congeneric ) to mos t
restrictiv e (parallel) , assessin g whethe r or no t addin g restriction s of
equalit y on th e loading s an d residual s is realistic . If th e fit of th e mode l
to th e dat a become s wors e as th e mode l is mad e mor e restrictive , the n
th e constraint s ar e no t plausibl e for th e data . As wil l be explaine d
later , "worse " can be define d by a numbe r of fit indexe s tha t can be
calculate d in structura l equatio n models .
10

Confirmator y Facto r Analysi s an d Metho d Factor s

The Basic Confirmatory Factor Analysis Path


Model for Multitrait-Multimethod
Matrices

If we assum e tha t trait s an d method s combin e additively , the n we can


integrat e MTM M matrice s wit h pat h model s an d diagra m th e mode l
10. For standardize d data , becaus e th e tota l varianc e fo r eac h variabl e is fixe d t o 1.0 an d
th e residual s ar e define d by th e commo n loading s (the y ar e sqrt( l - R ]), additiona l degree s
of freedo m ar e no t gaine d b y constrainin g th e residuals .
2

Factor Analysis and Multiple

Indicators

"/ /

b //

Xi,

1
1
Figur e 7 . 2 .

X2

B2
2

149

c\\

" \\

X3

X4 .

t+

&3
3

4
4

Facto r Mode l Illustratio n for Consistenc y Test s

as a confirmatory factor model, illustrated in Figure 7.3. Although an


additive model is a reasonable one to hypothesize, M T M M matrices
have not proven to be as straightforward as they at first seem to be.
First, there are arguments for traits and methods combining in
multiplicative fashion (e.g., Campbell & O'Connell, 1967). Second,
there are nonobvious issues of identification that need to be addressed
when using a full trait method model (e.g., Kenny & Kashy, 1992).
Nevertheless, for now we assume that they combine in additive
fashion and that the model is identified, for the principle of separating
trait variance from method variance illustrated here is a general one
and works successfully in situations other than the full trait method
model. Illustrations of CFAs of M T M M matrices have been provided
by Cole (1987), Dunn, Everitt, and Pickles (1993), and Marsh and
Byrne (1993). Dunn et al. (1993) looked at basic variations of
M T M M models, adding or excluding relations among methods,
among traits, and between traits and methods, illustrating what
happens under different assumptions.
Figure 7.3 contains three trait factors (above the measures) and
three method factors (below the measures). It is set up to be consistent
with Table 5.1. So, for example, the first, fourth, and seventh meas-

150

FACTO R ANALYSI S AN D PAT H

Figur e 7.3 . Multitrait-Multimetho


Facto r Analysis

MODELIN G

d Matri x Modelin g Usin g Confirmator y

ure s ar e th e one s measurin g th e firs t trait . Each measur e assesse s


(load s on ) a trai t an d a method . Th e diagra m assume s tha t trait s ar e
independen t of methods , thu s including no path s fro m trai t factor s
to metho d factors . Method s ar e allowe d to correlate , as, of course ,
ar e traits , for th e mode l woul d be uninterestin g if th e trait s assesse d
wer e independen t of on e another .
By usin g th e tracin g rul e for pat h models , th e differen t type s of
relationship s can be seen as reflectin g differen t combination s of paths .
Remember , as th e relationship s ar e interpreted , tha t in th e stan

Factor

Analysis

and Multiple

151

Indicators

dardize d metri c in whic h we ar e workin g all th e path s shoul d be less


tha n 1. To illustrat e throug h example s ho w th e pat h mode l decom
pose s relationship s withi n th e matrix :
Monotrait-Heteromethod

: r , = ab + /'Dm .
4

Not e tha t assumin g Campbel l an d Fiske' s (1959) Conditio n 1 as


describe d in Chapte r 5, Path s a an d b, trai t loadings , shoul d be
relativel y large , for th e measure s shoul d be substantiall y relate d to
th e trait s tha t the y ar e suppose d to assess . Th e secon d ter m on th e
righ t sid e of th e equatio n depend s on bot h th e strengt h of th e metho d
factor s an d thei r relationship . Becaus e it combine s thre e element s less
tha n 1, it is likel y to be smalle r tha n th e ab term . If th e method s ar e
independen t of on e another , the n th e righ t ter m woul d disappear ,
leavin g onl y th e trai t variance .
Heterotrait-Monomethod

: r

21

= aAd + jk.

As wa s tru e in th e precedin g illustration , th e size s of Paths ; an d k


depen d on th e strengt h of th e metho d variance . Th e firs t ter m on th e
righ t sid e of th e equatio n contain s tw o elements , trai t loadings , tha t
ough t to be substantia l assumin g Campbel l an d Fiske' s Conditio n 1.
Thus , th e size of th e firs t ter m depend s heavil y on th e strengt h of th e
relationshi p betwee n th e traits , whic h usuall y shoul d be neithe r to o
stron g no r zer o bu t coul d wel l be substantial .
Accordin g to Conditio n 3 of Campbel l an d Fiske (1959), r
shoul d be greate r tha n r . From th e CFA perspective , however , suc h
a conditio n is unnecessary . It woul d no t necessaril y be bad in term s
of validit y for Trait s 1 an d 2 to be moderatel y interrelate d an d for th e
metho d varianc e of Metho d 1 to be substantia l (whic h coul d resul t
in r , bein g large ) or for th e method s to be independen t (whic h woul d
reduc e th e size of r ) .
4 J

lx

4]

Heterotrait-Heteromethod

: r

51

= aAe + jDn.

In thi s case , bot h term s on th e righ t sid e of th e equatio n combin e


thre e element s less tha n 1, so thes e term s shoul d ten d to be less tha n
eithe r of th e tw o othe r type s just described .
Monotrait-Monomethod

: r

= a.
1

152

FACTO R ANALYSI S AN D PAT H

MODELIN G

In th e mode l of Campbel l an d Fiske , thes e term s ar e reliabilitie s (the y


ar e workin g from a correlatio n matrix) . In th e pat h mode l approach ,
th e solutio n proces s extract s trai t an d metho d varianc e an d leave s
a residual , whic h in Figur e 7.3 is th e e terms . Reliabilit y can be
determine d from th e residual s in pat h modelin g usin g 1 - e .
1

Confirmatory Factor Analysis Approaches to MultitraitMultimethod Matrices and Model Identification

Kenn y an d Kash y (1992) provide d a detaile d discussio n of identifica


tio n of MTM M matrices . Initially , for th e 3 (traits ) 3 (methods )
MTM M model , Kenn y an d Kash y note d tha t a mode l attemptin g to
correlat e trai t factor s wit h metho d factor s wil l no t be identified .
Furthermore , if researcher s attemp t to estimat e a solutio n wit h trait s
independen t from method s bu t in whic h all th e loading s on a singl e
facto r ar e force d to be equal , the n underidentificatio n wil l result .
Finally , eve n for a mode l like th e on e presente d in Figur e 7.3 in which ,
du e to independenc e of trai t an d metho d factors , empirica l identifi
catio n problem s seem less likely , Kenn y an d Kash y suggeste d tha t
mos t dat a set s hav e ha d problem s in findin g viabl e solutions . In th e
structura l equatio n literature , a viabl e solutio n is on e in whic h all
coefficient s ar e acceptable . Unacceptabl e value s includ e negativ e
variances , for eithe r residual s or factors , an d covariance s tha t excee d
th e produc t of th e standar d deviation s of th e variable s tha t covar y
(i.e., equivalen t to a correlatio n wit h an absolut e valu e greate r tha n
1.0). For MTM M matrices , assumin g tha t trai t measure s correlat e
positivel y wit h on e another , loading s withi n trai t factor s wit h differ
ing sign s also indicat e tha t ther e likel y wer e problem s in estimation .
As th e cleares t illustratio n of th e likelihoo d of problems , Wothk e
(1987) examine d 23 differen t MTM M dat a sets , attemptin g to fit
the m to CFA model s wit h trait s independen t of methods . Althoug h
th e problem s varie d fro m dat a set to dat a set , he reporte d failur e to
obtai n an acceptabl e solutio n in all 23 cases . Kenn y an d Kash y (1992)
suggeste d tha t identificatio n in suc h MTM M model s is mos t likel y to
occu r if th e loading s of measure s within factor s diverge . Otherwise ,
th e solutio n approximate s th e equa l loadin g on e tha t the y showe d
no t to be identified , wit h th e consequenc e tha t problem s of empirica l
identificatio n (th e dat a set producin g identificatio n problem s in a
mode l tha t coul d be identified ) emerge . But eve n in th e fina l instance ,

Factor

Analysis

and Multiple

Indicators

153

wit h divergen t loadings , Kenn y an d Kash y suggeste d tha t ther e stil l


migh t be othe r problem s in reachin g a solution .
Give n problem s tha t emerge d wit h 3 x 3 MTM M matrices , on e
migh t decid e to abando n MTM M model s altogether . Such a solutio n
seem s misguided , for th e proble m describe d appear s onl y in th e full y
crosse d model . So, for example , if on e metho d wer e to exer t littl e
commo n influenc e on measure s (i.e., be weak ) an d coul d be dropped ,
the n th e mode l coul d be identified . Furthermore , give n th e intuitiv e
appea l of th e MTM M CFA model , abandonin g MTM M model s seem s
sever e an d simplistic , for we kno w tha t ther e ar e source s of commo n
metho d varianc e tha t wil l bia s ou r solution s if the y ar e ignored .
Unfortunately , however , ther e ar e no t idea l structura l equatio n alter
native s (see als o Mars h & Grayson , 1995). First , researcher s ma y as
wel l examin e plausibilit y of thei r data' s fit to an additiv e model , bu t
the y migh t wel l expec t to encounte r problem s in estimation . If suc h
problem s occur , the n the y can try on e of severa l alternatives .
Kenn y an d Kash y (1992) suggested , as a firs t alternative , specify
ing method s as residua l covariance s in th e structura l mode l rathe r
tha n specifyin g the m as method s factors . Such an approac h wil l
produc e a solutio n bu t ha s tw o weaknesses . First , as describe d earlie r
in thi s chapter , a residua l covarianc e approac h can fit dat a tha t displa y
structure s othe r tha n commo n metho d factors . Second , th e approac h
require s tha t method s be independen t of on e another ; if method s
correlate , the n th e solutio n wil l be biased , likel y overestimatin g
convergen t validit y an d underestimatin g discriminan t validity .
A secon d alternativ e suggeste d by Kenn y an d Kash y (1992) is to
dro p on e of th e factors , choosin g fro m amon g th e method s factor s
becaus e it make s littl e sens e to dro p a trai t factor . In fact , I hav e
encountere d suc h an instanc e (Maruyama , 1982), for on e of th e
method s employe d wa s a free respons e metho d tha t in fact produce d
no metho d variance . Barrin g a sourc e of an eas y decisio n suc h as tha t
one , however , Kenn y an d Kash y describe d an approac h tha t drop s a
facto r withou t actuall y droppin g a factor . Tha t approach , simila r to
effect codin g in analysi s of variance , assign s weight s of + 1 an d - 1 to
variou s method s factor s so tha t th e method s factor s actuall y en d up
contrastin g variou s methods . For suc h an approach , a covarianc e
matri x rathe r tha n a correlatio n matri x shoul d be analyzed . Kenn y
an d Kash y suggeste d that , becaus e of th e restrictiv e assumption s mad e
in contrastin g methods , thi s approac h tend s inaccuratel y to lowe r
discriminan t validit y an d increas e convergen t validit y as wel l as to

154

F A C T O R ANALYSI S AN D PAT H

MODELIN G

lowe r estimate s of metho d variance . Finally , if no varian t of an


additiv e mode l fits , the n nonadditiv e effect model s coul d be exam
ine d for plausibility .

I Summary of Confirmatory
and Multitrait-Multimethod

Factor Analysis
Models

In summary , thi s sectio n introduce d forma l way s to thin k abou t an d


handl e effect s of commo n metho d varianc e withi n structura l equatio n
models . Even thoug h problem s ma y appea r if th e dat a includ e a full y
crosse d set of method s an d traits , it is importan t to conside r specifyin g
method s effect s in model s as a mean s of teasin g apar t trai t tru e scor e
varianc e fro m othe r source s of varianc e tha t obscur e th e natur e of
trai t relationships . Tha t is, additiv e effect s model s suc h as th e MTM M
mode l describe d in thi s chapte r can readil y handl e prominen t metho d
varianc e provide d tha t method s ar e no t full y crosse d wit h traits .

Initia l Testin g of Plausibilit y of Models : Consistenc y Test s

On e of th e primar y advantage s of introducin g multipl e measure s of


laten t variable s is tha t informatio n fro m the m can be use d to examin e
whethe r or no t thos e measure s defin e an underlyin g variabl e in a
consisten t way . Thi s sectio n demonstrate s on e wa y in whic h multipl e
indicator s can be use d to "test " for consistency . Th e perspectiv e
presente d wa s develope d primaril y by Costne r an d his colleague s
(e.g. , Costner , 1969; Costne r & Schoenberg , 1973). Ther e ar e forma l
test s tha t can be use d to tes t consistenc y usin g canonica l correlatio n
or structura l equatio n models . For thi s discussion , however , knowin g
ho w to us e those test s is less importan t tha n gainin g a goo d under
standin g of wha t multipl e indicator s provid e in th e wa y of informa
tio n an d ho w those indicator s can be use d to examin e viabilit y of
constructs . Th e approache s describe d illustrate , in a mor e simpl e way ,
th e processe s tha t ar e use d in laten t variabl e structura l equatio n
models . As note d earlie r in thi s chapter , reader s wh o wan t informa
tio n beyon d wha t is presente d shoul d conside r Kenn y (1979). Thi s
sectio n is presented assumin g a correlatio n metric , an d thi s is th e wa y
in whic h th e approac h wa s developed .

Factor

Analysis

and Multiple

155

Indicators

I Number of Indicators and Consistency Tests


Figur e 7.2 can be use d to illustrat e ho w consistenc y test s can be
performed . Becaus e onl y X's appea r in th e figure , an y correlation s
presente d wil l be expresse d usin g onl y th e numbers , for example , r
rathe r tha n rX,X . Conside r th e mode l firs t imaginin g tha t onl y Xj is
availabl e to measur e X. In tha t case , X need s to be define d exactl y by
X so pat h a is fixed to unit y an d e is fixed to zero. Assumin g
measure s an d construct s ar e th e sam e is wha t is don e by pat h analysis .
In doin g pat h analysis , therefore , researcher s hav e to hop e tha t X, is
at leas t a clos e approximatio n of X.
Second , imagin e tha t onl y X an d X are available . In tha t instance ,
ther e is on e correlatio n betwee n th e tw o measure s (r ) an d tw o path s
to estimat e (a an d b). Wha t result s whe n th e tracin g rule s from pat h
analysi s ar e applie d to th e mode l is on e equatio n in tw o unknowns ,
r , = ab, whic h is an underidentifie d model . It can be estimate d by
assumin g tha t th e path s are equa l (a = b); by selectin g tw o value s that ,
whe n multiplie d together , yield th e correlation ; or by fixin g on e of th e
tw o path s to unit y (1.0). If th e first case , the n each of th e tw o path s is
th e squar e roo t of th e correlatio n and , usin g th e terminolog y introduce d
earlier , th e indicator s are assume d to be parallel . If th e last case , the n th e
pat h tha t is no t fixed become s th e correlation . Th e middl e case work s
bu t is ver y difficul t to justify , for selectio n of th e tw o value s is arbitrary ,
as is thei r assignmen t to th e tw o measures . In summary , havin g tw o
indicator s provide s som e flexibilit y an d is markedl y bette r tha n havin g
onl y a singl e indicator , bu t it still is less tha n ideal .
Continuin g th e progressio n of addin g new indicators , imagin e
tha t th e firs t thre e indicator s of X ar e available . In thi s case , ther e ar e
thre e correlation s betwee n indicators , yieldin g thre e equation s an d
thre e unknowns . Th e mode l the n is just identified . Fro m th e tracin g
rules , th e equation s ar e

12

11

)2

r
r
r

12

13

23

= ab (sam e as th e two-indicato r model) ,


= ac, an d
= be.

Thus , th e mode l can estimat e a, b, an d c; those estimate s can be seen


mos t easil y in term s of thei r squares :
11. Ther e ar e alternative s suc h as adjustin g fo r unreliability , but , as note d earlie r in thi s
book , suc h correction s ar e risky , fo r the y ma y b e inaccurate .

156

FACTO R ANALYSI S AN D PAT H

MODELIN G

a = (r r,j) / r = abac I be = aa,


b (r r ) / r = abbe I ac = bb, an d
c = ( r r ) / r = aebe I ab = cc.
1

12

23

12

2J

13

23

12

Th e estimate s ar e no t independen t of on e another , for the y all involv e


th e sam e thre e correlations . Furthermore , as is tru e of all just-identi
fied models , ther e is onl y a singl e wa y in whic h to estimat e each path ,
an d no tes t of fit is possible . Thus , havin g availabl e thre e indicator s
is valuable , for it yield s estimate s of each of th e thre e paths . On th e
othe r hand , withi n a singl e facto r mode l ther e is no wa y in whic h to
judg e fit of thos e estimates , for th e mode l is jus t identifie d an d wil l
fit perfectly .
Addin g a fourt h indicato r allow s test s of th e consistenc y of
estimates , for ther e no w ar e mor e degree s of freedo m tha n path s (six
correlation s an d fou r paths) . Any numbe r of indicator s greate r tha n
four , of course , allow s simila r test s an d mor e of them . Th e equation s
ar e as follow s (th e firs t thre e ar e th e sam e as fro m th e three-indicato r
model) :
r
r
r
r
r
r

=
=
=
=
=
=

12

13

1}

ab,
ac,
be,
ad,
bd, an d
cd.

Estimatin g th e path s as befor e yield s square s of th e paths :

( 12

('1

c l

= ( 13

&

( ,4

u)

/ 23

n)

I U

23) /

^24 ) /

12
r

= ( 12 *

( 2

= ( 13

n)

24)

34 )

/ 24

= ( i3
r

I i4
r

( 23

/ 14 = ( 23

/ r

13

= (r

4 )

2 ) /

34 >

34 >

34

/ 24

r ) / r .
J4

2J

Ther e ar e thre e way s of estimatin g each of th e paths . (Do no t forge t


to tak e squar e root s to get th e paths. ) If th e mode l fits th e data , the n
th e variou s estimate s of each coefficien t shoul d be consisten t wit h on e
anothe r (i.e., approximatel y th e same) . If, however , th e differen t way s
of estimatin g a coefficien t yiel d markedl y differen t estimates , the n
ther e ar e problem s in th e model .

Factor

Analysis

and Multiple

Indicators

157

Althoug h consistenc y coul d be assesse d by calculatin g each esti


mat e of a, b, c, an d d in all possibl e ways , tha t approac h is no t optima l
becaus e th e differen t estimate s ar e no t independen t of on e another ;
ther e ar e onl y 2 degree s of freedo m in th e model . A mor e efficien t
wa y in whic h to examin e consistenc y is to us e th e thre e differen t pair s
of correlation s tha t shoul d be equal . Startin g fro m an y of th e equa
tions , deletin g th e redundan t term , an d movin g all term s fro m th e
denominato r wil l resul t in tw o of th e thre e pair s of correlations . Th e
thre e pair s ar e

12 34

13 24
f

14 23

For example , conside r


r

i2* n/ 23
r

\2 ul
T

Th e r can be delete d (by dividin g bot h side s by r ) fro m bot h side s


of th e equation , leavin g
]2

\il

23

24

Multiplyin g by (r 3 x r ) yield s
2

24

13 24

14 23

Not e tha t all fou r measure s appea r in th e subscript s on each sid e of


th e equation .
Th e equalit y r r = r r = r r yield s thre e of wha t Kenn y
(1979) calle d "vanishin g tetrads, " for th e difference s betwee n th e
pair s of correlation s shoul d be 0 if th e mode l is in fact tru e an d a
singl e facto r fits th e data . Becaus e th e fou r indicator s all defin e
a singl e factor , thi s consistenc y tes t can be though t of as consistency
within a construct . Th e vanishin g tetrad s ar e
12

J4

2 4

) 4

2 3

U 34- i3 24 =
r

i3 24- 14 23 =
r

Th e tetrad s ar e no t independent , for th e mode l ha s onl y 6 - 4 = 2


degree s of freedom . Nonetheless , the y provid e valuabl e informatio n

158

F A C T O R ANALYSI S AN D PAT H

MODELIN G

abou t plausibilit y of a single-facto r model . If th e tetrad s approximat e


zero , the n th e single-facto r mode l seem s plausible .
Befor e readin g further , reader s shoul d attemp t Exercis e 7.2, in
whic h pat h estimate s an d vanishin g tetrad s ar e calculated .
I

Costner's Original Consistency

Model

Th e "classic " mode l develope d by Costne r an d his colleague s (Cost


ner , 1969; Costne r & Schoenberg , 1973) appear s in Figur e 7.4. Not e
tha t if X an d ar e th e sam e variabl e (e = 1), the n Figure s 7.2 an d 7.4
woul d be identica l an d ther e woul d be nothin g ne w to discuss .
Assumin g tha t the y ar e no t identical , thi s mode l test s consistency
between constructs .
As wa s don e for th e previou s model , th e logi c of pat h analysi s can
be use d to trac e th e path s an d represen t th e relationship s betwee n th e
observe d measures . Th e mode l ha s six correlation s an d five path s to
estimate , thu s leavin g onl y 1 degre e of freedom . Th e equation s ar e
rX,X
rX,Y,
rX,Y
rX Y,
rX Y
rX,Y

=
=
=
=
=
=

ab,
aec,
aed,
bee,
bed, an d
cd.

Althoug h it ma y no t be immediatel y obvious , thes e equation s can be


combine d to yiel d th e followin g equality :
rX,Y, rX Y = rX,Y rX Y,
aec x bed = aed x bee = abede.
2

or

Becaus e bot h side s shoul d be th e same , thei r differenc e shoul d be


zero . If thi s differenc e approximate s zero , the n th e mode l fits (i.e.,
ther e seem s to be no nonrando m measuremen t erro r in th e model) .
Finally , Kenn y (1979) introduce d yet a thir d mode l variatio n tha t
can be use d for consistenc y tests . Thi s varian t begin s wit h a three indicator , just-identifie d mode l an d add s a fourt h indicato r tha t come s
fro m a differen t conceptua l variable , as illustrate d in Figur e 7.5.
Figur e 7.5 also is a mode l wit h 6 availabl e degree s of freedo m
an d five path s and , thus , seemingl y woul d hav e 1 degre e of freedom .

Factor Analysis and Multiple

Indicators

159

(V)

/
/

It

X2

Y2

It

1
1

\ d
\

2
2

3
3

Figure 7.4. Costner Model for Consistency Tests

In fact, there are 2 degrees of freedom coupled with underidentification, for the d and e paths cannot be uniquely solved; only the de
product can be determined.
The , , X2 , and X relationships are exactly the same as in the
three-indicator, single-factor model, and their relationships with
Y, are
3

rX,Y, = aed,
rX Y, = bed, and
rX Y, = ced.
2

The equality for this model is


rXjXj rX Y, = rX,X rX Y, = r X ^ , ,
2

ac bed

abx

ced

= be aed

or

abcde,

which could yield two vanishing tetrads. This model allows testing
consistency of indicators on constructs with only three available
indicators. Kenny (1979) called this consistency of the epistemic

160

FACTO R ANALYSI S A N D PAT H

MODELIN G

a /

I
I

x,:

1
1
Figure 7 . 5 .

I1

b /

c \

1I
1

2
2

'

I I

3
3

Yi

Kenn y Epistemi c Consistenc y Mode l

correlation, which is the relationship of an indictor with the under lying construct.
In summary, the consistency tests illustrate the information that
is gained by the availability of multiple indicators. They also allow
investigators to examine plausibility of their models at a model
development stage. They can be used during development of con structs to establish plausibility of single "factoredness" or to identify
indicators that could be problematic in structural equation models.
If, for example, "extra" indicators are available, then consistency
information could be used to decide whether or not to drop indicators
before building models or to add to a model additional factors
representing influences such as method variance.
Most important for many investigators, the "consistency" ap proaches can remove much of the mysticism that comes from struc tural equation models generally and from large models particularly
by giving investigators a better feel for their data. They can be used
to examine sources of problems when models are not fitting well.
They also serve a valuable prospective function, for investigators can
use these methods with pilot data to get a sense of the factor structure.

Factor

Analysis

and Multiple

Indicators

161

(In suc h instances , inspectio n of outlier s is particularl y important , for


singl e outlier s can markedl y chang e correlation s in smal l samples. )
Finally , full-informatio n solution s like those use d wit h laten t variabl e
structura l equatio n model s estimat e path s usin g informatio n fro m all
th e differen t way s of estimatin g them , in effect tryin g to reconcil e th e
differen t way s of estimation . Thus , to th e degre e tha t differen t
estimate s ar e no t consistent , fit suffer s an d estimate s can becom e less
stable .
Reader s shoul d no w try Exercis e 7.3, whic h take s th e informatio n
fro m th e consistenc y test s an d use s it to demonstrat e ho w overal l
mode l fit is calculated . Thi s illustratio n is ver y important , for it
demonstrate s ho w SEM program s calculat e th e goodness-of-fi t statis
tics an d indexe s tha t the y do .
At thi s point , all th e backgroun d informatio n to prepar e reader s
to becom e SEM researcher s ha s bee n provided . Reader s shoul d be
familia r wit h th e logi c underlyin g basi c pat h models , ho w thos e
model s can be decomposed int o direc t an d indirec t causa l effect s plu s
noncausa l effects , option s (nonrecursiv e model s or longitudina l pane l
models ) to conside r whe n tw o or mor e variable s seem to caus e each
other , th e importanc e of havin g availabl e multipl e measure s of con
struct s of interest , an d ho w to mode l residua l covariance s stemmin g
fro m additiona l source s of commo n variance . Finally , befor e turnin g
to full laten t variabl e structura l equatio n models , it is importan t to
agai n issu e th e reminde r tha t SEM approache s begi n wit h an d ar e
drive n by theory . The y ar e intende d to be confirmator y (i.e., to tes t
existin g model s of reality) , no t to tinke r wit h to generat e model s of
reality .

E X E R C I SE

7.1

Settin g Up Matrice s fo r Confirmator y Facto r Analysi s

Matrice s for th e MTM M CFA mode l ar e set up in exactl y th e


sam e wa y as wa s don e in an earlie r sectio n of thi s chapter . Set
up th e matrice s for Figur e 7.3. Cal l th e facto r patter n matri x
lambda , an d set it up . (Hint : Ther e ar e trait-plus-metho d
numbe r of factors. ) Cal l th e facto r correlatio n matri x phi , an d
set it up . Call th e residua l matri x theta , an d set it up .

FACTO R ANALYSI S AN D PAT H

162

E X E R C I SE

MODELIN G

7 . 2

Consistenc y Test s

Usin g th e approac h outline d in thi s chapter , estimat e th e


loading s of th e first an d last measure s of each construc t in all
possibl e way s (fou r variable s shoul d yiel d thre e ways , an d five
variable s shoul d yiel d six ways) . If reader s wan t to loo k at a
relevan t diagram , the y shoul d loo k at Figur e 7.2. For th e firs t
example , just ad d an additiona l indicato r X wit h a pat h e\ for
th e secon d illustration , imagin e tha t th e X's ar e Vs .
5

Get th e poole d estimat e of thos e loading s by summin g nu


merator s an d denominator s fro m th e variou s estimate s sepa
rately . Finally , calculat e th e "vanishin g tetrads " generate d
fro m Measure s X1-X4 for each construct , and , by inspection ,
asses s plausibilit y of a single-facto r model .

Construc t 1 : Academi c Achievemen t Value s


Xl = studyin g consistentl y t o becom e wel l educate d
X2 = workin g har d t o achiev e academi c honor s
Xi = strivin g t o ge t to p grad e poin t averag e
X = studyin g har d t o ge t goo d grade s
Xj = hour s spen t on homewor k

X.

*2

X3

x<

Xi

1.00

X2

.47

XJ

.41

.55

X4

.46

.56

.59

1.00

Xs

.06

.10

.08

.10

1.00

Construc t 2 : Famil y Socia l Clas s


Yl = hom e richnes s inde x
Y2 = famil y finance s
Yl = father' s educatio n
Yi = mother' s educatio n

x<

1.00

1.00

Factor

Analysis

and Multiple

16 3

Indicators

1
Yi

1.00

Yi
Yi

.42
.32

.31

1.00

.27

.35

.55

1.00

E X E R C I SE

1.00

7. 3

Calculatin g Residua l Matrice s Use d in Fit Test s

Use th e followin g estimate s of th e path s fro m th e measure s


to th e underlyin g factor/construc t for th e firs t par t of Exercis e
7.2 (i.e., th e Academi c Achievemen t Value s construct) :

X
X,
X,
X,
X,
2

pat h
pat h
pat h
pat h
pat h

=
=
=
=
=

.585
.762
.723
.782
.118.

Use th e abov e path s to estimat e wha t each of th e correlation s


betwee n each pai r of measure s is according to the model. Tha t
is don e by usin g th e tracin g rules , for example , r, = .585
.762. In Figur e 7.2, r = a b. Similarly , r = a c, an d so
forth . Pu t each correlatio n int o matri x for m parallelin g th e
matri x in th e firs t par t of Exercis e 7.2. Whe n all 10 correla
tion s hav e bee n computed , th e resul t is a predicte d vari
ance/covarianc e matri x (call it ) for th e model . Compar e th e
matri x predicte d by th e mode l an d th e on e observe d (call it
S). Th e differenc e betwee n th e predicte d an d observe d co
varianc e matrice s ( - S) is th e residual . Tha t residua l is wha t
is teste d for significanc e in structura l equatio n programs .
Becaus e th e tes t is a tes t of th e residual , significanc e is no t
wanted , for tha t mean s tha t th e residua l is differen t fro m 0,
whic h mean s tha t th e mode l doe s no t fit; it leave s unex
plaine d appreciabl e variability . Thus , it is a significanc e tes t
tha t seem s "backward. "
2

12

164

FACTO R ANALYSI S AN D PAT H

MODELIN G

Doc s you r inspectio n of th e difference s lead you to th e sam e


conclusio n tha t th e vanishin g tetrad s did ?
In maximu m likelihoo d program s suc h as LISREL, th e fittin g
functio n is of th e form
F = 1| | -1| | + triS!" ) - ,
1

wher e is th e predicte d variance/covarianc e matrix , S is th e ob


serve d variance/covarianc e matrix , an d (or p + q if exogenou s
an d endogenou s variable s ar e separated ) is th e size of th e inpu t
matrix . In English , th e equatio n say s tha t th e functio n is th e log
of th e determinan t of Matri x minu s th e log of th e determinan t
of Matri x S plu s th e trac e of th e Matri x S time s Matri x ~' minu s
(wher e is th e size of th e observe d matrix) . Regardles s of whethe r
or no t reader s follo w all th e matri x operations , th e logi c of
minimizatio n is tha t as 5 an d converge , thei r determinant s als o
converge , an d th e differenc e betwee n th e firs t tw o term s goe s to
0. Also as the y converge , ~' approache s
whic h makes 5"
approac h an identit y matrix . Becaus e th e trac e is th e su m of th e
diagona l elements , it approache s n, and thei r differenc e also goe s to 0.
1

S O L U T I O NS
TO
E X E R C I S ES
Exercis e 7.1

T,

Xl

Xj

Tj

M,

Mj

Lambd a
0

Xs

Factor

Analysis

and Multiple

Indicators

165

Ti

M,

T\

X?

Xi

Ph i
Ti

1.0

Ti

1.0

Ti

1.0

Mi

1.0

Mi

1.0

Mi

1.0

Thet a (diagonal )
[ei

ei

ei e* es et ei e% ]

Exercis e 7.2

A solutio n for Construc t 1 appear s in Exercis e 7.3. O f mos t


importanc e is tha t th e mode l fits wel l in Construc t 1 despit e
havin g on e item tha t correlate s poorl y wit h all othe r indica
tors . Its low loadin g likel y suggest s it shoul d be droppe d fro m
th e model , yet it doe s no t lead to a poo r fita n importan t
poin t to remembe r whe n thinkin g about overal l mode l fit.
Construc t 2 yield s discrepan t estimate s for path s as wel l as
nonvanishin g tetrads . It is no t single-factored .

Exercis e 7.3

Consisten t wit h th e finding s fro m Exercis e 7.2, all th e residu


als ar e ver y small , supportin g th e single-facto r interpretation .

166

Illustratio n 3:

F A C T O R ANALYSI S AN D PAT H

MODELIN G

Pee r Popularit y an d Academi c Achievemen t


Confirmator y Facto r Analysi s

This illustratio n continue s analysi s of a single dat a se t with differen t methods .


As wa s don e with th e prior two illustration s with this dat a set , the mode l se t
up for the SEM progra m LISREL appear s a s a n appendix . LISREL outpu t is
adde d to the appendix . The matrice s presente d correspon d directl y to
Equatio n 7.6 presente d earlier :

ly y = ' + ,

wher e Zyyis the variance/covarianc e matri x of observe d measures , (an d F)


is the facto r patter n matrix , is the facto r correlatio n matrix , an d is the
residua l variance/covarianc e matrix .
On the basi s of othe r analyses , the theoretica l variabl e of Family Socia l Class
was droppe d from the mode l becaus e it wa s not relate d to an y othe r variable ,
eithe r in the observe d variabl e model s or in the laten t variabl e models . The
remainin g eigh t laten t variable s (which cros s thre e tim e periods ) appea r in
Figure 7.6 . Reader s shoul d not e tha t the y ar e the sam e variable s a s appea r
in Figure 9.3 but tha t thos e appea r in a pat h mode l rathe r tha n in a confirmator y
facto r model . Their measure s ar e a s follows:
1.

Academi c Ability, measure d by th e Peabod y P VT (16 ) an d th e Rave n


Progressiv e Matrice s (17) ;
2-4. Acceptanc e by Peers , measure d by choice s for seating , schoolwork ,
an d playgroun d choice s (thre e waves : 13 , 14 , 15 ; 4, 5, 6; 7, 8, 9);
5-7 . Academi c Achievement , measure d by performanc e on standardize d
verba l achievemen t test s an d verba l grade s (als o thre e waves : 27 , 28 ;
18 , 19 ; 20 , 21) ; an d
8. Teache r Ratings , measure d by th e semanti c differentia l scal e scor e (30 )
an d a genera l expectatio n ratin g (32) .
Although the greates t interes t in thes e dat a stem s from the structura l mode l
relationship s betwee n pee r acceptanc e an d achievement , the y als o ar e ame
nabl e to CFA. As will be explaine d in detai l in Chapte r 10 , the fit of the CFA
mode l would be identica l to the fit of a just-identifie d structura l mode l causall y
linking the variables .
The solutio n tha t result s from the progra m an d mode l is a s follows:

Factor

Analysis

Figur e 7 . 6 .

and Multiple

Indicators

C o n f i r m a t o r y F a c t o r Analysi s fo r F o u r - P a r t Illustratio n

167

FACTO R ANALYSI S AN D PAT H

168

MODELIN G

LISRE L Estimate s (maximu m likelihood) :


Relation s of measure s t o construct s (lambd a V)
Latent variable
Ability Peer Ace 1 Achieve 1 PeerAcc2 AdhleveZ PeerAcc3 Achieves
VAR1 6

1.00

VAR1 7

.60

TEvaluat

(.25)
2.41
VAR1 3

100

VAR 14

.69

(.14)
4.85
VAR 15

.84
(.16)
5.39

VAR 2 7

1.00

VAR 28

.98

(.24)
4.13
VAR 4

1.00

VAR 5

1.00

(.11)
9.05
VAR 6

1.06
(.11)
9.72

VAR 18

1.00

VAR 19

1.11

1.00

.95

1 00

(.15)
7.22
VAR 7

VAR 8

{13 )
7.12
VAR 9

.88
(.12)
7.25

VAR 2 0

Factor

Analysis

and Multiple

169

Indicators

LISRE L Estimate s (continued )


Latent Variable
Ability PeerAcrt Achlevel PeerAcc2Achleve2PeerAccSAchieves
VAR 21

VAR 30
VAR 32

.61
(.14 )
4.2 1
1.0 0
.61
(.15 )
4.0 8

Latent Variable
Ability PeerActf Achlevel PeerAcc2 Achieve2PeerAcc3Achieve3
Ability

.46
(.22 )
2.0 5

PeerAc d

.15
(.09 )

Achlevel

PeerAcc 2

Achieve2

PeerAcc 3

Achieve3

TEvaluat

TEvaluat

TEvaluat

.71
(.17 )

1.6 4
.24

4.0 6
.16

.30

(.09 )
2.71
.04
(.08 )
0.4 2
.30
(.09 )
3.3 9
-.04
(.09 )
-0.4 7
.25
(.11 )
2.2 7
.25
(.10 )
2.5 6

(.08 )
1.9 7
.14
(.09 )
1.6 2
.14
(.08 )
1.8 3
.05
(.09 )
0.5 6
.27
(.11 )
2.5 2
.27
(.10 )
2.7 6

(.13 )
2.3 2
.08
(.07 )
1.0 9
.37
(.09 )
4.0 1
.06
(.08 )
0.8 1
.55
(.14 )
3.9 7
.38
(.10 )
3.8 3

.69
(.14 )
4.7 9
.09
(.07 )
1.3 3
.14
(.09 )
1.6 8
.20
(.10 )
2.0 4
.10
(.09 )
1.1 9

.50
(.12 )
4.1 7
-.03
(.07 )
-0.3 6
.57
(.12 )
4.7 1
.42
(.10 )
4.3 7

.72
(.16 )
4.5 4
.00
(.10 )
0.0 2
.02
(.09 )
0.2 4

.85
(.23 )
3.6 7
.63
(.13 )
4.9 6

NOTE: Standar d errors are In parentheses , (values are In rows unde r standar d errors .

.64
(.19 )
3.4 8

170

FACTO R ANALYSI S AN D PAT H

MODELIN G

The covarianc e metri c make s it mor e difficult to interpre t the size of th e


relationships . One could easil y as k for th e scale d solution , which provide s
estimate s in which the laten t variable s ar e scale d to unit variance .
Although th e chi-squar e is prett y goodchi-square with 11 5 degree s of
freedo m = 117.1 4 (p = .43)other informatio n from th e solutio n suggest s tha t
the mode l could be improved . In othe r words , given th e existin g measuremen t
structure , ther e is no solutio n tha t would provide a bette r fit. Becaus e ther e ar e
no degree s of freedo m in the relationship s amon g th e theoretica l variables ,
improvin g the mode l would requir e som e type of reconceptualizatio n of th e
measuremen t model . That could be don e by addin g residua l covariance s to
the existin g measuremen t mode l or by changin g th e basi c measuremen t
model .

A P P E N D IX

7. 1

LISRE L Setu p an d Outpu t Fro m Illustratio n


Model : Lyy = + ,
wher e
Lyy is th e variance/covarianc e matri x of observe d measure s
(an d F) is th e facto r patter n matrix , designate d by th e letter s LY in LISRE L
is th e facto r correlatio n matrix , designate d by PS in LISRE L
is th e residua l variance/covarianc e matrix , designate d by T E in LISRE L

Th e matri x for thes e analyse s is on th e opposit e page .


Th e LISREL contro l card s (althoug h th e measure s ar e selecte d
fro m a large r matri x calle d MAfullmt.r x containin g additiona l mea sures , whic h account s for th e 33 measure s an d th e nee d for an SE
line , plu s th e selectio n tha t follow s it) ar e as follows :
Mexica n America n data , run s for choice s of whites , CFA , wit h
referenc e indicator s
DA NI=3 3 NO=10 0 MA=CM
KM F U F O FI=a:MAfullmt.r
(8F10.7 )
SD F O

- * -

S 2 5 S S

<\

\ > VX
> '
I
I

S S

r< rH

171

FACTO R ANALYSI S AN D PAT H

172

MODELIN G

(11F7.5 )
1. 0 1. 0 1. 0 1. 0 1. 0 1. 0 1.02 5 1.04 9 .98 1 1. 0 1. 0
1. 0 1. 0 1. 0 1. 0 1. 0 1. 0 .90 1 .90 7 1.11 4 1.20 0 .91 1
.93 6 .76 6 .87 5 .92 6 1. 0 1. 0 1. 0 1. 0 .70 5 1. 0 1. 0
SE
16 1 7 1 3 1 4 1 5 2 7 2 8 4 5 6 1 8 1 9 7 8 9 2 0 2 1 3 0 3 2 /
MO NY=1 9 NE= 8 LY=FU,F I BE=FU,F I PS=SY,F r TE=SY,F I
FR L Y 2 1 L Y 4 2 L Y 5 2 L Y 7 3 L Y 9 4 L Y 1 0 4 C
ly 1 9 8 l y1 2 5 l y1 46 l y 1 56 l y 1 77
FR T E 1 1 T E 2 2 T E 3 3 T E 4 4 T E 5 5 T E 6 6 T E 1 7 1 7
TE 1 8 1 8 C
TE 7 7 T E 8 8 T E 9 9 T E 1 0 1 0 T E 11 11 T E 1 2 1 2 T E 1 3
13 T E 1 4 1 4 T E 1 5 1 5 C
TE 1 6 1 6 T E 1 7 1 7 T E 1 1 6 T E 1 2 7 T E 1 3 8 T E 1 4 9 T E
16 6 C
TE 1 7 7 T E 1 5 1 0 T E 1 6 11 T E 1 7 1 2 t e 1 8 1 8 t e 1 9 1 9
ST 1. 0 L Y 1 1 L Y 3 2 L Y 6 3 L Y 8 4 L Y 11 5 L Y 1 3 6 L Y
16 7 C
LY 9 4 L Y 8 4 L Y 1 8 8
ST . 7 L Y 4 2 L Y 7 3 L Y 5 2
ST . 3 L Y 2 1
ST 1. 0 P S 1 1 P S 2 2 P S 3 3 P S 4 4 p s 5 5 p s 6 6 p s
7 7 p s 8 8
ST . 7 T E 1 1 T E 2 2 T E 3 3 T E 4 4 T E 5 5 T E 6 6 T E 7
7 T E8 8 T E9 9 C
TE 1 0 1 0 T E 1 1 1 1 T E 1 2 1 2 T E 1 3 1 3 T E 1 4 1 4 T E 1 5 1 5
TE 1 6 1 6 T E 1 7 1 7
ST . 6 T E 1 8 1 8 T E 1 9 1 9
pat h diagra
m
OU P T S ET VAD=OF F LY=SMACFAl m BE=SMACFAl m PS=SMACFAl m
TE=SMACFAlm
Th e fit indexe s for th e CFA analysi s (to revie w afte r readin g
Chapte r 10) wer e as follows :
GOODNESS O F FI T STATISTIC S
CHI-SQUARE WIT H 11 5 DEGREES O F FREEDO M = 117.1

4 ( P=

0.43 )
ESTIMATED NON-CENTRALIT Y PARAMETER (NCP ) = 2 . 1 4
9 0 PERCEN T CONFIDENCE INTERVA L FO RNC P = (0. 0 ; 31.84

Factor

Analysis

and Multiple

173

Indicators

MINIMUM FI T FUNCTIO N VALUE = 1 . 1 8


POPULATION DISCREPANC Y FUNCTIO N VALUE (FO )
9 0 PERCEN T CONFIDENCE INTERVA L FO R F O = (0.
ROOT MEAN SQUARE ERRO R O F APPROXIMATION
0.01 4
9 0 PERCEN T CONFIDENCE INTERVA L FO R RMSEA
0.053 )

= 0.02 2
0 ; 0.32 )
(RMSEA ) =
=

(0.

0 ;

P-VALUE FO R TES T O F CLOS E FI T (RMSE A < 0.05 ) = 0.9 3


EXPECTED CROSS-VALIDATIO N INDE X (ECVI ) = 2 . 7 0
9 0 PERCEN T CONFIDENCE INTERVA L FO R ECV I = (2.6 8 ;
3 .00 )
ECVI FO R SATURATED MODEL = 3.8 4
ECVI FO R INDEPENDENCE MODEL = 8.2 6
CHI-SQUARE FO R INDEPENDENCE MODEL WIT H 17 1 DEGREES O F
FREEDOM = 779.4 9
INDEPENDENCE AKAIK E INFORMATIO N CRITERI A
817.4 9
MODEL AI C = 267.1 4
SATURATED AI C = 380.0 0
INDEPENDENCE CAI C = 885.9 9
MODEL CAI C = 537.5 3
SATURATED CAI C = 1064.9 8

(AIC )

ROOT MEAN SQUARE RESIDUA L (RMR) = 0.05 6


STANDARDIZED RMR = 0.05 6
GOODNES
S O F FI T INDE X (GFI ) = 0.9 0
ADJUSTED GOODNESS O F FI T INDE X (AGFI ) = 0 . 8 3
PARSIMONY GOODNESS O F FI T INDE X (PGFI ) = 0 . 5 4
NORME
D FI T INDE X (NFI ) = 0 . 8 5
NON-NORME
D FI T INDE X (NNFI ) = 0 . 9 9
PARSIMONY NORMED FI T INDE X (PNFI ) = 0 . 5 7
COMPARATIV
E FI T INDE X (CFI ) = 1 . 0 0
INCREMENTAL FI T INDE X (IFI ) = 1 . 0 0
RELATIVE FI T INDE X (RFI ) = 0.7 8
CRITICA L (CN ) = 130.4 7

LATEN T VARIABL E
STRUCTURA L
EQUATIO N
MODEL S

Ill
Movin g to laten t variabl e structura l equatio n mod elin g (SEM) is no w bu t a smal l ste p fro m method s an d idea s tha t hav e
bee n covere d thu s far . Tha t ste p integrate s th e logi c of facto r analysi s
fro m Chapte r 7 wit h th e logi c of pat h modeling . In laten t variabl e
modeling , th e variable s tha t appea r in th e pat h model s actuall y ar e
factor s extracte d throug h confirmator y facto r analysi s (CFA) . Th e
factors/variable s ar e define d by a set of observe d measures . Each
measur e is specifie d a prior i as bein g relate d to on e or mor e of th e
factors . Th e relationship s betwee n factor s an d measure s ar e specifie d
by equation s exactl y like th e facto r analysi s model , Y = Pf + e. Th e
factor s the n ar e interrelate d usin g an equatio n tha t parallel s th e
traditiona l regressio n equation , Y = AX + BY + (Y her e is not th e
sam e Yas in th e facto r analysi s equation) . Wha t prevent s th e solutio n
fro m bein g a simpl e regressio n mode l is tha t th e X's an d Y's in th e
regressio n equatio n ar e no t measure d directl y bu t rathe r ar e laten t
variable s tappe d onl y throug h th e observe d measure s tha t ar e in tende d to operationaliz e them .
Ther e ar e a coupl e of additiona l complication s associate d wit h th e
transitio n to laten t variabl e SEM. First , ther e ar e tw o set s of factor s
extracted , on e for endogenou s variable s (in typica l facto r analysi s
terminology , Y = Pf + e) an d th e othe r for exogenou s variable s (X =
Ff + e', wit h prime s intende d onl y to distinguis h th e coefficient s in
177

LATEN T VARIABL E

178

MODEL S

th e X mode l from thos e in th e Y model) . Thos e factor s if' an d f) the n


becom e th e respectiv e X an d Y variable s tha t ar e interrelate d usin g
th e regressio n mode l (f = Af + Bf + ) . Second , th e differen t
compute r program s for analyzin g laten t variabl e SEMs hav e use d
variou s set s of symbol s an d format s to presen t th e equations . Thi s
boo k use s th e notatio n of th e LISREL progra m (e.g. , Joresko g &
Sorbom , 1988, 1993), th e firs t an d mos t widel y use d of th e SEM
programs . LISREL present s th e matrice s usin g Gree k letter s to signif y
vector s an d matrices . Th e issue s ar e presente d in a wa y tha t user s of
earlie r versions , as wel l as th e mos t recen t versions , of th e LISREL
compute r progra m shoul d be abl e to understan d an d appl y them .
Thi s chapte r present s th e basic s of laten t variabl e SEM. Estimatio n in
SEM is don e usin g ful l informatio n approache s (i.e., estimatio n of
each paramete r use s all availabl e informatio n fro m th e covarianc e
matri x in determinin g th e estimate) , whic h mean s tha t th e facto r an d
regressio n component s of th e model s ar e estimate d simultaneously .
Nevertheless , as is commonl y don e by SEM programs , th e presenta
tio n is divide d up int o tw o components . Th e componen t relatin g
observe d measure s to laten t variable s is presente d first , followe d by
th e componen t interrelatin g laten t variables . Th e importanc e of
referenc e indicators , or measure s use d to provid e a scale or metri c for
unmeasure d variables , als o is presented . The n th e ful l mode l is
illustrate d throug h an example . Th e illustratio n cover s issue s of
mode l specificatio n an d identificatio n an d set s up th e matrice s tha t
ar e neede d for laten t variabl e SEM. Finally , basi c issue s of mode l
fittin g ar e discussed .

I Th e Basic Laten t Variabl e Structura l Equatio n Mode l


|

The Measurement

Model

Th e measuremen t mode l is th e mode l discusse d in Chapte r 7 relatin g


measure s to theoretica l variable s or factors . It contain s informatio n
abou t ho w theoretica l variable s ar e operationalize d in each study .
Althoug h in pat h analysi s informatio n abou t operationalizatio n can
be hidde n by label s (e.g. , by callin g a measur e of schoo l grade s
"achievement " an d usin g tha t labe l in an y figure s an d discussion) , in
laten t variabl e model s suc h informatio n is mor e readil y apparent .

Latent

Variable

Structural

Equation

Modeling

179

Each indicato r need s to be described , an d its relationshi p to th e


conceptua l variable(s ) it is suppose d to asses s need s to be specified .
Wit h respec t to writte n researc h reports , th e descriptio n of con
structs/laten t variable s an d th e measure s tha t operationaliz e the m
shoul d appea r in th e introductio n an d method s sections . Consisten t
wit h notion s tha t researcher s nee d to specif y th e natur e of relation
ship s of measure s wit h variables , inaccuracie s or imprecisio n in
definin g laten t variable s usuall y is calle d specificatio n error . A secon d
typ e of specificatio n erro r come s fro m inaccuratel y definin g th e
relationship s amon g laten t variables . Thus , whe n researcher s mentio n
misspecifie d models , the y ar e suggestin g tha t ther e is inaccurac y in
specifyin g relation s of measure s eithe r to variable s or amon g vari
ables .
In th e LISREL measuremen t model , tw o CFA model s ar e built ,
on e for exogenou s variable s an d th e othe r for endogenou s variables .
Actually , separatin g variable s is no t necessary ; on e can trea t exoge
nou s variable s as if the y wer e endogenou s an d thereb y includ e th e
ful l facto r mode l in a singl e set of equations . Th e approac h is
mathematicall y equivalen t to th e tw o set s of factor s approac h tha t is
th e basi c on e for compute r program s suc h as LISREL. Becaus e intro
ducin g th e tw o approache s togethe r can be confusin g to readers ,
however , presentatio n of wha t wil l be calle d "an all Y model " is
delaye d unti l late r an d is covere d onl y briefl y becaus e mos t othe r SEM
program s ar e equatio n base d rathe r tha n matri x based , makin g th e
distinctio n unnecessary . Tha t is, becaus e othe r SEM program s suc h
as AMO S (Arbuckle , 1994, 1997), EQS (e.g. , Bentler , 1989), an d th e
SIMPLIS languag e of LISREL (Joresko g & Sorbom , 1993) ar e set up
by definin g individua l equation s rathe r tha n specifyin g element s of
matrices , thi s distinctio n betwee n measuremen t model s is irrelevant .
Despit e thei r appearance , however , th e program s actuall y us e matri
ces equivalen t to those presente d in LISREL to solv e for estimate d
parameters .
In all SEM programs , includin g LISREL an d EQS, th e measure
men t mode l is a serie s of regressio n equation s linkin g measure s to
factorsth e traditiona l facto r analysi s approach . Relationship s can
be specifie d eithe r in a serie s of equations , on e for each observe d
measur e becaus e in facto r analysi s observe d measure s ar e th e depen
den t variables , or in matri x form consisten t wit h th e basi c facto r
analysi s formula . Wherea s AMOS , EQS, an d th e SIMPLIS versio n of
LISREL, for example , hav e researcher s defin e thei r model s equatio n

LATEN T VARIABL E

180

MODEL S

by equation , th e basi c LISREL progra m ha s researcher s specif y row


an d colum n coordinate s of parameter s to be estimate d withi n matrices .
Usin g LISREL terminology , in matri x for m th e facto r analysi s
equation s of th e form Y = Pf + e ar e
= + e for th e endogenou s variable s an d
X = + for th e exogenou s variables .

(8.1)
(8.2)

To explai n th e Gree k letter s whil e statin g Equation s 8.1 an d 8.2 in


narrativ e form , th e equation s ar e as follows : Equatio n 8.1Y equal s
lambd a Y time s eta plu s epsilon ; Equatio n 8.2X equal s lambd a X
time s xi plu s delta . Th e tw o lambd a matrice s ar e th e facto r patter n
matrice s (th e P's) , eta is th e vecto r of endogenou s variable s (factors) ,
xi is th e vecto r of exogenou s variable s (factors) , an d epsilo n an d delt a
ar e th e residual s (e's ) for th e observe d measures .
Th e singl e mode l use d for SEM can handl e pat h model s wit h an d
withou t measuremen t erro r as wel l as model s wit h nonrando m meas
uremen t error . If a mode l contain s measuremen t error , the n th e
residual s (epsilo n an d delta ) ar e mad e up of bot h erro r an d uniqu e
tru e scor e variances . If ther e is nonrando m error , the n th e vari
ance/covarianc e matrice s of thos e residual s can allo w residuals within
matrice s to covar y wit h on e another. The y wil l no t be just a vecto r
of residua l variance s representin g th e diagona l element s of th e matri
ces bu t rathe r wil l hav e off-diagona l element s tha t ar e nonzero .
To be abl e to wor k wit h th e residua l variance/covarianc e matrices ,
th e equation s for X an d Y nee d to be expresse d in term s of variance/co
varianc e matrice s of observe d measures . The y can be expresse d tha t
wa y by postmultiplyin g each sid e of th e facto r analysi s equatio n by
its transpos e an d takin g expecte d values . Th e algebr a for thi s opera
tio n is exactl y th e sam e as ha s bee n illustrate d in Chapte r 7 and ,
therefore , wil l no t be repeate d here . Thus , for = + , th e
resultin g equatio n is
12

lyy

Ay T\X)'Ay

(8.3)

12. On e reaso n fo r combinin g th e exogenou s an d endogenou s variable s int o a single-facto r


mode l is so tha t residual s ca n covar y acros s matrices . Thi s reaso n is obviate d in LISRE L 8,
whic h allow s residual s t o covar y between th e tw o matrice s of residual s an d in equatio n for m
program s suc h as AMO S an d EQS , in whic h th e residua l covariance s can b e name d in a
straightforwar d fashion .

Latent

Variable

Structural

Equation

181

Modeling

for X = A + , th e resultin g equatio n is


x

= '

' + .

(8.4 )

Finally , to defin e th e new terms : th e expecte d valu e of ' is a


variance/covarianc e matri x calle d , an d th e expecte d valu e of '
is . In addition , th e expecte d valu e of ' is define d as a facto r
variance/covarianc e matri x ; thus , th e latte r equatio n can be ex
presse d as = ' . Finally , as is illustrate d late r in thi s
chapte r whe n th e structura l mode l is presente d an d explained , th e
expecte d valu e of ' canno t be expresse d so simply , for it is a
functio n of a numbe r of othe r matrices .
Befor e presentin g th e structura l mode l tha t interrelate s theoreti
cal variables , th e issu e of referenc e indicator s is revisite d an d mor e
full y explaine d an d illustrated . Referenc e indicator s provid e a criti
cal lin k betwee n th e measuremen t model' s observe d variable s an d th e
structura l model' s unmeasure d theoretica l variables . Withou t refer
enc e indicators , it is no t possibl e to attai n identificatio n of laten t
variabl e models , for referenc e indicator s provid e a scale or metri c for
laten t variables . Man y user s of SEM technique s seem to hav e troubl e
understandin g wh y referenc e indicator s ar e needed , ho w referenc e
indicator s operate , an d wha t it mean s to say tha t selectio n of a
referenc e indicato r is arbitraryth e issue s covere d in th e nex t section .

I Reference Indicators
As note d in th e precedin g section , scalin g of laten t endogenou s
variable s can caus e problems , for ther e is no covarianc e matri x of
laten t endogenou s variable s (of etas ) in whic h to specif y th e variance s
as set to particula r values . Therefore , on e need s to scal e laten t
endogenou s variable s by fixin g th e relationshi p betwee n an indicato r
an d each laten t variable . Tabl e 8.1 provide s an artificia l illustratio n
of ho w proportionalit y is maintaine d acros s selectio n of differen t
indicators .
For Tabl e 8.1, imagin e tha t we hav e a singl e facto r wit h thre e
indicators . (Reader s wh o like to visualiz e th e mode l can refe r to
Figur e 7.2, assumin g tha t onl y thre e indicatorsX, , X , an d X ar e
available. ) Th e illustratio n is a CFA model , for ther e is no structura l
mode l wit h onl y a singl e laten t variable . In CFA problems , laten t
variable s can be scale d by fixin g thei r variance s to som e constant . In
2

LATEN T

182

VARIABLE

MODEL S

TABL E 8.1 Illustratio n of Referenc e Indicator s


(a ) Correlatio n Matri x
X3

X2

Xl
1.00

Xi
X2
Xi

.42

1.00

.48

.56

1.00

(b ) Equivalen t Version s o f th e Mode l


Residual

Reference
None

Indicator

Xi

Xj

X2

Variance

Loadin g
.60

Xi
X2
Xi

.60/.7 0

1.00

1.00

.70

.70/.6 0

.80

.80/.6 0

.80/.7 0

.60

.70

.60/.8 0

.64

.70/ .80

.51
.36

1.0

Varianc e laten t
1.00

variabl e

.80

th e firs t colum n of th e example , th e varianc e of th e laten t variabl e is


fixed to 1.0. By contrast , in structura l model s tha t hypothesiz e causa l
path s betwee n laten t variables , fixin g th e varianc e o f endogenou s
variable s is no t an option . In thos e models , th e varianc e o f endo
genou s laten t variable s is a functio n of explaine d an d unexplaine d
varianc e an d need s to be scale d by usin g a referenc e indicator . In othe r
words , th e solutio n in th e firs t colum n is no t possibl e for endogenou s
variable s in structura l models , for the y canno t be fixed to a define d
value . Th e solutio n woul d requir e selectin g an d scalin g on e of th e
indicators , yieldin g on e of th e solution s foun d in th e second , third ,
an d fourt h columns .
Wha t I hav e don e in th e illustratio n is to begi n wit h value s for
th e relation s of X X , an d X, wit h X (Path s a, b, an d c in Figur e 7.2)
of .60, .70, an d .80, respectively . Then , usin g th e tracin g rule , th e
correlatio n betwee n each pai r of measure s is th e produc t of th e path s
betwee n them . Tha t is, r {ab) = .60 .70 = .42, r (ac) = .60 x .80 =
.48, an d r (bc) = .70 .80 = .56. Th e sam e value s can be draw n
fro m th e pat h mode l consistenc y test s describe d at th e en d of Chapte r
7. Althoug h tha t mode l typicall y is use d to solv e for th e path s (a, b,
an d c), it can be don e "backward. " As wa s show n in Chapte r 7, a, b,
u

Latent

Variable

Structural

Equation

183

Modeling

an d c ar e relate d to th e correlations : a = (r x r / r ) , b = (r x
M / u)> ! *
- ( \i
23 / ) Th e syste m o f thre e equation s in thre e
unknown s (correlations ) is solvable . For example , becaus e a is just
.6 square d (i.e., .36), .36 ( r x r l r^) . Then , multiplyin g a time s
b\ whic h is .49 (i.e., .49 = (r, r / r, )), give s .36 .49 = r , or
r = sqrt(.1764 ) = .42. By a simila r process , r, = .48 an d r = .56,
all answer s th e sam e as by th e tracin g rule . Thes e value s for r r, ,
an d r appea r as th e correlatio n matri x in Tabl e 8.1, whic h yield s
facto r loading s of .60, .70, an d .80.
Th e matri x in Tabl e 8.1 can be use d in variou s structura l equatio n
program s to produc e th e column s of estimate s tha t appea r in th e
lowe r par t of Tabl e 8.1. Th e firs t colum n of number s is wha t woul d
be estimate d if th e varianc e of th e laten t variabl e wer e fixed to 1.00,
th e secon d if X! wer e mad e th e referenc e indicato r (for th e second ,
third , an d fourt h columns , I hav e left th e value s as ratio s rathe r tha n
insertin g thei r numerica l values) , th e thir d if X wer e mad e th e
referenc e indicator , an d th e fourt h if X wer e mad e th e referenc e
indicator . Th e fina l colum n contain s th e residua l variances , whic h ar e
unchange d acros s th e fou r variations . Th e residual s ar e equa l to th e
tota l variance s (each of whic h is 1 give n tha t th e variable s ar e
standardized ) minu s th e loadin g square d from th e firs t column , in
whic h th e laten t variabl e is scale d to uni t variance . For example , th e
residua l forX j is 1 - .60 , or 1 - .36 = .64.
2

13

12

anc

23

12

l2

l 2

12

23

12)

23

Ther e ar e thre e importan t point s to be made . Th e firs t is tha t


designatin g an indicato r th e referenc e indicato r doe s no t mak e th e
indicato r an d th e laten t variabl e th e sam e unles s th e referenc e indi
cator' s residua l varianc e is fixed to 0. Tha t is not don e in thi s exampl e
an d shoul d no t be don e whe n multipl e indicator s ar e available . Wit h
multipl e indicators , ther e is no nee d to fix residua l variance s to zero ;
fixin g th e residua l to zer o make s th e laten t variabl e an d observe d
variabl e th e same , whic h ignore s importan t informatio n abou t reli
abilit y of th e referenc e indicator . Second , th e proportionalit y of th e
indicator s is unchange d by selectio n of a referenc e indicator . As can
be seen fro m Tabl e 8.1, thei r relativ e size s ar e maintaine d regardles s
of whic h become s th e referenc e indicator . Third , th e residua l varianc e
is unchange d by selectio n of a referenc e indicator . Onl y th e varianc e
of th e laten t variabl e changes . Tha t change , of course , woul d alte r th e
nonstandardize d path s to an d fro m Variabl e X; however , if on e
standardize s th e laten t variable s by convertin g thei r variance s to unity ,
the n all variation s woul d produc e th e sam e solution . Mos t SEM

184

LATEN T VARIABL E

MODEL S

program s provid e scale d solution s in whic h laten t variable s ar e re scale d to uni t variance . Such a reseatin g impose d on solution s fro m
an y of th e second , third , or fourt h column s woul d yiel d as loading s
of th e indicator s th e sam e value s as ar e foun d in th e solutio n of th e
firs t column .
For reader s wh o hav e acces s to an SEM program , I woul d sugges t
as an exercis e inputtin g th e simpl e 3 x 3 correlatio n matri x an d
estimatin g th e solutio n by fixin g th e varianc e to 1.0 an d by fixin g
differen t indicator s as referenc e indicators . Incidentally , eve n if th e
mode l wer e overidentified , th e fit indexe s an d statistic s of th e differ
en t model s woul d be identical , as is th e case for just-identifie d model s
suc h as th e on e illustrate d in Tabl e 8 . 1 .
It is hope d tha t th e illustratio n help s demystif y selectio n of
referenc e indicators . Becaus e selectio n is arbitrary , th e issu e of refer
enc e indicator s shoul d be a simpl e on e to remember , for it is th e sam e
regardles s of th e typ e of structura l model . For each endogenous
variable, specify one indicator as a reference indicator and fix its
relationship with the latent variable to some value, typically 1.
Selectin g th e mos t reliabl e indicato r as th e referenc e indicato r in
crease s th e varianc e of th e laten t variabl e an d lower s th e loading s of
indicator s on it bu t ha s no effect on th e relativ e loading s or on overal l
mode l fit. Wit h respec t to structura l paths , selectin g differen t refer
enc e indicator s change s th e unstandardize d path s to an d fro m th e
laten t variabl e bu t doe s no t affec t eithe r significanc e of path s or th e
size of path s if th e laten t variabl e is rescale d to uni t variance .
At thi s point , th e complet e measuremen t mode l ha s bee n de
scribed . In othe r words , in settin g up thi s par t of th e SEM model , th e
factors/laten t variable s hav e bee n operationalize d (i.e., linke d to
observe d measures) , so attentio n can be turne d to th e interrelation
ship s amon g th e laten t variable s in th e structura l model .
13

I The Structural

Model

Th e structura l mode l is th e regressio n par t of laten t variabl e SEM.


Th e primar y difference s betwee n laten t variabl e structura l model s
an d basi c pat h analyti c model s ar e tha t (a) th e variable s in laten t
13. Usin g LISRE L 7, I ha d som e troubl e gettin g th e solutio n fo r th e firs t column , in whic h
I fixe d th e varianc e of th e laten t variabl e to unity . Reader s als o migh t encounte r problem s
if the y tr y tha t versio n of th e program .

Latent

Variable

Structural

Equation

Modeling

185

variabl e model s typicall y ar e no t measure d (th e exceptio n is wher e


ther e is onl y a singl e indicato r of a conceptua l variable ) an d tha t (b)
whe n calculatin g value s for paramete r estimates , no distinctio n need s
to be mad e betwee n recursiv e an d nonrecursiv e model s or model s
wit h residua l covariatio n amon g laten t variables . All model s can be
handled by th e genera l regressio n equation .
Th e variable s in th e regressio n equatio n ar e th e eta s an d xis fro m
th e measuremen t model . Thos e variable s ar e relate d throug h th e
genera l regressio n equatio n presente d earlie r in thi s chapte r (Y =
AX + BY + E), bu t onc e agai n th e Gree k terminolog y ma y mak e the m
seem different . Th e equatio n in LISREL for th e structura l model ,
whic h perfectl y parallel s th e regressio n equatio n an d differ s onl y by
usin g differen t symbols , is
= + + .

(8.5)

Compar e tha t wit h


Y = BY + AX + E.
In LISREL terminology , bet a ( ) is th e matri x of regressio n weight s
interrelatin g endogenou s ( ) variables , gamm a () is th e matri x of
regressio n weight s relatin g exogenou s ( ) to endogenou s ( ) vari
ables , an d zet a () is th e vecto r of residual s for th e endogenou s laten t
variables . If th e bet a matri x is or , by interchangin g rows , can be mad e
lowe r triangula r (i.e., all element s abov e th e mai n diagona l ar e 0),
the n th e mode l is recursiv e an d ha s unidirectiona l flow ; if it canno t
be mad e lowe r triangular , the n th e mode l is nonrecursive . Unlik e
regressio n approaches , regardles s of recursivity , th e mode l is esti
mate d in th e sam e way . As wa s tru e of regressio n approaches , how
ever , for nonrecursiv e model s ther e ar e additiona l concern s relate d
to identification .
An alternativ e form of th e structura l mode l equatio n move s all
th e eta s to th e left sid e of th e equation , yieldin g
(/-) = + .

(8.6)

For an y reader s familia r wit h th e earl y version s of th e LISREL


program , thi s is th e form of th e equatio n tha t wa s use d excep t tha t
th e matri x precedin g eta wa s calle d rathe r tha n / - B. By calling th e

LATEN T VARIABL E

186

MODEL S

matri x B rathe r tha n / - B, all th e coefficient s in th e bet a matri x ha d


to hav e thei r sign s reverse d befor e interpretin g them , for th e value s
in th e matri x woul d be correc t bu t hav e sign s opposit e to thei r tru e
sign s (-) . By contrast , th e for m presente d firs t (i.e., wit h th e coeffi
cient s interrelatin g th e endogenou s variable s on th e righ t sid e of th e
equation ) yield s estimate s wit h th e correc t signs . Late r version s of
LISREL switche d becaus e havin g to remembe r to revers e sign s wa s
an unneede d complicatio n for researcher s no t completel y comfort
abl e wit h SEM approaches . (For th e res t of us , reminiscin g abou t ho w
bet a wa s differen t wil l defin e us as "old-timers." )
Th e I-B for m of th e structura l mode l is usefu l for expressin g th e
structura l mode l in term s of covariances . If th e equatio n is change d
to expres s covariance s in a fashio n parallelin g th e measuremen t
mode l an d th e facto r mode l in Chapte r 7, the n th e equatio n become s
' = (/ - )-''(/ - ) "" + (7 - )'%'( - B)" '.

(8.7)

Takin g expecte d values , replacin g ' wit h an d ' wit h , th e


equatio n become s
. = (I - )-'(/ - ) "" + (/ - )-'(/ - )"".

(8.8)

As note d earlier , th e covarianc e matri x of th e eta s coul d no t be directl y


specified . It is a functio n of th e explaine d varianc e (th e firs t ter m on
th e righ t sid e of th e equation , (/ - )-'(/ - B)' ) an d th e un
explaine d varianc e (th e secon d ter m on th e righ t side , (/ - ) ( /
B)" ') in th e structura l model .
On e consequenc e of no t bein g abl e to directl y specif y element s
of th e et a covarianc e matri x is tha t it is somewha t trick y to provid e
thos e variable s wit h a scale or metric . Becaus e the y ar e unmeasured ,
the y hav e no inheren t scale . Yet, if the y ar e no t assigne d a metric , the n
th e mode l wil l be underidentified . To assig n a metric , on e of th e
indicator s of each endogenou s laten t variabl e need s to hav e its reliabl e
componen t tied in som e fashio n (usuall y set equal ) to th e varianc e of
th e laten t variable . Its reliabl e componen t can , for example , be set
equa l to th e varianc e of th e laten t variabl e by fixin g th e loadin g in
th e lambd a matri x to 1.0. Th e indicato r whos e loadin g is fixed is
calle d a referenc e indicator , for it provide s a poin t of referenc e for
th e laten t variable . All latent endogenous variables need to have a
reference indicator selected and that measure's loading fixed for the
v

_1

Latent

Variable

Structural

Equation

Modeling

187

solution to be identified. By contrast , variance s of exogenou s laten t


variable s can be scale d by fixin g diagona l element s in th e ph i matri x
as wel l as by specifyin g referenc e indicators .
In summary , laten t variabl e SEM method s represen t a logica l
couplin g of regressio n an d facto r analyti c approaches . The y provid e
researcher s wit h th e capacit y to overcom e man y of th e problem s an d
shortcoming s of pat h mode l approaches , suc h as measuremen t an d
specificatio n error , an d provid e a mode l genera l enoug h to dea l wit h
bot h nonrecursiv e an d recursiv e models . Onc e on e get s pas t th e
Gree k terminolog y for matrice s use d by th e LISREL program , th e
basi c mode l can be seen as a straightforwar d combinatio n of regres
sion an d facto r analysis . If th e technique s ha d bee n availabl e earlier ,
the n laten t variabl e SEM coul d hav e save d pat h analysi s approache s
fro m muc h criticis m about deficiencie s in thei r methods . Unfortu
nately , its developmen t ha d to wai t for availabilit y of bot h compute r
technolog y an d program s tha t coul d us e tha t technology . It wa s
Joresko g (1969, 1973), Bock an d his student s (e.g. , Keesling , 1972),
an d Wile y (1973) wh o opene d th e doo r to laten t variabl e SEM
methods .

I An Illustratio n of Structura l Equatio n Model s


|

Model

Specification

Imagine , for example , tha t we decid e tha t we wan t to examin e th e


relationship s of tw o exogenou s variable s (famil y social class an d
studen t ability ) wit h tw o endogenou s variable s (studen t pee r statu s
an d studen t achievement) . Imagin e furthe r tha t we decid e to collec t
informatio n on parents ' educationa l attainment , parents ' job status ,
an d famil y incom e as measure s of social class ; tw o abilit y or intelli
genc e tests , th e Peabod y Pictur e Vocabular y Test (PPVT ) an d th e
Rave n Progressiv e Matrices , as measure s of ability ; sociometri c pee r
rating s on schoo l work , play , an d friendship s as measure s of pee r
status ; an d mathematical , verbal , an d analyti c reasonin g dimension s
of a standardize d achievemen t test .
14. Th e mos t recen t versio n of LISREL , LISRE L 8, wil l selec t a referenc e indicato r fo r
researcher s as par t of its estimatio n process .

LATEN T VARIABL E

188

MODEL S

Ou r measuremen t mode l appear s as Figur e 8.1. In Figur e 8.1,


ther e ha s bee n no distinctio n mad e betwee n exogenou s an d endo
genou s variables , for at thi s poin t ther e ar e no arrow s connectin g th e
latent/conceptua l variables . If th e laten t variable s wer e to be con
necte d by curved , double-heade d arrows , the n we woul d hav e a CFA
mode l wit h fou r factors . As can be seen in Figur e 8.2, however , th e
hypothesize d mode l is in fact a causa l one , wit h path s fro m exogenou s
to endogenou s variables .
Onc e th e hypothesize d causa l relationship s ar e specified , th e
separatio n of exogenou s an d endogenou s variable s become s obvious .
On e additiona l poin t of importanc e is tha t eve n thoug h achievemen t
an d pee r statu s ar e likel y to be interrelated , it is no t immediatel y
obviou s ho w to specif y th e nature of thei r interrelationshi p in th e
model . First , ther e is n o compellin g justificatio n for specifyin g eithe r
of the m as causall y preponderan t ove r th e other . Second , becaus e th e
mode l specifie s tha t the y shar e commo n causes , the y wil l be relate d
in th e mode l withou t an y pat h tha t goe s directl y betwee n them . (Th e
magnitud e of thei r relationshi p in th e absenc e of a direc t pat h
betwee n the m can be calculate d by usin g tracin g rule s describe d
earlie r in thi s book. ) If thos e commo n cause s ar e hypothesize d to be
stron g enough , the n no othe r pat h ma y be neede d betwee n the m eve n
if thei r relationshi p is substantial . On th e othe r hand , if thei r hypothe
size d relationshi p exceed s th e covariatio n the y woul d be expecte d to
shar e du e to thei r commo n causes , the n th e additiona l relationshi p
need s to be acknowledge d in th e model . A wa y to mode l suc h
covariatio n withou t assignin g causa l preponderanc e is to connec t
thei r residuals . In term s of overal l mode l fit, includin g th e residua l
covarianc e is equivalen t to puttin g a pat h eithe r fro m pee r statu s to
achievemen t or vice versa .
I

Identification

As ha s bee n tru e for all type s of structura l models , for a mode l to be


estimable , it need s to be identified . In laten t variabl e structura l
models , degree s of freedo m can be determine d readil y beginnin g wit h
th e formul a for covarianc e matrices , v{v + 1) / 2 (wher e is th e
numbe r of measures) , to determin e possibl e degree s of freedom . SEM
approache s assum e tha t covarianc e matrice s ar e bein g analyzed , so
th e variance s ar e include d in th e formul a for degree s of free
dom . Thus , th e appropriat e formul a is v(v + 1) / 2 rathe r tha n th e

Latent

Variable

Structural

PPVT

Figur e 8.1.

Equation

J*

189

Modeling

Ml* ,

Aeh

Anaty.Aeh

Hypothetica l Four-Facto r Laten t Variabl e Mode l

v(v - 1) / 2 tha t is use d for determining th e numbe r of correlation s in


a matrix .
Onc e tota l possibl e degree s of freedo m ar e determined , then , by
subtractin g all coefficient s to be estimated , on e can determin e th e
degree s of freedo m for an y particula r model . In th e presen t illustra
tion , is 11; thus , possibl e degree s of freedo m ar e 11(12) / 2 = 66.
Fro m 66 we subtrac t 11 path s fro m laten t variable s to observe d
measures , 11 residual s on observe d measures , an d 6 path s (4 unidi
rectiona l an d 2 representin g covariances ) an d 2 residual s in th e
structura l model , apparentl y leavin g 36 degree s of freedom . Scalin g
th e laten t variable s require s fixin g referenc e indicator s for each of th e
tw o endogenou s variables , recapturin g 2 degree s of freedom . Scalin g
of laten t variable s for th e tw o exogenou s variable s is don e by fixin g
tw o variance s in th e ph i matrix ; becaus e thes e wer e no t include d as
free parameter s in th e precedin g calculations , no adjustmen t of

LATENT

190

Studen t

Abilit y

"Figur
e 8 . 2 .

Raven
Ravan

\ _

Studen t

MODELS

f '^* '
S

Family

/ I
PPVT
PPVT

VARIABLE

I Achtovrnt J ~*~

/ I \
Matt ) Ach
Matt ) Ach

Read Ach
Raad Ach
t

Analy . Ach
Analy . /
i

:
Hypothetica
l Four-Facto r Laten t Variabl e M o d e l

degrees of freedom is needed. If they had been scaled through use of


reference indicators, then 2 degrees of freedom would be gained for
the two reference indicators, but the variances in the phi matrix
would have to be freely estimated, and the two additional free para meters would take away the 2 degrees of freedom that were gained.
Either way results in a model having 36 + 2 = 38 degrees of freedom.
For complicated models, identification of all parameters may not
be readily apparent. For example, not all parameters in models with
positive degrees of freedom are necessarily identified (remember the
example from the consistency tests called consistency of the epistemic
correlation). In such instances, determining identification of a model
is necessary (e.g., Bollen, 1989; Rigdon, 1995).
SEM programs supposedly provide information about identifica tion. If a model is not identified, then the programs should not be
able to determine a unique solution. As noted in Chapter 6, the acid

Latent

Variable

Structural

Equation

Modeling

191

tes t shoul d be whethe r or no t th e progra m is abl e to calculat e standar d


error s for th e paramete r estimates , for in orde r to produc e standar d
errors , th e informatio n matri x (a matri x base d on th e matri x of
estimates ) need s to be inverted . If parameter s ar e no t identified , the n
on e or mor e of the m ar e linearl y dependen t on othe r paramete r
estimates , an d th e informatio n matri x shoul d be no t invertibl e but ,
rather , singular . In othe r words , th e compute r program s shoul d hel p
determin e mode l identification . Attainin g a solutio n wit h standar d
error s shoul d be evidenc e of identificatio n of th e model . Two cau
tions , however , ar e in order . First , if a mode l is no t identified , the n
th e program s shoul d aler t you to th e presenc e of problem s bu t ma y
no t properl y poin t to th e caus e of th e identificatio n problems . Sec
ond , an d mor e important , ther e ha s bee n considerabl e discussio n
amon g SEM researcher s about whethe r or no t th e program s can be
truste d to tes t for mode l identification . Ther e seem s to be widesprea d
agreemen t tha t occasionall y th e program s produc e solution s includ
ing standar d error s for model s tha t ar e no t identified . A conservativ e
approach , therefore , woul d be to determin e mode l identificatio n
befor e usin g an SEM program .
On e wa y in whic h to thin k about identificatio n for researcher s
wh o woul d like to ensur e identificatio n befor e analyzin g thei r dat a is
to separat e th e measuremen t an d structura l models . If each on e is
identifie d independently , the n th e mode l is identified . Althoug h ther e
stil l seem s to be som e disagreemen t abou t ho w to establis h necessar y
an d sufficien t condition s for identificatio n of laten t variabl e SEMs, a
conservativ e view tha t I follo w is tha t th e measuremen t mode l neve r
buy s identificatio n of th e structura l model . From tha t perspective , th e
structura l mode l need s to be identified ; assessin g whethe r or no t it is
identifie d can be don e usin g th e condition s for identificatio n intro
duce d in Chapte r 6. Provide d there ar e availabl e multipl e indicator s
of th e laten t variables , identificatio n of th e measuremen t mode l
shoul d be no proble m so lon g as th e factor s ar e scale d usin g th e
option s of specifyin g referenc e indicator s and/o r fixin g variance s (for
exogenou s variable s only) .
In th e presen t example , condition s of identificatio n ar e readil y
met . Th e structura l mode l is a recursiv e model , measure s ar e linke d
onl y to singl e factors , an d th e residua l covarianc e link s tw o variable s
wit h no causa l pat h betwee n them ; thus , all of th e parameter s ar e
identified . Th e structura l par t of th e mode l contain s six paths : th e
covarianc e betwee n th e exogenou s variables , th e fou r path s fro m

192

LATEN T VARIABL E

MODEL S

exogenou s to endogenou s variables , an d th e residua l covarianc e


betwee n th e tw o endogenou s variables . If attentio n is focuse d onl y
on th e structura l model , the n th e old rule s for degree s of freedo m
tha t wer e learne d in pat h analysi s stil l hold . To be consisten t wit h th e
covarianc e languag e of structura l equatio n models , { + 1) / 2 can
be use d as th e formul a for availabl e degree s of freedom , resultin g in
4(5) / 2 = 10 possibl e degree s of freedom , less six path s an d fou r
variances , leavin g no degree s of freedom , a just-identifie d structura l
model . Not e tha t if v(v - 1) / 2 fro m pat h analysi s ha d bee n used , the n
th e fou r variance s woul d be ignore d an d th e result , 0 (6 - 6 = 0)
degree s of freedom , woul d be th e same .
Becaus e th e structura l mode l ha s no degree s of freedom , in thi s
exampl e all 38 degree s of freedo m in th e mode l tes t th e fit of th e
measuremen t model . An y failur e to fit result s fro m imprecis e specifi
catio n of th e measuremen t model , no t fro m misspecifyin g th e rela
tionship s amon g th e laten t variables . All just-identifie d structura l
model s ar e equivalen t an d full y accoun t for th e relationship s amon g
th e laten t variables . Thi s equivalenc e provide s a "bes t fit " for th e
laten t variable s tha t is independen t of th e particula r mode l tha t is
specified , in effect simultaneousl y examinin g th e bes t fit of an arra y
of differen t models . Two point s ar e wort h noting . First , laten t variabl e
SEM approache s do not get aroun d th e "equivalen t model " problem .
Th e fit of th e presen t mode l woul d be identica l to th e fit of a numbe r
of othe r structura l model s tha t ar e jus t identified . Second , th e ide a of
knowin g tha t all of th e lack of fit is locate d in th e measuremen t mode l
is an appealin g one . It mean s tha t no matte r ho w th e conceptua l
variable s ar e interrelated , th e onl y wa y in whic h to get a bette r fit
woul d be to chang e th e measuremen t model . Som e of th e indexe s for
examinin g mode l fit includ e just-identifie d structura l model s as on e
of a serie s of neste d test s tha t hel p examin e mode l adequac y an d fit.

I Equations and Matrices

Settin g up th e equation s an d correspondin g matrice s for th e exampl e


follow s exactl y th e method s use d in th e pat h analysi s an d CFA
chapters . It provide s a clea r wa y in whic h to loo k at th e degree s of
freedo m issu e in detail . We begi n wit h th e measuremen t model ,

Latent

Variable

Structural

Equation

193

Modeling

endogenou s first . Equatio n by equation , whic h is th e wa y in whic h


model s ar e set up in program s suc h as AMO S an d EQS, th e mode l is
ScWkPo p = , , + ,
ScPlPo p = +
FrPo p = +
MathAc h = +
ReadAc h = +
AnalyAc h = + .
2

In matri x for m as use d by LISREL, it is


or
|,0 |
|e, |
| 0|
| |
| 0 | | , | + | |
|0 || |
| |
|0, |
|0 |
| |

|ScWkPo p
|ScPlPo p
JFrPop
| MathAc h
j ReadAc h
I AnalyAc h

For , ther e ar e n o residua l covariances , so th e matri x is diagona l


an d contain s th e variance s o f th e epsilon s (e.g. , E i ) . To mee t condi
tion s for identification , on e indicato r in each colum n of lambd a ha s
to be fixed to a nonzer o value . Th e on e selecte d is arbitrary . As ex
plaine d earlie r in thi s chapter , regardles s of th e on e selected , th e thre e
indicator s wil l maintai n proportionalit y wit h on e another ; jus t th e
referenc e poin t changes . Thus , in thi s par t of th e model , ther e ar e 10
parameter s ( 6 - 2 = 4 lambda s an d 6 thet a epsilons ) to be estimated .
For th e exogenou s variables , th e equation s ar e
z

EdParent s = "L, , + ,
JobParent s = , , +
Famlncom e = , , +
= +
Rave n = , , + , .
2

1 0

In matri x form ,

LATEN T VARIABL E

194

MODEL S

X = +

1 EdParent s
IJobParent s
| Famlncom
|PPV T
1 Rave n

|
| 0
|
e | = | 0
|

|0
|
8

| ,
|
+ | ,
|
| ,
2

,
|

Onc e again , is just th e residuals , in thi s case th e variance s of


th e deltas . To scale th e exogenou s laten t variables , eithe r a referenc e
indicato r need s to be designate d or th e variance s nee d to be fixed to
a valu e in th e ph i (variance/covariance ) matrix . Th e choic e is anothe r
arbitrar y one . In thi s case , assum e tha t we decid e to fix th e variance s
in th e ph i matrix , so anothe r 10 parameter s (5 lambda s an d 5 thetas )
nee d to be estimated .
For th e structura l model , th e equation s ar e
" i = 7 . + + an d
= Yj + < +
2

In matri x form ,
= + +
, = 10 01 |, | + |, | |, | + |, |
|
| 0 0|| |
| || |
| | .
2

Not e tha t bet a is null , for ther e ar e no hypothesize d causa l relation


ship s betwee n th e tw o laten t variables . Ther e ar e fou r relationship s
to solv e for in gamma . In addition , th e covarianc e matrice s for th e
phi s an d psi s nee d to be solve d for :

= l.o |
)2

1-01
As is tru e for all covarianc e matrices , ph i is symmetri c an d is th e
sam e as , so ther e is onl y on e coefficien t to estimate . If referenc e
indicator s ha d bee n specifie d in lambd a X, the n th e variance s in ph i
( an d ) woul d hav e to be freel y estimate d an d th e degree s of
freedo m woul d no t chang e ( + 2 in lambd a X an d - 2 in phi) . For psi ,
)2

21

22

Latent

Variable

Structural

Equation

195

Modeling

th e residua l covarianc e matri x of th e factor s or laten t variables , th e


matri x is
= Cu l
o r

u s i n

8 Gree k psi , = | , 2
2

21

Onc e again , ps i is a symmetri c covarianc e matrix , so ther e ar e onl y


thre e coefficient s to estimate .
Addin g up th e parameter s to estimate , th e tota l is 10 + 10 + 4
+ 1 + 3 = 28, 66 - 2 8 = 38 degree s of freedo m in th e model , th e
sam e tota l tha t wa s presente d earlier .

Basi c Idea s U n d e r l y i ng Fit/Signiflcanc e Testin g

At thi s point , assumin g tha t th e dat a hav e bee n collecte d an d th e


matrice s tha t link observe d to laten t variable s an d laten t variable s
wit h on e anothe r hav e bee n specified , all tha t is neede d to conduc t
th e analyse s is to set up th e command s for th e compute r progra m
selected . Becaus e reader s likel y wil l be usin g a variet y of differen t
programs , no progra m is describe d in detai l here . (In additio n to th e
reference s alread y cited , reader s also migh t see Byrn e [1989], for
LISREL an d Byrn e [1994], for EQS.) At th e end of thi s chapter ,
illustration s usin g th e LISREL progra m ar e presented ; hopefully ,
user s of othe r program s wil l be abl e to adap t th e illustration s to set
up th e program s the y ar e using . (In th e nex t chapter , an illustratio n
is set up for AMO S an d EQS as wel l as LISREL.) Th e focu s her e is on
th e solutio n proces s (i.e., ho w th e program s fit th e mode l to th e dat a
an d wha t th e tes t statistic s tha t ar e generate d mean ) rathe r tha n on
settin g up th e program . In wrappin g up thi s chapter , genera l princi
ple s of fit an d significanc e testin g ar e presented . A detaile d discussio n
of th e rang e of differen t fit statistic s an d indexe s wil l be left unti l
Chapte r 10.
I Individual Parameter Significance
Befor e addressin g overal l fit of th e model , it is importan t to not e tha t
in laten t variabl e SEM techniques , each individua l paramete r tha t is
freel y estimate d wil l hav e a standar d erro r attache d to it. Tha t
standar d erro r allow s for assessin g significanc e of each paramete r

196

LATEN T VARIABL E

MODEL S

estimated . Significanc e of parameter s is mos t commonl y don e by


judgin g discrepanc y from zer o in a traditiona l tes t of critica l ratio s of
r's or Z's , whic h test s whethe r or no t zer o is containe d withi n th e
confidenc e interval . For th e large r sample s tha t ar e expecte d for SEM,
r's approac h Z's . Therefore , as a genera l rule , if an estimat e is greate r
tha n twic e its standar d erro r (Z > 2.0), it is deeme d significant . Th e
confidenc e interva l als o allow s testin g in differen t ways . For example ,
on e coul d tes t whethe r or no t a correlatio n betwee n tw o variable s is
low enoug h tha t the y coul d no t be considere d to be identica l if teste d
by a confidenc e interva l tha t doe s no t includ e 1.0. Standar d error s
ar e availabl e for all fre e parameter s includin g residuals , variances ,
covariances , an d paths .
Testin g significanc e of individua l path s is ver y differen t from
testin g overal l fit of th e model . Goo d fittin g model s can hav e insig
nifican t parameter s in place s wher e significanc e an d meanin g wer e
expected , wherea s poorl y fittin g model s stil l coul d find stron g an d
importan t relationship s betwee n variables . Researcher s nee d to bal
anc e thei r focu s betwee n significanc e of particula r path s an d tha t of
overal l mode l fit. In som e instance s particula r parameter s ma y be
mor e important , wherea s in other s it ma y be overal l mode l fit tha t is
th e primar y issue .
I Model Fitting
As wa s illustrate d in Exercis e 7.3, statistica l test s of the model for all
test s ar e test s of difference s betwee n th e variance/covarianc e matri x
predicte d by th e mode l an d th e sampl e variance/covarianc e matri x
fro m th e observe d data . Thos e difference s ar e referre d to as "fit" or
"goodnes s of fit, " namely , ho w simila r th e hypothesize d mode l is to
th e observe d data . As th e solutio n is estimated , regardles s of whethe r
th e approac h use d is a varian t of leas t square s or maximu m likelihood ,
th e goa l of th e solutio n proces s is throug h an iterativ e proces s to
reduc e discrepancie s betwee n observe d an d predicte d matrices .
At thi s point , reader s shoul d realiz e tha t a par t of th e structura l
equatio n proces s ha s bee n left unexplained . Tha t par t is ho w to
generat e th e matri x predicte d by th e mode l so tha t th e relationshi p
betwee n th e matri x of th e observe d dat a an d th e matri x for th e
predicte d mode l can be compared . For pat h modeling , reconstructin g
th e predicte d matri x wa s straightforwar d bu t somewha t cumbersome .
It require d usin g an y on e of th e method s for decompositio n of effect s
to specif y relationship s betwee n variable s in term s of differen t path s

Latent

Variable

Structural

Equation

Modeling

19 7

an d the n substitutin g in th e value s of thos e path s to generat e pre


dicte d matrices . Th e approac h tha t mos t readil y generalize d acros s
model s (see Chapte r 3) wa s th e on e tha t involve d multiplyin g th e
matri x of pat h coefficient s by itsel f an d summing .
For laten t variabl e SEM, th e proces s is simila r bu t mor e compli
cate d du e to havin g bot h measuremen t an d structura l models . Th e
proces s also require s multiplyin g matrice s bu t specificall y require s
usin g th e set of matrice s describe d in th e measuremen t an d structura l
models . As is tru e for eve n th e simples t (overidentified ) pat h model ,
th e goa l is to determin e th e relationship s predicte d amon g th e ob
serve d measure s base d on th e hypothesize d model . Som e of th e part s
of th e predicte d matri x alread y hav e bee n presented , althoug h no t as
par t of an approac h for generatin g a predicte d matrix .
Becaus e bein g abl e to generat e th e predicte d matrice s is no t
critica l to usin g SEM techniques , th e algebr a is no t repeate d here . It
ha s bee n worke d ou t an d presente d in man y of th e earlie r article s on
SEM (e.g. , Wiley , 1973). Insofa r as th e component s tha t mak e up th e
predicte d matri x ar e simples t to understan d whe n th e structura l an d
measuremen t model s ar e separated , demonstratio n of ho w th e pro
cess work s firs t focuse s on th e structura l mode l an d the n goe s to th e
measuremen t model .
For th e structura l model , th e goa l is to generat e a predicte d
covarianc e matri x for all th e laten t variables . Thus , we begi n wit h a
vector :
hi
||
Postmultiplyin g th e vecto r by its transpose , [ | ]' , an d presentin g th e
matri x as partitione d int o submatrice s give s

(| '^' I
I V V I

wher e

V = I (/ - B)'

(/ - ) " + (/ - )" (/ - ) ""


( - )"
V = '(/ - ) - "
, = .
1

198

LATEN T VARIABL E MODEL S

In othe r words , parallelin g pat h analysis , in SEM th e predicte d


matrice s ar e a functio n of th e relationship s amon g th e exogenou s
variables , th e relationship s of exogenou s wit h endogenou s variable s
an d of endogenou s variable s wit h themselves , an d th e residual s of th e
endogenou s variables .
To reproduc e th e matri x of observe d measures , we nee d th e
component s fro m th e partitione d matri x just presente d plu s th e
weigh t an d residua l matrice s fro m th e measuremen t model . Together ,
the y yiel d a covarianc e matri x in term s of X an d Y:
^fyxryx n

I Eyr ^vx * I
I ^x r ^xx ' I

We can substitut e fro m th e basi c measuremen t mode l for th e Lyy an d


matrice s bu t hav e to introduc e ne w term s for th e Lyy an d
terms . Th e ful l equation , in term s o f th e matrices , is

() ) =

'

A, V * '
A

Not e tha t th e covariance s amon g th e laten t variable s ar e just pre - an d


postmultiplie d by th e weigh t matrice s to determin e th e tota l commo n
variance , an d the n th e residual s (uniquenesses ) ar e adde d on to ge t
th e tota l variance.
Th e formul a use d by th e maximu m likelihoo d estimatio n tech
niques , th e mos t commonl y use d approach , wa s describe d in th e fina l
exampl e of Chapte r 7 wher e sigm a an d S matrice s wer e generated .
It is
15

F = | | -ln|S | + tr(SL-> ) - ( + q),


wher e F is th e functio n to be minimized , is th e predicte d vari
ance/covarianc e matri x of th e X's an d Y's calculate d as describe d in
th e preceding , S is th e observe d variance/covarianc e matri x o f th e X's
an d Y's, an d an d q ar e th e numbe r of observe d exogenou s (X) an d
endogenou s (Y) variables , respectively . Th e operation s in th e equa
l s . If an y residual s betwee n th e Y's an d X's ar e allowe d t o covary , as can b e don e in LISRE L
8, AMOS , an d EQS , the n ther e als o woul d nee d t o b e a matri x (an d it s transpose ) adde d
t o th e off-diagona l submatrices . In othe r words , th e - ter m woul d b e ^- + ^ ,
an d th e ter m woul d b e ^ + ^ .

Latent

Variable

Structural

Equation

Modeling

199

don s ar e as follows : In, takin g th e natura l log ; | | , taking th e


determinant s (e.g. , \S\) of th e predicte d an d observe d matrices ; an d
tr , th e trac e or sum of th e diagona l element s of a matrix . As explaine d
earlier , as th e predicte d (sigma ) an d observe d (S) matrice s converge ,
th e firs t tw o term s approximat e each othe r an d thei r differenc e
approache s zero . Likewise , th e differenc e betwee n th e latte r tw o
term s shoul d approac h zero . As sigm a an d S converge , sigm a invers e
wil l approximat e S invers e an d S I wil l approximat e SS~\ whic h is
an identit y matrix . Becaus e an identit y matri x ha s one s on th e
diagonal , th e sum of th e diagona l element s of an identit y matri x is
th e size of th e matrix . In thi s case , th e matri x is of size p + q, so th e
differenc e betwee n th e latte r tw o term s approache s zer o as th e
predicte d an d observe d matrice s converge .
- 1

As laten t variabl e SEM technique s becam e available , th e initia l


perspectiv e abou t the m wa s that , becaus e thei r significanc e test s an d
overal l fit statisti c provide d suc h valuabl e informatio n abou t ade
quac y of th e model , a solutio n coul d potentiall y stan d on its ow n
withou t replication . As researcher s gaine d mor e experienc e wit h th e
technique s an d thei r shortcoming s ove r time , a differen t perspectiv e
emerged , namely , tha t th e bes t wa y in whic h to establis h validit y of a
mode l is throug h cross-validatio n by sampl e splittin g an d throug h
replication . Thus , if dat a set s ar e larg e enough , the n sample s shoul d
be split , wit h on e hal f use d to examin e plausibilit y of a mode l an d
perhap s eve n subtl y refin e it usin g modification s to th e mode l tha t d o
no t chang e th e critica l component s an d ar e conceptuall y defensible ,
an d wit h th e secon d hal f hel d to fit to th e mode l fro m th e firs t hal f
(e.g. , Cudec k & Browne , 1983). If th e sampl e is no t larg e enoug h to
split , the n replicatio n is highl y desirable . Even mor e recently , Brown e
an d Cudec k (1993) propose d usin g an expecte d cross-validatio n
inde x for smal l sample s to estimat e effect s of cross-validation .
As suggeste d in th e precedin g discussio n abou t sampl e splittin g
an d cross-validation , on e is unlikel y eve r to obtai n a mode l tha t fits
perfectly , regardles s of its veracity . Th e primar y challeng e for re
searcher s in evaluatin g plausibilit y of th e mode l bein g examine d is to
determin e whethe r or no t its goodnes s of fit is goo d or not . Th e mos t
direc t wa y in whic h fit is evaluate d is throug h significanc e testin g
of th e discrepancie s betwee n observe d an d predicte d relationship s
amon g measures . Th e tes t ma y seem backwar d to reader s wh o ar e
use d to significanc e as bein g good , for th e tes t is of significanc e of
discrepancie s tha t remai n afte r th e mode l is fit. Ideally , a researche r
woul d minimiz e residuals , namely , leav e nothin g unexplained ; if

LATEN T VARIABL E

200

MODEL S

successful , the n ther e woul d be no significan t residua l varianc e re


mainin g onc e th e mode l is fitted . Thus , a goo d fittin g mode l woul d
resul t in a nonsignifican t goodnes s of fit statistic . In th e precedin g
equation , F for a goo d fittin g mode l woul d be ver y small , for F
assesse s th e size of th e residual s rathe r tha n th e size of th e mode l
parameters .
Overal l fit is assesse d by a chi-squar e goodnes s of fit tes t of th e
residuals . Tha t tes t statisti c ha s degree s of freedo m as explaine d
earlie r in thi s chapte r (th e mode l used , e.g. , fro m Figur e 8.2 ha d 38
degree s of freedom) , th e tota l numbe r of variances/covariance s (66)
minu s fre e parameter s to be estimate d (28). Chi-squar e is distribute d
wit h a mea n equa l to its degree s of freedom , so dividin g chi-squar e
by its degree s of freedo m shoul d provid e an inde x of som e valu e as
wel l (e.g. , Marsh , Balla, & McDonald , 1988).
Althoug h havin g a goodnes s of fit statisti c tha t assesse s th e size
of th e residual s is valuable , unfortunately , tha t statisti c is of limite d
value . Th e chi-squar e statisti c is directl y a functio n of sampl e size , for
th e functio n minimize d is multiplie d time s th e sampl e size to deter
min e th e chi-squar e statistic . Th e genera l formul a is time s th e
function. For perfectl y fittin g model s (F = 0), sampl e size clearl y is
of no impact . For imperfectl y fittin g models , however , sampl e size
can hav e unwante d effect s (for a discussion , see , e.g. , Bollen &c Long ,
1993; Joreskog , 1969). Thus , if th e sam e mode l is teste d in tw o
sample s an d produce s exactl y th e sam e functio n bu t th e size of on e
sampl e is twic e tha t of th e other , the n th e large r sampl e wil l hav e a
muc h poore r fit, for its chi-squar e wil l be slightl y mor e tha n twic e as
grea t as tha t in th e smalle r sample . Becaus e of thi s relatio n of fit to
sampl e size , a numbe r of alternativ e fit indexe s hav e bee n develope d
tha t ar e less sensitiv e to sampl e size . The y wil l be explaine d in Chapte r
10. Othe r wor k no t covere d in thi s boo k is attemptin g to teas e apar t
lack of fit du e to sampl e size from othe r source s (e.g. , Kaplan , 1990).
In summary , thi s chapte r ha s laid ou t th e basic s of laten t variabl e
structura l equatio n models . In addition , issue s of mode l specificatio n
an d identificatio n wer e addresse d throug h an illustration , an d proce
dure s for settin g up eithe r equation s or matrice s to solv e for a mode l
als o wer e illustrated . Th e logi c underlyin g us e of referenc e indicator s
16

16. For LISREL , th e exac t formul a to g o fro m th e functio n to th e chi-squar e statisti c is

X =
l

2(N-l)F,

Latent

Variable

Structural

Equation

20 1

Modeling

wa s presente d an d illustrated . Finally , basi c issue s relate d to th e "how "


of mode l testin g wer e covered . Th e remainin g chapter s wil l provid e
additiona l illustration s of SEM models , appl y laten t variabl e SEM to
a coupl e of differen t type s of problem s an d discus s issue s tha t coul d
emerg e if reader s encounte r specifi c type s of situations , an d loo k
broadl y at SEM approaches .

A P P E N D IX

8. 1

A Guid e to Basic s of LISRE L Terminolog y

The Measurement

Model
Y = A + ,
y

wher e
Ay is th e facto r patter n matri x relatin g observe d endogenou s variable s
(observe d measures ) to laten t endogenou s variables '
is a vecto r of laten t endogenou s variable s
is a vecto r of residual s for th e observe d variable s

= + ,

wher e

is th e facto r patter n matri x relatin g observe d exogenou s variable s (ob serve d measures ) t o laten t exogenou s variables '

is a vecto r of laten t exogenou s variable s


is a vecto r of residual s fo r th e observe d variable s

I The Structural

Model
= + + ,

wher e
is a weigh t matri x of partia l regressio n coefficient s relatin g exogenou s to
endogenou s variables *

20 2

LATEN T VARIABL E

MODEL S

is a weigh t matri x of partia l regressio n coefficient s interrelatin g endogenou s


variables *
is a vecto r of residual s fo r laten t endogenou s variable s

The Variance!Covariance

Matrices

(element s ) of exogenou s laten t variables *


(element s ) of residual s for laten t endogenou s variables *
(element s ) of residual s for observe d indicator s of endogenou s
variables *
(element s ) of residual s for observe d indicator s of exogenou s
variables '
a. On e o f th e matrice s tha t ha s t o be specifie d in th e LISREL progra m comman d language .

A P P E N D IX

8. 2

LISRE L Contro l Statement s fo r Figur e 8.2

Hypothetica l four-facto r laten t variabl e model , fro m Chapte r 8:


DA NI=1 1 NO=[numbe r o f observation
s here ] MA=CM
LA
'ScWkPop '
'ScPlayPop
'
'FrPop '
'MathAch '
'ReadAch
'AnalyAch ' 'EdParents
' 'JobParents
' 'FamilyIncome
'PPVT ' 'Raven '
CM F O FI=[locatio
n an d nam e o f covarianc
e matri x here

'
'
]

[FORTRA N forma t for matrix , e.g. , 8F10.7]


MO NY= 6 NX= 5 NE= 2 NK= 2 LY=FU,F I LX=FU,F I BE=FU,F I
GA=FU,F R PH=SY,F R
PS=SY,F R TE=DI,F R TD=DI,F R
LK
'FAMIL Y SES ' 'STUDEN T ABILITY '
LE
'ST U PEE R STATUS ' 'ST U ACHIEVEMENT'
FR L Y 2 1 L Y 3 1 L Y 5 2 L Y 6 2 L X 2 1 L X 3 1 L X 5 2
ST 1. 0 L Y 1 1 L Y 4 2 L X 1 1 L X 4 2
pat h diagra m (i f LISREL8 )
OU-THE OUTCOME CAR D

1 1 I 1 1 1 I 1 1 1 1 I I 1 1

__| jPS^lfc I!3L^3L ^JpB


p^Cj 3^ SSZ ^^jrilCJLJLyM. '
Plausibilit y or Model s

T h is chapte r provide s "rea l data " illustration s of


structura l equatio n methods . Th e thre e illustration s focu s on a singl e
substantiv e issu e an d includ e (a) a unidirectiona l flow or recursiv e
mode l constructe d from dat a collecte d basicall y at a singl e poin t in
time , (b) a nonrecursiv e mode l base d on dat a fro m a singl e tim e point ,
an d (c) a three-wave , longitudinal , unidirectiona l flow mode l tha t
look s at a hypothesize d bidirectiona l relationshi p acros s tim e via a
pane l design .
Thre e dat a set s ar e use d to illustrat e differen t type s of laten t variabl e
structura l models . All thre e shar e a singl e conceptua l theme , for the y
focu s on achievemen t of student s in desegregate d schools . The y als o
illustrat e ho w a serie s of studie s can addres s an d refin e substantiv e
question s abou t relationship s betwee n variables . Becaus e no nonex perimenta l dat a stud y can eve r establis h causality , replicatio n is mor e
importan t tha n it is in experimenta l work .
Firs t is a relativel y simpl e reanalysi s of data , reporte d in Maruyam a
an d Mille r (1979), fro m a stud y originall y presente d as a pat h analysi s
(Lewi s & St. John , 1974). Include d in an appendi x to thi s chapte r ar e
files includin g th e contro l statement s usin g th e LISREL, AMOS , an d
EQS compute r programs .
203

204

LATEN T VARIABL E

MODEL S

Then , tw o additiona l dat a set s ar e presente d an d discussed . Thes e


latte r tw o dat a set s com e fro m differen t group s within a singl e
large-scal e stud y of schoo l desegregation . Th e firs t of thes e tw o dat a
set s is use d to examin e plausibilit y of a nonrecursiv e cross-sectiona l
model , wherea s th e secon d examine s three-wav e longitudina l dat a
presente d as a pane l model . Th e firs t dat a set is reporte d in Maruyam a
an d McGarve y (1980), wherea s th e secon d is presente d in Maruyama ,
Miller , an d Holt z (1986) an d Maruyam a (1993). Th e latte r dat a set
als o is use d to furthe r illustrat e th e advantage s of havin g multipl e
indicators . Finally , thes e example s wer e selecte d no t becaus e th e
model s ar e wonderfu l or fit extremel y wel l bu t rathe r becaus e the y
illustrat e importan t issue s of structura l equatio n model s aroun d a
singl e conceptua l theme . The y als o ar e representativ e of th e kind s of
dat a set s tha t ar e available .

Exampl e 1: A Longitudina l Pat h Mode l

Thi s simpl e mode l illustrate s ho w laten t variabl e structura l equatio n


modelin g (SEM) approache s can produc e finding s tha t diffe r substan
tivel y fro m ordinar y pat h models . Th e dat a an d analyse s com e fro m
reanalyse s (Maruyam a & Miller , 1979) of dat a initiall y reporte d by
Lewi s an d St. Joh n (1974). Th e correlatio n matri x (N = 154) appear s
in Tabl e 9.1. For a sampl e of Africa n America n schoolchildren , th e
mode l look s at th e relationship s betwee n acceptanc e by whit e peer s
an d schoo l achievement . We decide d to reexamin e thei r stud y becaus e
its conclusion s wer e ver y differen t fro m thos e w e wer e uncoverin g
usin g laten t variabl e SEM technique s for paralle l models . In ou r
longitudina l analyses , we ha d no t bee n abl e to find path s fro m pee r
acceptanc e to achievemen t tha t Lewi s an d St. Joh n reporte d in thei r
pat h analyses . Furthe r addin g to ou r interest , finding s fro m thei r dat a
showe d th e pee r acceptanc e to achievemen t pat h to be inconsisten t
acros s alternativ e measure s of achievemen t tha t seemingl y shoul d
hav e bee n comparable , if no t parallel , to on e another . Therefore , we
decide d to see wha t woul d happe n if we reconceptualize d thei r
hypothesize d mode l usin g multipl e indicators , whic h woul d mak e it
mor e closel y resembl e ou r othe r dat a sets . Ou r mode l appear s in
Figur e 9.1.
In term s of an illustration , Figur e 9.1 is of interes t for a numbe r
of reasons . First , it is no t purel y a laten t variabl e model , for ther e ar e

Examining

TABL E

Plausibility

205

9.1 Correlatio n Matri x Fro m Lewi s an d St. Joh n (1974)


(N = 154)
GPAl-5

GPAl- 5

of Models

omiQ

WHPOP

GPA 6

RACH

SES

SCHWH

1.000

OT1SI Q

.570

1.000

WHPO P

.300

.270

1.000

GPA 6

.770

.580

.360

RAC H

.520

.560

.160

.530

SES

.260

.170

-.020

.210

.220

1.000

SCHW H

.250

.230

.180

.320

.170

.060

1.000
1.000

1.000

onl y singl e indicator s for thre e of five construct s assessed . Therefore ,


it provide s an opportunit y to examin e bot h ho w model s ar e set up
wit h singl e indicator s an d ho w the y ar e constraine d whe n measure
men t erro r canno t be removed . Second , as a reminde r of th e impor
tanc e of theor y in drivin g models , th e mode l can be conceptualize d
in differen t ways . For example , on e coul d argu e tha t we shoul d hav e
bee n concerne d abou t th e relationshi p betwee n popularit y an d grade s
an d that , by combinin g achievemen t tes t performanc e wit h grades ,
thi s relationshi p wa s lost in ou r analyses . Unfortunately , th e dat a d o
no t allo w resolutio n of th e differen t views , for if we accep t tha t view
the n we ar e stymie d by th e absenc e of multipl e indicators . Ou r
analyse s of grad e dat a woul d replicat e th e pat h analyse s exactly .
Additiona l dat a ar e require d to sor t ou t th e differen t views . Third ,
th e mode l is longitudina l in tha t dat a wer e collecte d fro m differen t
point s in time , makin g it superio r to a purel y cross-sectiona l design .
Yet, it is a fairl y wea k longitudina l model , for dat a reall y wer e
collecte d at a singl e poin t in tim e bu t include d archiva l dat a culle d
from records . A stronge r desig n is a pane l desig n in whic h dat a ar e
collecte d at severa l points in time . Fourth , insofa r as th e mode l is
longitudinal , we definitel y shoul d hav e worke d wit h a covarianc e
matrix , a shortcomin g of bot h th e origina l articl e an d ou r reanalyses .
Becaus e ou r goa l wa s to compar e ou r finding s wit h th e previou s ones ,
we chos e simpl y to reanalyz e thei r data . Th e shortcomin g ma y be less
importan t in thi s stud y tha n in som e others , for th e onl y repeate d
measur e is grad e poin t average . If th e variabilit y in grad e poin t
averag e change d markedly , however , the n ou r inference s ma y be
inaccurat e becaus e we force d th e tw o grad e measure s to uni t varianc e
an d did no t allo w for "growth. "

206

g

'** .3

.2
a

-1Ss*

8
.
'
ON (j

..

Examining

Plausibility

20 7

of Models

Wit h respec t to mode l specification , th e mode l is constraine d by


th e absenc e of multipl e measure s of famil y social class , percentag e of
school' s childre n tha t ar e white , an d popularit y wit h whites . O f th e
three , schoo l percentag e whit e seem s mos t likel y to be highl y reliable ,
althoug h it migh t imperfectl y asses s th e underlyin g variable , whic h
ma y be prio r exposur e to whit e peers. In contras t to schoo l percent
age white , popularit y suffer s fro m reliabilit y problems , as doe s an y
measur e of socia l class . Nonetheless , becaus e onl y singl e indicator s
wer e available , th e mos t defensibl e methodologica l decisio n is to fix
th e loadin g to 1.0 an d th e residua l to 0.0, thu s makin g th e observe d
measure s an d underlyin g variable s identica l to on e another . (The
othe r alternativ e is to fix th e residua l to som e nonzer o value , which ,
for reason s describe d earlie r in thi s book , can be problematic . At th e
ver y least , it is likel y to be controversial. ) For pas t an d presen t
achievement , th e tw o construct s wit h multipl e indicators , fixin g a
referenc e indicato r allow s for identificatio n of bot h th e remainin g
pat h an d th e residuals . Finally , th e mode l include s nonrando m erro r
betwee n th e tw o grad e measure s an d th e tw o standardize d tes t mea
sures . Thos e residua l covariance s coul d tap an y subdomain-specifi c
varianc e tha t exist s separatel y from commo n varianc e on a genera l
achievemen t domain .
17

Th e structura l mode l include d all path s tha t wer e specifie d by


Lewi s an d St. Joh n (1974). Thei r mode l wa s full y recursive , for it
include d all possibl e path s followin g a hierarchica l orde r an d initiall y
produce d a just-identifie d structura l model . Thus , an y problem s in
fittin g can be attribute d to th e measuremen t model .
Conside r agai n th e issu e of degree s of freedom . Th e tota l possibl e
degree s of freedo m for seve n measure s is 7(8) / 2 = 28. Degree s of
freedo m ar e lost for th e fou r residua l variance s an d tw o covariance s
(6), th e tw o lambda s (2), th e nin e structura l path s (9), th e thre e
residual s on th e endogenou s variable s (3), an d th e thre e element s of
th e ph i matri x (3) give n tha t those directl y correspon d to th e vari
ance s an d covariance s of th e singl e indicator s of th e exogenou s
variables . In total , 2 8 - 6 - 2 - 9 - 3 - 3 = 5, whic h shoul d be th e
degree s of freedo m foun d for th e model .

17. Schoo l percentag e whit e it illustrativ e o f variable s tha t evok e varyin g interes t fro m
researcher s fro m differen t disciplines . Polic y researcher s ma y fee l comfortabl e wit h suc h a
variable , wherea s researcher s mor e intereste d in uncoverin g individua l studen t processe s
ar e likel y t o wan t t o recas t tha t variabl e in psychologica l terms , as I hav e done .

208

LATEN T VARIABL E

MODEL S

Th e overal l fit of th e mode l wa s acceptable , (5) = 4.88, =


154. Significan t path s wer e foun d fro m socioeconomi c statu s (SES)
an d schoo l percentag e whit e to pas t achievement , fro m pas t achieve
men t to popularit y wit h whites , an d fro m pas t achievemen t to presen t
achievement . Achievemen t wa s almos t perfectl y stable ( = .981).
Standardize d value s for th e significan t path s appea r wit h asterisk s (*)
in Figur e 9.1. In contras t to Lewi s an d St. Joh n (1974) an d consisten t
wit h ou r othe r data , ther e wa s no significan t pat h fro m popularit y to
achievement . It ma y be tha t th e increas e in stabilit y of achievemen t
in ou r result s compare d wit h their s prevente d othe r potentia l predic
tor s fro m displayin g an y influence . To th e exten t tha t multipl e indi
cator s allo w for mor e precis e assessmen t of variable s suc h as achieve
ment , th e finding s fro m suc h model s ma y diffe r greatl y fro m wha t
woul d be foun d if onl y singl e indicator s wer e available . Thi s poin t is
illustrate d mor e full y in a late r sectio n of thi s chapter .
2

Th e significan t path s all ar e direc t effect s o f variable s on subse


quen t variables . Th e viabl e indirec t effect s in th e mode l ar e thos e tha t
includ e multipl e substantia l paths . In thi s model , the y all involv e
multipl e significan t paths . As can be seen fro m Figur e 9.1, ther e ar e
no nonsignifican t path s stron g enoug h to resul t in substantia l indirec t
paths . Th e notabl e indirec t path s ar e fro m SES an d schoo l percentag e
whit e to presen t achievemen t (via pas t achievement ) an d to popular
ity wit h white s (also via pas t achievement) . Th e magnitud e of thes e
effects , as explaine d in th e chapte r on pat h analysi s (Chapte r 7), is
determine d by multiplyin g togethe r th e path s connectin g th e pair s of
variables . Thus , th e indirec t effect of SES on popularit y is .27 x .38 =
.10 an d on presen t achievemen t is .27 x .98 = .26. Similarly , th e in
direc t effect of schoo l percentag e whit e on popularit y is .29 x .38 = .11
an d on presen t achievemen t is .29 x .98 = .28. Thes e indirec t effect s
demonstrat e that , accordin g to th e model , bot h SES an d schoo l
percentag e whit e ar e substantiall y relate d to 6th grad e achievemen t
despit e no t displayin g a significan t direc t path . Finally , not e tha t in
th e mode l SES an d schoo l percentag e whit e can substantiall y correlat e
wit h presen t achievemen t withou t havin g direc t effect s on it.
In summary , if th e mode l is specifie d correctl y (th e big if), the n
th e followin g conclusion s ar e warranted . Black student s highe r on
achievemen t durin g thei r elementar y year s wer e mor e popula r wit h
thei r whit e peer s in 6th grade . Becaus e thi s pat h is significant , th e
dat a ar e consisten t wit h th e vie w tha t achievemen t cause s pee r
popularity . On th e othe r hand , ther e wa s no evidenc e tha t blac k

Examining

Plausibility

of Models

209

student s wh o wer e mor e popula r wit h thei r whit e peer s di d bette r in


school . Thus , th e dat a ar e no t consisten t wit h th e view tha t popularit y
cause s achievement . Such a view cannot , however , be totall y dis
misse d in th e dat a set , for th e pat h from schoo l percentag e whit e to
pas t achievemen t coul d be argue d as consisten t wit h a view tha t peer s
influenc e achievement . Th e processe s would , however , hav e to occu r
earlie r tha n 6th grade .

Exampl e 2: A Nonrecursiv e Multiple-Indicato r Mode l

Thi s dat a set ha s bee n reporte d in Maruyam a an d McGarve y (1980).


Th e correlatio n matri x appear s as Tabl e 9.2. Thi s illustratio n basicall y
repeat s wha t is containe d in mor e detai l in tha t article , so intereste d
reader s migh t wan t to loo k there as well . At th e sam e time , th e
practic e of SEM ha s change d a lot sinc e 1980, so referrin g bac k to
tha t articl e shoul d be don e mor e for th e logi c an d genera l method s
tha n for specifi c details .
Mos t importan t for curren t purposes , thi s exampl e illustrate s th e
advantage s of laten t variabl e SEM approache s for analyzin g differen t
type s of models . Even thoug h thi s mode l (see Figur e 9.2) is nonre
cursive , it can be handle d usin g th e sam e approac h as wa s use d in
recursiv e model s suc h as th e precedin g example . Reader s shoul d pa y
particula r attentio n to thi s mode l an d its details , for thi s illustratio n
wil l be revisite d in Chapte r 10 to discus s th e differen t type s of
statistica l test s tha t ar e use d for laten t variabl e structura l equatio n
model s an d way s in whic h hierarchica l model s can be compared . Th e
theoretica l variable s in th e mode l ar e socioeconomi c statu s of th e
famil y (SES), performanc e of th e chil d on standardize d abilit y test s
(ABL), acceptanc e by significan t adult s suc h as father/mother/teache
r
(ASA), verba l achievemen t (ACH) , an d acceptanc e by peer s (APR) .
Each theoretica l variabl e is define d by tw o or mor e observe d meas
ures . Th e indicator s ar e as follows :
SES :

ABL:
ASA :

SEI , Dunca n socioeconomi c inde x of occupations ;


EDHH , educationa l attainmen t of hea d of house ;
R/P, rati o of room s in hous e to peopl e in house ;
PEA , Peabod y Pictur e Vocabular y Test ;
RAV, Rave n Progressiv e Matrices ;
FEV, father' s evaluation ;
MEV , mother' s evaluation ;

TABLE 9.2 Correlation Matrix (N = 249)


SEATPOP PLAYPOP SWORKPOP

VACH

VGR

SEI

EDHH

RRJP

SEATPOP

1.000

PLAYPOP

.593

SWORKPOP

.548

.489

VACH

.280

.233

.322

1.000

VGR

.236

.177

.399

.495

1.000

SEI

.052

.097

.102

.173

.159

1.000

EDHH

.045

.097

.166

.297

.213

.558

RR/P

.021

-.042

-.028

.188

-.040

.172

.098

1.000

.086

.144

.288

.275

.060

.153

-.001

RAVEN

.079

RAVEN PEABODY FEVAL

MEVAL TEVAL

1.000
1.000

1.000
1.000

PEABODY

.132

.174

.167

.397

.188

.162

.210

.276

.320

1.000

FEVAL

.066

.024

.082

.006

.115

.013

-.045

-.041

.095

-.059

1.000

MEVAL

.152

.081

.174

.134

.271

-.066

-.052

.001

.165

-.067

.424

1.000

-.006

.041

.142

.081

.181

.311

TEVAL

.251

.080

SOURCE: Maniyama and Garvey (1980).

.327

.213

.266

-.018

1.000

Examining

ACH :
APR :

Plausibility

of Models

211

TEV , teacher' s evaluation ;


VACH , verba l achievemen t score ;
VGR , verba l grades ;
PPOP , playgroun d popularity ;
SPOP , classroo m seatin g popularity ;
WPOP , schoolwor k popularity .

Conceptually , thi s mode l examine s th e sam e tw o view s of th e


relationshi p betwee n pee r acceptanc e an d achievemen t tha t wer e
introduce d in th e firs t illustration . Thos e view s ar e (a) tha t bein g
accepte d by one' s peer s enhance s one' s schoo l achievemen t (th e pat h
fro m APR to ACH ) an d (b) tha t doin g wel l in schoo l achievemen t
enhance s one' s acceptanc e by peer s (th e pat h fro m ACH to APR) . O f
course , bot h view s or neithe r view ma y be correct . Remembe r that ,
in contras t to th e finding s originall y reporte d by Lewi s an d St. Joh n
(1974), th e prio r example foun d tha t althoug h achievemen t influ
ence d pee r acceptance , th e opposit e did no t occur . Thus , thi s mode l
attempt s to brin g furthe r informatio n to th e questio n addresse d by
th e firs t example , bu t wit h cross-sectiona l data . Not e tha t on e impor
tan t strengt h of th e mode l is tha t it can examin e bot h causa l possibili
tie s in a singl e model , wherea s on e majo r weaknes s is tha t it canno t
contro l for stabilit y ove r tim e of th e achievemen t an d pee r acceptanc e
variables , whic h mean s tha t it coul d dra w incorrec t inference s if on e
or bot h of th e variable s wer e highl y stable .
Th e sampl e is 249 whit e childre n wh o attende d schoo l in a distric t
abou t to underg o schoo l desegregation . The y wer e a subsampl e of a
large r grou p tha t wa s tracke d as par t of th e stud y of desegregation ,
selecte d becaus e the y ha d complet e dat a on th e measure s include d in
thi s illustratio n an d becaus e all wer e measure d in pre-desegregatio n
classes . In contras t to th e precedin g an d followin g samples , bot h of
whic h ar e minorit y children , thi s particula r sampl e doe s no t focu s on
acceptanc e by an out-group . Rather , it allow s examinatio n of pro
cesse s in th e mainstrea m cultur e of th e schools , for th e distric t wa s
predominantl y whit e durin g thi s study . In othe r words , if ther e is no
relationshi p betwee n pee r acceptanc e an d achievemen t for th e mai n
grou p of children , the n ther e woul d be littl e (or at leas t less ) reaso n
to expec t tha t suc h a relationshi p woul d be foun d acros s groups .
Asid e fro m th e presenc e of a muc h bette r measuremen t mode l
du e to th e presenc e of multipl e indicators , th e "new " methodologica l
issu e illustrate d by thi s example is ho w to handl e nonrecursiv e model s

212

Figur e 9.2.

LATEN T VARIABL E

MODEL S

Laten t Variabl e Structura l Equatio n Mode l

S O U R C E : Maruyam a an d McGarvc y ( 1 9 8 0 ) . Copyrigh t 1 9 8 0 by th e America n Piychologica l Association ;


reprinte d by permission .
N O T E : SES - socia l class ; ABL academi c ability ; ASA - acceptanc e by adults ; AP R acceptanc e by
peers ; AC H verba l achievement . Coefficent s &4 an d 85 wer e reverse d inadvertently .

an d thei r identification . Unlik e recursiv e models , no t all nonrecursiv e


model s wil l be identified . As a review , reader s ma y wan t to retur n to
th e discussio n on identificatio n in Chapte r 6. Becaus e in thi s illustra
tio n each of th e tw o variable s tha t ar e reciprocall y relate d ha s an
instrument , th e mode l is identified . As can be seen in Figur e 9.2, ASA
is an instrumen t for identifyin g th e path s to th e ACH variable ,
wherea s bot h SES an d ABL ar e instrument s for identifyin g path s to
APR . Remembe r tha t (a) instrument s in th e mode l nee d to directl y
caus e on e of th e tw o variable s tha t ar e reciprocall y relate d bu t no t
th e othe r an d tha t (b) it make s littl e sens e to hav e instrument s tha t
ar e highl y intercorrelate d wit h on e another , for if the y were , the n it
woul d be muc h mor e difficul t to argu e tha t the y hav e independen t
effects . In thi s example , th e mode l specifie s tha t th e exogenou s

Examining

Plausibility

of Models

213

variable s tha t act as instrument s for differen t endogenou s variable s


ar e no t intercorrelated .
In contras t to th e precedin g illustration , each of th e conceptua l
variable s ha s multipl e measure s available . Th e primar y advantag e of
multipl e indicator s is tha t the y allo w th e conceptua l variable s to be
define d in term s of th e commonalitie s amon g th e measures , thereb y
in principle removin g erro r an d uniqu e varianc e fro m th e constructs .
Not e tha t each measur e ha s a nonzer o residua l attache d to it. In
practice , th e conceptua l variable s wil l be onl y as goo d as thei r
indicators ; if th e set of availabl e indicator s is poo r or th e indicator s
shar e a singl e method , the n th e conceptua l variabl e wil l be less tha n
ideal . If indicator s ar e poor , the n th e "correct " conceptua l variabl e
ma y no t be assessed ; if a singl e metho d is used , the n th e theoretica l
variabl e wil l no t hav e metho d varianc e extracted .
Th e finding s fro m th e illustratio n ar e somewha t mixed . First , all
of th e indicator s wer e significantl y relate d to th e construct s tha t the y
wer e suppose d to represent , an d all ha d significan t residuals . In othe r
words , all ha d significan t component s of commo n variance , bu t als o
of uniqu e variance . Second , th e overal l fit of th e mode l wa s less tha n
ideal , (59) = 138.55, bu t th e large r sampl e size compare d wit h tha t
in th e firs t illustratio n would , of course , produc e a large r chi-squar e
statisti c eve n wit h th e sam e minimu m functio n value . For now , th e
issu e of fit is deferred ; it wil l be reintroduce d in th e nex t chapter ,
whe n issue s of alternativ e model s ar e introduced . Third , mos t of th e
structura l path s wer e significant . SES of th e famil y an d abilit y (ABL)
wer e relate d (.378), an d th e path s from abilit y (ABL) to achievemen t
(ACH ) (standardized , .618) an d from acceptanc e by adult s (ASA) to
acceptanc e by peer s (APR) (standardized , .218) wer e significant .
Finally , th e path s of greates t interest , th e reciproca l path s betwee n
pee r acceptanc e (APR) an d achievemen t (ACH) , wer e as follows :
achievemen t to pee r acceptance , significan t (standardized , .306);
pee r acceptanc e to achievement , marginall y (p < .10) significan t
(standardized , .204). In othe r words , th e dat a ar e consisten t wit h th e
view tha t achievemen t affect s pee r acceptanc e bu t ar e somewha t
equivoca l on whethe r or no t pee r acceptanc e can affec t schoo l
achievement , leavin g ope n th e possibilit y tha t it coul d be foun d for
cross-grou p contact s bu t no t providin g stron g ground s for expectin g
to find suc h a relationship .
2

In summary , thi s example illustrate s ho w nonrecursiv e model s fall


unde r an d can be handle d by th e sam e genera l approac h as othe r type s
of models . At th e sam e time , however , nonrecursiv e model s nee d to

214

LATEN T VARIABL E

MODEL S

addres s issue s of mode l identification . Furthermore , th e exampl e


show s ho w multiple-indicato r model s can provid e advantage s ove r
single-indicato r ones . Finally , an d hopefully , it reinforce s th e impor
tanc e of conceptua l model s drivin g th e methods ; suc h model s requir e
bot h carefu l operationalizatio n of construct s an d specificatio n of
relationship s betwee n them . Substantively , th e dat a ar e generall y
consisten t wit h thos e of th e firs t study , bu t th e margina l pat h fro m
acceptanc e by peer s (APR ) to achievemen t (ACH ) leave s som e ambi
guit y abou t th e nature of th e relationshi p betwee n thos e variables .

Exampl e 3: A Longitudina l Multiple-Indicato r Pane l Mode l

As wa s tru e for th e previou s illustration , th e dat a tha t wer e use d to


examin e plausibilit y of th e mode l illustrate d in Figur e 9.3 wer e
collecte d as par t of a broade r stud y of schoo l desegregation . Th e
sampl e is th e grou p of Mexica n America n student s in th e schools .
Durin g th e firs t tim e period , th e student s attende d segregate d schools ;
durin g th e secon d an d thir d tim e periods , thei r school s ha d bee n
desegregated . Th e Mexica n America n sampl e wa s selecte d for thi s
illustratio n because , consisten t wit h th e conceptua l tac k tha t ha s bee n
taken , it provide s a tes t of out-grou p acceptance . Althoug h Africa n
America n student s in principl e coul d hav e don e so as well , in fact
ther e wer e to o few student s wit h complet e dat a to estimat e a solutio n
for them . Furthermore , ther e wer e no t enoug h minorit y childre n in
an y classroo m to generat e appreciabl e number s of minorit y (out
group ) choice s for whit e children , so th e analyse s ar e limite d to
Mexica n America n children . Finally , in th e cours e of th e preliminar y
analyse s of th e dat a set use d for thes e analyse s (see Maruyama , 1993),
(a) th e socia l class variabl e turne d ou t to be inconsequentia l in th e
Mexica n America n sampl e an d wa s droppe d an d (b) th e teache r
evaluatio n variabl e wa s foun d to be highl y collinea r wit h abilit y
an d wa s dropped . Therefore , th e illustratio n is simplifie d compare d
to whit e studen t dat a presented by Maruyam a et al. (1986) an d
Figur e 9.3.
Th e firs t five variable s in Figur e 9.3, whic h is take n fro m
Maruyam a et al. (1986), ar e th e sam e variable s as thos e in Figur e 9.2
excep t for th e teache r dimension . Identica l variable s ar e famil y socia l
class (SES), students ' academi c abilit y (AB), acceptanc e by peer s
(PAC66) , an d schoo l achievemen t (ACH66) . Significan t adul t rating s

Examining

Figur e 9.3.

Plausibility

of Models

215

Laten t Variabl e Pane l Structura l Equatio n Mode l

S O U R C E : Maruyama , Miller , an d Holt z ( 1 9 8 6 ) . Copyrigh t 1 9 8 6 by th e America n Psychologica l


A l l o c a t i o n ; reprinte d by permission .
N O T E : T h e pane l mode l is (or examinin g th e relatio n betwee n pee r popularit y an d achievement . SES
socioeconomi c status , measure d by SEI (Dunca n Socioeconomi c Inde x of Occupations) ; E D H H
educationa l attainmen t o f hea d o f household ; RR/ P = rati o o f room s in hom e t o peopl e livin g in home ;
A B academi c ability , measure d by RA V (Raven' s Progressiv e Matrices ) an d PEA (Peabod y Pictur e
Vocabular y Test) ; T E V teachers ' evaluation s o f students , measure d by (teachers ' rating s o f
students ' motivation ) an d T E XP (teachers ' expectation s o f students ' eventua l educationa l attainment) ;
PA C - acceptanc e by peers , measure d by SPO P (seatin g popularity) , (playgroun d popularity) , an d
W P O P (schoolwor k popularity) ; AC H - schoo l achievement , measure d by VAC H (verba l standardize d
tes t performance ) an d V G R (verba l grades) .

fro m Figur e 9.2 wer e replaced by evaluation s th e studen t receive d


fro m his or he r teache r (TEV). Measuremen t of th e five variable s
precede d schoo l desegregatio n durin g th e 1966 schoo l year . There
fore , th e pee r nomination s ha d to com e fro m othe r Mexica n Ameri
can students . Th e longitudina l or pane l aspect s of th e mode l wer e
availabl e becaus e th e achievemen t an d pee r acceptanc e variable s wer e
measure d agai n afte r desegregation , bot h immediatel y (in 1967) an d

216

LATEN T VARIABL E

MODEL S

2 year s late r (in 1969). At thos e tw o tim e points , pee r acceptanc e wa s


acceptanc e by whit e peers , an d achievemen t dat a wer e fro m desegre
gate d classe s an d schools .
As can be seen in Figur e 9.3, th e mode l analyze d by Maruyam a
et al. (1986) examine d plausibilit y of a perspectiv e tha t view s bot h
pee r acceptanc e an d studen t achievemen t as bein g influence d by
famil y social class , by students ' ability , an d by teachers . Teacher s ar e
viewe d as bein g abl e to mediat e effect s of abilit y an d socia l class .
Becaus e th e logi c use d wa s tha t causa l effect s wer e lagged , ther e is no
pat h betwee n acceptanc e by peer s (PAC66 ) an d schoo l achievemen t
(ACH66) . Even if we ha d wante d to construc t a reciproca l causatio n
model , instrument s wer e no t availabl e to identif y a nonrecursiv e
model . Tha t left th e thre e choice s of (a) specifyin g th e pat h on e wa y
or th e other , (b) omittin g path s an d assumin g tha t th e ful l relationshi p
betwee n th e tw o variable s is du e to commo n causes , or (c) allowin g
thei r residual s to covary . Thi s last alternativ e wa s chose n as preferabl e
to eithe r tryin g to argu e for predominanc e of eithe r pee r acceptanc e
or achievemen t or no t allowing the m to be relate d ove r an d abov e
thei r commo n causes . As discusse d earlier , th e residua l covarianc e is
simila r to a covarianc e betwee n tw o exogenou s variable s excep t tha t
it is betwee n onl y th e unexplaine d portio n of th e variance . Also of im
portanc e is tha t in thi s mode l th e residua l covarianc e explain s exactl y
th e sam e covarianc e as woul d eithe r causa l pat h fro m on e to th e other .
Onc e again , th e "critical " relationship s in th e mode l wer e th e
one s betwee n achievemen t an d pee r acceptance . The y ar e numbere d
11, 12, 15, an d 16. As explaine d in th e precedin g discussion , ther e ar e
no causa l path s withi n tim e periods . Furthermore , at th e tw o late r tim e
periods , we hypothesize d tha t th e residua l covarianc e betwee n pee r
acceptanc e an d achievemen t tha t wa s include d at Tim e 1 wa s no t
neede d becaus e prio r measure s wer e available , allowin g for a pane l
mode l in whic h stabilit y an d cross-lagge d effect s coul d be assessed .
Not e tha t as wa s tru e for th e precedin g illustration , each variabl e
ha s multipl e measures ; thi s allow s for extractio n of measuremen t
erro r for all measures . In addition , becaus e th e sam e measure s wer e
collecte d repeatedl y acros s time , thei r residual s wer e allowe d to
covar y (drawin g fro m th e earlie r discussio n of pane l analysis ) to pic k
up measure-specifi c variance . In contras t to pane l analysis , commo n
metho d varianc e can be tease d apar t fro m trai t stability .
Unfortunately , th e dat a fro m thi s exampl e als o illustrat e th e majo r
shortcomin g of longitudina l sampling , for , despit e havin g a reason

Examining

Plausibility

of Models

217

abl e sampl e size at th e beginnin g of th e study , sampl e attritio n wa s


ver y grea t acros s th e 4 year s of th e stud y du e to bot h studen t mobilit y
an d missin g dat a scattere d throughou t th e measures . Becaus e we
coul d no t assum e tha t attritio n wa s random , we initiall y attempte d
to extrac t a larg e subsampl e wit h complet e data , a tas k tha t wa s no t
possible . As a result , we settled for a dat a matri x base d on th e
maximu m numbe r of observation s betwee n each pai r of variables ,
ofte n calle d a pairwis e matrix . Thi s matri x is reall y appropriat e onl y
whe n dat a loss is random . Analyse s wer e conducte d to compar e
individual s lost throug h attritio n to thos e kep t in th e sampl e on a
numbe r of backgroun d variables . Thes e analyses , whic h compare d
mean s an d variances , did no t yield a clea r interpretation . Th e goo d
new s wit h respec t to missin g dat a is tha t method s for dealin g wit h
the m hav e improve d (e.g. , Graha m & Donaldson , 1993; Graham ,
Hofer , & Piccinin , 1994; Littl e & Rubin , 1987, 1990; Muthen ,
Kaplan , & Hollis , 1987). If th e analyse s of th e mode l o f Figur e 9.3
wer e to be don e today , the n th e dat a sourc e woul d likel y diffe r fro m
th e matri x appearin g in Tabl e 9.3, an d th e sampl e size woul d be large r
tha n th e nomina l 100 tha t is use d for th e analyses .
Th e dat a for thi s illustratio n appea r in Tabl e 9.3. Th e matri x is a
scale d covarianc e matri x in whic h each measur e is standardize d th e
firs t tim e it appears , an d the n change s in variabilit y ar e calculate d (as
a ratio ) whe n a measur e appear s agai n usin g th e firs t tim e tha t measur e
appeare d as a standar d (see , e.g. , Meredith , 1964). In othe r words ,
th e relativ e size of th e variance s is preserve d for each measure , bu t
th e actua l variance s ar e change d to a metri c in whic h all variable s hav e
simila r scales. Furthermore , th e metri c is simila r to a correlatio n
metric , whic h intuitivel y is simple r for people . Thi s approac h is fine
so lon g as th e specifi c variance s ar e no t critica l for cross-grou p
comparisons .
Substantively , achievemen t at Tim e 1 (Ach66 ) wa s relate d to
abilit y (AB); achievemen t (ACH ) displaye d virtuall y perfec t stabilit y
ove r tim e (standardize d coefficient s of .98 an d 1.00), an d pee r
acceptanc e displaye d no consistenc y from th e segregate d to th e
desegregate d classroo m (Pac6 6 -> Pac67 ) an d onl y modes t stabilit y
withi n desegregate d classroom s ove r tim e (Pac6 7 - Pac69) . Ther e
wer e no significan t path s betwee n pee r acceptanc e an d achievement .
18

18. It turn s ou t tha t fo r som e model s in whic h variance s diffe r dramatically , reseatin g t o
reduc e difference s in variance s ma y b e neede d to obtai n meaningfu l SEM solutions .

218

i
.

-;

^f t v
OO fN
\ ro

oo
OO i-<
r s -.

*o

?
IS

I
y

BE ft

l ^

^ O r n r N f N f

i
S

T - t i S - f S f N

^ o o r n o s 0 r s r ^ r n - * r
^ O r o ^ ' i - ' O O i N i O N ^ O f H * O O r - . r N O O O ^ - ' ' - J

* I

'

~*

ifl

I"-. <N
rS *o

ri

'

fN

ON

"

m
V>

th
oo

* -

l-H -

e \ f > l

N * * ! f -

oo r s

^> SO
^
r n f S O * n

r N O O O S O O r n r S ' ^ - v

'
O

^ H C O f N r - f S r - i i - l O O r

i O o f ' ' - ^ 0 - " ' r n f N * 0 0 \

O V O r - O r - . i - l ^ - . i - l ^ - . ^ . T - O
i-
'
" I

^ r H i N i ^ r

r n * - < 0 0 r S 9 s r > 4 9 \ v > r S 0 0 O S V O


r - - O O ^ O o o ^ - i - < ^ - - < * O O ^ t ^ * ^ * o r o ' * ; r s |

f ^ r ^ O O

w ^ v O i N V O < S ^ - 0 0 0 0 - < ' r ^ r ^ ^ - r

w - l 0 0 ^ ' - ' W ^ 0 N 0 0 0 0 \ p 0 0 r n > o O ^ ' 0 > \


t ^ f f \ V ^ w ^ o o 0 r n
' r r ^ f N r r , 0
* 0 ^ - - - * -

ft S: >

n
*

^ ^ r ^ i ^ r n r ^ ^ r 4 r o ( N 0 O ^ r v N 0
0
* ' - ' ~ " - " - < ' - 0 - < -
i

r ^ ^ o 0 f ^
O ^ ^ r n v o u ^ r
' - ' O O f N O O
-

S
>

r S ^ ^ r o - 0 i N r n O

Examining

Plausibility

of Models

219

Result s of th e analyses , presente d as th e fina l par t of th e illustratio n


begu n as pat h analysi s an d continue d as pane l analysi s an d confirma
tor y facto r analysis , appea r in Illustratio n 4.
Overall , then , th e longitudina l mode l seem s to sugges t tha t cross
sectiona l model s may , by no t takin g int o accoun t tha t achievemen t is
highl y stable , wrongl y infe r tha t othe r variables , suc h as acceptanc e
by peers , ar e affectin g it. It also provide s less clea r informatio n abou t
whethe r or no t achievemen t shape s pee r acceptance . Take n together ,
th e thre e studie s leav e som e ambiguit y bu t certainl y call int o questio n
th e assertio n of Lewi s an d St. Joh n (1974) tha t pee r acceptanc e is an
importan t determinan t of late r achievement .
Befor e leavin g thi s illustration , it is use d to compar e model s
havin g singl e indicator s to those havin g multipl e indicators . Althoug h
thes e analyse s bega n attemptin g to fit th e full mode l tha t appear s in
Figur e 9.3 (see Maruyam a et al., 1986), as describe d earlier , th e social
class variabl e an d th e teache r evaluatio n variabl e wer e droppe d (see
Maruyama , 1993). Th e focu s on wha t happen s to variable s wit h singl e
versu s multipl e indicator s is accomplishe d by varyin g th e indicator s
on the achievement variables.
As argue d in Maruyam a (1993), ther e ar e majo r advantage s tha t
accru e fro m havin g multipl e measures . Way s of tryin g to simulat e
havin g multipl e measure s seem to provid e no effectiv e substitute .
Maruyam a (1993) provide d a compariso n tha t is reporte d here . For
a singl e dat a set , th e followin g alternativ e way s of modelin g achieve men t wer e examined : (a) a single-indicato r mode l assumin g perfec t
reliability , (b) tha t sam e singl e indicato r wit h its reliabilit y fixed to
less tha n 1.0 (.9) an d havin g a nonzer o residual , (c) a singl e indicato r
wit h its reliabilit y fixed to be th e sam e valu e as wa s foun d by th e
solutio n for multipl e indicators , an d (d) th e multipl e (two- ) indicato r
solutio n correspondin g to Figur e 9.3. Thes e change s primaril y af
fecte d th e relationship s of achievemen t wit h pee r acceptance ; thos e
relationship s varie d substantiall y acros s th e differen t options .
Th e path s fro m th e variou s solution s appea r as Tabl e 9.4, whic h
is take n fro m Maruyam a (1993). As can be seen , th e differen t way s
of modelin g achievemen t hav e ver y differen t consequence s for th e
inference s draw n fro m th e model . Th e firs t colum n is th e dat a fro m
th e multiple-indicato r approach , whic h shoul d be th e "best, " for it
make s us e of multipl e measure s to separat e reliabl e varianc e fro m
erro r an d uniqu e variance , an d it also allow s residua l covariance s
acros s tim e to captur e measure-specific varianc e separately . In tha t

220

LATEN T VARIABL E

MODEL S

TABLE 9.4 Coefficient s Interrelatin g Pee r Acceptanc e an d Achieve


ment , Examinin g Variou s Assumption s Abou t th e Reli
abilit y of Indicator s of Achievemen t

Coefficient

Model 1:
Multiple
Indicators

Model 2:
Single Indicator
With Perfect
Reliability

Model 3:
Single
Indicator With
Reliability of
.90

Model 4:
Single
Indicator With
Reliability Set
From Model I

Achievemen t stabilit y
path s
13 (Ach66-Ach67 )

1.00"

. 3 5 **

. 4 8 **

.75*

17 (Ach67-Ach69 )

. 9 8 **

. 5 1 **

. 6 5 **

.95*

Pee r acceptanc e to
achievemen t
11 (Pac66-Ach67 )

-.04

.12

.14

.10

15 (Pac67-Ach69 )

.15

. 2 1 **

. 2 2 **

.24*

6 (AB-Pac66 )

.27

.27

.27

.27

7 (AB-Ach66 )

. 5 9 **

. 5 2 **

. 5 9 **

. 7 5 *'

Othe r path s

18 (AB-Pac67 )

-.08

-.06

-.09

-.13

10 (Pac66-Pac67 )

.17

.20

.19

.19

12 (Ach66-Pac67 )

.16

.15

.17

.19

14 (Pac67-Pac69 )

.22*

.22*

.22*

.22*

16 (Ach67-Pac69 )
Chi-squar e goodnes s of fit
Degree s of freedo m

-.05
87.6
102

-.06

-.07

-.07

50.1

46.1

42.8

67

67

67

S O U R C E : Maruyam a ( 1 9 9 3 ) .
N O T E : Number s precedin g coefficient s refe r t o Figur e 9 . 3 . Not e tha t th e famil y socia l clas s (SES) an d
teache r evaluatio n (TEV) variable s ar e omitte d fro m th e model .
'p < . 1 0 ; "p < . 0 5 .

model , achievemen t is almos t perfectl y stable acros s bot h tim e peri


od s (th e firs t tw o row s of Tabl e 9.4) an d is not shape d by acceptanc e
by peer s (th e thir d an d fourt h row s of Tabl e 9.4). In each of th e othe r
thre e models , however , th e stabilit y of achievemen t is muc h lower ,
an d ther e ar e at leas t stron g hint s tha t acceptanc e by peer s can
influenc e achievement . Not e tha t eve n if th e reliabilit y of th e achieve
men t measur e is fixed to th e valu e estimate d fro m th e multiple
indicato r solution , th e dat a still ar e quit e differen t an d th e inference s
woul d likel y diffe r as well .
In summary , thi s last mode l illustrate s no t onl y wha t laten t
variabl e SEM pane l model s loo k like bu t also th e difference s tha t can

Examining

Plausibility

of Models

221

be foun d whe n single-indicato r model s ar e compare d to multiple


indicato r models . It shoul d reinforc e th e poin t tha t laten t variabl e
SEM approache s ar e no t particularl y powerfu l whe n onl y singl e
indicator s are available , for the y reduc e to pat h analysi s an d its variants .
Onc e multipl e indicator s ar e added , th e capacit y to extrac t nonran
do m error , autocorrelation , and , of course , reliabilit y estimate s mark
edl y increase s th e strengt h an d flexibility of SEM approaches .
It is hope d tha t th e example s hav e helpe d reader s to develo p a
bette r understandin g of SEM model s an d ho w the y work . Tha t
understandin g wil l be extende d in th e nex t chapter , whic h wil l loo k
in detai l at way s of developin g alternativ e model s to compar e an d a
rang e of technique s for relativ e mode l testing .

E X E R C I SE

9. 1

Settin g Up Matrice s fo r a Laten t Variabl e Structura l Equatio n Mode l

Set up th e equation s an d matrice s for th e measuremen t an d


structura l model s of Figur e 9.1. Correc t matrice s an d equa
tion s appea r in Appendi x 9.1. Reader s ar e encourage d to
compar e th e tw o approache s so tha t the y ar e abl e to go bac k
an d fort h betwee n equation s an d matrices , for th e othe r
example s wil l be provide d onl y in matri x form . Thos e matri
ces an d equation s can be use d to set up th e LISREL, AMOS ,
an d EQS program s tha t appea r in Appendi x 9.2.

E X E R C I SE

9. 2

Settin g Up Figur e 9.2 Matrice s

Set up th e matrice s for Figur e 9.2. The n calculat e th e numbe r


of degree s of freedo m in th e model . A summar y of degree s of
freedo m an d a LISREL comman d file appea r in Appendi x 9.3.
In contras t to th e firs t illustratio n wit h its smal l numbe r of
measures , th e multipl e indicator s produc e a larg e numbe r of
degree s of freedom .

222

LATEN T VARIABL E

A P P E N D IX

MODEL S

.1

Matrice s an d Equation s fo r Reanstysi s


of Dat a Fro m Lewi s an d St . Joh n (1974)
Matrice s

Lambd a X is an identit y matrix . Ph i is a correlatio n matri x wit h th e


correlatio n betwee n socioeconomi c statu s (SES) an d schoo l percent age whit e (.06) as its off-diagona l element . Thet a X is null . For th e
lambd a Ymatrix , asterisk s (*) indicat e designate d referenc e indicator s
tha t ar e fixed to 1.0.
Y= A +
y

I OtisI Q
|GPAl- 5
| PopwWhi
|GPA 6
I RdAc h

I, 0 0|

I
I
t|
I
I

|
0 0|
| 0 1.0 0|
10 0 *
|00 |

|, |
| |

iPastAAc h |
| PopwWhi t |
PresAAc h I

=
|PastAAc h |
| PopwWhi t |
|PresAAc h |

|0

I ,
|


00l

+ +

IPastAAc h |
| | SES|
| PopwWhi t j +
S%WH |
I3 41
I PresAAc h j

0|
3

| |
2

(symmetric )
|0 .
JO 0 0
10 0 ,
I 4,1 0 0 0 .

= |
2

5 2

&

=
6

| . .|
2

Equation s

Oti s IQ = , * PastAAc h + ,
GPA1- 5 = PastAAc h +
PopwW h = 1.0 PopwWhit + 0
GPA 6 = * PresAAc h +
RdAc h = PresAAc h +
PastAAc h = , SES + S%WH + ,
PopwWhi t = , PastAAc h + SES + S%WH +
PresAAc h = PastAAc h + PopwWhi t + y x SES +
S%WH +
2

Examining

Plausibility

22 3

of Models

A P P E N D IX

9. 2

Setup s for Figure 9.1


(a) USRE L Setu p
REANALYSES O F LEWI S & ST . JOH N DATA, SE MBOOK, LISRE L
SETUP
DA NI= 7 NG= 1 NO=15 4 MA=KM
LA
'GPA1-5 ' 'OTISIQ ' 'WHPOP ' 'GPA6 ' 'RACH ' 'SES ' 'SCHWH '
KM F O fi=a:mnmda t
(16F5.3 )
MO NY= 5 NX= 2 NE= 3 NK= 2 LY=FU,F I F I BE=FU,F I GA=FU,F R
PS=DI,F R TE=SY,F I
LK
'SES '
LE

'SCHWH '

'PASTACH ' 'WHPOP ' 'CURACH '


FR L Y 2 1 L Y 5 3 B E 2 1 B E 3 1 B E 3 2 T E 1 1 T E 2 2
TE 4 4 T E 5 5 C
TE 4 1 T E 5 2
ST 1. 0 L Y 1 1 L Y 3 2 L Y 4 3
OU ad=of f P T S E T V r s MR M I F D S S TM=4 5
[File mnmdat , locate d on A drive , is a lon g ro w vector: ]
1 . .5 7 1 . . 3 .2 7 1 . .7 7 .5 8 .3 6 1 . .5 2 .5 6 .1 6 .5 3 1 .
.2 6
.1 7 -.0 2 .2 1 .2 2 1 . .2 5 .2 3 .1 8 .3 2 .1 7 .0 6 1 .

(b ) AMO S Setu p
1. Contro l Fil e

Exampl e fro m Maruyam a an d Mille r (1979).


Firs t mode l of relatio n betwee n popularit y an d achievement .
Correlations , bogu s standar d deviations , fro m Lewi s an d St. Joh n
(1974).
$Mods= 4
$Structur

22 4

LATEN T VARIABL E

GPAGR1-5 < - PastAc h (1 )


GPAGR1-5 < - eps l (1 )
OtisI Q < - PastAc h
OtisI Q < - eps 2 (1 )
WhPop < - PopW H (1 )
GPAGr6 < - PresAc h (1 )
GPAGr6 < - eps 4 (1 )
RAch < - PresAc h
RAch < - eps 5 (1 )
FamSE S < - SE S (1 )
Sch%Wh < - SchPCW h (1 )
SES <- > SchPCW h
PastAc h < - SE S
PastAc h < - SchPCW h
PastAc h < - zeta l (1 )
PopWH < - SE S
PopWH < - SchPCW h
PopWH < - PastAc h
PopWH < - zeta 2 (1 )
PresAc h < - SE S
PresAc h < - SchPCW h
PresAc h < PastAc h
PresAc h < - PopW H
PresAc h < - zeta 3 (1 )
eps l <> eps 4
eps 2 <> eps 5
$Includ e = a:\mnmmatrx.am
d
$technica
l

2. Dat a

! REANALYSES O F LEWI S & ST . JOH N DATA.


! Reanalysi
s o f dat a fro m Lewi s & St . John .
! Correlations
.
$Inputvariable
s
GPAGR1-5 ! Grad e poin t averag e grade s 1- 5
OtisI Q ! Oti s grou p administere
d I Q
WhPop ! Popularit
y wit h whit e peer s
GPAGr6 ! Grad e 6 grad e poin t averag e

MODEL S

Examining

Plausibility

225

of Models

RAch ! Readin g achievemen


t tes t scor e
FamSE S ! SE S o f famil y o f studen t
Sch%Wh ! Percen t o f schoo l tha t i s whit
$Samplesize=15
4
$Correlation
s
1.00 0
.57 0 1.00
.30 0 .27
.77 0 .58
.52 0 .56
.26 0 .17
.25 0 .23
$Standar d
1. 0 1. 0 1.

0
0 1.00 0
0 .36 0 1.00
0 .16 0 .53
0 -.02 0 .21
0 .18 0 .32
deviation
s
0 1. 0 1. 0 1.

0
0 1.00 0
0 .2201.00
0
0 .17 0 .06 0 1.00

0 1. 0

[Thes e las t tw o line s ar e no t needed ; the y poin t ou t tha t a


correlatio n matri x is bein g analyzed. ]

(c) EQ S Setu p

/TITL E
[Exampl e fro m Maruyam a an d Mille r (1979); reanalysi s of Lewi s
an d St. Joh n (1974).]
/SPECIFICATION S
DATA='A:\MNMDAT.EQS' ; VARIABLES=7 ; CASES=1 58 ;
METHODS=ML
;
MATRIX=CORRELATION
;
/LABEL S
Vl=GPAl-5 ; V2=OTISIQ ; V3=WHPOP; V4=GPA6 ; V5=RACH;
V6=SES ; V7=SCHWH; F1=PASTACH ; F2=POPWWH;
F3=PRESACH; F4=FAMSES ; F5=SCHPCWH;
/EQUATIONS
VI = 1.0F 1 + El ;
V2 = *F 1 E2 ;
V3 = 1.0F 2 + E3 ;
V4 = 1.0F 3 + E4 ;

LATEN T VARIABL E

226

V5 = *F 3 +
V6 = 1.0F 4
V7 = 1.0F 5
Fl = *F 4 +
F2 = *F 4 +
F3 = *F 4 +
/VARIANCES
Fl = *
F2 = 1 0;
F3 - *

MODEL S

E5 ;
+ E6 ;
+ E7 ;
*F 5 + D3 ;
*F 5 + *F 1 + D4 ;
*F 5 + *F 1 + *F 2 + D5 ;

F4 = 1 0;
F5 = 1 0;
El = *
E2 = *
E3 = 0,
E4 = #
E5 = *
E6 = 0,
E7 = 0,
/COVARIANCES
F4 , F 5 = * ;
El , E 4 = * ;
E2 , E 5 = * ;

NOTE : Becaus e I di d no t hav e a cop y of th e EQS program , I coul d


no t ru n thi s progra m to ensur e tha t it woul d work . It conform s to
earlie r wor k I did whe n I ha d acces s to EQS.

Examining

Plausibility

227

of Models

A P P E N D IX

9. 3

Analysi s of Degree s of Freedo m an d USRE L Setu p fo r Figur e 9.2

1. Analysi s of degree s of freedo m


Possibl e degree s of freedom : N( N + 1) / 2 = 13*14 / 2 = 91
Parameter s to estimate : Tota l = 32
Relation s betwee n construct s an d indicators : 13 - 2' = 11
Residual s on indicators : 13
Structura l paths : 5 causa l + 1 covarianc e = 6
Residual s on laten t variables : 2
Mode l degree s of freedom : 91 - 32 = 59
[Note : Ther e ar e tw o referenc e indicators . Th e variance s for th e
exogenou s variable s ar e fixed in th e ph i matri x to 1.0 each , givin g
the m uni t variance. ]
2. LISREL command s
[Note : Becaus e of problem s estimatin g a solutio n in LISREL 8
wit h a "normal " X an d Y setup , th e mode l wa s set up as if all variable s
wer e endogenous . Th e "all Y" approac h to LISREL wil l be describe d
in Chapte r 11. It produce s th e sam e solutio n as an X an d Y setup. ]
CONTROL CARDS, MARUYAM
A & MCGARVEY LISRE L EXAMPLE,
ALL Y
DA NG= 1 NI=1 3 NO=24 9 MA=KM
KM F O S Y FI=a:MNMCDA T
(13F6.4 )
LA
'SEATPOP ' 'PLAYPOP ' 'SWORKPOP ' 'VACH ' 'VGR ' ' SEI '
'EDHH'
'RR/P ' 'RAVEN ' 'PEABODY ' 'FEVAL ' 'MEVAL ' 'TEVAL '
SE
6 7 8 9 1 0 1 1 1 2 1 3 1 23

4 5 /

MO NY=1 3 NX= 0 NE= 5 NK= 0 LY=FU,F I BE=FU,F I


TE=SY,F l
LE
'SES ' 'ABILITY

'

'ACCSIGO '

'ACCPEER '

'SCHACH '

PS=SY,F I

22 8

LATEN T VARIABL E M O D E LS

FR L Y 9 4 L Y 1 0 4 L Y 1 3 5 L Y 2 1 L Y 3 1 L Y 4 2 L Y 6
3 LY7 3 C
BE 5 4 B E 4 5 B E 5 1 B E 5 2 B E 4 3 P S 1 1 P S 2 1 P S
2 2 C
PS 3 3 P S 4 4 P S 5 5 T E 1 1 T E 2 2 T E 3 3 T E 4 4 T E
5 5 T E 6 6 C
TE 7 7 T E 8 8 T E 9 9 T E 1 0 1 0 T E 11 1 1 T E 1 2 1 2 T E 1 3
13
ST 1. 0 L Y 1 1 L Y 5 2 L Y 8 3 L Y 11 4 L Y 1 2 5
OU P T LY=SV 2 BE=SV 2 PS=SV 2 TE=SV 2 S E T V M I S S

Illustratio n 4: Laten t Variable Structura l Equatio n Modeling


This is th e final illustratio n draw n from a single dat a set . At this point , the bes t
possibl e approac h to th e mode l given the existin g dat a is presented . That
approac h is laten t variabl e structura l equatio n modeling , which add s multipl e
indicator s of eac h theoretica l variable . For this Illustration , the socia l clas s
variabl e wa s droppe d becaus e It wa s not relate d to an y of th e othe r conceptua l
variables , an d the teache r evaluatio n variabl e wa s droppe d becaus e it wa s too
highly relate d to the ability an d achievemen t variables , resultin g In unwante d
collinearity . Dropping variable s tha t shoul d be conceptuall y distinc t or impor
tan t is not a n eas y decisio n for a substantiv e article ; for the purpos e of the
illustration , however , th e decisio n Is muc h easier , for the variable s ar e not
neede d to illustrat e the point s bein g made . To agai n remin d readers , th e focus
is on th e path s betwee n pee r acceptanc e an d achievement .
The mode l is describe d in Figure 9.3, an d the matri x (calle da : ma fu l lmt . r x
in the illustration ) appear s in Table 9.3.
The LISREL command s for the mode l ar e a s follows:
Mexican America n data , runs for choice s of whites , multipl e indicator s
DA NI=3
KM F U F
(8F10.7
SD F O
(11F7.5

3 NO=10 0 MA=CM
O FI=a:mafullmt.r
)

1. 0 1. 0 1. 0 1. 0 1. 0 1. 0 1.02

5 1.04

9 .98 1 1. 0 1. 0

1. 0 1. 0 1. 0 1. 0 1. 0 1. 0 .90 1 .90 7 1.11

4 1.20

0 .911

.93 6 .76 6 .87 5 .92 6 1. 0 1. 0 1. 0 1. 0 .70 5 1. 0 1. 0


SE

Examining

Plausibility

22 9

of Models

16 1 7 1 3 1 4 1 5 2 7 2 8 4 5 6 1 8 1 9 7 8 9 2 0 2 1 /
MO NY=1 7 NE= 7 LY=FU,F I BE=FU,F I PS=SY,F I TE=SY,F I
FR L Y 2 1 L Y 4 2 L Y 3 2 L Y 7 3 L Y 9 4 L Y 8 4 C
BE 3 1 B E 2 1 B E 4 1 B E 4 2 B E 4 3 B E 5 2 B E 5 3
6 5 BE 6 4 C
BE 7 5 B E 7 4 P S 3 2 P S 1 1 P S 2 2 P S 3 3 C
PS 4 4 P S 5 5 P S 6 6 P S 7 7 T E 1 1 T E 2 2 T E 3 3
4 4 T E 5 5 T E 6 6 C
TE 7 7 T E 8 8 T E 9 9 T E 1 0 1 0 T E 1 1 11 T E 1 2 1 2 T E
13 T E 1 4 1 4 T E 1 5 1 5 C
TE 1 6 1 6 T E 1 7 1 7 T E 11 6 T E 1 2 7 T E 1 3 8 T E 1 4 9
16 6 C
TE 1 7 7 T
EQ L Y 7 3
EQ L Y 9 4
EQ L Y 8 4

BE

T E
1 3
TE

E 1 5 1 0 T E 1 6 11 T E 1 7 1 2
LY1 2 5 LY1 7 7
LY1 4 6
LY1 3 6

ST 1. 0 L Y 1 1 L Y 5 2 L Y 6 3 L Y 1 0 4 L Y 11
LY 1 6 7 C
LY 9 4 L Y 8 4
pat h diagra m
OU P T AD=OF F S S

5 LY 1 5 6

T he result s ar e th e sam e a s thos e describe d a s Mode l 1 in Table 9.4 .


T he complet e result s ar e a s follows:

LISRE L ESTIMATE S (MAXIMUM LIKELIHOOD )


(a) Relation s Betwee n Construct s an d Measure s (lambd a V):
Ability

PeerAccI

Achievel

PeerAcc2

Achleve2

PeerAcc3

Achieves

VAR 16

1.0 0

VAR 17

.66
(.28 )
2.3 1

1.3 5

(.27 )
4.9 7
.84

(.17 )
4.8 5
1.0 0

1.0 0

VAR 13

VAR 14

VAR 15
VAR 27

230

LATEN T VARIABL E

MODEL S

(a) Relation s Betwee n Construct s an d Measure s (continued) :


Ability
VAR 28

PeerAccI

Achievel

PeerAcc2

Achieve2

PeerAcc3

Achieves

1.0 2

(.16 )
6.2 6
VAR 4

1.01
(.09 )
11.7 8

VAR 5

.98
(.08 )
11.6 1

VAR 6
VAR 18

1.0 0

1.0 0

VAR 19

1.0 2
(.16 )
6.2 6

VAR 7

1.01
(.09 )

11.7 8
.98
(.08 )
11.6 1
1.0 0

VAR 8

VAR 9
VAR 20
VAR 21

1.0 0
1.0 2
(1.16 )
6.2 6

(b ) Structura l Path s Interrelatin g Theoretica l Variable s

PeerAcc I

Ability

PeerAccI

Achievel

.28

.22
(.16 )
1.3 6

.24
(.28 )

(.19 )
Achievel

1.4 2
.50
(.24 )
2.0 6

PeerAcc 2

-.10
(.29 )
-0.3 6

0.8 6

Examining Plausibility of Models

23 1

(b ) Structura l Path * Interrelatin g Theoretica l Variable * (continued) :


Ability

PeerAccI

Achievel

Achieve2

.05
(.16 )
-0.3 0

1.3 5
(.31 )
4.3 5

PeerAcc 3

PeerAcc2

.21

-.06

1.8 9

PeerAcc3

(.11 )
Achieved

Achieve2

(.13 )
-0.4 6

.12
(.09 )

.97
(.12 )

1.3 5

7.8 9

(c) Residual s
Ability
Ability

PeerAc M

.42
(.22 )
1.9 4

Achieve l

PeerAch 2

Achieve2

Achievel

PeerAcc2

Achleve2 PeerAcc3

Achieves

.41
(.13 )
3.1 1
.05
(.06 )
0.8 0

.19
(08 )
2.4 7
.68
(.13 )
5.31
.01
(.11 )
0.1 0
.61
(.12 )
5.0 6

PeerAch 3

Achieve3

PeerAccI

-.01
(.11 )
-0.1 3

Of mos t prominenc e ar e the stron g achievement-to-achievemen t paths , with


residual s tha t ar e effectivel y zero (.01 an d -.01 , the latte r a n anomalou s nega
tive varianc e calle d a Haywood case) . Most notably , thos e stron g stabilitie s

23 2

LATEN T VARIABL E

MODEL S

wiped out an y pane l relationships , leavin g the conclusio n tha t ther e is no


relatio n betwee n popularit y an d achievement .
The correspondin g fit statistics , which ar e provide d s o tha t reader s ca n look
bac k at the m afte r readin g the next chapter , were a s follows:
GOODNESS O F FI T STATISTIC S
CHI-SQUARE WIT H 10 2 DEGREES O F FREEDO M = 87.5

5 ( P =

0.85 )
ESTIMATED NON-CENTRALIT Y PARAMETER (NCP ) = 0 . 0
9 0 PERCEN T CONFIDENCE INTERVA L FO R NC P = (0. 0 ; 9.94 )
MINIMUM FI T FUNCTIO N VALUE = 0 . 8 8
POPULATION DISCREPANC Y FUNCTIO N VALUE (F0 ) = 0. 0
9 0 PERCEN T CONFIDENCE INTERVA L FO R F 0 = (0. 0 ; 0.10 )
ROOT MEAN SQUARE ERRO R O F APPROXIMATION (RMSEA ) = 0 . 0
9 0 PERCEN T CONFIDENCE INTERVA L FO R RMSEA = (0. 0 ;
0.031 )
P-VALUE FO R TES T O F CLOS E FI T (RMSE A < 0.05 ) = 0.9 9
COURIER = EXPECTE D CROSS-VALIDATIO N INDE X (ECVI ) = 1 . 9 1
9 0 PERCEN T CONFIDENCE INTERVA L FO R ECV I = (2.0 6 ;
2.16 )
ECVI FO R SATURATED MODEL = 3.0 9
ECVI FO R INDEPENDENCE MODEL = 7.0 8
CHI-SQUARE FO R INDEPENDENCE MODEL WIT H 1 36 DEGREES O F
FREEDOM = 666.4 6
INDEPENDENCE AI C = 700.4 6
MODEL AI C = 189.5 5
SATURATED AI C = 306.0 0
INDEPENDENCE CAI C = 7 61.7 5
MODEL CAI C = 373.4 1
SATURATED CAI C = 857.5 9
ROOT MEAN SQUARE RESIDUA L (RMR) = 0.06 3
STANDARDIZED RMR = 0.06 1
GOODNES
S O F FI T INDE X (GFI ) = 0.9 1
ADJUSTED GOODNESS O F FI T INDE X (AGFI ) = 0.8 6
PARSIMONY GOODNESS O F FI T INDE X (PGFI ) = 0.6 1
NORME
D FI T INDE X (NFI ) = 0 . 8 7
NON-NORME
D FI T INDE X (NNFI ) = 1.0 4
PARSIMONY NORMED FI T INDE X (PNFI ) = 0 . 6 5
COMPARATIV
E FI T INDE X (CFI ) = 1 . 0 0

Examining

Plausibility

23 3

of Models

INCREMENTAL FI T INDE X (IFI ) = 1 . 0


RELATIVE FI T INDE X (RFI ) = 0 . 8 2

In summary , the path , panel , confirmator y facto r analysis , an d laten t variable


structura l equatio n modelin g analyse s yielded differing interpretations . The
difference s point out the importanc e of having multiple measure s of theoretica l
variable s so tha t issue s suc h as measuremen t error can be addresse d ade
quately .
As a final activity, look bac k at the differen t conclusion s tha t one might draw
from the differen t analytica l approaches . This point is the sam e one as is made ,
perhap s even more strongly, in the illustratio n tha t appear s in Table 10 .4. The
point is tha t the result s from a single dat a se t diverge .

1 1 1 I I I 1 I I I I I 1 I I 1

ZIIII

^|iy|i

|i|^i|c

^Bis |BIIIIII

Thi s chapte r provide s on e importan t an d funda menta l piec e of structura l equatio n modelin g (SEM) tha t stil l need s
to be addressed . Tha t piec e is ho w to us e variou s test s of mode l
plausibilit y to complemen t paramete r significanc e tests . As pointe d
ou t in Chapte r 8 whe n th e basi c laten t variabl e SEM mode l wa s
presented , th e chi-squar e goodnes s o f fit test , althoug h valuable , is
limite d becaus e it is a direc t functio n of sampl e size . In smal l sample s
eve n poo r model s ma y fit fairl y well , wherea s in ver y larg e sample s
eve n trivia l difference s betwee n th e hypothesize d mode l an d th e
observe d dat a wil l resul t in model s tha t d o no t fit by traditiona l
criteri a of significanc e testing .
Thi s chapte r discusse s aspect s of mode l fitting . First , it discusse s wha t
it mean s to call model s neste d an d th e advantage s for mode l fittin g
tha t exis t whe n model s ar e nested . Second , it describe s an d explain s
genera l measure s of overal l mode l fit an d fit indexe s for comparin g
neste d models . Third , it present s indexe s of fit tha t allo w compariso n
of non-neste d models . Fourth , it describe s tw o approache s for settin g
up a serie s of neste d model s so tha t reader s can see ho w the y migh t
nee d an d us e th e indexe s describe d in th e secon d section . Finally , th e
issue s covere d in thi s chapte r ar e illustrate d throug h an example . Th e
Maruyam a an d McGarve y (1980) dat a set (Figur e 9.2; Tabl e 9.2) is
use d to illustrat e a serie s of neste d model s plu s th e arra y of overal l fit
indexes .
234

Alternative

Models

and Significance

Tests

235

Neste d Model s

Generall y speaking , model s can be said to be neste d wheneve r on e


mode l ha s all th e sam e free parameter s as doe s a secon d mode l bu t
also ha s othe r fre e parameter s no t share d by th e othe r model . In othe r
words , th e tw o model s ar e equivalen t excep t for a subse t of free
parameter s in on e mode l tha t ar e fixed or constraine d in th e other .
Imagin e tha t we wan t to tes t plausibilit y of a three-wav e mode l
in whic h pee r acceptanc e influence s achievemen t as in Figur e 9.3.
First , we coul d compar e tw o models , on e wit h an d th e othe r withou t
th e pee r acceptanc e to achievemen t paths . Th e tw o model s woul d
diffe r onl y wit h respec t to those particula r paths ; thei r measuremen t
model s woul d be identical , an d thei r structura l model s woul d diffe r
onl y by tw o parameter s tha t ar e free in on e mode l bu t fixed to 0 in
th e other , so the y fit th e definitio n of nested . In othe r words , th e
differences , as wel l as th e differenc e in degree s of freedom , betwee n
th e tw o model s resul t from parameter s bein g fre e in on e mode l an d
fixed in th e other . In thi s case , thos e ar e th e tw o path s fro m pee r
acceptanc e to achievement .
In som e instances , nestin g of model s ma y no t be immediately
apparent , for example , comparin g a two-facto r confirmator y facto r
mode l wit h an alternativ e one-facto r model . In tha t case , provide d
tha t no indicato r is relate d to bot h factor s (e.g. , Figur e 7.4 provide s
a simpl e illustration) , th e tw o alternativ e model s ar e neste d eve n
thoug h thei r nestin g is no t as readil y apparen t as in th e previou s
example . A one-facto r mode l actuall y just assume s tha t th e correla
tio n betwee n th e tw o factor s is unity , whic h make s th e factor s th e
sam e an d th e loading s on th e tw o factor s th e sam e as the y woul d be
on a singl e factor . Th e two - an d one-facto r model s thu s fit th e
definitio n of nested , for th e freel y estimate d facto r correlatio n is th e
onl y differenc e betwee n them . Not e that , in contras t to th e firs t
illustration , th e fixed paramete r is fixed no t to 0 bu t rathe r to 1.
Th e precedin g example s describ e commo n type s of situation s in
whic h variation s of a mode l ar e pitte d agains t on e anothe r to examin e
plausibilit y of som e hypothesize d path s and/o r plausibilit y of alterna
tiv e view s abou t relationships . Th e alternativ e model s can be com
pare d for overal l fit. Befor e gettin g to o carrie d awa y by th e possibilit y
of generatin g larg e number s of neste d model s to tes t differen t paths ,
however , it is pruden t to remembe r tha t each paramete r estimate d
ha s its ow n standar d erro r an d therefor e a confidenc e interval . In

236

LATEN T

VARIABLE

MODEL S

othe r words , significanc e of each paramete r estimat e (th e differenc e


of tha t paramete r estimat e fro m 0) can be assesse d withou t usin g
neste d models . Furthermore , examinatio n of plausibilit y of alterna
tiv e view s can be don e by inspectin g confidenc e interval s aroun d an d
significanc e level s of critica l parameters . In th e tw o illustration s just
described , plausibilit y can be assesse d by examinin g (a) th e signifi
canc e (differenc e fro m a pat h of 0) of th e path s fro m pee r acceptanc e
to achievemen t an d (b) th e differenc e of th e correlatio n betwee n th e
tw o factor s fro m a correlatio n of unity . Lookin g at confidenc e inter
val s is far simple r tha n settin g up a serie s of neste d model s to "test "
specifi c parameters . Alternativ e model s shoul d be viewe d as testin g
change s at th e mode l leve l rathe r tha n at th e leve l of th e individua l
parameter . Tha t is, model s woul d be compare d tha t diffe r wit h
respec t to a numbe r of differen t parameters .
An appealin g featur e of settin g up neste d model s is tha t the y ar e
directl y comparabl e by a tes t statistic . If th e differenc e betwee n th e
chi-squar e statistic s for tw o neste d model s is calculate d by simpl e
subtractio n (e.g. , 1 - 2 , wher e M l an d M 2 ar e th e tw o neste d
models) , the n tha t differenc e als o is distribute d as chi-squar e wit h
degree s of freedo m equa l to th e differenc e in degree s of freedo m
betwee n th e model s (<J/M 1 - dfM2). Thus , ther e is availabl e a simpl e
an d straightforwar d chi-squar e tes t of th e overal l difference s betwee n
models .
Althoug h usin g th e chi-squar e goodnes s of fit statisti c to compar e
model s is appealing , it share s th e genera l shortcomin g of all chi
squar e goodnes s of fit test s for structura l models , namely , tha t it is
directl y affecte d by sampl e size an d therefor e is somewha t limite d in
valu e for assessin g difference s betwee n alternativ e models . By con
trast , within th e rang e of acceptabl e sampl e size s for laten t variabl e
SEM (i.e., larg e samples) , standar d error s for estimate d parameter s
d o no t chang e much . Tha t is wh y th e approac h of choic e for exam
inin g significanc e of individua l path s is to us e standar d error s to buil d
confidenc e interval s rathe r tha n constructin g chi-squar e differenc e
test s of overal l fit. At th e sam e time , however , neste d model s can be
use d as alternativ e model s for th e differen t fit indexe s presente d in
th e nex t sectio n of thi s chapter . Thos e indexe s ar e less pron e to
complication s du e to change s in sampl e size tha n is th e chi-squar e
differenc e test .
A secon d class of neste d model s is th e typ e in whic h equalit y
constraint s ar e imposed . Imposin g suc h constraint s can be don e in
2

Alternative

Models

and Significance

Tests

237

an y model , bu t the y mak e th e mos t sens e in eithe r longitudina l or


multipl e grou p analyse s (see Chapte r 11). For longitudina l pane l
models , th e sam e indicator s collecte d acros s tim e can hav e thei r
loading s constraine d so tha t the y ar e th e sam e at each tim e point .
(Thi s is th e variatio n on tau equivalen t test s describe d in Chapte r 7.)
Th e constraine d an d unconstraine d model s ar e nested , for the y diffe r
onl y wit h respec t to th e parameter s tha t ar e constraine d in on e mode l
an d fre e in th e other . Whe n researcher s hav e availabl e multipl e
group s (e.g. , a mal e sampl e an d a femal e sample) , the y ma y wan t to
examin e whethe r or no t parameter s ar e th e sam e in th e differen t
samples . As wil l be describe d in Chapte r 11, SEM technique s allo w
constrainin g path s to be equa l acros s samples . Onc e again , th e con
strained an d unconstraine d model s ar e nested .
Regardles s of ho w simila r tw o model s ar e to on e another , if the y
fre e alternativ e parameter s suc h tha t on e mode l no longe r is th e
secon d wit h som e free parameter s added , the n th e model s no longe r
ar e neste d an d therefor e no longe r ar e directl y comparable . As a
result , a chi-squar e tes t of differenc e to compar e the m is no longe r
possible . Ther e certainl y ar e man y instance s in whic h th e differen t
theoretica l model s ar e not , canno t be , an d shoul d no t be mad e a serie s
of neste d models . In thos e instances , way s of choosin g betwee n th e
alternativ e model s nee d to be base d on somethin g othe r tha n th e chi
squar e fit statisti c (Akaike , 1987; Brown e & Cudeck , 1989).
As a note , just-identifie d model s (sometime s also calle d full y
saturate d models ) wil l be neste d wit h all possibl e models , for the y
hav e no degree s of freedo m an d fit th e dat a perfectly . Thei r nestin g wit h
othe r models , however , is trivial , for all variation s of just-identifie d
model s fit th e same , regardles s of thei r conceptua l sensibility . On th e
othe r hand , th e fact tha t all sets of variable s eventuall y can be perfectl y
represente d in a mode l points ou t th e importanc e of th e principl e of
parsimon y in mode l construction . Tha t principl e is tha t test s of mode l
fit ough t to giv e som e valu e to model s tha t approximat e th e just identifie d mode l whil e still havin g man y degree s of freedom . Tha t
principl e ha s bee n use d in developin g a set of indexe s of mode l fit.
SEM researcher s also spea k of full y saturate d or just-identifie d
structura l models . Thes e hav e bee n discusse d in previou s chapters .
The y hav e no degree s of freedo m tied to th e relation s amon g th e
laten t variables , so all degree s of freedo m in suc h model s ar e tied to
th e relation s of measure s wit h laten t variables . Just-identifie d struc
tura l model s hav e as man y path s linkin g laten t variable s as ther e ar e

238

LATEN T VARIABL E

MODEL S

covariance s amon g thos e laten t variables . For example , th e mode l of


Figur e 9.1 ha s a just-identifie d structura l model , for ther e ar e 5 laten t
variable s an d 10 path s amon g them . Just-identified structural
models
are nested with all other models sharing their measurement structure.
Thi s typ e of nestin g is mor e interestin g an d importan t tha n nestin g
tied to th e just-identifie d model , for just-identified structural
models
are guaranteed to have the best possible fit of any models sharing a
particular measurement structure. If thei r fit is poor , the n no wa y of
specifyin g relation s amon g th e conceptua l variable s woul d fit an y
better .
Becaus e ther e can be man y variant s of a structura l mode l tha t ar e
just identified , just-identifie d structura l model s provid e anothe r case
in whic h nestin g of model s ma y no t be immediately apparent . For
example , a confirmator y facto r mode l wit h all laten t variable s fre e to
covar y wit h on e anothe r is identica l in overal l mode l fit to an y full y
recursiv e pat h mode l (an d als o an y just-identifie d structura l model )
linkin g thos e sam e laten t variables .
To summarize , alternativ e model s tha t ar e neste d can be directl y
compare d wit h on e anothe r in overal l fit throug h th e chi-squar e
statisti c an d SEM fit indexes . The y provid e a wa y in whic h to engag e
in direc t compariso n of competin g model s an d complemen t th e
significanc e informatio n tha t is availabl e for individua l paramete r
estimates .

I Test s of Overal l Mode l Fit


In th e pas t decade , a wid e arra y of test s of overal l fit of SEM model s
ha s emerged . Althoug h th e variou s tests , or fit indexe s as the y ar e
commonl y called , ar e in th e proces s of bein g sorte d ou t an d com
pared , at presen t ther e is no agreemen t abou t a singl e optima l tes t or
eve n a set of optima l tests . Th e unfortunat e resul t is tha t differen t
article s presen t differen t indexes , an d differen t reviewer s of article s
ask for variou s indexe s tha t the y kno w abou t or prefer . Even thoug h
ther e is an entir e boo k focused on mode l testin g (Bollen &CLong ,
1993), providin g reader s wit h cookbook-lik e guidanc e in thi s com
ple x are a is no t possibl e (an articl e tha t come s clos e to providin g suc h
guidanc e is tha t of Hoyl e & Panter , 1995). Furthermore , curren t
versions of popula r SEM program s suc h as LISREL, AMOS , an d EQS
provid e a larg e numbe r of fit indexesmore , in fact , tha n anyon e

Alternative

Models

and Significance

Tests

239

woul d wan t to report . (Look back , for example , at th e fit indexe s in


Appendi x 9.4 unde r th e headin g of "goodnes s of fit statistics." )
Basically , th e arra y of writing s on overal l test s of mode l fit is
extensiv e an d no t altogethe r consisten t (e.g. , Bentler , 1990; Bollen ,
1990; Bollen & Long , 1993; Hu & Bentler , 1995; Marsh , Balla, &
McDonald , 1988; Mulaiketal. , 1989). Th e differen t fit indexe s diffe r
wit h respec t to dimension s suc h as susceptibilit y to sampl e size
differences , variabilit y in th e rang e of fit possibl e for an y particula r
dat a set , an d valuin g simplicit y of mode l specificatio n neede d to attai n
an improve d fit. Althoug h man y of these indexes hav e bee n develope d
to be use d for a numbe r of differen t estimatio n approache s (e.g. ,
maximu m likelihood , generalize d leas t squares) , the y will be pre
sented here in terms of the chi-square statistic that corresponds to
maximum likelihood estimation (see als o Tabl e 10.1).
Th e differen t indexe s hav e bee n classifie d in a numbe r of ways .
Th e mos t widel y cited an d followe d classificatio n approac h is on e
presented by Mars h et al. (1988). As amende d by Hu an d Bentle r
(1995), it provide s th e framewor k for th e discussio n of basi c indexe s
an d indexe s for neste d model s presented here . Tha t framewor k
present s "absolute " indexe s an d the n fou r differen t subtype s of
"relative " indexes . On e also coul d ad d to tha t framewor k yet anothe r
dimensio n for classifyin g fit, whic h woul d be labele d "adjusted "
indexes . Thi s additiona l dimensio n typicall y ha s no t bee n include d as
a separat e categor y becaus e it doe s no t provid e an independen t
classificatio n scheme .
Absolut e indexe s addres s th e question : Is th e residua l or unex
plaine d varianc e remainin g afte r mode l fittin g appreciable ? Thus ,
the y ar e absolut e insofa r as the y impos e no baselin e for an y particula r
dat a set . If a dat a set , for example , ha s onl y smal l relationship s acros s
measures , the n almos t an y mode l potentiall y coul d fit fairl y well .
Relativ e indexe s addres s th e question : Ho w wel l doe s a particula r
mode l do in explainin g a set of observe d dat a compare d wit h (a rang e
of) othe r possibl e models ? Mos t of thes e relativ e fit indexe s establis h
as a baselin e a "wors t fitting " model . Th e mos t commo n worst-fittin g
mode l is on e tha t model s onl y th e variance s fro m th e variance/covari
anc e matrix . Tha t mode l is also calle d th e "nul l model. " Becaus e it
fits onl y th e variance s an d assume s tha t all covariance s ar e 0, it fits
onl y if ther e ar e negligibl e covariance s betwee n measures . Then ,
theoretica l model s ar e viewe d as fallin g alon g a continuu m betwee n
th e nul l mode l (th e wors t possibl e fittin g model ) an d a bes t fit mode l

24 0

LATEN T VARIABL E

MODEL S

TABL E 10.1 Fit Indexe s Summar y


Types (classes) of fit indexes
1. Absolute: Is th e residua l (unexplained ) varianc e appreciable ? (e.g. , chi-square , chi
square/^ , RMR , GFI )
2. Relative: Ho w wel l doe s th e mode l d o compare d wit h (a rang e of ) othe r possibl e
model s wit h th e sam e data ? (e.g. , Typ e 1: NFI ; Typ e 2: TLI , IFI ; Typ e 3 : BFI o r
RNI )
3. Adjusted:

Ho w doe s th e mode l combin e fit an d parsimony ? (e.g. , LISREL' s PGFI ,

PNFI , TLI )
Specific

indexes

In th e following , F stand s fo r th e functio n tha t is minimized , an d - ( - 1)F. Th e sub


scrip t ke y is as follows : t = theoretical ; = null ; s saturated ; a an d b = alternativ e
models . Th e symbo l k -= numbe r o f measure s in th e model .
1

Roo t mea n residua l


Thi s statisti c is simpl y th e squar e roo t of th e mea n of th e square d discrepancie s betwee n
all th e element s of th e predicte d () an d observe d (5) matrices .
LISRE L goodnes s of fit index :
GF I = 1 - (trdT' S - I ) / t r f r ' S ) ]
2

Th e GF I measure s th e proportio n of weighte d informatio n in S tha t fit s weighte d infor


matio n in , suc h as th e coefficien t of determination . Th e rati o par t of th e formul a is
lik e a rati o o f residua l to tota l variance .
LISRE L adjuste d goodnes s o f fit index :
AGF I = 1 - [k( k + 1) / 2 df, ] (1 - GFI )
Th e AGF I is no t recommende d (se e Mulai k et al. , 1 9 8 9 ) .
Bende r an d Bonett' s ( 1 9 8 0 ) tiorme d fit index :

Tucker-Lewi s inde x (Tucke r fie Lewis , 1 9 7 3 ) :

TL/ =

(x'/df-x't/df,)
-\)
{

((Fn/df.) - (F,/df,))
~ (Fn/Jf.) (WN -1))

wher e is th e sampl e size . Thi s als o is th e Bende r an d Bonet t non-norme d fit inde x formula .

tha t perfectl y fits th e observe d dat a (typicall y attaine d onl y by th e


full y saturate d or just-identifie d model) . Differen t conceptua l model s
can be directl y compare d onl y if the y ar e nested .

Alternative

Models

and Significance

241

Tests

TABLE 10.1 Continue d


Bollen' t ( 1 9 8 9 ) incrementa l fit index :
IFl-(X

-X )/(X
2

-df, )

Not e tha t df, - expecte d valu e o f wit h t degree s o f freedom .


2

Bender' s ( 1 9 9 0 ) an d McDonal d an d Marsh' s ( 1 9 9 0 )


relativ e noncentralir y index :
RN I o r BFI = [( - df ) - ( , - df,) ] / ( - d f )
2

James , Mulaik , an d Brett' s ( 1 9 8 2 ) parsimoniou s fit index :


PGF I - {df,/[k( k + 1 ) / 2 ] } G F I ,
wher e dft is degree s o f freedo m o f th e model , k - siz e o f inpu t matrix , k(k + 1) / 2 =
tota l possibl e degree s o f freedom , an d GF I is th e inde x define d above .
Mulai k et al.' s ( 1 9 8 9 ) parsimoniou s norme d fit index :
PNF I - (df , / d f ) NF 1 or {df, / [k(k - 1) / 2 ] } NF I
n

Mulai k et al.' s parsimoniou s norme d fit index , Typ e 2 :


PNFI 2 = (df , / df ) IFI or {df, / [k(k - 1) / 2 ] ) IFI
n

Akaik e informatio n criteri a (Akaike , 1 9 8 7 ) :


AI C (Joreskog ) - , - 2df ,
2

A1C (Tanaka ) = , + 2(numbe r o f fre e parameters )


2

Boidogan' s ( 1 9 8 7 ) modifie d AIC :


CA1 C - , - ( 1 + InNJdf ,
2

Brown e an d Cudeck' s ( 1 9 8 9 ) expecte d cross-validatio n index :


ECV I
lX ,1 (N - 1)] + 2[numbe r o f fre e parameter s / ( - 1)]
2

Steiger' s ( 1 9 9 0 ) roo t mea n squar e erro r o f approximation :


RMSE A - SQRT(F , / df, )
N O T E : R M R - roo t mea n residual ; GF I - goodnes s o f fit index ; NF I norme d fit index ;
Tucker-Lewi s index ; IFI incrementa l fit index ; BFI o r RN I - relativ e noncentralir y index ;
comparativ e fit index ; PGF I parsimoniou s G F I ; PNF I parsimoniou s N F I ; AGF I - adjuste
AI C - Akaik e informatio n criteria ; C M C - modifie d AIC ; ECV I expecte d cross-validatio n
RMSE A " roo t mea n squar e erro r o f approximation .

TU
CF I
d GFI ;
index ;

Som e of th e relativ e indexes , calle d Typ e 1 by Mars h et al. (1988),


directl y compar e th e fit of tw o differen t models . Othe r indexes , calle d
Typ e 2 by Mars h et al., compar e model s bu t also includ e informatio n
from th e expecte d valu e of th e model s unde r a centra l chi-squar e
distribution . H u an d Bentle r (1995) ad d Typ e 3 an d Typ e 4 indexes ;

LATEN T VARIABL E

242

MODEL S

Typ e 3 indexe s compar e model s includin g informatio n abou t ex


pecte d valu e unde r a noncentra l chi-squar e distribution , an d Typ e 4
indexe s compar e model s whil e including informatio n fro m othe r
distributiona l forms . At present , ther e ha s bee n littl e wor k don e on
Typ e 4 indexe s (e.g. , H u & Bender , 1995).
Finally , adjuste d indexe s explicitl y addres s th e question : Ho w
doe s th e mode l combin e fit an d parsimony ? Th e point , mentione d
earlie r in thi s book , is tha t man y model s coul d fit th e dat a if enoug h
parameter s wer e estimated , so valu e ough t to be give n to model s tha t
accoun t for variabilit y wit h a relativel y smal l numbe r of free parame
ters . Wha t make s thi s categor y nonindependen t of th e other s is tha t
som e of th e mode l indexe s just describe d inherentl y adjus t in thei r
formula s for th e degree s of freedo m in th e variou s model s tha t ar e
bein g compared , whic h mean s tha t the y alread y buil d in a contro l for
parsimony . In othe r words , som e of th e mode l indexe s describe d in
th e previou s categorie s ar e in fact alread y adjuste d indexes .
|

Absolute Indexes

Thes e indexe s includ e thos e tha t us e th e functio n tha t is minimize d


(th e maximu m likelihoo d fittin g functio n or th e scale d likelihoo d
ratio) , th e roo t mea n residua l (RMR) , th e chi-squar e tes t an d % /df
ratio , an d th e goodnes s of fit inde x (GFI) an d adjuste d goodnes s of
fit inde x (AGFI ) fro m th e LISREL program . The y provid e informa
tio n abou t ho w closel y th e model s fitte d compar e to a perfec t fit. At
th e sam e time , the y ignor e variabilit y betwee n dat a set s in ho w poorl y
an y mode l coul d possibl y fit.
O f th e absolut e indexes , th e chi-squar e an d X /df rati o indexe s
alread y wer e describe d earlie r in thi s book . Both loo k at th e absolut e
size of th e residuals . As a reminder , th e distributio n of chi-squar e is
suc h tha t its expecte d valu e is equa l to its degree s of freedom , so an
"ideal " fit woul d hav e a X /df rati o of 1.0. "Ideal " is in quote s becaus e
(a) in fact a perfectl y fittin g mode l woul d hav e a chi-squar e statisti c
of 0 an d (b) for an y give n leve l of mode l fit shor t of perfect , th e
chi-squar e and , consequently , th e % /df rati o chang e as sampl e size
changes . Th e thir d absolut e index , th e RMR , als o doe s no t requir e
muc h explanation , for it is straightforward . It is th e squar e roo t of
th e mea n square d differenc e betwee n element s of th e predicte d an d
observe d matrices . (Difference s ar e square d becaus e th e sign of th e
differenc e is inconsequential ; a discrepanc y is a discrepanc y regard
2

Alternative

Models

and Significance

243

Tests

less of its sign. ) Th e RM R make s mos t sens e whe n measure s ar e


standardized , for the n the y hav e a commo n metri c an d thei r residual s
hav e paralle l meaning . O f th e othe r indexe s mentioned , onl y th e GFI
is describe d here ; th e AGF I is skipped , for Mulai k et al. (1989) argue d
tha t it ha s problem s tha t mak e it less desirable .
Th e GFI ha s bee n presented in tw o differen t ways . Mor e recen t
version s of th e LISREL manua l us e th e terminolog y of Tanak a an d
Hub a (1984), bu t th e earlie r terminolog y is mor e consisten t wit h
othe r indexe s an d wit h term s use d previousl y in thi s book , so it is
presente d here .
GFI = 1 - [tr(I-' 5 - ) / trfl-'S) ].
2

GFI assesse s th e relativ e amount of th e variance s an d covariance s


jointl y accounte d for by th e mode l an d thu s typicall y range s betwee n
0 an d 1. As sigm a an d S converge , th e numerato r of th e ter m in
bracket s goe s to 0, resultin g in GFI goin g to 1.
|

Relative Indexes

A Typ e 1 inde x widel y use d bu t currentl y no t recommende d (H u &


Bentler , 1995, recommended no Typ e 1 indexes) , becaus e it is affecte d
by sampl e size an d doe s poorl y for smal l samples , is th e Bentle r an d
Bonet t (1980) norme d fit inde x (NFI) . Tha t inde x compare s fits of
tw o differen t model s to th e sam e dat a set . On e of th e model s ma y be
a baseline/nul l model . Th e NF I can be expresse d in term s of eithe r
th e fit functio n (F) or th e statistic , for the y yiel d equivalen t results .
2

NF I = [F, F ] I F
b

or

I x ^ - x^ l / X/.
2

wher e a an d b ar e alternativ e model s an d is th e nul l model .


Consisten t wit h th e logi c presente d her e in whic h model s ar e
viewe d as fallin g along a continuu m fro m wors t possibl e to bes t
possibl e fit, th e denominato r implicitl y is F - F wher e s is th e
saturate d mode l an d ha s a fit functio n of 0. Remember , eve n thoug h
in th e NF I equatio n tw o competin g model s ar e compared , th e nul l
mode l can be on e of th e tw o model s compared . If it is, the n th e
resultin g inde x provide s informatio n abou t th e proportio n of possibl e
improvemen t fro m nul l to bes t fit mode l tha t is attaine d by th e

LATEN T VARIABL E

244

MODEL S

conceptua l model . Becaus e th e NF I is bounde d by 0 an d 1, it is an


appealin g index . Bende r an d Bonet t (1980) recommende d acceptin g
NFI s of .90 or greate r in compariso n to th e nul l mode l as indicativ e
of a goo d fit for a theoretica l model .
Ther e ar e severa l differen t Typ e 2 indexe s tha t ar e widel y used .
Mos t important , thes e indexe s ar e muc h mor e consisten t acros s
sampl e size (e.g. , Mars h et al., 1988) tha n ar e eithe r absolut e or Typ e
1 indexes . Th e mos t prominen t is th e classi c Tucke r an d Lewi s (1973)
formula , whic h wa s expande d by Bentle r an d Bonet t (1980) to
compar e alternativ e model s rathe r tha n comparin g on e mode l wit h a
nul l model . Th e Tucker-Lewi s inde x (TLI) is
(X ydf-x ,/df,)
2

TLI =

_ ((F /df) m

ix\/df-\)

(F,/df,))

(F/#)-(l/N-l)

wher e is th e sampl e size , F is th e functio n tha t is minimized , is


th e nul l model , an d t is th e theoretica l model . Th e 1 in th e denomi
nato r result s from th e fact tha t th e expecte d valu e of = df, whic h
make s th e rati o of expecte d / df for an y mode l t equa l to 1. Thu s
th e TLI is a Typ e 2 index , base d on th e expecte d valu e of / df unde r
th e centra l chi-squar e distribution . Th e 1 on th e far righ t is divide d
by - 1, whic h is th e valu e tha t th e fittin g functio n (F) is multiplie d
by to get th e chi-squar e value . In othe r words , th e othe r thre e ratio s
fro m th e formul a presente d in term s of chi-squar e all ar e divide d by
- 1 to go from a chi-squar e to an F, so th e 1 ter m need s to be as
wel l so as to maintai n equivalence .
Th e TLI is robus t acros s sampl e size change s (H u & Bentler ,
1995; Mars h et al., 1988) but , unfortunately , is no t bounde d by 0 an d
1, makin g it mor e difficul t to interpre t tha n an inde x suc h as NFI .
Th e Bentle r an d Bonet t (1980) non-norme d fit inde x (NNFI ) is
identica l to th e TLI excep t tha t Bentle r an d Bonet t allo w th e nul l
mode l in th e numerato r to be replace d by an alternativ e model , for
example , ( ,, / df ) rathe r tha n ( / df ).
A secon d recommende d (e.g. , Hoyl e & Panter , 1995) Typ e 2
inde x is Bollen' s (1989) incrementa l fit inde x (IFI). Th e IFI look s
muc h like th e NFI , bu t in its denominato r it subtract s degree s of
freedo m of th e theoretica l mode l fro m . Th e degree s of freedo m
of th e theoretica l mode l is th e expecte d valu e of ,. Th e formul a is
2

I FI = (^-x^KxK-df,).

Alternative

Models

and Significance

245

Tests

Typ e 3 indexe s hav e no t bee n as widel y use d in th e SEM literatur e


as hav e Type s 1 an d 2. Agai n followin g recommendation s of Hoyl e
an d Pante r (1995) as wel l as Hu an d Bentle r (1995), onl y tw o ar e
mentione d here . Firs t is a statisti c develope d bot h by Bentle r (1990),
calle d th e Bentle r fit inde x (BFI), an d by McDonal d an d Mars h
(1990), calle d th e relativ e noncentralit y inde x (RNI) . Like th e TLI, it
is no t bounde d by 0 an d 1. Its formul a is
RN I or BFI = [(\ - df ) - ( , - df,)) I ( - df ).
2

Th e secon d index , also develope d by Bentle r (Bentler , 1990; Hu &


Bentler , 1995) an d calle d th e comparativ e fit inde x (CFI), adjust s th e
RNI/BF I so tha t it falls withi n th e rang e of 0 to 1. Becaus e tha t inde x
use s a formul a tha t differs greatl y from thos e presente d in thi s section ,
its formul a is no t presente d here . In mos t instances , th e RN I (BFI)
wil l fall betwee n 0 an d 1, an d CFI = RN I (BFI).
|

Adjusted Indexes

Finally , ther e ar e variation s of som e of th e indexe s just describe d tha t


ad d to the m a direc t assessmen t of parsimon y of th e model s bein g
compared . Typically , wha t thes e variation s d o is multipl y th e indexe s
by a rati o of degree s of freedo m in th e theoretica l mode l to degree s
of freedo m eithe r in th e matri x (whic h for measure s is v(v + 1) /
2), use d in absolut e models , or in th e nul l mode l (whic h is v{v - 1) /
2), use d in relativ e models . For model s tha t us e a lot of degree s of
freedo m in mode l specification , th e adjuste d or parsimoniou s fit
indexe s loo k muc h wors e tha n d o th e relativ e fit indexes . James ,
Mulaik , an d Bret t (1982) presente d a parsimoniou s fit inde x for th e
LISREL GFI (PGFI) :
PGFI = {df,/[v(v

+ l ) / 2 ] } GFI.

Becaus e th e GFI is an absolut e index , th e denominato r of th e parsi


moniou s adjustmen t is th e tota l numbe r of availabl e degree s of
freedo m fro m th e variance/covarianc e matrix . Mulai k et al. (1989)
presente d tw o differen t parsimoniou s indexe s for relativ e indexes ,
on e for th e norme d fit inde x (PNFI) ,

246

LATEN T VARIABL E

PNF I = (df,ldf )
{df,l[v{v-\)ll\)

NF I
NFI ,

MODEL S

or

an d th e othe r for a Typ e 2 mode l (PNFI2 ) that , althoug h unstated ,


happen s to be for Bollen' s IFI. Th e IFI is inserte d int o thei r equatio n
to simplif y th e formula :
PNFI 2 = (df,/df ) IFI
{df, / [ - ) / 2]} IFI.
n

or

Not e tha t becaus e bot h indexe s ar e relativ e indexes , th e denominato r


is th e degree s of freedo m of th e nul l model .

Fit Indexe s for Comparin g Non-Neste d Model s

Researcher s wh o hav e alternativ e model s tha t canno t be mad e neste d


ar e faced wit h a differen t challenge , for it is difficul t to compar e
model s tha t mak e differen t assumption s abou t pattern s an d relation
ships . On e could , of course , compar e absolut e fit indexe s suc h as th e
chi-squar e or th e GFI. Direc t compariso n is complicate d becaus e no
direc t statistica l compariso n is possible . For suc h models , ther e ar e
othe r fit indexe s tha t ar e usefu l primaril y becaus e of thei r abilit y to
orde r model s fro m bes t fittin g to wors t fitting . In general , thes e d o
no t hav e idea l value s to attai n bu t provid e a relativ e orderin g of
differen t model s for a singl e dat a set . Thes e statistic s ar e th e Akaik e
informatio n criteri a (AIC ) (Akaike , 1987), Bozdogan' s (1987) vari
ation , th e CAIC , an d Brown e an d Cudeck' s (1989) expecte d cross validatio n inde x (ECVI). Finally , ther e is Steiger' s (1990) roo t mea n
squar e erro r of approximatio n (RMSEA) , whic h is a discrepanc y pe r
degre e of freedo m tes t muc h like a roo t mea n squar e (e.g. , Brown e
& Cudeck , 1993).
Th e AIC is complicate d by th e fact tha t th e formul a is give n in
differen t form s tha t d o not yiel d identica l values . The y do , however ,
yiel d paralle l findings. Th e form s presente d by Joresko g (1993) an d
Tanak a (1993) ar e presente d here :
AIC (Joreskog ) = , - 2df,
2

AIC (Tanaka ) = , + 2(numbe r of fre e parameters) .


2

Alternative

Models

and Significance

247

Tests

Bozdogan' s (1987) modifie d AIC is


CAIC = , - ( 1 + \nN)df,.
2

Brown e an d Cudeck' s (1993) ECVI give s a valu e tha t woul d be


expecte d if a cross-validatio n sampl e wer e available :
ECVI = [ , / ( - 1)] + 2(numbe r of free parameter s / - 1).
2

Th e AIC an d ECVI wil l giv e th e sam e ran k orde r of th e model s bein g


compared .
Finally , Steiger' s RMSEA is calculate d by th e formul a
RMSEA =

sqn(F,/df,).

The suggeste d leve l for a goo d fittin g mode l is an RMSEA of less tha n .05.
I

Settin g Up Neste d Model s

No w tha t a rang e of indexe s for comparin g alternativ e an d nul l


model s hav e bee n described , way s of settin g up a serie s of neste d
model s ar e describe d so tha t reader s can see ho w all thes e indexe s ar e
actuall y used . Two logica l approache s for settin g up a serie s of model s
ar e presented . Thes e approache s ar e thos e of Bentle r an d Bonet t
(1980) an d Jame s et al. (1982). Th e approache s ar e summarize d in
Tabl e 10.2.
Bentle r an d Bonet t (1980) suggeste d tha t reader s shoul d hav e at
leas t thre e actua l model s an d tha t tw o othe r model s requirin g no
mode l fittin g als o hel p in thinkin g abou t qualit y of overal l fit. First ,
the y suggeste d th e importanc e of fittin g a mode l of ful l independence ,
whic h is th e nul l or baselin e mode l discusse d in th e precedin g sectio n
of thi s chapter . Th e nul l mode l specifie s th e variance s bu t estimate s
all th e covariance s as 0 (for a critiqu e of nul l models , see Sobe l &
Bohrnstedt , 1985). Th e mos t recen t version s of LISREL, AMOS , an d
EQS automaticall y calculat e th e nul l model ; in earlie r versions , th e
nul l mode l woul d hav e to be estimate d as a separat e model .
Second , fit th e theoretica l mode l of greates t interest . Third , fit
an y alternativ e theoretica l models . Fourth , implicitl y all researcher s
estimat e a "saturated " model , whic h is just identifie d an d fits th e dat a

248

TABL E 10.2

LATEN T VARIABL E

MODEL S

Approache s for Settin g Up Alternativ e Neste d Model s

Bende r an d Bonet t ( 1 9 8 0 )
1. Nul l = ful l independenc e (diagonal s only )
2. Theoretica l = you r mode l
3. Alternativ e = othe r viabl e model(s )
4 . Saturate d = just-identified

, df = 0, perfec t fit

5. "Ideal " (hypothetical) : Tak e th e bes t fittin g statisti c fro m Mode l 2 o r 3 abov e an d
giv e it df = nul l - 1. If it doe s no t fit , the n n o mode l will , fo r othe r model s wil l
hav e large r function s an d smalle r degree s of freedom .
James , Mulaik , an d Bret t ( 1 9 8 2 )
1. Nul l mode l (sam e as above )
2. Measuremen t mode l wit h independen t laten t variable s
3. You r mode l (sam e as Mode l 2 above )
4. Just-identifie d structura l mode l (an y lac k of fit is du e to th e measuremen t model )
5. Full y saturate d model , df = 0 (sam e as Mode l 4 above )
6. Simila r t o Mode l 5 of Bentle r an d Bonett ; tak e degree s o f freedo m 1 les s tha n
fro m th e nul l mode l an d fro m Mode l 4
2

Relativ e norme d fi t inde x fo r Mode l 3:


RNF I = [ <

2
Z

M J

) /

2
4

) - (df , - df
M

M 4

))]

Parsimoniou s relativ e norme d fit inde x fo r Mode l 3:


RNF I = [ ( d f

M J

- df

M 4

) / (df

M 2

- df

M 4

) ] RNF I

N O T E : RNF I = relativ e norme d fit index ; PRNF I parsimoniou s R N F I .

perfectl y ( = 0). It is th e bes t fittin g model . Finally , Bentle r an d


Bonet t (1980) suggeste d a "quasi-test " or hypothetica l tes t in whic h
th e chi-squar e fro m th e bes t fittin g theoretica l mode l is take n an d
examine d as if it ha d degree s of freedo m 1 less tha n tha t of th e nul l
model , namely , [v(v - 1) / 2] - 1. If tha t mode l doe s no t fit well , the n
non e of th e model s wil l fit, for the y all wil l hav e large r function s an d
fewe r degree s of freedom . At tha t point , a conceptua l rethinkin g ma y
be in order . Assumin g tha t th e hypothetica l tes t is no t to o discourag
ing , on e coul d use th e variou s indexe s to compar e th e secon d an d
thir d type s of model s agains t th e nul l mode l an d agains t on e another .
Jame s et al. (1982) suggeste d a simila r procedur e bu t di d no t
mentio n th e hypothetica l tes t an d adde d a coupl e of additiona l
alternativ e model s tha t ar e helpful . First , the y bega n wit h th e sam e
nul l or independenc e model . Second , the y fit a measuremen t mode l
tha t specifie s all laten t variable s as uncorrected . One , of course ,
2

Alternative

Models

and Significance

Tests

249

woul d no t wan t thi s mode l to fit, for a goo d fit woul d mea n tha t th e
laten t variable s all ar e independen t of on e another . Thir d is th e
theoretica l mode l or alternativ e model s if ther e is mor e tha n one .
Fourth , the y estimate d a just-identifie d structura l mode l in whic h
ther e ar e no degree s of freedo m in th e relationship s amon g th e laten t
variables . In suc h a model , all lack of fit is du e to inadequacie s of th e
measuremen t model . If on e wer e to d o a hypothetica l tes t paralle l to
Bentle r an d Bonet t (1980), takin g th e fit of th e just-identifie d struc
tura l mode l wit h 1 less degre e of freedo m tha n th e nul l mode l woul d
be th e bes t tes t of whethe r or no t an y mode l coul d fit. Finally , the y
als o anchore d th e variou s model s wit h a saturate d mode l tha t fits
perfectly .
Jame s et al. (1982) use d Model s 2, 3, an d 4 (see Tabl e 10.2) to
calculat e an additiona l fit index , whic h the y calle d th e relativ e norme d
fit inde x (RNFI) . Its formul a is
RNF I = (

2
2

2
3

) / [(

- <) - (df
2

df )].
m

If th e differenc e betwee n Model s 3 an d 4 ( - X M) '


sam e as
thei r expecte d valu e (df -df ),
the n RNF I is 1, for th e denominato r
reduce s to be th e sam e as th e numerator . The y also hav e a parsimo
niou s versio n (PRNFI) :
2

s t n e

M3

M4

PRNF I = [(df

M}

- df ) I (df
M4

M1

- df )]
M4

RNFI .

To summarize , eithe r of th e recommende d approache s for settin g


up a serie s of neste d model s is worthwhile . The y ar e basicall y similar,
an d ther e is no reaso n no t to us e th e bes t element s of each . It seem s
to me tha t th e just-identifie d an d independen t structura l model s ar e
particularl y important , for the y provid e tw o valuabl e anchor s for
assessin g ho w muc h differen t structura l path s coul d affec t th e overal l
fit. Usin g hypothetica l test s is also worthwhile , for the y migh t quickl y
sen d researcher s "back to th e drawin g board. "

I Wh y Model s Ma y No t Fit
Cudec k an d Henl y (1991) provide d on e framewor k in whic h to thin k
abou t wh y model s ma y fail to matc h observe d data . The y describe d
fou r type s of discrepancy , each of whic h coul d be use d to selec t from

250

LATEN T VARIABL E

MODEL S

amon g competin g models . Two of the m warran t discussio n in th e


contex t of fit indexes . Firs t is th e discrepanc y of th e sample , whic h is
th e minimize d differenc e betwee n th e sampl e covarianc e matri x an d
th e restricte d populatio n covarianc e matri x for th e model . Thi s is
wha t is produce d by th e minimizatio n function . By contrast , th e
secon d type , overal l discrepancy , is th e discrepanc y betwee n th e
underlyin g populatio n covarianc e matri x an d th e estimate d covari
anc e matri x for th e model . To get a goo d estimat e of thi s typ e of
discrepancy , Cudec k an d Henl y recommende d th e single-sampl e
cross-validatio n index , th e ECVI (e.g. , Brown e & Cudeck , 1989).
Cudec k an d Henl y (1991) als o note d tha t model s wit h man y
parameter s to estimat e wil l fit th e bes t provide d th e sampl e size is
largeand , of course , a larg e sampl e size is highl y desirable . In larg e
samples , therefore , it is importan t no t to rel y onl y on overal l discrep
anc y an d Typ e 2 indexes ; on e shoul d us e othe r informatio n as well .
Typ e 3 indexe s ma y hel p handl e mode l complexity , an d cross-valida
tio n is usefu l as well . Finally , Cudec k an d Henl y wen t furthe r an d
note d tha t th e best-fittin g mode l ma y no t be th e mode l closes t to th e
tru e model , whic h ough t to remin d reader s tha t ther e is muc h mor e
to SEM tha n assemblin g a large-sampl e comple x mode l tha t "fits. "
Thei r perspectiv e is supporte d by a discussio n in Chapte r 12 of pos t
ho c mode l modification .

Illustratin g Fit Test s

At thi s point , th e Maruyam a an d McGarve y (1980) dat a set is use d


to illustrat e th e differen t indexes . Five model s ar e compared : a nul l
model , a structura l mode l wit h independen t laten t variables , a just
identifie d structura l model , th e origina l mode l presente d by Maruyam a
an d McGarvey , an d a modifie d mode l tha t add s residua l covariance s
amon g five pair s of residual s of measures . Thos e residua l covariance s
ar e betwee n father' s an d mother' s ratings ; betwee n th e Peabod y
abilit y measur e an d th e standardize d readin g achievemen t measure ;
an d thre e path s interrelatin g residual s amon g verba l grades , school
wor k popularity , an d teachers ' evaluation s (th e origina l mode l ap
peare d in Figur e 9.2).
Result s of th e variou s indexe s appea r in Tabl e 10.3. Th e to p
sectio n of th e tabl e provide s th e "absolute " test s tha t ar e availabl e in
th e LISREL program . Th e secon d sectio n of th e tabl e contain s infor

Alternative

Models

and Significance

Tests

251

matio n abou t th e "relative " test s tha t ar e widel y used ; in all indexes ,
th e nul l mode l is use d as a baseline . I hav e calculate d thes e indexe s
by han d eve n thoug h man y program s provid e the m as par t of th e
progra m output . Reader s hav e availabl e in Tabl e 10.3 all th e infor
matio n neede d (i.e., th e statistics , th e degree s of freedo m of th e
variou s models , an d th e minimu m valu e of th e functio n fro m maxi
mu m likelihoo d estimation ) to calculat e all th e indexes . Th e thir d
sectio n of Tabl e 10.3 provide s informatio n comparin g differen t mod
els wit h on e another . Finally , th e botto m sectio n of th e tabl e provide s
th e differen t indexe s comparin g relativ e fit for th e differen t models .
Thi s last sectio n woul d provid e valuabl e informatio n eve n if th e
model s wer e no t nested .
2

Assum e no w tha t on e ha s availabl e thi s arra y of indexe s an d


statistics . Ho w can the y be interpreted ? It is th e case tha t calculatin g
th e indexe s is th e eas y par t an d is don e in curren t versions of mos t
SEM programs . Unfortunately , at thi s poin t ther e is littl e guidanc e
othe r tha n genera l rule s of thum b abou t overal l fit (e.g. , NF I > .90,
RMSEA < .05), so at thi s poin t persona l judgmen t come s int o th e
picture . Also , interpretatio n is complicate d by th e fact that , in addi
tio n to overal l fit, ther e is th e issu e of particula r path s an d thei r
significance . If a mode l is buil t to examin e a particula r relationshi p
of interest , the n it coul d be tha t plausibilit y of tha t relationshi p can
be assesse d eve n withou t an overal l fit tha t measure s up to th e usua l
requirement s of a "good " fit.
For th e dat a an d model s presente d in Tabl e 10.3, we migh t dra w
th e followin g interpretations . First , as woul d be expected , th e variou s
model s d o muc h bette r tha n th e nul l model . At th e sam e time , th e fit
of th e nul l mode l is no t all tha t bad (RMR = .206), so man y of th e
relationship s acros s measure s mus t be prett y modest . Second , an d
fortunately , it is no t plausibl e (fro m th e independen t laten t variable s
model ) to assum e tha t th e laten t variable s ar e unrelate d to on e
another . Third , th e arra y of differen t indexe s all seem to sugges t tha t
althoug h th e origina l hypothesize d mode l doe s no t fit th e dat a poorly ,
th e mode l coul d be improve d on . As wil l be describe d in th e followin g
chapter , curren t version s of SEM program s provid e directio n abou t
ho w bes t to improv e th e fit of a mode l by freein g fixed parameters ;
th e informatio n is provide d by wha t ar e calle d modificatio n indexes .
Rathe r tha n attemptin g to improv e on th e mode l by relyin g on
modificatio n indice s or som e othe r nontheoretica l means , however ,
thi s model' s attempt s at improvemen t ar e focuse d on measure s tha t

252

LATEN T VARIABL E

TABL E 10.3

Description

MODEL S

Illustratio n of Fit Indexe s for Maruyam a an d McGarve y


(1980) Dat a (N = 249)

of models

M o NUL L

Model s onl y diagona l element s of covarianc e matrix , MODE L 0

MA INDEP.LV s

Laten t variable s unrelate d t o on e another , MODE L A

MB JUST.I D

Just-identifie d structura l model , MODE L

M I M&MCPAPE R

As in Maruyam a an d McGarve y article , MODE L 1

M j MOD.M&M C

Modifie d Maruyam a an d McGarve y model , residua l


covariances , MODE L 2

(a) Basi c Outpu t Provide d by All Version s of th e LISRE L Progra m (al l ar e absolut e in
dexes ; th e X ldf rati o wa s calculate d by hand )
2

Model

df

X ldf

NUL L

738.94

78

9.4736

INDEP.LV s

256.10

65

3.9400

JUST.I D

123.74

55

2.2498

0.2495

.927

.880

.067

M&MCPAPE R

138.55

59

2.3483

0.2793

.920

.876

.079

MOD.M&M C

104.51

54

1.9354

0.2107

.940

.898

.069

Function

GFI

AGF1

RMR

1.4898

.627

.565

.206

0.5163

.851

.792

.147

(b ) Additiona l Indexe s (al l excep t th e firs t ar e relativ e indexes )


Model

PGFI

NUL L

.537

INDEP.LV s

.608

JUST.I D

.560

M&MCPAPE R
MOD.M&M C

NFI

PNFI

PNFI2
(PIFI)

RNI

.716

.597

.711

.899

.634

.896

.841

.883

.668

.880

.890

.926

.641

.924

TU

IFI

.653

.544

.618

.833

.587

.853

.596

.813

.615

.558

.859

.595

Bentle r an d Bonet t pseudo-ch i square-bes t fit wit h mos t degree s of freedom :


( 7 7 ) = 104.51
2

migh t be expecte d to shar e additiona l source s of commo n variabilit y


beyon d th e specifie d facto r structure. Thos e effort s seem to hav e
19

19. Reader s migh t wan t to kee p in min d a coupl e of point s in reviewin g thi s example . First ,
th e mode l wa s intende d primaril y t o illustrat e th e methods , an d w e fel t tha t usin g a
nonrecursiv e mode l woul d enhanc e th e illustration . Thus , certai n path s canno t b e freed , fo r
the y woul d lea d to identificatio n problems . Second , th e version s o f LISRE L availabl e whe n
th e articl e wa s prepare d di d no t readil y allo w residua l covariance s fo r measures ; th e residual s
woul d hav e ha d to be modele d as separat e laten t variable s becaus e th e thet a matrice s ha d
to b e diagonal . Unnecessaril y complicatin g th e illustratio n woul d no t hav e serve d its
purpose .

Alternative

Models and Significance

253

Tests

TABL E 10.3 Continue d

(c) Comparison s of Differen t Model s


Comparison

Difference

df

NFI

PNFI

NNFI

19

.813

.615

.841
.890
.049

Mo-M i

600.39

M0-M 2

634.43

24

.859

.595

M1-M 2

34.04

.046

RNF I (compare s Mode l 1 t o Model s A an d B) = ( F A - F 1 ) / [ F A - F B - ( # 1 -)}


.916

PRNF I = . 3 6 6
(d ) Test s Applicabl e to Non-Neste d Model s
ATC
AIC
Model
(Joreskog)
(Tanaka)

CAIC

ECVI

RMSEA

NUL L

582.9

764.94

230.58

3.085

0.138

INDEP.LV s

126.1

308.1

-167.53

1.243

0.089
0.067

JUST.1 D

13.74

195.74

-234.72

0.789

M&MCPAPE R

20.55

202.55

-245.98

0.817

0.069

MOD.M&M C

-3.50

178.51

-247.43

0.719

0.062

N O T E : GF I = goodnes s o f fit index ; AGF I adjuste d G F I ; R M R roo t mea n residual ; PGF I


parsimoniou s G F I ; NF I = norme d fit index ; PNF I = parsimoniou s N F I ; T L I Tucker-Lewi s index ;
IFI incrementa l fit index ; P1FI = parsimoniou s IFI; RN I relativ e noncentralit y index ; N N F I
non-norme d fit index ; RNF I = relativ e NFI ; PRNF I = parsimoniou s R N F I ; AI C - Akaik e information .

bee n somewha t successful , for th e modifie d mode l fits bette r (e.g. ,


th e chi-squar e of th e difference , whic h assesse s improvemen t gaine d
by th e modifications , is (5) = 34.04). At th e sam e time , however ,
th e parsimoniou s indexe s d o no t improve , so ther e ar e argument s for
goin g bac k to th e origina l model . Th e failur e of th e parsimoniou s
indexe s to improv e probabl y occur s becaus e no t all of th e residual s
specifie d wer e significant , so the y in aggregat e di d no t improv e th e
fit enoug h to offse t th e loss of degree s of freedom .
Th e fina l set of indexe s provide s a fairl y consisten t orderin g fro m
wors t to best . Th e onl y variabilit y is betwee n th e theoretica l an d
just-identifie d models , an d it reflect s slightl y differen t weightin g
betwee n fit an d parsimony . In thi s case , th e indexe s yiel d conclusion s
generall y consisten t wit h th e one s in th e uppe r part s of Tabl e 10.3.
On balance , thi s illustratio n probabl y is muc h like man y actual
studie s in no t bein g straightforwar d or simpl e to interpret . First , on
th e plu s side , in genera l th e mode l seem s plausible , for th e genera l
2

LATEN T VARIABL E

254

MODEL S

facto r structur e seem s reasonable , ther e wer e no out-of-rang e esti


mates , th e differen t indexe s see m generall y consisten t wit h on e
another , an d th e fit, althoug h less tha n ideal , is no t bad . Yet, on th e
negativ e side , becaus e th e fit is less tha n perfect , ther e coul d be way s
in whic h to improv e th e model . At th e sam e time , it is no t clea r ho w
muc h th e centra l theoretica l issue , th e significanc e an d magnitud e of
th e path s fro m acceptanc e to achievemen t an d vice versa , migh t be
change d by mode l modification , so "mode l improvement " ma y no t
reall y be an improvement . Furthermore , modification s migh t no t
improv e fit whe n judge d by th e parsimon y criterion .
In summary , a numbe r of differen t fit indexe s ar e available . Whe n
a samplin g of differen t type s of indexe s is examine d (see , e.g. , Hoyl e
& Panter , 1995), researcher s shoul d hav e availabl e enoug h informa
tio n to feel fairl y confiden t tha t the y understan d ho w wel l thei r
model s fit. Yet, as can be seen fro m th e example , th e indexe s d o no t
necessaril y provid e clea r guidanc e abou t plausibilit y of models , for
th e informatio n ma y be somewha t ambiguous .
To summarize , at thi s point , reader s hav e bee n expose d to all th e
basi c tool s the y nee d to be "intelligent " user s of structura l equatio n
techniques . Thi s knowledg e begin s wit h logi c of partialin g an d pat h
analysis , an d it add s to th e blen d an understandin g of problem s of
regressio n approaches , of issue s relate d to metho d variance , of th e
logi c of longitudina l or pane l analysis , an d of th e logi c of facto r
analysis . Together , thes e ingredient s shoul d giv e reader s th e skill s
neede d to desig n laten t variabl e structura l equatio n models . Effectiv e
user s of laten t variabl e SEM technique s coupl e thes e skill s wit h
theoretica l knowledg e tha t generate s th e model s to be teste d usin g
SEM techniques .

E X E R C I SE

10. 1

Calculatin g Fit Indexe s

Usin g th e chi-square , th e function , an d th e degree s of free


dom , calculat e th e followin g indexe s an d chec k the m agains t
wha t is reported : norme d fit index , incrementa l fit index ,
Tucker-Lewi s index , relativ e noncentralit y index , an d th e
parsimoniou s version s of goodnes s of fit index , norme d fit
index , an d incrementa l fit index .

1 1 1 1 1 1 [ 1 1 1 1 1 1 1

jBij|EEHJ |B ^

A s note d at th e en d of Chapte r 10, reader s at thi s


poin t shoul d hav e a technica l understandin g of ho w to us e laten t
variabl e structura l equatio n techniques . A set of basi c issue s ha s bee n
covered . Thos e issue s includ e th e root s an d fundamental s of decom positio n of covariance s or correlation s int o causa l an d noncausa l
effect s for an y particula r structura l equatio n model , valu e an d short coming s of regressio n technique s wit h respec t to pat h models , com plication s of model s du e to rando m an d nonrando m error s including
metho d variance , th e logi c of reciproca l causatio n and/o r lagge d
effects , th e valu e of facto r analyti c logi c for pat h modeling , an d
genera l issue s relate d to th e us e of laten t variabl e structura l equatio n
models .
If reader s hav e acquire d a reasonabl y goo d understandin g of th e issue s
covered , the n the y shoul d be read y to us e th e technique s effectivel y
and , perhap s mor e important , to rea d th e structura l equatio n model ing (SEM) literatur e an d understan d it. In othe r words , hopefull y the y
ar e read y to wor k wit h rea l dat a an d theoretica l model s of thei r ow n
makin g rathe r tha n my making . Structurally , thi s chapte r introduce s
thre e variation s on SEM approaches .
First , in man y instance s researcher s wan t to compar e structura l
model s in differen t populations . For example , in th e example s of
achievemen t processe s presente d in Chapte r 9, it woul d hav e bee n
255

256

LATEN T VARIABL E

MODEL S

nice to be abl e to directl y compar e achievemen t processe s of whit e


an d minorit y student s (e.g. , whit e vs . Africa n America n vs. Mexica n
American) . Th e initia l sectio n of thi s chapte r discusse s option s avail
abl e in SEM program s for model s in whic h dat a ar e collecte d fro m
multipl e populations . Althoug h ther e hav e no t bee n as man y studie s
testin g comparabilit y of fit acros s severa l population s as on e migh t
expect , possibl e use s ar e many . As example s in additio n to analyse s
of differen t racial/ethni c groups , on e also migh t wan t to compar e
antecedent s of sexua l behavior s for male s wit h thos e for females ,
compar e th e healt h behavior s of colleg e graduate s wit h thos e of
nongraduate s or th e healt h behavior s of an at-ris k grou p wit h thos e
of a grou p no t at risk , or compar e attitude-behavio r link s for youn g
childre n wit h thos e sam e link s for olde r children . In all o f thes e
instances , th e differen t sample s could , of course , be fitte d separatel y
to a singl e model .
SEM program s hav e option s tha t allo w simultaneousl y estimatin g a
singl e solutio n acros s a numbe r of samples . Th e solutio n can estimat e
each sampl e separatel y or impos e constraint s acros s sample s tha t
forc e part s of th e mode l to be fitte d to a singl e solution . By comparin g
fit s of differen t solutions , researcher s ar e abl e to dra w a d d i t i o n a l
inference s abou t overal l mode l comparability .
Second , thi s chapte r illustrate s ho w SEM approache s can be use d to
mode l second-orde r facto r structure s (see , e.g. , Lian g & Bollen ,
1983; Mars h & Hocevar , 1985; Rindskop f & Rose , 1988). Such
model s hypothesiz e tha t th e laten t variable s shar e commo n varianc e
du e to on e or mor e higher - (second- ) orde r factor s tha t lea d to
commo n variance . Perhap s th e mos t widel y use d exampl e of a sec
ond-orde r facto r is genera l intelligence , or G, whic h presumabl y lead s
to commonalitie s acros s differen t abilit y variables . Althoug h usin g
SEM program s to set up higher-orde r facto r model s is no t particularl y
difficult , ther e hav e no t bee n man y application s in th e literature .
Third , an approac h calle d "all-Y" model s is described . Althoug h a
functiona l reaso n for presentin g thi s approac h is so tha t person s wit h
acces s to matri x SEM program s can allo w residuals of exogenou s
variable s (X's) to covar y wit h indicator s of endogenou s variable s
20. Thi s approac h soo n shoul d becom e unncede d fo r matri x approaches . For example , it
ha s bee n mad e moo t in LISRE L 8 by th e additio n of an expande d residua l matrix , bu t it
stil l is neede d by person s usin g earlie r version s o f LISREL , includin g LISRE L 7.

Variations

on the Basic

Model

257

(Y's), it is of conceptua l importanc e becaus e it present s th e under


lyin g matri x mode l use d to solv e model s wit h X-Y covariance s for all
programs . For equation-base d SEM programs , thi s discussio n can
enric h readers ' understandin g of ho w th e program s wor k bu t is
irrelevan t to usin g those programs . Th e approac h require s settin g up
all variable s as endogenou s variable s (i.e., exogenou s variable s ar e
modele d along wit h endogenou s variables) . For example , in th e
Maruyam a an d McGarve y (1980) articl e discusse d in Chapte r 10, a
reasonabl e argumen t can be mad e for allowin g th e residua l of th e
Peabod y abilit y tes t (PEA) to covar y wit h th e residua l of th e stan
dardize d verba l achievemen t (VACH ) test . But PEA is an X variable ,
wherea s VACH is a Y variable , an d in man y compute r program s
residual s of measure s of X variable s ar e no t allowe d to covar y wit h
residual s of Y measures , for th e tw o measure s ar e locate d in differen t
matrices . For suc h problems , onl y by combinin g X an d Y variable s can
thei r residual s covary . (Not e tha t in Appendi x 9.3, th e Maruyam a an d
McGarve y LISREL contro l statement s ar e set up as an all-Y model. )
20

Analyzin g Structura l Equatio n Model s


Whe n Multipl e Population s Are Availabl e

Overview of Methods

On e of th e mos t commo n opportunities , yet on e no t use d all tha t


frequentl y thu s far in th e socia l scienc e literature , is to compar e
structura l model s fro m differen t groups . Thi s procedur e involve s th e
sam e logi c as is use d for comparin g th e magnitud e of a relationshi p
betwee n tw o variable s in differen t samples . For example , we migh t
be intereste d in askin g about th e relationshi p betwee n tim e spen t at
a compute r an d interes t in mathematic s for boy s versu s girl s or in
comparin g tha t sam e relatio n for student s in th e Unite d State s versu s
student s in som e othe r country . Th e comparison s for singl e relation
ship s ar e straightforward ; the y loo k at th e covariance s foun d in th e
tw o differen t groups . Th e approache s describe d in thi s sectio n simpl y
exten d th e logi c of tha t compariso n to greate r number s of relation
ship s an d mor e comple x structura l models .
For th e discussio n of multipl e groups , eve n thoug h th e goa l is to
compar e model s for differen t populations , in fact wha t ar e bein g
compare d ar e model s fitte d to sample s from differen t populations .

258

LATEN T VARIABL E

MODEL S

Thus , th e languag e use d in thi s chapte r talk s abou t comparin g "sam


ples " wit h th e understandin g tha t th e sample s discusse d com e fro m
differen t group s or populations .
An importan t poin t is tha t in multiple-sampl e comparisons , work
ing wit h correlation s is wrong ; correlation s yiel d accurat e finding s
onl y whe n th e variance s in th e differen t sample s bein g compare d ar e
identical . Whe n variance s differ , th e compariso n need s to be betwee n
covariance s (or nonstandardize d regressio n coefficients ) in th e differ
en t samples . Multiple-sample comparisons always should work with
covariance matrices, for only such matrices can deal adequately with
differences in variability across samples. An additiona l complicatio n
can com e in settin g th e metri c or scale of laten t variable s whe n
workin g wit h multipl e sample s (see , e.g. , William s & Thomson ,
1986).
Researcher s ne w to SEM frequentl y carr y wit h the m way s of
thinkin g abou t variable s suc h as gende r or ethnicit y draw n fro m
analysi s of varianc e an d regressio n approaches , namely , thinkin g of
the m as dummy-code d variable s to pu t int o thei r model s as additiona l
variables . It certainl y is possibl e to construc t model s wit h dumm y
variable s for gende r or ethnicity , for difference s betwee n mean s
woul d sho w up as covariance s betwee n th e dumm y variable s an d
othe r variable s in th e models .
Capturin g mea n differences , however , is no t th e sam e as askin g
abou t similarit y of processes . Th e latte r question s woul d remov e
mea n difference s an d compar e magnitude s of specifi c relationship s
acros s groups , as wa s presente d earlie r in th e exampl e of tim e spen t
at a compute r an d interes t in mathematics . Th e issu e is no t whethe r
or no t boy s spen d mor e tim e at computer s tha n d o girls , whic h come s
fro m a mea n comparison . If boy s spen t mor e tim e at computer s tha n
di d girls , the n th e dumm y variabl e of gende r woul d correlat e wit h
th e tim e spen t at computer s variabl e an d in a mode l ther e coul d be a
pat h betwee n them . By contrast , th e cross-sampl e compariso n is
whethe r boy s wh o ar e som e degre e abov e th e boys ' mea n in tim e
spen t at computer s displa y an interes t in mathematic s tha t reflect s
thei r mor e frequen t us e of computer s and whethe r or no t girl s wh o
ar e comparabl y abov e th e girls ' mea n displa y a comparabl e highe r
interes t in mathematics .
Furthermore , fro m a methodologica l perspective , usin g dummy
code d variable s seem s risky . First , th e dichotomou s dumm y variable s
ma y lead to problem s in meetin g assumption s of multivariat e normal

Variations

on the Basic

Model

259

ity. Second , issue s of collinearit y can be particularl y bothersome .


Imagine , for example , a mode l in whic h ever y variabl e is relate d to
gender. Sortin g ou t causa l influence s is particularl y difficul t in suc h a
model . Apparen t effect s of gende r (significan t path s fro m gende r to
othe r variables ) can var y greatl y fro m sampl e to sample , du e onl y to
fluctuation s in th e size of relationship s tha t woul d occu r by chance .
In summary , if a sufficien t sampl e size is available , the n modelin g
group s as multipl e population s is a superio r alternativ e to dumm y
coding .
I Comparing Processes Across Samples
Comparin g model s acros s sample s provide s informatio n tha t allow s
researcher s to tal k about comparabilit y of causa l processe s in differen t
populations . Th e focu s on processe s mean s attentio n directe d towar d
relationships , namely , covarianc e structur e comparisons . As an aside ,
laten t variabl e SEM approache s can be use d to compar e mean s as wel l
as covarianc e structures . Therefore , it is possibl e to tal k abou t com
parabilit y of level s as wel l as of relationship s or processe s (e.g. ,
Sorbom , 1974). Thi s text , however , doe s no t focu s on th e mea n
compariso n issues , for the y requir e introducin g a numbe r of ne w
issues , for example , generatin g for inpu t int o th e compute r program s
wha t is calle d an augmente d momen t matri x rathe r tha n a covarianc e
or correlatio n matri x (intereste d reader s can see , e.g. , Brown e 8c
Arminger , 1995; Byrne , Shavelson , & Muthen , 1989; Sorbom , 1974,
1982).
Th e mos t basi c wa y in whic h to compar e solution s acros s sample s
is to fit th e exac t sam e mode l wit h dat a fro m differen t sample s an d
to compar e th e goodnes s of fit an d th e mode l paramete r estimates .
Each dat a set is fitte d separatel y to th e model . Thi s solutio n can be
valuable, for comparabilit y of model s can be assesse d in differen t
ways . For example , all nonsignifican t path s coul d be droppe d fro m
th e mode l an d "trimmed " model s containin g onl y remainin g path s
compare d acros s th e sample s to see whethe r th e basi c processe s seem
to be th e same . Second , for all paths , confidenc e interval s coul d be
calculate d an d compared , an d whethe r or no t th e confidenc e interval s
for each pat h overla p in th e differen t sample s coul d be assessed .
Wher e confidenc e interval s fail to overlap , th e model s differ .
Althoug h th e tw o approache s just describe d can be valuable , the y
also ar e limited . First , bot h focu s onl y on comparin g individua l

260

LATEN T VARIABL E

MODEL S

parameter/pat h estimate s an d d o no t provid e a direc t compariso n of


goodnes s of fit for th e differen t samples . Overal l mode l fittin g
statistic s coul d be compare d onl y if th e sampl e size s ar e identical , an d
eve n the n difference s in fit betwee n sample s ma y no t be du e to
conceptuall y interestin g issues . In othe r words , th e compariso n of fit
is imprecise , bot h in its focu s an d in its capabilit y of detectin g
conceptuall y importan t differences . Second , eve n whe n th e compari
son s attemp t to focu s on parameter s of greates t conceptua l interest ,
th e individua l test s ar e test s of significan t difference s betwee n esti
mates , rathe r tha n of equalit y of estimates , so findin g estimate s no t
to be significantl y differen t doe s no t mea n tha t the y ar e th e same .
Man y SEM program s hav e as an optio n th e capacit y to analyz e
mor e tha n on e sampl e simultaneously . In its simples t form , thi s allow s
an overal l fit tes t of tw o or mor e separatel y estimate d sample s fitte d
to a singl e theoretica l model . Becaus e th e chi-squar e goodnes s of fit
tes t automaticall y is weighte d by th e sampl e size , th e overal l fit wil l
reflec t th e differen t sampl e sizes . Tha t is, th e overal l fit is a weighte d
sum of th e fit statistic s of th e differen t samples . Said differently , it is
th e sum of th e individua l fit statistics , whic h mean s tha t it is no t reall y
ver y differen t fro m estimatin g each sampl e separately .
Th e SEM compute r program s set up for multipl e samples , how
ever , als o can allo w researcher s to forc e th e differen t sample s to be
fitte d to a singl e solution . Th e solutio n can forc e estimate s of variou s
parameter s to be th e sam e acros s sample s (in th e languag e of SEM,
constraine d to be equal) , so a singl e estimat e is generate d for each
on e of an y numbe r of specifie d parameters . Tha t estimat e maximize s
fit (or minimize s discrepancies ) acros s all th e sample s simultaneously .
As an illustration , loo k bac k at Figur e 9.2, imaginin g tha t ther e ar e
availabl e tw o differen t sample s (whit e an d Africa n America n studen t
samples) . We coul d decid e to constrai n th e path s betwee n achieve
men t an d acceptanc e by peer s ( an d ) to be equa l in th e tw o
groups . Equalit y constraint s coul d be impose d on an y par t of th e
model , includin g th e relationship s of laten t variable s to observe d
measures , th e residual s for th e observe d measures , an d th e relation
ship s amon g an y of th e laten t variables . The n th e fit of th e solutio n
wit h constraint s coul d be compare d wit h th e fit of a solutio n tha t
allowe d th e parameter s to be estimate d separatel y for each group .
Becaus e actuall y settin g up th e cross-sampl e constraint s an d
runnin g a multisampl e analysi s is tied to th e specifi c SEM progra m
used , thi s discussio n focuse s on th e logi c of th e methods . Illustra
tion s of multiple-sampl e comparison s can be foun d in manual s for
2

Variations

on the Basic

Model

261

mos t SEM program s includin g LISREL (Joresko g & Sorbom , 1988),


AMO S (Arbuckle , 1997), an d EQS (e.g. , Dunn , Everitt , & Pickles ,
1993).
I Testing Plausibility

ofContraints

As parameter s ar e constrained , additiona l degree s of freedo m ar e


obtained . Th e critica l questio n is whethe r or no t th e fit of th e mode l
to th e dat a get s wors e as th e constraint s an d degree s of freedo m ar e
added . Th e basi c fit statisti c canno t get bette r as degree s of freedo m
ar e adde d an d woul d sta y th e sam e onl y if th e estimate s wer e identica l
in th e differen t samples . If th e chi-squar e valu e staye d th e sam e as
mor e degree s of freedo m wer e added , the n othe r fit indexe s would ,
of course , improv e (e.g. , xVdf). If th e overal l chi-squar e increase d
substantially , the n th e estimate s fro m th e constraine d mode l woul d
no t fit as wel l as thos e fro m th e unconstraine d model . In othe r words ,
th e sample s mus t diffe r in term s of th e parameter s bein g constrained .
2

I Constraints in the Measurement

Model

To compar e variable s an d thei r relationship s acros s samples , a re


searche r ma y decid e tha t he or she need s th e relationship s betwee n
measure s an d th e underlyin g laten t variable s the y asses s to be identica l
acros s th e samples . Tha t is, by forcin g th e relationship s of measure s
wit h variable s to be equal , th e sam e laten t variable s in principl e
shoul d be assesse d in each sample . Th e equalit y constraint s ma y hel p
ensur e that , by equalizin g th e loadin g for each indicato r an d therefor e
th e relativ e size of th e loading s of th e indicator s of an y particula r
laten t variable , each laten t variabl e wil l be th e sam e for all samples .
In term s of sequencing , constraint s in th e measuremen t mode l ma y
preced e othe r constraints , for the y addres s th e issu e of comparabilit y
of th e theoretica l variables , irrespectiv e of th e relationship s amon g
thos e theoretica l variables .
Speakin g practically , th e free parameter s relatin g observe d meas
ure s to thei r underlyin g construct s (in LISREL, all element s of th e
lambd a matrices ) woul d be constraine d to be equa l acros s th e sample s
(analogou s to tau equivalen t tests) . In Figur e 9.2, thes e equalit y
constraint s woul d be impose d on th e lambd a coefficient s so that , for
example , th e estimat e for * linkin g SEI to SES woul d be th e sam e in
th e whit e an d Africa n America n samples . Simila r constraint s woul d
be impose d on all othe r lambd a coefficients .

LATEN T VARIABL E

262

MODEL S

Th e constraint s also coul d be extende d to th e residual s (in LISREL,


th e thet a matrices ) if th e researche r though t tha t th e tota l varianc e
woul d be th e sam e acros s samples . Tha t varianc e woul d the n be
divide d equall y int o reliabl e an d erro r varianc e (analogou s to paralle l
tests) . Onc e agai n returnin g to Figur e 9.2, th e focu s is on th e delta s
an d epsilons .
Parallelin g th e precedin g discussion , th e estimat e for 6 for
example , in th e whit e sampl e woul d be constraine d to be equa l to 8j
in th e Africa n America n sample . If th e variance s of th e indicator s ar e
no t equal , however , the n constrainin g th e residual s as wel l as th e
relation s of indicator s to laten t variable s wil l resul t in a poore r fit of
th e dat a to th e model .
b

Constraints in the Structural

Model

Researcher s also coul d decid e to constrai n relationship s withi n th e


structura l (laten t variable ) par t of th e model . Thes e constraint s focu s
on similarit y of hypothesize d causa l processes , examinin g whethe r or
no t th e laten t variable s displaye d th e sam e relationship s acros s sam
ples . Returnin g to Figur e 9.2, constrainin g an d woul d allo w us
to compar e th e relationship s betwee n acceptanc e by peer s an d schoo l
achievemen t for whit e an d Africa n America n students . Befor e impos
ing constraint s in th e structura l model , however , researcher s nee d to
be confiden t tha t th e measuremen t model s ar e ensurin g tha t th e
theoretica l variable s ar e th e sam e in th e differen t samples .
2

When and How to Impose Equality

Constraints

Introducin g notion s abou t constraint s acros s tim e an d sample s


draw s attentio n to issue s tied to exac t versu s conceptua l replication .
Althoug h exac t versu s conceptua l replicatio n issue s generall y ar e
discusse d in th e contex t of experimenta l work , the y ar e criticall y
importan t her e as well . Exac t replicatio n refer s to situation s in whic h
a theoretica l variabl e is operationalize d in exactl y th e sam e wa y in
differen t instances . Conceptua l replicatio n refer s to situation s in
whic h a theoretica l variabl e is operationalize d in differen t way s in dif
feren t instance s bu t in whic h researcher s emplo y som e typ e of vali
datio n proces s to demonstrat e tha t th e variabl e bein g tappe d is th e
sam e one .

Variations

on the Basic

Model

263

Issue s of exac t an d conceptua l replicatio n in SEM approache s ar e


mos t prominen t in situation s wher e th e sam e measure s ar e bein g
assesse d eithe r acros s tim e or acros s samples . State d mos t directly , th e
centra l issu e is ho w to ensur e tha t th e variable s ar e define d th e same ,
eithe r acros s tim e or acros s samples . For example , if we ar e workin g
wit h self-concept , ho w can we be sur e tha t wha t is bein g calle d
self-concep t at Tim e 1 or in Sampl e 1 is th e sam e as wha t is bein g
calle d self-concep t at Tim e 2 or in Sampl e 2?
Th e primar y challeng e in ensurin g comparabilit y is to determin e
whethe r comparabilit y can be bette r create d by tryin g to produc e
exac t replicatio n or conceptua l replication . Exact replicatio n can be
don e by forcin g th e loading s acros s tim e or sample s to be th e sam e
via constraints . So, for example , in Figur e 11.1 (whic h is set up to
paralle l Figur e 7.1 bu t assesse s a singl e variabl e acros s time) , we coul d
forc e th e followin g equalities : p = p = p , p = ps = p , pi = p =
p , thereb y ensurin g tha t th e indicator s remai n proportionat e acros s
time . For multipl e samples , imagin e tha t Figur e 11.1 ha s tw o differen t
samples . In suc h a case , th e paralle l issu e woul d be as follows : Shoul d
pi in Sampl e 1 be exactl y th e sam e as p (whic h coul d be calle d p/ )
in Sampl e 2, tha t is, , = p,' ? Th e sam e issu e woul d hol d for p , p ,
an d so on . In comple x models , constraint s ma y nee d to hol d bot h
acros s tim e an d acros s samples .
t

Like man y laten t variabl e SEM issues , however , forcin g equalit y


constraint s is th e righ t answe r onl y in context ; ther e ar e circum
stance s in whic h forcing exac t replicatio n ma y resul t in a different
conceptua l variabl e bein g assesse d du e to th e natur e of th e processe s
bein g modeled . For example , in man y developmenta l processes , th e
dimensionalit y of construct s ma y be increasing . In suc h circum
stances , weight s representin g th e bes t relation s of measure s to a
theoretica l variabl e at on e poin t in tim e ma y be a suboptima l weight
ing of wha t migh t be eve n mor e tha n on e theoretica l variabl e at a late r
poin t in time .
As an illustration , imagin e tha t we ar e attemptin g to mode l a
theoretica l variabl e calle d "academi c achievement " in a sampl e of
elementar y school-age d children , usin g as ou r measure s mathematica l
an d verba l achievemen t tes t performance . (Thi s discussio n parallel s
an earlie r discussio n of achievemen t in pane l model s in whic h th e
compariso n is acros s tim e rathe r tha n acros s samples. ) Imagin e fur
the r tha t ou r stud y compare s processe s of childre n in 1st grad e (at
whic h poin t childre n hav e ha d little , if any , forma l instructio n in

LATEN T VARIABL E

264

Tim * 2

Tim * 1

py'Pi

Figur e 11.1.

' l

py

Pi\

v, | Yi I I Yi I

ei

Tim * 3

FT 2

f C.
V ^

62

03

yi|

1
3

JPi

1 y . | 1
65

MODEL S

v .| LXI

\P B

\P

1 y . |
LX

69

Illustratio n of Equalit y Constraint s fo r Longitudina l Mode l

mathematics ) wit h processe s of childre n in 6th grade . If we chos e to


forc e th e respectiv e weight s of verba l an d mat h performanc e meas
ure s to sta y th e sam e acros s th e tw o age groups , the n we woul d likel y
be takin g an indefensibl e position . As student s get older , th e result s
of instructio n shoul d be tha t complexit y an d diversit y in mathematic s
skill s ar e increasing . (Change s in variabilit y of mathematic s skill s
woul d no t by themselve s creat e a problem , for the y shoul d be handle d
by analyzin g a covarianc e matrix. ) Forcin g th e loading s to be equa l
acros s group s ha s risks . First , we ma y neve r ta p muc h of th e mathe
matic s achievemen t domain . Second , if th e dimensionalit y is increas
ing , the n th e mode l ma y wel l be misspecifie d for eithe r or bot h of th e
group s an d th e finding s ma y be meaningless .
Th e primar y poin t her e is tha t eve n thoug h exac t replicatio n ma y
seem bette r in man y instances , ther e ar e time s whe n onl y throug h
allowing chang e can conceptua l replicatio n be attained . Lettin g load
ing s var y acros s tim e ma y be preferabl e to forcin g the m to sta y th e
same . Mos t important , th e decisio n on constrainin g need s to be
drive n by theory , no t methodologica l elegance .
Regardles s of th e decisio n abou t constraints , successfull y address
ing issue s of comparabilit y acros s tim e an d sample s is critica l for SEM
approaches . It is ver y difficul t to spea k abou t comparabilit y of causa l
processe s if th e laten t variable s bein g compare d diffe r fro m on e
anothe r in differen t samples . Pu t simply , if th e laten t variable s ar e
different , the n th e processe s bein g compare d canno t be th e same .

Variations

on the Basic

Model

265

In summary , then , SEM approache s can be use d to compar e


structura l relationship s amon g dat a collecte d acros s tw o or mor e
groups . By imposin g differen t type s of constraints , variou s assump
tion s abou t th e relationship s can be teste d an d comparabilit y of laten t
variable s in differen t sample s can be increased . As ha s bee n tru e of
ever y approac h describe d throughou t thi s book , decisions , in thi s case
to analyz e dat a as multiple-grou p dat a an d to impos e restrictiv e
constraints , mus t be drive n by theoretica l considerations . Decidin g to
forc e relationship s betwee n measure s an d variable s to be equa l acros s
sample s ma y be necessar y in som e situation s yet wron g in others . In
on e case , it ma y be that , withou t restrictions , th e laten t variable s
woul d be define d so differentl y in th e variou s sample s tha t th e
construct s woul d no t be th e same . Yet, in anothe r case , it ma y be tha t
certai n construct s displa y themselve s differentl y in variou s samples ,
wit h th e resul t tha t constrainin g loading s to be equa l woul d no t allo w
th e sam e construct s to emerg e in differen t samples . Onl y fro m a
theoretica l basi s coul d thes e decision s be mad e wit h an y precision .
In conclusion , multiple-sampl e comparison s offer a valuabl e ex
tensio n of basi c laten t variabl e SEM approaches . Yet wit h th e exten
sion come s greate r complexity ; for multiple-sampl e comparisons ,
ther e ar e importan t issue s tha t wil l hav e to be resolve d a prior i abou t
th e mos t likel y wa y in whic h to ensur e comparabilit y of processe s
acros s samples .

Second-Orde r Facto r Model s

As suggeste d in th e introductio n to thi s chapter , second-orde r factor s


ar e factor s of factors . Th e exampl e mentioned , th e G facto r in ability ,
posit s a higher-orde r genera l abilit y dimensio n tha t draw s fro m mor e
specifi c abilit y competencies . If thos e specifi c competencie s ar e de
fine d as factors , the n G is a facto r define d by thos e factor s or a
second-orde r factor . Each of th e mor e specifi c abilit y competencie s
is a facto r tha t can be assesse d throug h a measuremen t procedure . By
contrast , th e genera l abilit y dimensio n is assesse d throug h th e specifi c
factor s rathe r tha n throug h an y measures . If it wer e directl y measure d
an d ha d its ow n indicators , the n th e mode l woul d no t be a second
orde r facto r model .
Mars h an d Hoceva r (1985) provide d an illustratio n of second
orde r facto r model s for modelin g self-concept . The y contraste d a
first-orde r facto r mode l for seve n self-concep t domain s (physica l

266

LATEN T VARIABL E

MODEL S

ability , appearance , relation s wit h peers , relation s wit h parents , read


ing , mathematics , an d genera l schoo l self-image ) wit h differen t sec
ond-orde r models . For example , on e mode l separate d academi c
self-concep t fro m all other s as tw o second-orde r factors .
Extractin g second-orde r factor s can be tricky , for thos e factor s
ar e define d by unmeasure d variable s whos e definition s ma y be ques
tionabl e or controversial . Pu t simply , as on e get s to th e leve l of
extractin g unobservables from unobservables , ther e is greate r poten
tia l for erro r an d disagreement . Thi s potentia l exist s eve n whe n
theorizin g allow s clea r articulatio n of th e processe s involve d tha t
warran t specificatio n as second-orde r factors .
Th e firs t questio n for SEM approache s to second-orde r factorin g
is a technica l one : Is it eve n possibl e to mode l second-orde r factors ?
Th e answe r is yes, for th e mode l can be adapte d to second-orde r
factorin g provide d condition s for identificatio n ar e met . In general ,
identificatio n of thos e model s parallel s identificatio n of facto r model s
in whic h factor s correlate ; it is th e correlation s amon g th e factor s tha t
ar e fitte d to a particula r second-orde r facto r space . Settin g up second orde r facto r model s in equation-base d program s suc h as AMO S an d
EQS is relativel y easy , for it just require s additiona l equation s tha t
expres s second-orde r factor s as a functio n of first-orde r factors . For
matri x program s suc h as LISREL, ther e ar e specia l limitation s in ho w
to set up th e matrices . Second-orde r facto r model s requir e settin g up
a mode l as Y onl y an d the n to us e th e gamma , phi , an d psi matrice s
to defin e th e second-orde r factors .
Assuming , then , tha t we hav e a mode l tha t warrant s specificatio n
including second-orde r factors , ho w ar e th e detail s accomplishe d in
a matri x for m model ? Initially , th e first-orde r factor s ar e set up just
like an y laten t variabl e confirmator y facto r analysi s (CFA) mode l bu t
usin g th e Y measuremen t model . (For ordinar y CFA, eithe r th e X sid e
or th e Y sid e of th e mode l can be used. ) Tha t is,
= + e,

yieldin g
lyy

= Ay \}'Ay

Qt,

whic h is a traditiona l facto r model . Th e Y sid e is use d becaus e th e


covarianc e matri x of th e eta s ha s to be specifie d in term s of othe r

Variations

on the Basic

267

Model

matrices . For a first-orde r facto r model , th e eta-et a transpos e matri x


(' ) is specifie d as psi , wit h no gamm a or ph i matrices . For a
second-orde r facto r model , however ,
^ + ,
whic h is just a facto r mode l parallelin g th e facto r mode l for th e
observe d measures . Th e equatio n combinin g th e first - an d second
orde r factor s become s
lyy = ^ + ) Ay

Th e lambda s relat e observe d measure s to first-orde r factors , th e


epsilon s ar e th e uniquenesse s of observe d measure s an d thet a epsilo n
is thei r variance/covarianc e matrix , th e gamma s relat e first-orde r
factor s to second-orde r factors , th e phi s ar e th e variance/covarianc e
matri x of th e second-orde r factors , an d th e zeta s (th e residual s in psi )
ar e th e residual s fro m th e first-orde r factor s an d psi thei r variance /
covarianc e matrix .
In summary , second-orde r facto r analysi s can be set up wit h eithe r
equation-base d or matrix-base d SEM programs . Th e genera l laten t
variabl e SEM approac h allow s factor s of factor s to be extracted ,
casting second-orde r facto r extractio n as a straightforwar d extensio n
of SEM. At thi s poin t in time , th e matri x approac h SEM program s d o
no t allo w structura l model s to causall y interrelat e second-orde r fac
tors . For thos e specifi c structura l model s in whic h th e relationship s
amon g th e second-orde r factor s defin e a just-identifie d model , how
ever , th e solutio n to thos e structura l model s woul d be identica l in fit
to a solutio n in whic h all th e second-orde r factor s ar e allowe d to
correlat e wit h on e another . Thus , by extractin g th e second-orde r
facto r matri x in whic h all factor s ar e allowe d to covar y wit h on e
anothe r an d the n analyzin g tha t matri x via ordinar y regressio n tech
niques , structura l path s coul d be estimated . Becaus e th e structura l
par t of th e mode l is just identified , th e regressio n estimate s woul d
yiel d th e sam e exac t overal l fit as th e second-orde r facto r mode l tha t
allow s all factor s to intercorrelate . In othe r words , a just-identifie d
second-orde r structura l mode l woul d yiel d a fit identica l to extractin g
a second-orde r facto r variance/covarianc e matrix .

LATEN T VARIABL E

268

MODEL S

All-Y Model s

As mentione d earlie r an d use d in settin g up tw o illustration s in


Chapte r 9, ther e ar e certai n instance s in whic h it make s sens e whe n
usin g matrix-base d SEM program s to tr y to "combine " exogenou s
an d endogenou s variables . By contrast , equation-base d SEM pro
gram s suc h as AMO S an d EQS set up th e equation s an d covariance s
elemen t by element , thu s bypassin g matri x issue s altogethe r for
progra m users . An importan t point , however , is tha t regardles s of ho w
th e use r interfac e is set up , all program s wor k wit h matrice s in th e
paramete r estimatio n process , so th e discussio n provide s a loo k at
ho w all program s actuall y estimat e relations .
Th e primar y reaso n for user s to selec t an all-Y mode l is to allo w
residua l covariance s betwee n measure s of exogenou s an d endogenou s
variables , whic h canno t be don e in som e version s of compute r pro
gram s tha t set up th e free parameter s in matri x form . For suc h a
combinatio n to be worthwhile , thes e programs , of course , produc e a
solutio n proces s identica l to wha t woul d be foun d if an equivalen t
mode l wer e calculate d separatel y for exogenou s an d endogenou s
variables . Tha t is, if a singl e mode l (withou t residua l covariance s
betwee n X an d Y measures ) wer e estimate d wit h X an d Y variable s
an d the n agai n as all Y, the n th e estimate s an d overal l fit woul d hav e
to be identica l regardles s of metho d chosen .
Th e approac h to be presented meet s th e precedin g conditions ,
for it is exactl y th e sam e as th e X an d Y solutio n excep t for th e
opportunit y to ad d X-Y residua l covariates . Th e matrice s fro m th e
all-Y mode l wil l be bigge r in size , for the y contai n multipl e matrice s
withi n them . However , as wil l be shown , ther e is a direc t correspon
denc e betwee n th e matrice s of th e tw o approaches .
Mor e specifically , lookin g at th e basi c model , th e measuremen t
mode l is as follows :
= + for th e endogenou s variable s
X = + for th e exogenou s variable s
= A ' ' +
= ' ' + .

Th e structura l mode l is as follows :

Variations

on the Basic

Model

269

=++
( 7 - ) = +
' = (/ - )- '( 7 - )' + (/ - )" '(/ - B)~
_. = (7 - )- "(7 - ) "" + (7 - )" (7 - ) - ".
1

Lookin g at th e all-Y variant , th e measuremen t mode l is as follows :


Y* = A * * + *
|X |
|A 0 | | |
| |
| | = | 0 | | | + |e| .
y

In othe r words , wha t is no w bein g calle d (Y*) is bot h X an d Y an d


is a vecto r of size X + Y, th e new eta (*) include s bot h xi an d eta
an d is a vecto r of size + , th e new lambd a Y (A *) is no w of size
(X + ) ( + ) , an d th e new epsilo n (*) is bot h delt a an d epsilo n
an d is a vecto r of size X + Y.
Th e ne w thet a matri x (*), whic h is of size (X + ) (X + Y),
become s
y

* = | symmetri c I
|
| .
Not e tha t th e old thet a delt a an d thet a epsilo n matrices , althoug h
locate d differently , remai n unchanged , bu t no w ther e ar e opportuni
tie s to allo w residual s fro m X an d Y variable s to covar y in th e thet a
epsilon-delt a matrix . Th e structura l mode l is as follows :
* = * * + *
| |
|00 || |
| |
| | = | | | | + || .
Th e new mode l include s onl y th e eta , beta , an d psi matrices . Eta (*)
wa s define d in th e measuremen t model . As can be seen fro m th e
partitionin g of th e matrices , bet a (*) include s bot h gamm a an d beta ,
zet a (*) include s xi (yes , xi appear s in tw o differen t matrices ) an d
zeta ; and , as wil l be shown , psi (* ) include s bot h ph i an d psi . For
th e "new " psi , eve n thoug h th e exogenou s variable s ar e viewe d in th e

270

LATEN T VARIABL E MODEL S

mode l as endogenous , it still is th e cas e tha t no variabl e cause s an y of


the m (whic h is, of course , th e definitio n of an exogenou s variable) ,
wit h th e resul t tha t thei r tota l varianc e is unexplained , an d thu s thei r
residua l variance s ar e thei r tota l variances . Thei r variance/covarianc e
matri x is par t of th e residua l variance/covarianc e matri x psi . Th e ne w
ps i (*) is
- | | | | = |0 |
||
|0| .
Finally , parallelin g th e equations , th e covarianc e structur e include s
fewe r bu t large r matrices :

i,

= (/-BV r

(i-B *y .
u

In summary , settin g up model s as all Y require s onl y a simpl e


reconfigurin g of matrice s int o a ne w structure . It change s nothin g bu t
allow s additiona l option s for program s tha t requir e settin g th e equa
tion s up in matri x form .

E X E R C I SE

11. 1

Set up th e matrice s for th e Maruyam a an d McGarve y (1980)


proble m illustrate d in Chapte r 10 as an all-Y model . For
LISREL users , set up th e contro l statements . Th e setu p can
be compare d wit h th e LISREL contro l statement s in Appen
di x 9.3.

Thi s fina l chapte r present s thre e set s of issues . First ,


wit h no inten t to squelc h an y enthusias m tha t ha s bee n generated ,
structura l equatio n modelin g (SEM) approache s ar e viewe d fro m
perspective s of thei r critics . Thi s chapte r look s at severa l types of
critic s of SEM technique s in castin g th e SEM field withi n th e broade r
discipline s of th e social an d behaviora l sciences . Ther e ar e critic s
within th e grou p of social scientist s wh o coul d be calle d user s of th e
technique s as wel l as critic s wh o woul d neve r conside r usin g th e
technique s becaus e the y believ e the m to be deficien t in substantia l
ways . Second , a "hot " issu e in SEM research , tha t of mode l modifi cation , is discussed . Th e issu e is determining th e exten t to whic h an y
modificatio n of a theoretica l mode l is appropriate . In othe r words , it
is decidin g on a balanc e betwee n mode l developmen t an d mode l
confirmation . Third , a rang e of topic s no t addresse d in thi s boo k ar e
describe d briefly . Becaus e thi s boo k wa s intende d primaril y to provid e
a genera l an d basi c introductio n to structura l equatio n techniques ,
ther e hav e bee n a numbe r of fairl y comple x or less fundamenta l SEM
issue s tha t hav e bee n left out . The y ar e th e kind s of issue s tha t a
researche r coul d encounte r give n a particula r typ e of theoretica l
mode l or dat a set bu t tha t ar e no t integra l part s of basi c SEM
approaches .

271

272

LATEN T VARIABL E

MODEL S

Criticism s of Structura l Equatio n Modelin g Approache s

"Internal"

Critics

On e of th e mos t eloquen t critique s of SEM approache s is th e on e


give n by Cliff (1983). Cliff commende d th e developer s of SEM
approache s for th e tool s the y hav e provide d bu t wen t on to sugges t
tha t few interpretationa l problem s ar e solve d by th e approache s an d
suggeste d furthe r tha t us e of SEM technique s coul d be disastrou s if
socia l scientist s suspende d thei r critica l judgment s whe n considerin g
SEM studie s an d models . H e wen t on to describ e fou r principle s of
scientifi c inferenc e tha t researcher s migh t be entice d to violate .
Th e firs t principl e is tha t data never can confirm a model; the y
can onl y fail to disconfir m it. Hopefully , thi s poin t is on e tha t ha s
bee n mad e enoug h time s throughou t thi s boo k to hav e bee n deepl y
embedde d int o th e knowledg e reader s hav e acquire d abou t SEM
approaches . Cliff (1983) wen t on to a secon d poin t tied to th e sam e
principle , also hopefull y wel l entrenche d in th e understandin g of
readers : If th e dat a d o no t disconfir m a particula r model , the n ther e
ar e othe r (alternative ) model s tha t ar e no t disconfirme d either . Ther e
ar e importan t corollarie s of thi s poin t abou t alternativ e models . On e
is tha t replicatio n is eve n mor e importan t in SEM researc h tha n in
experimenta l work , for it is importan t to kno w whethe r failur e to
rejec t is plausibl e beyon d th e dat a set for whic h a mode l initiall y wa s
fitted . A secon d corollar y is tha t it is criticall y importan t to uncove r
alternativ e explanation s for an y findin g so tha t competin g model s can
be teste d throug h inclusio n of othe r variable s an d replicatio n an d
extensio n of th e plausibl e model . A quot e of his is particularl y cogent :
"Muc h of wha t characterize s goo d researc h is th e abilit y to anticipate ,
an d neutraliz e wit h data , potentia l criticism s of conclusions " (p . 118).
Cliff' s (1983) secon d principl e wa s post hoc is not propter hoc.
In othe r words , tempora l sequencin g of dat a collectio n of particula r
variable s is no t a guid e for inferrin g causality . Even if correlation s ar e
strong , ther e ar e man y differen t causa l explanation s for thos e corre
lations . Perhap s th e bes t illustratio n is th e achievemen t domai n in
som e of th e example s presented in thi s book ; eve n thoug h at an y tim e
poin t achievemen t wa s moderatel y correlate d wit h a numbe r of
differen t variable s tha t temporall y precede d it, its stabilit y wa s so hig h

Wrapping

Up

273

tha t in mos t instance s non e wa s a viabl e caus e whe n th e mode l wa s


specifie d as longitudinal .
Th e thir d principl e is anothe r on e tha t ha s receive d substantia l
attentio n throughou t thi s book . Cliff (1983) calle d it th e nominalistic
fallacy. Th e poin t is tha t givin g somethin g a nam e doe s no t necessaril y
mak e it wha t we call it or ensur e tha t we understan d th e thin g we
hav e named . Althoug h mos t obviou s in term s of namin g factors , th e
issu e basicall y is on e of operationalizatio n or mode l specification ;
ther e alway s is som e gap betwee n theoretica l variable s an d th e
measure s tha t operationaliz e them .
For manifes t variabl e models , fundamenta l issue s of validit y
(Wha t doe s th e measur e asses s in additio n to th e theoretica l variabl e
of interest? ) an d reliabilit y (Wha t par t of th e measur e is error , un
relate d to an y conceptua l variable? ) for each measur e ar e tie d to th e
nominalisti c fallacy . Eithe r poo r validit y or low reliabilit y cause s grea t
problem s in tryin g to interpre t path s in a mode l becaus e on e or mor e
variable s ar e no t exactl y wha t we thin k the y ar e an d ar e callin g them .
For laten t variabl e model s in whic h multipl e indicator s ar e avail
able , on e coul d argu e tha t th e problem s ar e substantiall y lessened , for
th e theoretica l variabl e alway s is mor e tha n an y singl e indicator . At
th e sam e time , however , each conceptua l variabl e is define d by th e
set of measure s or indicator s selecte d to asses s it. Researcher s assume ,
ofte n wit h littl e justification , tha t thei r indicator s ar e representativ e
of th e domai n define d by th e theoretica l variabl e an d tha t a differen t
set of indicator s woul d no t chang e th e conceptua l variabl e ver y much .
To th e exten t tha t th e reasonin g is wrong , th e result s coul d var y
greatl y wit h differen t measures . Furthermore , becaus e residuals/erro r
is essentiall y wha t is left ove r afte r a commo n facto r is draw n fro m
th e indicators , th e residual s also can chang e as differen t measure s ar e
use d as indicators .
An all to o commo n situatio n is on e in whic h all th e indicator s of
a conceptua l variabl e ar e collecte d usin g a singl e method . In man y of
thos e instances , substantia l metho d varianc e exists . A likel y resul t is
tha t th e commo n metho d acros s indicator s become s "part " of th e
conceptua l variabl e becaus e metho d an d trai t ar e intertwined . (Wit h
"extra " indicator s of laten t variables , ther e at leas t is th e opportunit y
to bootstra p th e indicators , droppin g the m on e at a tim e an d reesti
matin g th e mode l to see whethe r th e relationship s of th e laten t
variabl e wit h othe r variable s change. )

27 4

LATEN T VARIABL E

MODEL S

Cliff (1983) illustrate d namin g problem s by discussin g a construc t


of "verba l ability. " H e suggeste d tha t th e construc t get s define d as
wha t th e differen t test s of verba l abilit y hav e in common , whic h ma y
diverg e markedl y fro m th e underlyin g conceptua l variable .
Finally , Cliff (1983) addresse d th e issu e of ex post facto analysis .
Hi s vie w wa s a traditiona l an d conservativ e on e tha t seem s to hav e
bee n relaxe d by man y SEM researchers , yet on e tha t generall y ha s
bee n advance d by thi s book . It wa s tha t SEM technique s ar e intende d
to be use d for mode l confirmation , no t mode l development . Onc e
again , Cliff wen t further , addressin g way s in whic h inspectio n of th e
dat a (e.g. , th e correlatio n matri x of th e observe d measures ) can lea d
researcher s to modif y thei r model s eve n befor e usin g SEM programs .
In fact , suc h an approac h is implicitl y suggeste d in thi s book , for us e
of consistenc y test s to ensur e dimensionalit y of indicator s woul d
allo w researcher s grea t insigh t int o thei r data . Cliff suggeste d tha t th e
fit statistic s no longe r ar e meaningful , for th e dat a woul d hav e bee n
modifie d to enhanc e fit befor e th e SEM analyse s ar e eve n conducted .
Althoug h th e proble m is no t on e addresse d in thi s book , th e solutio n
propose d by Cliff wa s on e suggeste d earlie r in thi s book , namely , to
spli t th e sampl e an d to us e cross-validatio n of th e findings .
Cliff' s (1983) criticism s provide d a ver y importan t context , for
the y reminde d SEM researcher s tha t ther e is no magi c in thei r
method s an d tha t thei r structura l equatio n model s nee d to stan d up
to scrutin y on a numbe r of dimensions . Activel y workin g to addres s
th e fou r principle s articulate d shoul d prepar e researcher s for th e
revie w process , for issue s of causa l ordering , operationalization ,
mode l disconfirmation , an d mode l modificatio n all ar e centra l to
preparatio n of a manuscrip t usin g laten t variabl e SEM approaches .
A secon d commonl y cited criticis m cam e fro m Breckle r (1990).
Breckle r focuse d on five problem s tha t he argue d ar e (or at leas t were )
widesprea d in th e SEM literature . Thos e problems , whic h overla p
somewha t wit h Cliff (1983), ar e (a) problem s du e to violation s of
distributiona l assumption s underlyin g SEM techniques , (b) problem s
tied to th e existenc e of alternativ e models , (c) bot h developin g an d
"confirming " model s wit h a singl e set of data , (d ) mode l modificatio n
unaccompanie d by cross-validation , an d (e) poorl y justifie d causa l
inferences . In part , Breckler' s criticism s reflec t changin g practice s
withi n th e emergin g SEM field , for th e field change d rapidl y durin g
th e tim e perio d he covere d (1977-1987) in his review . For example ,
ther e wa s a tim e whe n "causa l modeling " wa s an accepte d descriptio n

Wrapping

Up

275

of wha t no w is calle d structura l equatio n modeling . From my perspec


tive , tha t issu e is no t primaril y on e of belief s about causalit y bu t rathe r
an issu e of cultur e an d changeusin g terminolog y an d styl e tha t ma y
be accepte d at on e tim e bu t tha t is late r changed . I woul d expec t tha t
SEM approache s woul d fare bette r if mor e recen t studie s wer e exam
ined . No t onl y ar e th e method s bette r understoo d an d mor e carefull y
use d today , bu t continue d developmen t of compute r softwar e an d
SEM method s make s it easie r to tes t for distributiona l assumption s
an d to contro l mode l modificatio n problems . At th e sam e time , man y
researcher s stil l ar e usin g SEM approache s becaus e the y ar e tol d the y
nee d to us e them , eve n if thei r understandin g is far fro m perfect .
Breckler' s (1990) revie w of literatur e in th e are a of personalit y
an d social psycholog y foun d man y suboptima l use s of SEM methods .
Accordin g to Breckler , onl y a smal l proportio n of article s reporte d
examinin g thei r dat a for multivariat e normality , man y provide d onl y
incomplet e informatio n on mode l fit an d on predictio n of specifi c
variables , an d virtuall y all faile d to discus s th e existenc e of equiva
len t models . Consisten t wit h Cliff' s (1983) ex pos t fact o criticism ,
Breckle r foun d SEM researcher s to bot h develo p an d tes t model s on
a singl e dat a set , mostl y withou t cross-validation ; man y seem to hav e
use d pos t ho c mode l modification s to improv e overal l mode l fit,
agai n withou t cross-validation . Finally , Breckle r foun d tha t som e
researcher s use d causa l languag e inappropriately . Overall , then , Breck
ler' s criticism s still provid e a helpfu l guid e for researcher s as the y us e
SEM technique s an d prepar e SEM manuscripts .

E X E R C I SE
Reader s ar e encourage d to selec t article s fro m thei r disci
pline s an d see ho w the y far e by Cliff' s (1983) fou r principle s
an d Breckler' s (1990) five problems .

"External"

Critics

As man y reader s ma y hav e discovere d in discussin g SEM technique s


wit h colleague s an d friends , ther e ar e man y socia l scienc e researcher s
wh o ar e ver y skeptica l abou t SEM approaches . Som e of thes e skeptic s

276

LATEN T VARIABL E

MODEL S

ar e simplisti c in thei r views , clingin g dogmaticall y to th e phras e


"Correlatio n doe s no t impl y causation " or dismissin g th e technique s
du e to limitation s of manifes t pat h models . Beyon d suc h skeptics ,
however , is a muc h mor e sophisticate d grou p tha t view s SEM tech
nique s negativel y (e.g. , Baumrind , 1983; Ling , 1982). Thi s group' s
positio n can be illustrate d by Ling (1982) in his revie w of Kenny' s
(1979) book , Correlation and Causation. Ling (1982) stated ,
Th e autho r of thi s boo k hold s th e vie w . . . tha t causa l inferenc e fro m
correlationa l d a t a . . . is a vali d for m o f statistica l an d scientifi c inference .
M y vie w ha s bee n tha t th e method s an d techniques , develope d an d
applie d unde r tha t premise , fo r causa l inferenc e . . . ar e at bes t a for m
of statistica l fantasy , (pp . 4 8 9 - 4 9 0 )

H e wen t on to describ e SEM approache s as "a clas s of pseudo-black


magi c methods " (p . 491). Ling' s is a ver y traditiona l view tha t stil l is
hel d by man y statisticians .
Baumrin d (1983) too k a somewha t mor e moderat e position , on e
drive n by traditiona l conception s of caus e an d effect . He r vie w is tha t
causalit y claim s fro m SEM technique s markedl y excee d th e capacit y
of th e method s to spea k about causa l mechanisms . He r criticism s wer e
drive n by (a) failur e to dismis s alternativ e model s (SEM theorist s
agree , e.g. , MacCallum , Wegener , Uchino , & Fabrigar , 1993 note d
tha t fit alon e neve r can justif y acceptin g a model , for ther e wil l be
alternativ e model s tha t fit th e dat a equall y well) , (b) incompletel y
specifie d model s ("First ther e mus t be a viabl e causa l hypothesi s to
model " [p . 1296]), an d (c) th e ofte n wea k natur e of relationship s
uncovere d by SEM techniques .
Well , so perhap s I shoul d hav e warne d reader s earlie r so tha t the y
coul d prepar e themselve s for th e shortcoming s of th e method s of
fantas y tha t I hav e bee n attemptin g to explain . But , of course , I did ;
as note d in Chapte r 1, ther e is a rang e of view s abou t SEM techniques ,
an d man y perspective s diffe r fro m Ling' s (1982) an d Baumrind' s
(1983) view s in prominen t ways . Fro m my perspective , differen t
view s ar e drive n in larg e par t by one' s view s abou t th e rol e of method s
in supportin g an d extendin g theor y development . From a view drive n
purel y by methods , Ling' s positio n is no t surprising , for method s
provid e n o wa y in whic h to distinguis h amon g mathematicall y equiva
len t models . Also , fro m a traditiona l experimentalis t perspective ,
SEM technique s wil l no t provid e wha t experiment s can provide .

Wrapping

Up

277

Whe n theor y is adde d to th e methods , however , a differen t


purpos e appears , for ther e ar e opportunitie s (a) to disconfir m a
hypothesize d mode l (an d its identica l alternativ e models ) an d (b) to
distinguis h amon g competin g theoretica l model s tha t ar e nonequiva
lent . Said differently , th e greates t strengt h of SEM technique s draw s
fro m th e fact tha t the y nee d to be drive n by theory , no t by th e
statistica l technique s tha t provid e method s for th e theories .
Mos t fundamenta l is tha t correlationa l dat a potentiall y provid e
an opportunit y to addres s issue s of causation , eve n if primaril y
throug h mode l disconfirmation . Researcher s wit h correlationa l dat a
collec t thos e dat a for reason s tied to implici t conceptua l models . The y
shoul d be encourage d to articulat e th e causa l processe s tha t the y thin k
underli e thei r reason s for selectin g th e variable s tha t the y chos e to
stud y an d tha t led the m to collec t th e measure s tha t the y did . Onc e
thos e processe s ar e articulated , thei r plausibilit y can be examined .
In general , however , th e debat e wil l continue , withi n bot h th e
SEM communit y an d th e broade r researc h community . My expecta
tio n is tha t ove r time , as thes e technique s ar e use d an d corroborate d
(or not ) throug h complementar y method s (e.g. , interventio n re
search , experimenta l designs) , thei r ultimat e valu e wil l emerge . Then ,
socia l scientist s can determin e whethe r or no t the y wer e a boon , a
curse , or somewher e in between .
In summary , ther e ar e criticism s of whic h reader s nee d to be
aware , for reader s inevitabl y wil l encounte r the m in som e form . Non e
of th e criticism s provide s a reaso n for SEM technique s to be totall y
discarde d as inappropriate . Rather , the y provid e differen t philoso
phie s abou t way s in whic h to us e availabl e dat a plu s guidanc e abou t
way s in whic h to us e SEM approache s effectively .
I Emerging Criticisms
On e are a tha t ma y wel l emerg e as a ne w are a of controvers y for SEM
researcher s is th e balanc e betwee n overal l mode l fit an d significanc e
of particula r pat h coefficients . Th e emphasi s on overal l mode l fit ha s
potentiall y negativ e consequence s in that , first , it encourage s re
searcher s to overfi t thei r dat a to attai n th e bes t possibl e fit and ,
second , it distract s attentio n fro m th e mos t importan t coefficient s by
embeddin g an d judgin g thei r wort h in broade r fit tests . For th e firs t
point , ther e is th e distinc t possibilit y tha t th e mode l tha t fits bes t in
a singl e sampl e wil l no t fit bes t in a secon d sample . Thi s is tru e

278

LATEN T VARIABL E

MODEL S

regardles s of ho w muc h an investigato r tinker s wit h a mode l tryin g


to mak e it fit, du e simpl y to th e natur e of sampling . Second , ofte n
difference s amon g model s can hav e littl e to d o wit h th e relationship s
tha t led researcher s to loo k at th e dat a usin g SEM techniques . Tha t
is, lack of fit in som e instance s ma y be independen t of th e primar y
hypotheses . In th e wors t possibl e case , researcher s coul d rejec t mod
els tha t accuratel y specif y importan t relationship s becaus e th e overal l
mode l doe s no t fit. Critic s of mode l fittin g migh t argu e tha t th e SEM
field ha s no t adapte d enoug h fro m experimenta l researc h abou t
partitionin g varianc e an d acceptin g residual/unexplaine d variance .

Pos t Ho c Mode l Modificatio n

A secon d issu e tha t ha s no t receive d a grea t dea l of attentio n thu s far


in thi s boo k is th e issu e of modifyin g models . Fro m on e perspective ,
mode l modificatio n mus t no t be to o bad , for th e SEM compute r
program s typicall y includ e first derivatives , modificatio n indices , an d
Lagrang e multiplie r test s as par t of th e output ; th e informatio n give n
by thes e indexe s assist s researcher s in findin g way s in whic h to modif y
thei r model s tha t wil l improv e fit (for curren t thinkin g abou t mode l
modification , see , e.g. , MacCallum , 1995). Th e LISREL progra m
eve n offer s unrestricte d automati c mode l modificatio n as an option .
Joresko g (1993), for example , distinguishe d amon g thre e situations
strictl y confirmatory , alternativ e models , an d mode l generatingan d
suggeste d tha t mode l generatin g is th e mos t common .
Fro m a perspectiv e mor e like Cliff' s (1983), however , mode l
modificatio n is a substantia l shif t fro m th e confirmator y inten t of
laten t variabl e SEM approaches . Th e mos t conservativ e positio n is
tha t model s shoul d be purel y confirmator y an d no t modifie d excep t
perhap s to hel p pla n th e nex t study . If on e begin s fro m a purel y
confirmator y perspective , the n cross-validatio n throug h sampl e split
tin g an d a prior i specificatio n of alternativ e model s ar e way s of
allowing som e mode l modificatio n whil e maintainin g th e basi c con
firmator y intent .
Althoug h an y discussio n attemptin g to find a preferabl e cours e of
actio n can focu s on researche r values , ther e is a secon d componen t
of thi s discussion , an empirica l one : Can mode l modificatio n hel p
researcher s to find mor e accurat e models ? If th e answe r is a definitiv e
no, the n th e issu e is moot . If it is yes, always, the n th e criticis m is in

Wrapping

Up

279

principl e correc t bu t impractica l an d counterproductive . If it is some


wher e in between , the n preference s ar e likel y to be drive n by differin g
researche r values .
Th e answe r to th e questio n of accuracy of mode l modificatio n in
recoverin g "true " model s seem s to be that , in general , modifyin g
model s is all to o likel y not to be helpful . Thi s answe r wa s provide d
firs t by Costne r an d Schoenber g (1973) for multiple-indicato r mod
els, reinforce d by MacCallu m (1986), an d the n reaffirme d mor e
strongl y by MacCallum , Roznowski , an d Necowit z (1992). Thes e
studie s foun d tha t it is difficul t to modif y misspecifie d model s in way s
tha t mov e close r to "true " models ; thei r dat a definitel y argu e agains t
nontheoretica l searche s an d eve n for exercisin g grea t cautio n in
conductin g theory-guided modification . MacCallu m et al. (1992)
recommende d settin g fort h multipl e model s a prior i to avoi d pos t ho c
dat a fittin g modifications . MacCallu m (1986), in addition , suggeste d
tha t an y modification s nee d "rigorou s substantiv e justification " (p. 118)
an d that , withou t a goo d startin g model , modificatio n is likel y to lead
on e astray .
Fro m my perspective , a conservativ e approac h to mode l modifi
catio n is best . If a researche r anticipate s needin g mode l modificatio n
du e to muc h uncertaint y abou t causa l processe s in th e literature , the n
dat a shoul d be collecte d wit h modificatio n in mind . Practically , wha t
tha t mean s is tha t a larg e enoug h sampl e shoul d be collecte d so tha t
it can be split . Hal f can be use d for mode l modificatio n an d th e othe r
hal f hel d bac k to us e for cross-validation . MacCallum , Roznowski ,
Mar , an d Reith (1994) provide d strategie s for cross-validation . Even
thi s approac h usin g cross-validation , however , ha s weaknesses . Th e
mos t likel y problem (which , base d on studie s suc h as Costne r an d
Schoenber g [1973] is a fairl y goo d possibility ) is tha t th e modifica
tion s wil l produc e improvement s in fit from th e origina l mode l bu t
not resul t in findin g th e tru e model . Instead , wha t is foun d is on e of
th e man y alternativ e model s whos e fit approximate s th e fit of th e tru e
mode l an d whic h ma y be overfitted . In suc h instances , examinin g th e
modifie d mode l on th e cross-validatio n sampl e wil l yiel d a poore r fit
tha n th e origina l sampl e bu t on e tha t is improve d fro m th e a prior i
model .
So wha t shoul d reader s conclude ? If Joresko g (1993) wa s correct ,
the n usin g SEM technique s for mode l generatio n is a commo n
instance , yet on e ope n to potentia l criticism . If a muc h mor e conser
vativ e positio n is taken , the n refinemen t of theoretica l model s migh t

280

LATEN T VARIABL E

MODEL S

tak e man y dat a set s an d muc h time . Clearly , bein g conservativ e ha s


fewe r pitfall s bu t ma y mov e progres s mor e slowl y in a field wher e
modelin g is no t ver y sophisticated . If reader s op t for spee d an d decid e
to risk th e greate r potentia l for incorrec t inference s an d inaccurat e
models , the n the y shoul d follo w a procedur e suc h as th e on e sug
geste d by Joresko g (1993). Tha t procedur e include s cross-validatio n
throug h sampl e splittin g an d us e of Cudec k an d Browne' s (1983)
expecte d cross-validatio n index .

Topic s No t Covere d

Wha t remain s is an arra y of topic s an d issue s tha t rang e fro m way s of


operationalizin g differen t type s of model s to way s of dealin g wit h
differen t types of data . Man y of thes e issue s def y shor t an d simpl e
descriptio n an d wil l onl y be mentione d (an d reference s tha t addres s
the m cited ) so tha t reader s facin g thes e situation s can find th e re
source s tha t wil l hel p them .
|

Power Analysts

Researc h on powe r analysi s of covarianc e structure s still is develop


ing . Cohe n (1992) addresse d genera l issue s of powe r in a straightfor
war d fashion . MacCallum , Browne , an d Sugawar a (1996) focused
specificall y on powe r an d sampl e size for structura l equatio n analysis .
|

Nonlinear

Relationships

Reader s ma y encounte r circumstance s in whic h the y wan t to posi t


relationship s tha t ar e othe r tha n linear , for example , tha t ar e hypothe
size d to tak e a quadrati c form . MacCallu m an d Ma r (1995), for
example , discusse d multiplicativ e an d quadrati c model s an d way s in
whic h to distinguis h betwee n them . At a mor e genera l level , Kenn y
an d Jud d (1984) provide d a discussio n of ho w to mode l nonlinea r
effect s in laten t variabl e model s (see also Yalcin 6c Amemiya , 1993).
Kenn y an d Jud d (1984) also addresse d a relate d issue , tha t of ho w to
mode l interactio n effects . Mor e recently , Jaccar d an d Wan (1996)
provide d an accessibl e sourc e for modelin g interaction s in laten t
variabl e model s (see als o Jaccar d & Wan , 1995; Ping , 1995).

Wrapping

281

Up

Interactio n effect s ar e of particula r interes t in SEM approaches ,


for ther e ar e man y instance s in whic h moderato r effect s hav e bee n
hypothesize d an d moderato r effect s ar e modele d as interactions ,
namely , as produc t term s (see also Baro n & Kenny , 1986). To tes t
for moderato r effects , th e interactio n of tw o variable s is include d
along wit h th e tw o variable s tha t interac t in th e structura l model . For
observe d variabl e models , inclusio n of interactio n term s is fairl y
straightforward , bu t interaction s becom e mor e comple x whe n multi
pl e indicator s ar e present . For an approac h dealin g wit h interaction s
for multiple-indicato r laten t variables , see Jaccar d an d Wan (1996).
On e potentia l concer n to atten d to whe n includin g nonlinea r com
ponent s or interaction s in model s is collinearity ; in man y instances ,
interactio n term s ar e strongl y relate d to th e tw o variable s whos e
interactio n is include d in th e model .
21

Alternative Estimation

Techniques

Thi s boo k ha s focuse d on th e basi c method s of SEM, assumin g


ordinar y leas t square s an d maximu m likelihoo d estimation . Ther e ar e
a numbe r of alternativ e way s in whic h to estimat e coefficient s fro m
laten t variabl e SEMs (for an illustration , see , e.g., Stein , Smith , Guy ,
Sc Bentler , 1993). The y includ e generalize d leas t squares , unweighte d
leas t squares , generall y weighte d leas t squares , diagonall y weighte d
leas t squares , an d asymptoti c distribution-fre e estimators . Th e firs t
tw o are , in general , simila r to maximu m likelihoo d in thei r require
ment s an d propertie s bu t yield fit statistic s tha t perfor m less wel l tha n
maximu m likelihoo d statistic s (e.g. , Hu & Bentler , 1995). Th e latte r
thre e diffe r in tha t the y provid e estimatio n procedure s tha t d o no t
requir e multivariat e normalit y in th e data . At th e sam e time , wor k on
fit statistic s ha s foun d tha t th e distribution-fre e estimators , in com
pariso n to maximu m likelihoo d estimates , hav e no t produce d esti
mate s wit h desirabl e properties , particularl y in smal l sample s (H u &
Bentler , 1995). Therefore , at th e presen t time , assumin g tha t one' s
dat a d o no t strongl y violat e an assumptio n of multivariat e normality ,
on e seem s to lose littl e by stayin g wit h maximu m likelihoo d estimate s
(see also Bentle r & Dudgeon , 1996). Th e caveat , however , is tha t
2 1 . In certai n circumstances , moderato r variable s als o ma y b e modele d as nonlinea r effect s
(se e Baro n 8c Kenny , 1 9 8 6 ) .

282

LATEN T VARIABL E

MODEL S

ther e currentl y is substantia l researc h bein g don e on fit statistic s tha t


ma y chang e th e recommende d cours e of action .
I Analysis of Noncontinuous

Variables

Th e approache s describe d hav e assume d tha t th e variable s collecte d


hav e bee n continuous . If the y ar e dichotomous , ordere d categorical ,
or polychotomous , the n regula r covarianc e matrice s shoul d no t be
analyzed , bu t paralle l method s exist . (Reader s ma y wan t to thin k bac k
to th e discussio n of usin g demographi c variable s suc h as race/ethnicit y
or gende r in models ; thos e wil l be dichotomous. ) Thes e technique s
hav e bee n develope d primaril y by Muthe n (1984, 1993). Option s
no w provide d by th e mainstrea m compute r program s for SEM analy
sis can analyz e tetrachori c or polychori c matrices , bu t th e sampl e size
neede d tend s to be greate r tha n tha t neede d wit h continuou s data .
For example , in LISREL th e PRELIS progra m wil l generat e differen t
matrice s for analysis . Researcher s wit h thes e type s of dat a ar e strongl y
encourage d to rea d th e article s of Muthe n an d others .
I Adding Analysis of Means
In an earlie r chapter , th e possibilit y of addin g analysi s of mean s to
SEM program s wa s mentioned . As als o note d then , addin g mean s to
th e analysi s require s a differen t typ e of matri x to be analyzed , namely ,
on e tha t ha s informatio n abou t th e mean s of th e observe d measure s
as wel l as th e covarianc e structure . Tha t typ e of matri x is calle d an
augmente d momen t matri x an d contain s a vecto r of mean s alon g wit h
a covarianc e matrix . Recen t version s of SEM program s (e.g. , PRELIS
wit h LISREL 8) wil l generat e suc h a matrix , makin g it muc h easie r to
mode l means . A recen t treatmen t of ho w to us e SEM as an alternativ e
to multipl e analysi s of varianc e for modelin g multivariat e mean s wa s
provide d by Cole , Maxwell , Arvey , an d Salas (1993).
I Multilevel Structural Equation

Modeling

Recently , ther e ha s bee n wor k on approache s tha t paralle l hierarchica l


linea r modelin g (e.g. , Bryk & Raudenbush , 1992). Thes e approache s
attemp t to partitio n varianc e at differen t levels . For example , in a
stud y of studen t performance , ther e ar e impact s at th e leve l of
teacher s (class level ) as wel l as at th e leve l of student s (individua l

Wrapping

283

Up

level) . Thi s wor k stil l is emergin g an d ma y chang e appreciabl y as it is


develope d (see , e.g. , McArdl e & Hamagami , 1996; Muthen , 1994).
|

Writing Up Papers Containing


Equation Modeling Analysis

Structural

In additio n to lookin g at existin g paper s an d article s in one' s field for


guidanc e in selectin g statistica l informatio n to repor t an d to presen t
results , ther e ar e tw o article s tha t shoul d prov e helpfu l to reader s new
to SEM. For technica l issues , see Hoyi e an d Pante r (1995). For mor e
genera l issues , see Raykov , Tomer , an d Nesselroad e (1991) as wel l as
Hoyl e an d Pante r (1995).
I Selecting a Computer Program to Do Latent
Variable Structural Equation Modeling
An ever-increasin g numbe r of SEM program s ar e available . I admi t
that , like an old do g wit h new tricks , I hav e bee n conten t to tr y to
kee p up wit h th e change s in th e progra m I firs t learne d to use
LISREL. Change s hav e bee n fairl y frequent , mos t recentl y wit h th e
introductio n of an equation-base d version , SIMPLIS, in LISREL 8.
Unfortunately , I hav e no t staye d abreas t of th e arra y of alternativ e
programs . I recentl y looke d at a dem o cop y of EQS, th e othe r
progra m I ha d use d at variou s time s in th e past . For me , th e conclusio n
wa s th e sam e as my earlie r one , namely , tha t it stil l seem s easie r to us e
for som e problem s an d harde r for others . Its nices t featur e is th e
capacit y to wor k off diagram s create d withi n th e progra m in a
drawin g program . My mos t recen t versio n of LISREL provide s a
diagra m an d allow s on e to wor k off it onc e th e progra m run s bu t doe s
no t star t wit h a drawin g option .
A numbe r of othe r program s ar e availabl e varyin g in eas e of use ,
flexibility , option s offered , and , perhap s of greates t importanc e to
many , price . Reader s wh o us e SPSS ma y wan t to lear n AMO S (e.g. ,
Arbuckle , 1994, 1997), for it wil l be replacin g LISREL in th e SPSS
line of products . It is equatio n base d like EQS an d also run s fro m a
diagra m create d by users . I hav e use d th e AMO S diagrammin g option s
to prepar e man y of th e figure s presented in thi s book , an d I recom
men d it highly . It wa s eas y to pic k up an d produc e diagram s of hig h
quality . AMO S wil l rea d SPSS syste m files , providin g a nic e interfac e
for SPSS users . Reader s wh o wan t to compar e th e comman d state

284

LATEN T VARIABL E

MODEL S

ment s fro m LISREL, EQS, an d AMO S shoul d revisi t Appendi x 9.2.


In so doing , the y nee d to remembe r tha t if th e diagrammin g optio n
is chose n for EQS or AMOS , the n th e contro l statement s ar e no t
needed . M X is anothe r SEM program . I hav e no t use d it becaus e I
alread y hav e an SEM program . For ne w users , however , on e advan
tag e is tha t it can be downloade d for us e fro m a Worl d Wide Web site
( h t t p : / / o p a l . v c u . edu/html/mx/mxhomepage
. html) . Finally ,
on e othe r frequentl y use d SEM progra m is EZPAT H (Steiger , 1989),
whic h is tie d to th e statistica l packag e SYSTAT
2 2

For reader s wh o wan t to perus e th e full rang e of alternatives , a


revie w of seve n differen t SEM program s is provide d by Walle r (1993)
for confirmator y facto r analysis . Anothe r recen t revie w of EQS,
LISREL, an d AMO S can be foun d in Ho x (1995), an d a ver y recen t
an d comprehensiv e discussio n of variou s program s appear s in th e
prefac e of Haydu k (1996). Walle r als o is workin g on a curren t review .
Anothe r sourc e of informatio n is th e Interne t grou p SEMNET , whic h
carrie s a rang e of information , varyin g fro m basi c to advanced , abou t
SEM issues . Beware , however , tha t th e grou p is ver y activ e an d wil l
delug e yo u wit h e-mail .
As reader s wor k wit h SEM techniques , the y wil l likel y nee d to
updat e an d broade n thei r knowledg e about SEM technique s an d
applications . SEM technique s currentl y ar e ver y popular , an d ther e
is a ver y activ e grou p of methodologist s an d statistician s wh o ar e
continuin g to refin e an d develo p structura l equatio n methods . Read
ers migh t follo w th e wor k of name s tha t hav e com e up repeatedl y
throug h thi s book . Althoug h listin g name s inevitabl y wil l be incom
plet e du e to omission s an d my ignorance , example s of name s to
trac k includ e Joreskog , Sorbom , Muthen , Bentler , Arbuckle , Miller ,
Browne , Bollen , Hayduk , Byrne , Cudeck , an d Mulaik . Finally , ther e
is a fairl y ne w journa l devote d to structura l equatio n model s calle d
Structural Equation Modeling. It include s a teacher' s corne r wit h
informatio n an d annotate d bibliographie s of SEM literature .
Goo d luck , structura l equatio n modelers !

22. As I wa s makin g fina l revisions , Michae l Brown e provide d m e wit h a c o p y o f K a n o ' s


1997 Behaviormetrika
pape r (24, 8 5 - 1 2 5 ) title d "Software. " It provide s description s by
progra m author s o f seve n SE M programsAMOS , COSAN , EQS , LISREL , M E C O S A ,
RAMONA , an d SEPATHan d is highl y recommende d fo r reader s tryin g t o selec t a
progra m to use .

%
1

A er 1 111 ") 1 1 ; t ) 1 1
to \ I X H T a f r
t
1 11
Sh 1 \ * F nilh 0
AL

. 1

/-

Matri x algebr a provide s a wa y in whic h to represen t


multipl e equation s in a form tha t bot h consolidate s informatio n an d
allow s efficien t dat a analysis . By workin g wit h matrices , mathematica l
operation s can be expresse d in a compac t fashion . Finally , wit h
respec t to structura l equatio n modelin g (SEM) an d regressio n ap
proaches , matri x algebr a simplifie s an d make s mor e accessibl e th e
mathemati c operation s tha t ar e used . (Reader s searchin g for a secon d
sourc e on matri x algebr a coul d see Kerlinge r & Pedhazur , 1973.)
I What Is a Matrix?
A matri x is an m rectangl e containin g numbers , symbol s tha t stan d
for numbers , or variabl e names . Th e orde r of th e matri x is m row s
by columns . For example , a 2 3 matri x ha s tw o row s an d thre e
columns . To illustrate ,
Column

Column

Column

Row l

(1,1)

(1,2)

(1,3)

Ro w 2

(2,1)

(2,2)

(2,3)

Th e pair s of number s in parenthese s ar e not intende d to be val


ues ; rather , the y represent th e coordinate s of each elemen t of th e
matrix . For example , th e (ro w 1, colum n 1) elemen t wil l be locate d
285

BASIC S O F S T R U C T U R A L E Q U A T I O N

286

MODELIN G

wher e (1,1) is in th e matrix . In othe r words , th e coordinate s firs t giv e


th e numbe r of th e row of an y elemen t an d the n giv e th e numbe r of
th e colum n of tha t element . Th e coordinate s of element s ar e impor
tan t for a numbe r of reasons , including (a) the y ofte n ma y be use d as
subscript s for unknow n coefficient s (e.g. , b ), (b) the y can be use d to
identif y th e variable s tha t ar e bein g relate d (e.g. , r ) , an d (c) the y ar e
use d in som e SEM program s to specif y parameter s to be estimate d in
matrice s use d by thos e programs .
A 2 3 matri x wit h value s rathe r tha n coordinate s woul d loo k
like
l2

21

| 5 3 7|
Matri x = 16 2 8 | .
Wit h label s for row s an d columns , it woul d loo k like
Column : 1 2 3
Row : 1| 5 3 7
2 J6 2 8| .
So, for example , th e valu e of th e (ro w 1, colum n 2) element , namely ,
(1, 2) in th e precedin g instance , is 3. A specia l case of a matri x is on e
calle d a nul l matrix , whic h contain s onl y O's.
Sometime s th e matri x ma y totall y contai n algebrai c representa
tion s of th e elements , for example , usin g subscript s ij wher e i repre
sent s th e row coordinat e an d / ' th e colum n coordinate . Th e precedin g
2 x 3 matri x coul d be presente d as

Matri x = \b

2l

22

b\
2J

In th e example , b = 6 an d b = 8.
In SEM analyses , each row correspond s to a dependen t or endo
genou s variable . Tha t is, each dependen t variabl e ha s its ow n equa
tion , whic h is a row . By contrast , column s correspon d to predicto r
variables , whic h ma y be eithe r exogenou s or endogenous . Thus , if we
hav e a syste m of structura l equation s containin g thre e dependen t
variables , the n matri x representation s of thos e equation s woul d re
quir e matrice s wit h thre e rows . Th e numbe r of column s in a matri x
containin g th e SEM pat h (regression ) coefficient s woul d be th e sum
2x

2i

Appendix

287

of th e numbe r of (a) exogenou s or independen t variable s tha t wer e


in th e equation s an d (b) endogenou s variable s (sometime s limite d to
endogenou s variable s tha t ar e use d to predic t othe r endogenou s
variables) .
If a matri x ha s eithe r on e row or on e column , the n it is calle d a
vector . Vector s ar e use d regularl y to presen t variable s an d residuals .
If, for example , ou r endogenou s variable s wer e pee r popularit y an d
achievement , the n a vecto r for those variable s woul d be
I Peer Popularit y |
I Achievement! .
Th e vecto r ha s tw o row s an d on e column . Vector s also can hav e onl y
a singl e row plu s multipl e columns .
A commo n matri x operatio n is on e tha t turn s a matri x on its sid e
by turnin g row s int o column s an d column s int o rows . It is calle d
takin g th e transpos e of a matrix . If we wer e to tak e th e transpos e of
Matri x in th e preceding , the n th e ne w matri x (B')> usin g th e
element s as labele d previously , woul d be
* ,

Matri x B' = \b b \
\b b \
n

\5 6\

= | 3 2|
|78| .

Not e tha t th e element s wit h tw o identica l subscript s d o no t move ,


wherea s th e other s mov e "aroun d th e diagonal. " Wha t formerl y wa s
th e (ro w 1, colum n 3) elemen t (b ) no w is foun d in th e thir d row bu t
firs t column . It ha s kep t its "old " coordinates , so it stil l is b . O f
course , if we wer e to giv e th e element s of th e transpos e ne w coordi
nates , the n thos e woul d correspon d to th e new row s an d column s an d
b woul d becom e b reflectin g its new row an d colum n coordinates .
n

JU

Square matrices. If th e number s of row s an d column s ar e identical ,


the n th e matri x is calle d square . For squar e matrices , th e set of
element s runnin g fro m th e uppe r left-han d corne r of th e matri x to
th e lowe r right-han d corne r is calle d th e diagonal . In term s of
coordinates , th e diagona l is mad e up of element s tha t hav e tw o
identica l coordinate s (e.g. , r ). If th e onl y nonzer o element s of a
matri x ar e foun d on th e diagonal , the n th e matri x is calle d a diagona l
matrix .
u

BASIC S O F S T R U C T U R A L E Q U A T I O N

288

MODELIN G

Symmetric (square) matrices. Matrice s like correlatio n an d covari


anc e matrice s ar e calle d symmetric . Row s an d column s ar e define d
by th e sam e variable s in th e sam e order , an d th e matrice s hav e th e
sam e element s abov e th e diagona l as belo w th e diagona l excep t tha t
th e element s ar e transposed . All correlatio n matrice s or covarianc e
matrice s hav e to be bot h squar e an d symmetric . Her e is an exampl e
of a symmetri c matrix :
|1.0 0
R = 1.71
1.45

.71
.00
.32

.45|

.321
1.00J.

Not e tha t (2, 1) equal s (1, 2), tha t (3, 1) equal s (1, 3), an d tha t (3, 2)
equal s (2, 3).
Identity matrices. A specia l for m of a diagona l symmetri c matri x
is an identit y matrix . It contain s l' s on th e diagona l an d 0's (by its
definitio n as diagonal ) everywher e else . It is designate d by / . A 3 x 3
diagona l matri x woul d be
| 1 00|
1

/ =

|o

11.

Th e identit y matri x serve s th e sam e functio n as th e numbe r 1;


anythin g multiplie d by / equal s itself . So, if Matri x A wer e to be
multiplie d by 7, the n th e resul t woul d be A. In othe r words ,

A7 = IA = A.
I Matrix

Operations

Ther e ar e thre e basi c matri x operations .


1. Multiplyin g a Matri x Time s a Singl e Numbe r (or scalar )
Th e resul t is tha t each elemen t of th e matri x is multiplie d time s
tha t number . So, if Matri x in th e precedin g discussio n wer e multi
plie d time s th e numbe r 3, the n each elemen t woul d be thre e time s
larger . For example , b woul d no w be 3 b or 3b .
xl

xl)

xl

Appendix

289

2. Takin g th e Sum of (or differenc e between ) Tw o Matrice s


For rw o matrice s to be summed , thos e matrice s hav e to be th e
sam e size , tha t is, hav e identica l number s of row s an d columns . If the y
ar e th e sam e size , the n each wil l hav e th e sam e numbe r of elements .
Additio n an d subtractio n ar e don e by combinin g correspondin g ele
ment s of th e matrice s on an element-by-elemen t basis . So, for exam
ple , imagin e tha t we wan t to ad d togethe r Matrice s C an d D , wher e
14 3 61
|218 |
C = | 1 5 7| an d D = | 0 9 3|
C + D =
| 4 3 6|
| 2 18|
|( 4 + 2) (3 + 1) (6 + 8)|
| 6 4 14|
| 1 5 7| + 10 9 3J = |( 1 + 0) (5 + 9) (7 + 3)| = | 1 14 10| .
If we wer e subtractin g D from C, the n we woul d be doin g th e
equivalen t of multiplyin g each elemen t in D by a - 1 (describe d earlie r
as Operatio n 1), whic h woul d chang e th e sign s of each of th e element s
of D, an d the n addin g th e correspondin g element s together . So, (1,
1) woul d be 4 + (-2) rathe r tha n 4 + 2 whe n C an d D ar e summed .
For additio n an d subtraction , th e rule s ar e simpl y tha t (a) th e matrice s
hav e to be th e sam e size an d (b) correspondin g element s in th e tw o
matrice s mus t be combined .
3. Multiplyin g Tw o Matrice s Togethe r
Multiplicatio n of matrice s is addresse d in tw o steps . First , th e
condition s unde r whic h multiplication can be don e ar e described .
Second , th e mechanic s of matri x multiplicatio n ar e explained .
When multiplication is possible. To be abl e to multipl y tw o
matrice s together , th e firs t matri x (or tha t whic h appear s on th e left )
need s to hav e a numbe r of column s equivalen t to th e numbe r of row s
of th e secon d (or right ) matrix . Th e row s of th e firs t matri x an d
column s of th e secon d matri x defin e th e size of th e resultin g matrix .
If we wer e tryin g to multipl y Matri x (f x s ) time s Matri x F (t u),
wit h r row s an d s column s an d F wit h t row s an d u columns , the n
s mus t equa l t, an d th e resultin g matri x ha s dimension s of r by u.
Orderin g of th e matrice s is ver y important , for Matri x time s Matri x
F is no t th e sam e as F time s E. To multipl y F time s , r woul d hav e

290

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

to equa l u. If the y ar e no t equal , the n eve n thoug h it is possibl e to


comput e F, it is no t possibl e to comput e F usin g matri x
algebra . On e wa y in whic h to d o notatio n is to pu t th e numbe r of
row s in a matri x to th e left of th e matri x nam e an d th e numbe r of
row s to th e right , as , ,. In suc h notation , s an d t can be compare d
readily .
For example , can we multipl y (3 2) time s F (2 3), or j
F ? Yes, for ha s tw o column s an d F ha s tw o rows , so th e require
men t is met . As stated , th e matri x tha t is th e produc t x F wil l hav e
th e sam e numbe r of row s as an d column s as F an d wil l be 3 3, th e
"outside " number s in E F . Illustratin g th e difference s betwee n
F an d F is simple ; F also coul d be computed , bu t th e resul t
woul d be a 2 2 matrix . If G wer e substitute d for F an d wa s 2 x 4 ,
the n G coul d be calculated , for th e tw o column s of correspon d
to th e tw o row s of G. By contrast , G canno t be computed , for
G's fou r column s d o no t alig n wit h ' s thre e rows .
x

Computations for matrix algebra. In matri x multiplication , th e


ro w element s fro m th e firs t matri x ar e multiplie d by thei r correspond
ing colum n element s fro m th e secon d matrix . Wha t tha t mean s
concretel y can bes t be explaine d throug h illustration . Matri x (3
2) an d Matri x F (2 3) wil l be multiplied :
|12 |
| 3 4 2|
= |23 | and F = |2 5 1|
|41| .
For an y elemen t (', /' ) o f th e resultin g produc t matrix , th e element s of
row fro m ar e combine d wit h th e element s of column; ' fro m F. So,
|12 |
13 4 21
17 14 41
12 3 J 12 5 11 = 112 23 71
|41 |
114 21 91.
For example , elemen t (1, 1) in th e produc t matri x is determine d by
multiplyin g th e element s of th e firs t row of by th e element s of th e
firs t colum n of F. Elemen t (1, 1) is [(1 3) + (2 2)] = 7, wher e (1
3) is (firs t row , firs t elemen t of firs t matrix ) time s (firs t column , firs t
elemen t of secon d matrix ) an d (2 2) is (firs t row , secon d elemen t of
firs t matrix ) time s (firs t column , secon d elemen t of secon d matrix) .

Appendix

291

|12 |
| 3 42|
I 7144|
|23 | | 2 5 1J = |1 2 23 7|
| 4 1|
114 21 91.
Element s of ro w 1 in th e resultin g matri x all us e th e first-ro w element s
fro m Matri x bu t combin e wit h th e correspondin g value s fro m th e
column s of F.
Illustration s ar e don e for element s (2, 2) an d (3, 1) of F.
Elemen t (2, 2) use s th e secon d row of an d th e secon d colum n of F;
thus , 23 = (2 x 4) + (3 x 5), wher e 2 is th e firs t elemen t of th e secon d
row of , 4 is th e firs t elemen t of th e thir d colum n of F, 3 is th e secon d
elemen t of th e secon d row of , an d 5 is th e secon d elemen t of th e
thir d colum n of F.
| 1 2|
| 3 42|
I 7 14 4|
|23 | | 2 5 1J = |1 2 23 7|
| 4 1|
114 21 91.
Elemen t (3,1) use s th e thir d row of an d th e firs t colum n of F; thus ,
14 = (4 3) + (1 x 2).
I 7 14 4|
| 1 2|
| 3 4 21
j2 3 j | 2 5 1J = 112 23 7|
| 4 1|
j 14 21 9 j .

Up to thi s point , nothin g ha s bee n said abou t divisio n of matrices ,


an d for goo d reason . In fact , divisio n canno t be done . Th e closes t
thin g to divisio n is multiplyin g a matri x time s th e invers e of som e
matrix , wher e th e invers e is analogou s to a reciproca l of a number .
Th e discussio n of collinearit y in thi s boo k center s aroun d issue s tie d
to invertibility , for correlatio n or covarianc e matrice s wit h perfec t
collinearit y hav e no invers e (ar e no t invertible) , an d regressio n ap
proache s canno t produc e a vali d solution .
I Inverting Matrices
By definition , an invers e of a Matri x H, writte n as H" , is th e matri x
that , whe n multiplie d time s anothe r matrix , yield s an identit y matrix .
Tha t is, H~ H = I. Becaus e / matrice s alway s ar e square , onl y squar e
matrice s can hav e inverses . At th e sam e time , as alread y noted , man y
1

BASIC S O F S T R U C T U R A L E Q U A T I O N

292

MODELIN G

squar e matrice s d o no t hav e inverses . Thos e tha t d o no t hav e inverse s


caus e problem s for SEM analyses .
Specifically , if ther e exist s a Matri x suc h tha t AxB = I, the n A
is said to be nonsingula r or invertibl e an d is th e invers e of A.
Similarly , is nonsingula r an d A is its invers e (in thi s case , =
A = ). Becaus e calculatin g inverse s is complicate d an d tedious ,
detail s ar e no t provide d here . Mos t important , standar d statistica l
package s calculat e inverse s in regressio n an d facto r analysi s pro
grams , an d ofte n the y offer inverse s an d determinants , whic h ar e
describe d next , as optiona l output . As is explaine d in th e text , th e
diagona l element s of inverse s of correlatio n matrice s provid e infor
matio n abou t collinarit y amon g variables .
|

Determinants

Determinant s ar e a singl e numerica l valu e associate d wit h an y squar e


matrix . Mathematic s of calculatin g inverse s is no t covere d here , for
as matrice s get larger , th e calculation s get mor e comple x an d mor e
difficul t to illustrate . Reader s intereste d in attainin g a fulle r under
standin g shoul d consul t a boo k on matrice s or matri x algebr a (e.g. ,
Marcu s & Mine , 1964).
For th e simples t case , a 2 2 matrix , calculatio n is fairl y simple .
Conside r Matri x A:
\ab\
A = \cd\.
Th e determinan t is (a d) - (b c).
So, if A = {3 11
|25| ,
the n its determinan t is (3 5) - (2 1) = 13.
For correlatio n matrices , th e determinan t wil l rang e betwee n 1
(if all variable s ar e totall y uncorrelated ) an d 0 (if ther e is perfec t
collinearit y betwee n on e or mor e variables) . If a determinan t is ver y
close to 0, the n ther e mus t be substantia l relationship s betwee n
variables , an d th e dat a shoul d be examine d to loo k for problem s tie d
to collinearity .

Appendix

293

I Matrices and Rules


Finally , a brie f reminde r about ho w som e commo n rule s appl y to
matrices :
Commutative : A + = + A; however , an d A ar e no t
equa l excep t in specia l circumstance s
Associative : A + (B + C) = (A + B) + C; A(BC) = (AB)C
Distributive : A(B + C) = AB + AC ; (B + C)A = BA + CA
Distributin g Transposes : (AB)' = B'A'
(not e tha t th e orde r of element s is reversed) .

Ar s w f r u ) c ,\
h S i sin
r

ti

Si n

I C H A P TRE 1
1. Yes, rememberin g tha t dichotomou s an d ordere d categorica l
measure s ca n fall unde r the clas s of quantitativ e data .
2. Yes, provide d sampl e size issue s are addressed . That is, som e
time serie s analyse s an d othe r analyse s at aggregate d levels
ca n have sample s too smal l for thes e methods .
3. No, moderatio n implies an interactio n type of effec t betwee n
variables . Chapte r 12 will addres s moderatin g effects .
4. Equivalen t model s are thos e tha t predic t the exac t sam e patter n
of relationship s betwee n measures . For example , J o b
Success Family is mathematicall y equivalen t to Figure 1.2 ,
as is Succes s causin g both Family an d Job, for the y predic t the
relationshi p betwee n Family an d Job to be the produc t of the
othe r two correlations . Becaus e the model s always star t with
theory , one shoul d have chose n a mode l from amon g
equivalen t one s tha t matche s the hypothesize d relationships .
That of cours e doe s not mak e the mode l correct , but it affirm s
plausibilit y of the theory .

I C H A P TR
E2
1. Yes, technicall y speaking , pat h analysi s always use s stan
dardize d data .
294

Answers

to Chapter

Discussion

Questions

2. Yes, for identifie d or over-identifie d models , multiple regressio n


yields optima l estimates . Yes, pat h coefficient s are partia l
regressio n coefficients .
3. Again speakin g technically , no, for whe n dat a are longitudinal ,
covarianc e matrice s nee d to be analyzed . On the othe r hand ,
if nonstandardize d coefficient s are examined , the n regressio n
technique s ca n be use d to analyz e longitudina l data . (Those
coefficient s sometime s were calle d pat h regressio n coeffi
cient s in the pat h analysi s literature.) .
4. Degree s of freedo m for pat h model s are determine d by the
numbe r of piece s of informatio n tha t are availabl e to us e for
solving for pat h coefficients . In the sam e way tha t subject s are
bits of informatio n for man y analyses , eac h correlatio n
(covariance ) is one piec e of informatio n for pat h models . Each
pat h to be estimate d "use s up" a degre e of freedom , so
eliminatin g a pat h "puts back " a degre e of freedo m in the model .
Degree s of freedo m provide the opportunit y for model s to
diverge from the data , an d therefor e allow the possibilit y of
mode l disconfirmation .
5. Under-identifie d model s cannot be solved . If thei r plausibilit y is
of interest , the y nee d to be reconceptualize d to mak e the m
identified .
6. Surprisingly, the answe r is Yes, eve n thoug h ther e are few case s
in which the y are appropriate , for the y are far inferior to the
laten t variabl e structura l equatio n technique s describe d late r
in this book .

CHAPTE R 3

1. Becaus e pat h analysi s is regressio n analysis , it analyze s cor


relatio n or covarianc e matrice s tha t are use d in regressio n
analysis .
2. Partia l correlatio n attempt s to completel y eliminat e the relation
ship s of the controlle d variabl e with the remainin g variable s an d
thei r relationship s with one another , while partia l regression
attempt s to sprea d commo n varianc e acros s the various
predictors . Partia l correlatio n would be picke d to look at residua l
relationships afte r removin g som e variabl e or variables .

295

296

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

3. Althoug h pat h analysi s approache s do not formall y talk abou t


workin g with partia l correlatio n matrices , ther e ma y be instan
ce s in which , du e to sampl e siz e limitations , contro l variable s
like ag e or gende r woul d nee d to be partiale d out so th e sampl e
is sufficien t for SEM techniques . Remember , however , tha t
decision s to partia l nee d to be guide d theoretically , an d there
fore likely shoul d not be don e if th e variabl e to be partiale d is
expecte d to displa y differen t causa l structure s at differen t
levels .
4. Th e sign s of th e nonstandardize d coefficient s will be th e sam e
a s th e standardize d coefficients , an d th e nonstandardize d
value s ar e ver y descriptiv e insofa r a s the y describ e th e rela
tionshi p in raw scor e units . For example , we coul d say tha t for
ever y yea r (1 raw scor e unit ) of educatio n "produces " an X
dolla r (raw scor e units ) increas e in expecte d annua l earnings .
5. Stepwis e regressio n can be guide d by theory , an d is whe n use d
for decomposin g effects . If it is not , yes , it can be misleading .
Remember , however , tha t at th e last ste p of th e regressio n
analysis , if all variable s ar e entere d into th e equation , th e orde r
of entr y use d doe s not matter ; all order s of entr y yield th e sam e
fina l outcom e provide d th e sam e variable s ar e in th e equation .
6. Yes, th e logic of decompositio n of effects is th e sam e acros s
all differen t type s of SEM techniques .
7. Th e matri x form is ver y appealing , for it work s for over-identifie d
a s well a s just-identifie d models . The othe r approache s wor k
a s well , requirin g tha t coefficient s tha t ar e omitte d from th e
mode l tak e on a valu e of 0.
8. ANOVA s ar e not used , althoug h pat h analysi s can be use d to
mode l experimenta l studies . Pat h modelin g can be particularl y
effectiv e if som e variabl e is viewe d a s a mediatin g variable .
Ther e it can be use d to tes t plausibilit y of a mediatio n model .
It als o can be valuabl e if ther e ar e question s abou t th e concep
tua l variabl e tha t is bein g assesse d by som e independen t or
dependen t variable . It ma y be possibl e to us e SEM technique s
to aggregat e measure s into conceptua l variables .

Answers

to Chapter

Discussion

Questions

CHAPTE R 5

1. The informatio n on variability within the sampl e is very impor


tant , an d it is lost in convertin g to correlations . Of cours e the
tradeof f is tha t a correlatio n metri c (rangin g from - 1 to 1, the
meanin g of r , etc. ) is so intuitive.
2

2. Here, if we are talkin g abou t assumption s of regression , viz.,


independenc e of residuals , the n we are thinkin g only abou t
pat h analysi s models . Othe r pat h model s ca n allow residual s
to covary, which is why it is tricky to talk abou t l o o muc h
non-rando m error" as a violation of a n assumption . Whethe r or
not assumin g residual s to be independen t is reasonabl e is a
questio n tha t begin s with theor y but the n is "tested " by data . As
will be explaine d later , test s of fit are all base d upo n residuals ,
the par t of relationship s tha t are not accounte d for by models .
Sometimes , it will be obvious by looking at a correlatio n matri x
tha t a hypothesize d mode l will not fit. For example , if a se t of
four measure s is hypothesize d as assessin g a single construct ,
but two of the measure s have a correlatio n twice tha t of all the
others , it is clear tha t a single facto r mode l will not work well.
Othe r times , the patter n of relationship s will be mor e compli
cated , an d ca n be examine d only throug h looking at the
residua l matri x an d the measure s of mode l fit.
3. To go from standardize d to nonstandardize d coefficient s or vice
versa , it is the standar d deviations , not the standar d errors , tha t
are used . Yes, ther e is a fairly simpl e conversio n from one to
the othe r tha t require s only dividing by (or multiplying by) a ratio
of two standar d deviations . There ha s bee n a controvers y in
the literatur e abou t the meanin g of standardize d versu s non standardize d coefficient s an d how to explain tha t difference .
From my perspective , it is mos t importan t to think of it concep
tually. Standardize d coefficient s describ e relation s in standar d
deviatio n units , wherea s nonstandardize d coefficient s describ e
relation s in rea l raw scor e units .
4. Decision s abou t relationship s betwee n method s shoul d be
driven at the dat a collectio n stage . It seem s to me tha t the idea l
answe r is "no," tha t it would be simple r if ther e was no metho d
variance . At the sam e time , one need s to trad e off difficulty
in collectin g dat a wher e method s do not exert influenc e on

297

BASIC S O F S T R U C T U R A L E Q U A T I O N

298

MODELIN G

answer s versu s simplicit y in gettin g neede d data . If on e decide s


tha t th e bes t decisio n is to collec t dat a from measure s tha t hav e
metho d variance , the n tha t variabilit y need s to be modele d so
variabilit y can be adequatel y partitioned .
5. Systematicall y attendin g to metho d variabilit y is clearl y a s
importan t toda y a s it wa s in th e past . Usin g multipl e method s
is highl y desirable . Yet, a s will be describe d in Chapte r 7, MTMM
matrice s produc e problem s of estimatio n in certai n type s of
models , whic h reduce s somewha t thei r value .

C H A P T E R

1. Lag refer s to th e passag e of time . The exac t amoun t of tim e


varie s with th e natur e of th e conceptua l issues .
2. I suspec t tha t I am guilt y of acceptin g way s of talkin g abou t
stabilit y tha t do not fit with som e othe r use s of terms , and , mor e
importantly , I hav e not bee n a s clea r in my usag e a s I shoul d
hav e been . To clarif y (hopefully) : Stabilit y a s I hav e use d it refer s
only to singl e variables . In th e absolut e sense , stabilit y mean s
absenc e of chang e in som e variabl e acros s som e tim e period .
With respec t to covariances , however , stabilit y mean s onl y tha t
th e relativ e positio n of a grou p of individual s on som e dimen
sion doe s not change . For example , if all childre n in a clas s
gre w at a commo n rate , thei r heigh t score s woul d all increas e
by a constant , an d thei r height s at th e first tim e (befor e growing )
woul d correlat e perfectl y with thei r height s at th e secon d tim e
(afte r growing) , an d heigh t woul d be perfectl y stable . Said
differently , heigh t at tim e 1 perfectl y predict s heigh t at tim e 2.
Finally , if a relationshi p betwee n 2 variable s is calle d stable ,
the n thei r covarianc e shoul d not hav e change d from som e
poin t in tim e to som e late r time .

IP Xi ei

:(

Akaike , H. (1987). Facto r analysi s an d AIC . Psychometrika, 52, 317-332.


ArbuckJe . J. L. (1994). AMOS : Analysi s of momen t structures . Psychometrika, 59, 135-137.
Arbuckle , J. L. (1997). AMOS users'guide:
Version 3.6. Chicago : SPSS .
Bagozzi , R. P. (1991). On th e us e of S E model s in experimenta l designs : Tw o extensions .
International Journal of Research in Marketing, 8, 125-140.
Baron , R. M. , & Kenny , D. A. (1986). Th e moderator-mediato r variabl e distinctio n in socia l
psychologica l research : Conceptual , strategic , an d statistica l considerations . Journal of
Personality and Social Psychology, 51, 1173-1182.
Baumrind , D . (1983). Speciou s causa l attribution s in th e socia l sciences : Th e reformulate d
stepping-ston e theor y of heroi n us e as exemplar . Journal of Personality and Social
Psychology, 45, 1289-1298.
Bentler , P. M . (1989). EQS: Structural equation manual. Los Angeles : BMD P Statistica l Software .
Bentler , P. M . (1990). Comparativ e fit indexe s in structura l models . Psychological Bulletin, 107,
238-246.
Bentler , P. M. , & Bonett , D. G . (1980). Significanc e test s an d goodnes s of fit in th e analysi s of
covarianc e structures . Psychological Bulletin, 88, 588-606.
Bentler , P. M. , & Dudgeon , P. (1996). Covarianc e structur e analysts : Statistica l practice , theory ,
directions . Annual Review of Psychology. 47, 563-592.
Blalock , H . M Jr. (1964). Causal inferences in non-experimental research. Chape l Hill : Univer
sit y o f Nort h Carolin a Press .
Bollen , K. A. (1989). A ne w incrementa l fit inde x fo r genera l structura l equatio n models .
Sociological Methods & Research, 17, 303-316.
Bollen , K. A. (1990). Overal l fit in covarianc e structur e models : Tw o type s of sampl e siz e effects .
Psychological Bulletin, 107, 256-259.
Bollen , K., & Lennox , R. (1991). Conventiona l wisdo m on measurement : A structura l equatio n
perspective . Psychological Bulletin, 110, 305-314.
Bollen , K., & Long , J. S. (1993). Testing structural equation models. Newbur y Park , CA : Sage .
Bozdogan , H . (1987). Mode l selectio n an d Akaike' s informatio n criteri a (AIC) : Th e genera l theor y
an d it s analytica l extensions . Psychometrika, 52, 345-370.
Breckler , S. J. (1990). Application s of covarianc e structur e modelin g in psychology : Caus e fo r
concern ? Psychological

Bulletin, 107, 260-273.

299

300

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

Brophy , J. E., & Good , T. L. (1974). Teacher-student relationships: Causes and consequences.
Ne w York : Holt , Rinehar t & Winston .
Browne , M . W. (1984). Th e decompositio n of multitrait-multimetho d matrices . British Journal of
Mathematical and Statistical Psychology, 37, 1-21.
Browne , M. W., & Arminger , G . (I99S) . Specificatio n an d estimatio n of mean - an d covariance
structur e models . In G . Arminger , C. C. Oogg , & . E. Sobe l (Eds.) , Handbook of
statistical modeling for the social and behavioral sciences (pp . 185-249). Ne w York :
Plenum .
Browne , M . W., & Cudeck , R. (1989). Singl e sampl e cross-validatio n indice s fo r covarianc e
structures . Multivariate Behavioral Research, 24, 445-455.
Browne . M . W & Cudeck , R. (1993). Alternativ e way s of assessin g mode l fit . In K. A. Bolle n &
J. S. Lon g (Eds.) , Testing structural equation models (pp . 136-162). Newbur y Park , CA :
Sage .
Bryk , A. S., SL Raudenbush , S. W. (1992). Hierarchical linear models: Applications and research
methods. Newbur y Park , CA : Sage .
Byrne , . M . (1989). A primer of LISREL: Basic applications and programming for confirmatory
factor analysis models. Ne w York : Springer-Verlag .
Byrne , . M . (1994). Structural equation modeling with EQS and EQS/Windcws: Basic concepts,
applications, and programming. Thousan d Oaks , CA : Sage .
Byrne , . M. , Shavelson , R. J., & Muthen , B. (1989). Testin g fo r th e equivalenc e of facto r
covarianc e an d mea n structures : Th e issu e o f partia l measuremen t invariance . Psychologi
cal Bulletin, 105, 456-466.
Byrne , D. , SL Griffitt , W. (1973). Interpersona l attraction . Annual Review of Psychology, 24,
317-336.
Calsyn , R. J., & Kenny , D. A. (1977). Self-concep t of abilit y an d perceive d evaluatio n of others :
Caus e or effec t of academi c achievement ? Journal of Educational Psychology, 69, 136
145.
Campbell , D. , & Fiske , D. W. (1959). Convergen t an d discriminan t validatio n b y th e multitrait
multimetho d matrix . Psychological Bulletin, 56, 81-105.
Campbell , D. T., SL O'Connell , E. J. (1967). Method s factor s in multitrait-multimethod
matrices :
Multiplicativ e rathe r tha n additive ? Multivariate Behavioral Research, 2, 409-426.
Cliff , N . (1983). Som e caution s concernin g th e applicatio n of causa l modellin g methods . Multi
variate Behavioral Research, 18, 115-126.
Cohen , J. (1992). A powe r primer . Psychological Bulletin, 112. 155-159.
Cole , D. A. (1987). Utilit y of confirmator y facto r analysi s in tes t validatio n research . Journal of
Consulting and Clinical Psychology, 55, 584-594.
Cole , D. ., Maxwell , S. , Arvey , R., & Salas , E. (1993). Multivariat e grou p comparison s of
variabl e systems : MANOV A an d structura l equatio n modelling . Psychological
Bulletin,
114, 174-184.
Cooley , W. W. (1978, October) . Explanator y observationa l studies . Educational
Researcher,
pp . 9-15.
Costner , H. L. (1969). Theory , deduction , an d rule s of correspondence . American Journal of
Sociology. 75, 245-263
Costner , H. L , SL Schoenberg , R. (1973). Diagnosin g indicato r ill s in multipl e indicato r models .
In A. S. Ooldberge r & O . D. Dunca n (Eds.) , Structural equation models in the social
sciences (pp . 167-199). Ne w York : Semina r Press .
Crandall . C. S. (1994). Prejudic e agains t fat people : Ideolog y an d self-interest . Journal of
Personality and Social Psychology, 66, 882-894.
Cudeck , R. (1988). Multiplicativ e model s an d MTM M matrices . Journal of Educational Statistics,
13, 131-147.
Cudeck , R. (1989). Analysi s of correlatio n matrice s usin g covarianc e structur e models . Psy
chological Bulletin, 105, 317-327.

References

301

Cudeck , R., & Browne , M. W. (1983). Cross-validatio n of covarianc e structures . Multivariate


Behavioral Research, 18, 147-167.
Cudeck , R., & Henly , S. J. (1991). Mode l selectio n in covarianc e structure s analysi s an d th e
"problem " of sampl e size : A clarification . Psychological Bulletin, 109, 512-519.
Darlington , R. B. (1978). Reduced-varianc e regression . Psychological Bulletin, 85, 1238-1255.
Darlington , R. B. (1990). Regression and linear models. Ne w York : McGraw-Hill .
Duncan , O . D. (1966). Pat h analysis : Sociologica l examples . American Journal of Sociology, 72,
1-16.
Duncan , O . D. (1975). Introduction to structural equation models. Ne w York : Academi c Press .
Dunn , G Everitt , B., & Pickles , A. (1993). Modelling covariances and latent variables using
EQS. London : Chapma n & Hall .
Ford , J. K., MacCallum , R. C , & Tail , M. (1986). Th e applicatio n of explorator y facto r analysi s
in applie d psychology : A critica l revie w an d analysis . Personnel Psychology, 39, 291-314.
Gerbing , D. W., & Hamilton , 1. G . (1996). Validit y of explorator y facto r analysi s as a precurso r
to confirmator y facto r analysis . Structural Equation Modeling, 3, 62-72.
Goldberger , A. S. (1964). Econometric theory. Ne w York : Joh n Wiley .
Gordon , R. A. (1968). Issue s in multipl e regression . America n Journal of Sociology, 73, 592-616.
Gorsuch , R. L. (1983). Factor analysis (2n d ed.) . Hillsdale , NJ : Lawrenc e Erlbau m
Graham , J. W., & Donaldson , S. I. (1993). Evaluatin g intervention s wit h differentia l attrition : Th e
importanc e of nonrespons e mechanism s an d us e of follow-u p data . Journal of Applied
Psychology. 78, 119-128.
Graham , J. W., Hofer , S. M. , & Piccinin , A. M. (1994). Analysi s wit h missin g dat a in dru g
preventio n research . In L. M . Collin s & L. Seit z (Eds.) , Advances in data analysis for
prevention intervention research (pp . 13-53). Washington , DC : America n Psychologica l
Association .
Green , B. F. (1977). Paramete r sensitivit y in multivariat e methods . Multivariate
Behavioral
Research, 12, 263-288.
Hayduk , L. A. (1996). LISREL issues, debates, and strategies. Baltimore , MD : John s Hopkin s
Universit y Press .
Hox , J. J. (1995). AMOS , EQS , an d LISRE L for Windows : A comparativ e review. Structural
Equation Modeling, 2, 79-91.
Hoyle , R. H. (1995). Structural equation modelling: Concepts, issues, and applications. Thousan d
Oaks , CA : Sage .
Hoyle , R. H., & Panter , A. T. (1995). Writin g abou t structura l equatio n models . In R. H . Hoyl e
(Ed.) , Structural equation modelling: Concepts, issues, and applications (pp . 158-176).
Thousan d Oaks , CA : Sage .
Hu , L., & Bentler , P. M. (1995). Evaluatin g mode l fit. In R. H. Hoyl e (Ed.) , Structural equation
modelling: Concepts, issues, and applications (pp . 76-99). Thousan d Oaks , CA : Sage .
Jaccard , J., & Wan , C. K. (1995). Measuremen t erro r in th e analysi s of interactio n effect s betwee n
continuou s predictor s usin g multipl e regression : Multipl e indicator s an d structura l equa
tio n models . Psychological Bulletin, 117, 348-357.
Jaccard , J & Wan , C. K. (1996). LISREL approaches to interaction effects in multiple regression
(Quantitativ e Application s in th e Socia l Sciences , Vol . 114). Thousan d Oaks , CA : Sage .
James , L. R., Mulaik , S. ., & Brett , J. M. (1982). Causal analysis: Assumptions, models, and
data. Beverl y Hills , CA : Sage .
Joreskog , J. G . (1969). A genera l approac h to confirmator y maximu m likelihoo d facto r analysis .
Psychometrika, 34, 183-202.
Joreskog , K. G . (1971). Statistica l analyse s of set s of congeneri c tests . Psychometrika, 36,
109-133.
Joreskog , K. G . (1973). A genera l metho d fo r estimatin g a linea r structura l equatio n system . In
A. S. Goldberge r & O . D . Dunca n (Eds.) , Structural equation models in the social sciences
(pp . 85-112). Ne w York : Semina r Press .

302

BASIC S O F S T R U C T U R A L E Q U A T I O N

MODELIN G

Joreskog . K. G . (1993). Testin g structura l equatio n models . In K. A. Bolle n SL J. S. Lon g (Eds) ,


Testing structural equation models (pp . 294-316). Newbur y Park , CA : Sage .
Joreskog , K. G. , & Sbrbom , D . (1988). LISREL 7: A guide to the program and applications.
Chicago : SPSS .
Joreskog , K G , 4 Sorbom , D. (1993). USREL 8: Structural equation modeling with the SIMPLIS
command language. Mooresville , IN : Scientifi c Software .
Kaplan , D. (1990). Evaluatin g an d modifyin g covarianc e structur e models : A revie w an d recom
mendation . Multivariate Behavioral Research, 25, 137-155.
Keesling , W. (1972, June) . Maximum likelihood approaches to causal flow analysis. Unpublishe d
doctora l dissertation . Schoo l of Education , Universit y of Chicago .
Kenny , D. A. (1979). Correlation

and causation.

Ne w York : Joh n Wiley .

Kenny , D. ., SL Judd , C. M . (1984). Estimatin g th e nonlinea r an d interactiv e effect s of laten t


variables . Psychological Bulletin, 96, 201-210.
Kenny , D. ., SL Kashy , D. A. (1992). Analysi s o f th e multitrait-multimetho d matri x b y confir
mator y facto r analysis . Psychological Bulletin, 112, 165-172.
Kerlinger , F. N. , SL Pedhazur , E. J. (1973). Multiple regression in behavioral
Holt , Rinehar t & Winston .

research. Ne w York :

Land , K. C. (1969). Principle s of pat h analysis . In E. F. Borgart a (Ed.) , Sociological methodology,


1969 (pp . 3-37). San Francisco : Jossey-Bass .
Lewis , R., SL St . John , N. (1974). Contributio n of cross-rac e friendshi p to minorit y grou p
achievemen t in desegregate d classrooms . Sociometry, 37, 79-91.
Liang , J., SL Bollen , K. A. (1983). Th e structur e o f th e Philadelphi a Geriatri c Cente r moral e scale :
A reinterpretation . Journal of Gerontology, 38, 181-189.
Ling , R. F. (1982). Revie w of "Correlatio n an d Causation. " Journal of the American Statistical
Association, 77, 489-491.
Little , R. J. ., SL Rubin , D. B. (1987). Statistical analysis with missing data. Ne w York : Joh n
Wiley .
Little , R. J. ., & Rubin , D. B. (1990). Th e analysi s of socia l scienc e dat a wit h missin g values .
Sociological Methods A. Research, 18, 292-326.
Loehlin , J. C. (1992). Latent variable models: An introduction to factor, path, and structural
analysis (2n d ed.) . Hillsdale , NJ : Lawrenc e Erlbaum .
MacCallum , R. C. (1986). Specificatio n searche s in covarianc e structur e modelling . Psychological
Bulletin. 100, 107-120.
MacCallum , R. C. (1995). Mode l specification : Procedures , strategies , an d relate d issues . In
R. H . Hoyl e (Ed.) , Structural equation modelling: Concepts, issues, and applications
(pp . 16- 36). Thousan d Oaks , CA : Sage .
MacCallum , R. C , SL Browne , M . W. (1993). Th e us e of causa l indicator s in covarianc e structur e
models : Som e practica l issues . Psychological Bulletin, 114, 533-541.
MacCallum , R. C , Browne , M . W., & Sugawara , . M. (1996). Powe r analysi s an d determinatio n
o f sampl e siz e fo r covarianc e structur e modeling . Psychological Methods, 1, 130-149.
MacCallum , R. C , SL Mar , C. M . (1995). Distinguishin g betwee n moderato r an d quadrati c effec t
in multipl e regression . Psychological Bulletin, 118, 405-421.
MacCallum , R. O , Roznowski , M , & Necowitz , L. B. (1992). Mode l modification s in covarianc e
structur e analysis : Th e proble m of capitalizatio n on chance . Psychological Bulletin, 111,
490-504.
MacCallum , R. C , Roznowski , M. , Mar , C. M. , & Reith , J. V. (1994). Alternativ e strategie s fo r
cross-validatio n o f covarianc e structur e models . Multivariate Behavioral Research, 29,
1-32.
MacCallum , R. C , Wegener , D . T , Uchino , . N. , 4 Fabrigar , L. R. (1993). Th e proble m of
equivalen t model s in applicatio n of covarianc e structur e analysis . Psychological Bulletin,
114, 185-199.

References

303

Marcus , . , & Mine , . (1964). A survey of matrix theory and matrix inequalities. Boston : Ally n
& Bacon .
Marsh , H. W., Balla , J. R., & McDonald , R. P. (1988). Goodness-of-f h indexe s in confirmator y
facto r analysis : Th e effec t of sampl e size . Psychological Bulletin, 103, 391 -411.
Marsh , H. W., & Byrne , . M . (1993). Confirmator y facto r analysi s of MT-M M self-concep t data :
Between-grou p an d within-grou p invarianc e constraints . Multivariate Behavioral Re
search, 28, 313-349.
Marsh , H. W., & Grayson , D. (1995). Laten t variabl e model s of multitrait-mulrimetho d data . In
R. H . Hoyl e (Ed.) , Structural equation modeling: Concepts, issues, and applications
(pp . 177-198). Thousan d Oaks , CA : Sage .
Marsh , H. W & Hocevar , D. (1985). Applicatio n of confirmator y facto r analysi s to th e stud y of
self-concept : First - an d highe r orde r facto r model s an d thei r invarianc e acros s groups .
Psychological Bulletin, 97, 562-582.
Maruyama , G . (1977). A causal model analysis of variables related to primary school achievement.
Dissertation Abstracts International, 38, 1470B. (Doctora l dissertation , Departmen t of
Psychology , Universit y of Souther n California )
Maruyama , G . (1982). Ho w shoul d attribution s be measured ? A reanalysi s of dat a fro m Eh g an d
Frieze . American Educational Research Journal, 19, 552-558.
Maruyama , G . (1993). Model s of socia l psychologica l influence s in schooling . In H. J. Walber g
(Ed.) , Advances in educational productivity (Vol . 3, pp . 3-19). Greenwich , CT : JAI .
Maruyama , G. , 8c McGarvey , B. (1980). Evaluatin g causa l models : An applicatio n of maximu m
likelihoo d analysi s of structura l equations . Psychological Bulletin, 87, 502-512.
Maruyama , G. , & Miller , N. (1979). Re-examinatio n of normativ e influenc e processe s in de
segregated classrooms . American Educational Research Journal, 16, 272-283.
Maruyama , G. , & Miller , N. (1980). Physica l attractiveness , race , an d essa y evaluation . Personality
and Social Psychology Bulletin, 6, 384-390.
Maruyama , G. , & Miller , N. (1981). Physica l attractivenes s an d personality . In B. Mahe r ( E d ) ,
Progress in experimental personality research (Vol . 10, pp . 203-280). Ne w York : Academi c
Press .
Maruyama , G. , Miller , N. , & Holtz , R. (1986). Th e relatio n betwee n popularit y an d achievement :
A longitudina l tes t of th e latera l transmissio n of value s hypothesis . Journal of Personality
and Social Psychology, 51, 730-741.
McArdle , J. J & Aber , M . S. (1990). Pattern s of chang e withi n laten t variabl e structura l equatio n
models . In A. von Eye (Ed.) , Statistical methods in longitudinal research (Vol . 1, pp . 151
224). Ne w York : Academi c Press .
McArdle , J. J., & Hamagami , F. (19%) . Multileve l model s fo r a multipl e grou p structura l equatio n
perspective . In G. A. Marcoulide s A R. E. Schumache r (Eds.) , Advanced
structural
equation modeling: Issues and techniques (pp . 57-88). Mahwah , NJ : Lawrenc e Erlbaum .
McConahay , J. B. (1986). Mode m racism , ambivalence , an d th e mode m racis m scale . In J. Dovidi o
& S. L. Gaertne r (Eds.) , Prejudice, discrimination,
and racism: Theory and research
(pp . 91-124). Ne w York : Academi c Press .
McDonald , R. P., & Marsh , H. W. (1990). Choosin g a multivariat e model : Noncentralir y an d
goodnes s of fit . Psychological Bulletin, 107, 247-255.
McGarvey , B., Miller , N. , & Maruyama , G . (1977). Scorin g field dependence : A methodologica l
compariso n of five rod-and-fram e scorin g systems . Applied Psychological
Measurement,
1. 433-446.
Mehrens , W . . , & Lehmann , I. J. (1984). Measurement and evaluation in education and
psychology (3rd ed.) . Ne w York : Holt , Rinehar t & Winston .
Meredith , W . (1964). Rotatio n to achiev e factoria l invariance . Psychometrika, 29, 187-206.
Miller , . B. (1995). Coefficien t alpha : A basi c introductio n fro m th e perspectiv e of classica l tes t
theor y an d structura l equatio n modeling . Structural Equation Modeling, 2, 255-273.
Mulaik , S. A. (1972). The foundations offactor analysis. Ne w York : McGraw-Hill .

BASIC S O F S T R U C T U R A L E Q U A T I O N

304

MODELIN G

Mulaik , S. ., James , L. R Van Alstine , J., Bennett , N. , Lind , S., & Stilwell , C. D . (1989).
Evaluatio n of goodness-of-fi t indice s fo r structura l equatio n models . Psychological Bul
letin. 105. 430-445.
Muthen , B. (1984). A genera l structura l equatio n mode l wit h dichotomous , ordere d categorical ,
an d continuou s laten t variabl e indicators . Psychometrika, 49, 115-132.
Muthen , B. (1988). L1SCOMP: Analysis of linear structural equations with a comprehensive
measurement model. Chicago : Scientifi c Software .
Muthen , B. (1993). Goodnes s o f fit wit h categorica l an d nonnorma l variables . In K. A. Bolle n &
J. S. Lon g (Eds) , Testing structural
Sage .

equation

models (pp . 205-234). Newbur y Park , CA :

Muthen , . (1994). Multi-leve l covarianc e structur e analysis . Sociological


22, 376-398.

Methods <t Research,

Muthen , , Kaplan , D , & Hollis , M. (1987). On structura l equatio n modelin g fo r dat a tha t ar e
no t missin g completel y at random . Psychometrika, 52, 431-462.
Namboodiri , . K., Carter , L. R, & Blalock , . M. , Jr. (1975). Applied multivariate analysis and
experimental design. Ne w York : McGraw-Hill .
Olkin , I., & Finn , J. D. (1995). Correlation s redux . Psychological Bulletin, 118, 155-164.
Pelz , D. O , & Andrews , F. M . (1964). Detectin g causa l prioritie s in pane l stud y data . American
Sociological Review, 29, 836-848.
Ping , R. A. (1995). A parsimoniou s estimatin g techniqu e fo r interactio n an d quadrati c laten t
variables . Journal of Marketing Research, 32, 336-347.
Price , B. (1977). Ridg e regression : Applicatio n to nonexperimenta l data . Psychological Bulletin,
84, 759-766.
Raykov , T . Tomer , ., & Nesselroade , R. J. (1991). Reportin g structura l equatio n modelin g result s
in psycholog y an d aging : Som e propose d guidelines . Psychology and Aging, 6, 499-503.
Rigdon , E. (1995). A necessar y an d sufficien t identificatio n rul e fo r structura l equatio n model s
estimate d in practice . Multivariate Behavioral Research, 30, 359-383.
Rindskopf , D. , & Rose , T. (1988). Som e theor y an d application s of confirmator y second-orde r
facto r analysis . Multivariate Behavioral Research, 23, 51-67.
Rogosa , D. (1980). A critiqu e of cross-lagge d correlations . Psychological Bulletin, 88, 245-258.
Rozelle , R. M. , & Campbell , D. T. (1969). Mor e plausibl e rival hypothese s in th e cross-lagge d
pane l correlatio n technique . Psychological Bulletin, 71, 74-80.
Shingles, R. D . (1976). Causa l inferenc e in cross-lagge d pane l analysis . Political Methodology, 3,
95-133.
Sobel , . E., & Bohrnstedt , G . W. (1985). Us e of nul l model s in evaluatin g th e fit of covarianc e
structur e models . In . B. Tum a ( E d ) . Sociological methodology, 1985 (pp . 152-178). Sa n
Francisco : Jossey-Bass .
Sorbom , D. (1974). A genera l metho d fo r studyin g difference s in facto r mean s an d facto r structure s
betwee n groups . British Journal of Mathematical and Statistical Psychology, 27, 229-239.
Sorbom , D. (1982). Structura l equatio n model s wit h structure d means . In K. G . Joresko g & H.
Wol d (Eds) , Systems under direct observation (pp . 183-195). Amsterdam : Nort h Holland .
Steiger , J. H. (1989). EZPATH: A supplementary module for SYSTAT and SYGRAPH. Evanston ,
IL: SYSTAT .
Steiger , J. H . (1990). Structura l mode l evaluatio n an d modification : An interva l estimatio n
approach . Multivariate Behavioral Research, 25, 173-180.
Stein , J. ., Smith . G . M. , Guy , S. M & Bentler , P. M . (1993). Consequence s of adolescen t dru g
us e on youn g adul t jo b behavio r an d jo b satisfaction . Journal of Applied Psychology, 3,
463-474.
Tanaka , J. S. (1993). Multifacete d conception s of fit in structura l equatio n models . In K. A. Bolle n
& J. S. Lon g (Eds.) , Testing structural equation models (pp . 10-39). Newbur y Park , CA :
Sage .

References

305

Tanaka , J. S., & Huba , G. J. (1984). Confirmator y hierarchica l facto r analysi s of psychologica l
distres s measures . Journal of Personality and Social Psychology, 46, 621-635.
Tanaka , J. S.. Panter , A. T , Winboume , W. G , & Huba , G . J. (1990). Theor y testin g in personalit y
an d socia l psycholog y wit h structura l equatio n models : A prime r in 2 0 questions . In
C. Hendric k & M . S. Clar k (Eds.) , Review of personality and social psychology (Vol 11,
pp . 217-242). Newbur y Park , CA : Sage .
Thurstone , L. L. (1938). Primar y menta l abilities . Psychometric Monographs, No . 1.
Tucker , L. R., & Lewis , C. (1973). Th e reliabilit y coefficien t fo r maximu m likelihoo d facto r
analysis . Psychometrika, 38, 1-10.
Waller , N . G . (1993). Seve n confirmator y facto r analysi s programs : EQS , EZPATH , LINCS ,
LISCOMP , LISRE L 7, SIMPLIS , an d CALIS . Applied Psychological Measurement, 17,
73-100.
Wiley , D. E. (1973). Th e identificatio n proble m fo r structura l equatio n model s wit h unmeasure d
variables . In A. S. Goldberge r & O . D . Dunca n (Eds.) , Structural equation models in the
social sciences (pp . 69-83). Ne w York : Semina r Press .
Willett , J. B., & Sayer , A. G . (1994). Usin g covarianc e analyse s to detec t correlate s an d predictor s
o f individua l chang e ove r time. Psychological Bulletin, 116, 363-381.
Williams , R., & Thomson , E. (1986). Normalizatio n issue s in laten t variabl e modeling . Sociologi
cal Methods & Research, 15. 24-43.
Wothke , W. (1987, April) . Multivariate linear models of the multitrait-multimethod
matrix. Pape r
presente d at th e annua l meetin g of th e America n Educationa l Researc h Association ,
Washington , DC .
Wright , S. (1921). Correlatio n an d causation . Journal of Agricultural Research, 20, 557-585.
Wright , S. (1934). Th e metho d of ^(mtfiaeM.Annalsof
Mathematical Statistics,
5,161-215.
Yalcin , I., & Amemiya , Y. (1993). Fittin g of a genera l non-linea r facto r analysi s model . American
Statistical Association Proceedings (Statistica l Computin g Section) , pp . 118-122.

FX

In

1%.

Ln<11<

Aber , M . S. 108
Akaike , H. , 237, 2 4 1 , 2 4 6
Amemiya , Y, 2 8 0
Arbuckle.J . L , 19, 179, 2 6 1 , 283
Andrews , F. M. , 120-121
Arminger , G. , 2 5 9
Arvey , R., 2 8 2

Bagozzi , R. R, 3
Balla , J . R., 2 0 0 , 2 3 9 , 2 4 1 , 2 4 4
Baron , R. M. , 4 0 , 281
Baumrind , D. , 2 7 6
Bennett , N. , 2 3 9 - 2 4 1 , 2 4 3 , 2 4 5
Bentler , R M. , 19, 179, 2 3 9 , 2 4 0 , 2 4 1 ,
242-245, 247-248, 281
Blalock , . M . Jr. , 17, 106-108
Bohrnstedt , G . W , 2 4 7
Bollen , . ., 12, 8 1 , 106, 189, 2 0 0 , 2 3 8 ,
239, 2 4 1 , 2 4 4 , 256
Bonett , D . G. , 2 4 0 , 2 4 3 - 2 4 4 , 2 4 7 - 2 4 8
Bozdogen , H. , 2 4 1 , 2 4 6
Breckler , S. J . , 2 7 4 - 2 7 5
Brett , J. M. , 2 4 1 , 2 4 5 , 2 4 7 - 2 4 9
Brophy , J. E., 6
Browne , M . W , 7 3 , 8 1 , 97, 199, 2 3 7 , 2 4 1 ,
2 4 6 , 247, 2 5 0 , 2 5 9 , 2 8 0
Bryk , A. S., 2 8 2
Byrne , . M. , 3 1 , 149, 195, 2 5 9

306

Byrne , D. , 6

Calsyn , R. J., 109


Carter , L. F., 106-108
Campbell , D . T., 9 2 - 9 6 , 1 2 0 - 1 2 1 , 1 4 9 , 1 5 1
152
Cliff , N. , 135, 139, 2 7 2 - 2 7 5 , 2 7 8
Cohen , J., 2 8 0
Cole , D . ., 149, 2 8 2
Cooley , W W , 3
Costner , H . L., 1 3 2 , 154, 158, 2 7 9
Crandall , C . S., 8 9
Cudeck , R., 3 4 , 7 3 , 9 7 , 118, 199, 237, 2 4 1 ,
246, 247, 249-250, 280

Darlington , R. B., 6 1 , 73
Donaldson , S. I., 2 1 7
Dudgeon , R, 2 8 1
Duncan , O . D. , 17, 2 9 , 4 6
Dunn , G. , 149, 2 6 1

Everitt , B 149, 261

Fabrigar , L. R., 2 7 6
Finn , J. D. , 71

Author

307

Index

Fiske , D . W , 9 2 - 9 6 , 1 2 0 - 1 2 1 , 1 4 9 , 1 5 1 - 1 5 2
Ford , J. K., 136

Gerbing , D . W , 138
Goldberger , A. S., 62
Good , T. L , 6
Gordon , R. ., 6 6 - 7 0 , 75
Gorsuch , R. L.,80, 132, 134
Graham , J. W , 2 1 7
Grayson , D. , 153
Green , B. F.,63
Griffitt , W , 6
Guy , S. M. , 281

Hamagami , F., 283


Hamilton , J. G. , 138
Hayduk , L. ., 12, 2 8 4
Henly , S. J., 2 4 9 - 2 5 0
Hocevar , D. , 2 5 6 , 265
Hofer , S. M. , 2 1 7
Hollis , M. , 2 1 7
Holrz , R., 5, 2 0 4 , 2 1 4 - 2 2 0
Hox , J. J . , 2 8 4
Hoyle , R. R , 12,
Hoyle , R. H. , 2 3 8 , 2 4 4 , 2 4 5 , 254, 283
Hu , L , 2 3 9 , 2 4 2 - 2 4 5 , 281
Huba , G. , 8 1 , 243

Jaccard , J., 2 8 0 , 281


James , L. R., 2 3 9 - 2 4 1 , 2 4 3 , 2 4 5 , 2 4 7 - 2 4 9
Joreskog , J. G. , 19, 20, 147, 178, 179,
187-200, 246, 2 6 1 , 2 7 8 - 2 8 0
Judd , C. M. , 2 8 0

Kano , Y, 2 8 4
Kaplan , D. , 2 0 0 , 2 1 7
Kashy , D . ., 9 6 , 149, 152-154
Keesling , W , 187
Kenny , D . ., 4 0 , 8 5 , 96, 1 0 4 - 1 0 5 , 109,
132, 149, 1 5 2 - 1 5 4 , 1 5 7 - 1 6 0 , 2 7 6 ,
2 8 0 , 281
Kerlinger , F. N. , 285

Land , K. C. , 18, 49
Lehmann , I. J . , 80, 84

Lennox , R., 81
Lewis , C., 2 4 0 , 2 4 4
Lewis , R., 2 0 3 - 2 0 9 , 2 1 1 , 2 2 0
Liang , J., 2 5 6
Lind , S., 2 3 9 - 2 4 1 , 2 4 3 , 2 4 5
Ling , R. R, 2 7 6
Little , R. J. ., 2 1 7
Loehlin , J. C. , 136
Long , J. S., 200, 2 3 8 , 2 3 9

MacCallum , R. C , 8 1 , 1 3 6 , 2 7 6 , 2 7 8 , 2 7 9 ,
280
Mar , C. M. , 2 7 9 , 2 8 0
Marcus , M. , 2 9 2
Marsh , H . W , 149, 153, 2 0 0 , 2 3 9 , 2 4 1 ,
244, 245, 256, 265
Maruyama , G. , 5 , 6, 63, 9 4 , 102, 113, 153,
203-220, 221, 234, 250-254, 257,
270
Maxwell , S. E., 2 8 2
McArdle , J. J., 108, 283
McConahay , J. B. 89
McDonald , R. P., 2 0 0 , 2 3 9 , 2 4 1 , 2 4 4 , 2 4 5
McGarvey , B., 5 , 94, 102, 2 0 9 - 2 1 4 , 2 3 4 ,
250-254, 257, 2 7 0
Mehrens , W ., 80, 84
Meredith , W , 2 1 7
Miller , . B., 136
Miller , N. , 5 , 6, 6 3 , 9 4 , 2 0 3 - 2 0 9 , 2 1 4 - 2 2 0
Mine , H. , 2 9 2
Mulaik , S. ., 132, 2 3 9 - 2 4 1 , 2 4 3 , 2 4 5 ,
247-249
Muthen , B., 3 1 , 2 1 7 , 2 5 9 , 2 8 2 , 2 8 3
p

Namboodiri , . K., 106-108


Necowitz , L. B., 4 , 2 7 9
Nesselroade , R. J., 2 8 3

O'Connell , E. J., 96, 149


Olkin , I., 71

Panter , . X, 8 1 , 2 3 8 , 2 4 4 , 2 4 5 , 2 5 4 , 2 8 3
Pedhazur , E. J., 2 8 5
Pelz , D . C , 120-121
Piccinin , A. M. , 2 1 7
Pickles , ., 149, 261

308

Ping , R. ., 2 8 0
Price , B., 7 4 - 7 5

Raudenbush , S. W , 282
Raykov , T., 283
Reith , J. V , 2 7 9
Rigdon , E., 106, 190
Rindskopf , D. , 2 5 6
Rogosa , D. , 109, 121
Rose , T., 2 5 6
Rozelle , R. M. , 120-121
Roznowski , M. , 4 , 2 7 9
Rubin , D . B., 2 1 7

St . John , N. , 2 0 3 - 2 0 9 , 2 1 1 , 2 2 0
Salas , E., 2 8 2
Sayer , A. G. , 108
Schoenberg , R., 132, 154, 158, 2 7 9
Shaveison , R. J., 3 1 , 2 5 9
Shingles , R. D. , 109, 121
Smith , G . M. , 281
Sobel , . E., 2 4 7
Sorbom , D. , 19, 3 1 , 7 8 , 179, 2 5 9 , 261
Steiger , J. H. , 2 4 1 , 2 4 6 , 2 8 4
Stein , J. ., 281
Stilwell , C . D. , 2 3 9 - 2 4 1 , 2 4 3 , 245
Sugawara , . M. , 2 8 0

STRUCTURA L EQUATIO N

MODELIN G

Tait , M. , 136
Tanaka , J. S., 8 1 , 2 4 3 , 2 4 6
Thomson , E., 2 5 8
Thurstone , L. L., 133
Tomer , ., 283
Tucker , L. R., 2 4 0 , 2 4 4

Uchino , . N. , 2 7 6

Van Alstine , J., 2 3 9 - 2 4 1 , 2 4 3 , 2 4 5

Waller , N . G. , 2 8 4
Wan , C. K., 2 8 0 , 281
Wegener , D . T , 2 7 6
Wiley , D . E., 2 0 , 187, 197
Willett , J. B., 108
Williams , R., 2 5 8
Winbourne , W C , 81
Wothke , W , 152
Wright , S., 9, 15, 16

Yalcin , I., 2 8 0

5 L T DR

Ci U K u

"All - r models , 2 5 6 - 2 5 7 , 2 6 7 - 2 7 0
Alternativ e models , 2 3 4 - 2 3 8

Collinearity , 6 0 - 7 0
Compute r program s fo r SEM , 2 8 3 - 2 8 4
Conceptua l replication , 2 6 2 - 2 6 3
Consistenc y tests , 154-161
Constraints , 1 4 7 - 1 4 8 , 2 5 7 - 2 6 5
Correlations :
confidenc e interval s for , 70-71
partial , 5 1 - 5 2
Criticism s of SEM , 2 7 2 - 2 7 8

Decompositio n o f effects , 3 0 , 3 5 - 4 8
Degree s o f freedom , 48
Determinan t of a matrix , 2 9 2
Direc t effect , 3 6
Distributio n fre e estimation , 281

parsimonious/adjusted
, 242, 245
246
relative , 2 3 9 - 2 4 2 , 2 4 3 - 2 4 5

Identification , 18, 105-108


of laten t variabl e models , 1 8 8 - 1 9 2
orde r conditio n for , 107
ran k conditio n for , 107
Indirec t effects , 3 6
Instrumenta l variables , 105
Interactio n effects , 281
Illustration s of SEM models , 1 8 7 - 1 9 5 , 2 0 3
220

Just-identifie d model , 18

Lack of fit , 2 4 9 - 2 5 0
Leas t square s estimation , ordinary , 3 9
Longitudina l models , 1 0 8 - 1 2 2

Exac t replication , 2 6 2 - 2 6 3

Facto r analysis :
confirmatory , 1 3 1 , 1 3 9 - 1 4 7
exploratory , 136-139
logi c of , 1 3 2 - 1 3 6
Finit e causa l lag , 100
Fit indexes , 2 3 8 - 2 4 6
absolute , 2 2 9 , 2 4 2 - 2 4 3

Matrix :
addition , 2 8 9
diagonal , 2 8 2
identity , 2 8 8
invers e of , 2 9 1 - 2 9 2
multiplication , 2 8 9 - 2 9 0
nonsingular , 2 9 2
square , 2 8 7

309

310

STRUCTURA L EQUATIO N

symmetric , 2 8 8
transpose , 2 8 7
Matri x algebra , 2 8 5 - 2 9 3
Mea n differences , analysi s of , 2 8 2
Measuremen t error , 2 9 , 7 9 - 8 9
Measuremen t model , 178-181
Mediation , 10
Metho d variance , 88-92
Mode l modification , 2 7 8 - 2 8 0
Moderato r variables , 281
Modificatio n indices , 251
Mode l fit , 1 6 1 , 1 6 3 - 1 6 4 , 195-201
Multicollinearity , 61
Multileve l modelling , 2 8 2
Multipl e populatio n analyses , 2 5 5 , 2 5 7 - 2 6 5
Multitrait , multimetho d models , 9 2 - 9 7
an d confirmator y facto r analysis ,
148-154

Neste d models , 2 3 5 - 2 4 6 , 2 4 7 - 2 4 9
Noncausa l effects , 36
Noncontinuou s variables , 281
Nonlinea r relationships , 2 8 0
Non-neste d models , 2 4 6 - 2 4 7
Nonrando m error , 8 7 - 8 9
Nonrecursiv e models , 1 0 0 - 1 0 5

stabilit y of causa l processes , 1 1 8


120
tempora l la g in , 1 1 5 - 1 1 8
Pat h analysis , 9, 1 5 - 2 0 , 29
Pat h modelin g notation , 3 6 , 37, 5 8
Powe r analysi s o f SEMs , 2 8 0

Rando m error , 8 4 - 8 7
Recursiv e model , 16
Referenc e indicators , 1 7 8 , 1 8 1 - 1 8 4
Regression :
partial , 4 9 - 5 0
ridge , 6 1 , 7 3 - 7 5

Second-orde r facto r models , 2 5 6 , 2 6 5 - 2 6 9


Specificatio n error , 2 9
Structura l equations , 6, 10
Structura l model , 1 8 4 - 1 8 7

Tests :
congeneric , 148
parallel , 147
tau-equivalent , 148

Under-identifie

d model , 18

Overal l mode l fit , 2 3 8 - 2 4 7


Over-identifie d model , 18

Pane l analysis , 1 1 0 - 1 1 2
Pane l models :
analysi s of , 1 2 0 - 1 2 2
stabilit y in , 1 1 2 - 1 1 5

MODELIN G

Variable :
endogenous , 37
exogenous , 3 7
intervening , 4
mediating , 4
moderating , 2 8 1
Vector , 2 8 7

Ar

rf

t th Ant h r

Geoffre y M . Maruyam a is Vice Provos t for Academi c Affair s in th e


Office of th e Provos t for Professiona l Studie s at th e Universit y of
Minnesota . Hi s responsibilitie s includ e academi c planning , curricula r
an d instructiona l issues , graduat e education , facult y issue s includin g
promotio n an d tenure , an d research . H e receive d his Ph.D . in
psycholog y fro m th e Universit y of Souther n Californi a in 1977. H e
ha s bee n a facult y membe r in th e Departmen t of Educationa l Psychol
ogy sinc e Septembe r 1976. Before his appointmen t as Vice Provost ,
he spen t 10 year s as directo r of th e Huma n Relation s Progra m in
th e Departmen t of Educationa l Psychology , 3 year s as directo r of th e
Cente r for Applie d Researc h an d Educationa l Improvement , an d 1
yea r as Actin g Associat e Dea n in th e Colleg e of Educatio n an d Huma n
Development . Hi s academi c experienc e also include s 9 year s of activ e
involvemen t in facult y governanc e an d 4 year s as lobbyis t for facult y
issue s at th e Minnesot a stat e legislature .
Maruyam a ha s also writte n anothe r book , Research in Educational
Settings (wit h Stan Deno) , as wel l as 13 boo k chapter s an d mor e tha n
50 articles . Hi s researc h interest s cluste r aroun d (a) methodologica l
issue s including applicatio n of structura l equatio n modeling , actio n
researc h an d its implication s for collaborat e research , an d applie d
researc h methods/progra m evaluation ; an d (b) substantiv e issue s tied
to th e interfac e of psycholog y an d education , includin g schoo l re
form , schoo l achievemen t processes , an d effectiv e educationa l tech
nique s for divers e schools .
311

You might also like