You are on page 1of 28

OF VARIATION,

SKEWNESS.

MEASURES
AND KURTOSIS
5.I INTRODUCTION
Of greatcOncernio the statistician is the var-iation in the events of nature.
The variatiDn of one measuremenl from another is a persislingcharacter

istic of any sample of measurements. Measurements of intelligence, eye


color, reaction time, and skin resistance, for example, exhibit variation in
any sample of individuals. Anthropometric measurementssuch as height.
weight,diameterof the skull. length of the forearm, and angularseparation
of the metatarsals show variation betwe'enindividuals. Anatomical and
physiological measuremenls vary: also the measurements made by the
physicist,chemist,botanist, and agronomist. Statistics has been spoken of
as the study crfvariation. Fisher ( 1970) has observed,
The conccpljon of \lalr\tics !s lhe sludy of varialion i\ lhe naturnl oulcomc of
vie\ jnB lhe
subjecl as the sluJy of population\: for a population ol rndividuals in all re\p
ecis idenlical is
completely described by a descript()n of any one individual. logether wilh the n
umber in r.he
group. The populationswhich are the object of slatistical sludy always display v
ariation in
one or more rcspects.
The experimental scienlisl is frequentlyconcernedwith the different cir'

cumstances.conditions.or sources whichcontributeto the variation in the


measurements
he or she obtains. The analysis of variance(Chapter15)
developedby Fisher is an important statistical procedurewhereby the variation
in a set of experimental data can be partitionedinto components
whichfrequentlymay be attributed to different c.lusal circumstances.
How may the variation in any set of measurenentsbe described?
THEMEAN OIP|VANON
Consider rhe lbllowing measL
Sample,4 I0
|] 15 lt
SampleB I
8 15 2:
We note that the t\r,osanrple
Inspectionindicates.however,
variabie than those in samp
Among the possiblemeasures
the mean deviation.and the r
tbese is the slandarddeviation
5.2 THERANGE
f$ .,9g"_ts_*.su:ptellJlg3l
metrts.ihe{angeis takej}-ar.th
r@.gle!0etrls. The rangefor
20 minus10,or 10. Therang
is 28 minusl, or 26. Themr
exhihit greater variation
thantl
a mucngreaterrange.Therar
samplesit is an unstabledesct
the rangefbr smallsamplesis
deviationbut increasesrapidly
not independentof samplesize
distributionsthattapert0 0 att
tainingextremevaluesfor lar6
rangescalculatedon samplesc
notdirectlycomparable.Desoi
fectivelyusedin theapplication
5.3 THEMEANDEVIATION
Considerthefbllowingmeasurer
Sample,4 It 8 8u
Samplef I 4 7 IO I
Samplca I 5
Intuitively,themeasurements
in
whichin rurn arelessvariableth
rnI exhihitno variation atall.
and 16. Ii we cxpressthemeas
means.we obtain
Considerthe lbllowing mea5uremerlts for two samplesr
Samplc.l t0 t: li llt t0
SampleI I 8 t-5 tt t8
We note that the two salnples have the same mean, nantel), 15. Simple
inspection indicates. houevcr. that the measurements in sample B are more
variable than those in sample ,4: they difler more one liom another.
Among the possiblemeasules used to describe this variation are the range,
the mean deviation. and the standard deviation. The most imoortant of
theseis the standard dcviation.
5.2 IHE RANGE
@arntio1, ln 43y1a!.qp,lc-Le
al_lrgalu
mentsthe rang is taken. aslhedifferencebetweenthe largesl andsmallest
rl1g$]lrcmenl.s. The range for- lhe measurcments 10, ll, | 5, I8, and lU rs
l0 minus 10.or 10. fhc range ft)r the mcasufemen(t l. l{, 15. ll, and l8
is ?8 minus I, or 26. The measurementsin th!' second set quite clearly
exhibit greatervariation than those in lhe first set, and thi5 reflects itself i
n
a nuch gf a(er range. fhe rangehas two disadvantages. First, for large
samples it is an unstable descriptive measure. The rampling variance of
the range fbr small samplcs is not much greater than that of the standard
deviatjon but increases rapidly with increase in N. Second,the range is
not independent of sample size. except under special circumstances. For
distributionsthat taper to 0 at the extremities a bctter chance exists of obtain
ing
extrenre values lbr large than lbr small samples. Consequently,
rangc; calculated on samplescomposed of diff'ercnt numbers of cases are
not directly comparable. Despitc thcse disadvantages the range may be ef'fectivc
ly
used in the application of tests of significancc with small samples.
5.3 THEMEAN DEVIATION
Consider the following measuremcnts:
SamplcJ SlililiS
Sanplc B I I 7 l0 tl
Sample(' I5t0 15:e
lntuitively.the measurements in samplc,4are less variable than those in B.
whichin turn are less variilblcthanthose in C. Indeed.the measurements
in z1 exhibit no variation at all. fhe means of the thre sarnplesare ll, 7.
and i6. lf we cxpress thc measurementsas deviations fiom their sample
means. we obtain
61 MEASUPIS ANO(URTOSTS
OF VAnAION, srrWNtSS,
Sample,{ 0 0 000
SampleI 3 0 +3 +6

-6
-

SampleC 15 tt +1 +9 +13
Inspectionof these numbers suggeststhat as variation increases, thedepartureof
the observations from their sample mean increases. We may use
this characteristic to define a measure of variation. One such measureis
lhe mean deviation. The mean deviation is the arithmetic mean of the
absolute deviations from thearithmetic mean. An absolute deviation is a
deviationwithout regard to algebraicsign. To obtain the mean deviation
we simply calculatethe deviations from the arithmetic mean, sumthese.
disregardingalgebraicsign, and divide by N. For sample ,4 above, the
mean deviation is 0. For sample B the mean deviation is (6 + 3 +
0+ 3 +6)/5: +:3.6. + 11 +4+
ForsampleCthemeandeviationis(15
9+13)/5:Y:10.4.
The mean deviation is givenin algebraic language by the fbrmula
t5;rl ItD-:x,ltl
Here X -* is a deviation from the mean and l,f'- tl is a deviation
wilhout regard to algebraic sign. The verticalbarsmeanthal signsareignored.

Hitherto, symbols above and below the summation sign ! have been
used to indicatethe limits of the summation. In the above formula for the
mean deviation these symbols have beenomitted, the summation being
clearly understood to extendover the N members in the sample. In this
and subsequent chapters symbols indicating the limits of summation will,
for convenience, be omitted wherethese are understood clearly from the
context to extend over N sample members. Where anypossibilityofdoubt
could exist, the symbolsaboveand belowthe summation sign will be inserted.

Themean deviation is infrequently used. It is not readily amenable to


algebraic manipulation. This circumstance stems from the useof absolute
values. ln general,in statisticalwork lhe use of absolute values should be
avoided,if at all possible.
The mean deviation is discussedhereprimarilyfor pedagogicreasons.
It illDsfiat show one parlicularmeasureof variation may be defined.
5,4 THESAMPLE ANO STANDARO VARIANCE DEVIAIION
Some of the deviations about the mean are posilive; others are negative.
The sum of deviations is 0. One method for dealing with the presenceof
negative signs is to use the absolute deviations. as in tbe calculation of the
5.4 TfiESAMPITvAfl Nct At{D3
meandeviationas ,
ommendit. An al
the deviationsabo
squaresin thedefir
of the measuremer
meanare-6, -3, (
9, 0. 9. and36. Tl
The sumof sor
riono]I?"ti
the-yqri4lce-"tiifrBorh
!ry-dtviOlclrhe-sur
numbero.fcases.I
t5.21
ln the illustrativee;
deviationsaboutthe
[5.2],iss': {ri: 13.
An altemativen
squaresby N-I ri
variancess, is given
t5.1i
Both formulas,I
variance. Thesefc
processof plausibleI
tinctionis madehen
that definedby formr"
What is the ess
formula[5.2]andthe
(heanswerto this qu
tinction was madeb
values,or parameten
of a populationvariar
>(X -X)" by N we
showa systematicte
divide> (X -t), by
o:. Suchanestimate
thanor lessthanqr.
In all situations
required,the statistic
N shouldbe used. I
book an unbiasedes
descriptivestatistics,
5.' IHT 5AMPIE VAFIANCE AND SIANDARD DEVIAIION
mean devialion as denncd in formula [5.1]. Thisprocedurehaslittle to recommend
it. An altemative and generally preferable procedureis to square
the deviations about the mean,sum these squares, and use this sum of
squaresin the definition of a measure of variation. For example, the mean
olthe measurements l, 4,7, 10, and 13is 7. The deviations from the
-6, -3, 0,*3, and+6.
meanare The squareso[ these deviations are36,
9, 0, 9, and 36. The sum of squares is 90.
sum of squares of deviations about the mean is used in the defini

the variance-Both are in common use. blq .ttlS4_deli""" rhe variance


bflivlOlqe-1h -stts-a{-sqlalq of deviatiops about thc mean bLN. the
nq4b31-o!14.sqq.Denote this statistic by .r'':. Thus
-
2(.X x)',
ln the illustrative example in the paragraphabove, the sum of squares of
deviationsaboutthe mean was 90. and the variance, according to formula
[,5.2],isr':+!:18.
An altemative method of defining the varianceis to divide the sunl of
squares by N -I rather than N. Thus. according to this dennition. the
variancess: is givenby
I5.31 -2

Both formulas, [5.2] and [5.3],providealternativedefinitionsof the


variance. These formulas have no derivation but are obtained by a
processof plausiblereasoning. The readerwill note that no notational distinctjon
is
made here between the variancedefined by formula [5.2]and
thatdefinedby formula [5.3].
What is the essential diference between the varianceas defined by
formula[5.2]andthe varianceas defined by formulat5.31?To understand
theanswer to tbis questionthereader should recallthatin Chapter I a distinctionw
as
made betu'een sample values, or estimates, and population
values,or parameters.Both formulas, [5.2] and [5.3],provideestimates
of a populationvalianceo!. For certain algebraic reasonswhen we dilide
-
:(.Y ,Y)' by N we obtaina biased estimate of rr". This estimate will
show a systematic tendencyto be less than 02. lt is biased. Whenwe
divide!(-Y -t)' by N -l, however, we obtain an unbiased eslimate of
o:. Suchanestimatewill show no systematic tendency to be either greater
than or less than or.
In all situations where an estimateof a populalionvariance rrr is
required,theslatistic i'zwhich divides the sum of squares by N -I and not
N should be used. In the greatmajority of situations discussed in this
book an unbiased estimate is required. ln some situations involving
descriptive statistics, convenienceand simplicity dictate the useof an es
MIASURES Oa vARlAllON, SKEWNESS,
-
timatewhich divides the sum of squares hv N and not N l. lrt g<'netal
in thi.t butL thc s y-ntl>ol,s!vill ba u!(d t(, tcJu l(, lhc unbiascd cstltttttl
t' olttsinad
b-r<lividingthe sunt of squares by N l - us in .llrmulo 15.31. I n s
fet situations the deJiniti(rn oJ Jbrmulu [5.2] ui// bc used. AII strt'h in
stunces v'ill be clearl.,- speciiied in tlrc text.
The reader should note that a population variance is defined as
o:: )(X p):/N,,. wherep is the populationmeanand N,, is the number
of members in the population. Thus in any situation where the variance of
a complete population is required we divide the sunl of squares by the
numberof members in the population,
While N is the nunrber of measuremenls or observalions. the quanlit)'
-
N I is the number D[ deviations about the mear] that are free to vary.
TD illustrate, consider the measurements7, 8, and 15. The mean is 10,
-3, -2, +5.
andthe deviations about the mean are The sum of deviations
aboutthe mean is 0: that is, (-3) + ( 2) + (,5):0. Becauscthis is so, if
any two of the deviationsareknown,the third deviation is fixed. It cannot
vary. In this example. the sum of squares of deviations about the mean is
9 + 4 + 15:38. Although this sum of squaresis obtained by adding
togetherthree squared deviations, only two of these squared deviations can
exhibit freedom of variatjon. The number ol valuesthat are free to ar) i\
"
cafledthe number of zlzgllc.s oJJtt,edorn.Aquantityof the kind:(X--\r'
-
is said to have associated with it N I degrees of freedom, because N I
of the N squared deviations of which it is composedcan vary. Somc intuitive
plausibility attaches to the idea that in the dennition of a measure of
variation we should divide the sum of squares by the number of valuesthat
can erhibit freedom of variation. The concept of degrees of freedom is a
very useful and general concept in statistics and is elaborated in more dc-
tail later in this book.
ln the above discussionthe definition of the varianceevolvesinitially
from a consideration of deyiations about the arithmetic mean. An alternative,
and perhapsmore elemental, approach is to lregin by considering the
diflerencesbetween each value and every other value. With two measurements
only, X, ancJX", we may consider the diflerence between them,
Xt .Y,:. With three measurements, Xr, X,r, and X,!, we may consider the
-
differences X, Xr, X, ,1.,,. and X.3 X;. [n general, for N
measur mentsthe number of such dilferences is N(N- l)/2. To illustrale,
fDr the measurements 1,4,7, 10, and ll, the differences between
-
eachmeasurementand every olher measurenent are -3, -6, -9, l:. ,1.
9, -3, -6, and -3.
6. Note that the sign of the difference depends on
the order of the measurements. If we obtain the sum of squares of the dit't'eren
ces
between each meilsurement and every olher measurement and
divide by the number of such differences,the result is closely related to s';
in fact it is simply twice r:. [n our example the sum of squares of dif[
erencesis 450. We divide this by l0 to obtain 45.0. which is seen to be
twice the variance. 22.5. as calculated by formula [-s.]1. In general,in
algebraicnoration it may- be shown that
!.5 AN ItLUSIPAIVE APPUCAI|ON
L5.4)
where the summatil
ferences.Thisresul
eachvalueisfrom ev
diferencesdividedb.
The varianceis a
feet,then(X -x yr1.
desirableto usea mei
units of the originalj
takingthesquareroot
is called thestundard
t5.5i
or
t5.61
5.5 AN ITTUSIRATIVE
API
Our understanding
of
will beenrichedby cor
are of interest. Consi
effectof a drug onaco
of subjects, who receiv
the drug, areused. Ei
scoreson the codingts
Experimental S
7
Conrrol 29
36
The meanscorefor the
-51.5. The investigatot
meansthatthedrugha
jects. The standardde
and 14.86, the experim
ancethan thecontrol gr
ertrnga substantial inflr
influenceon levelof pe
mentaldatatheinvestig
encesin the standardd
arithmeticmean.
5.5 AN rrrUSlnATtVt 57
APPCAI|ON
-
:f'\'' -
l).qt
X )'
2\ '
N(N t)12
where the summation is understood to extend over N(N l)/l diflerences.
This resuh meansthal s! is a descriptlve indexof how different
eachvalueis from every other value;in fact it is anaverageof the squared
differencesdividedby 2.
-
The variance is a statistic in squared units. lf x X is a deviation in
feet, then (X -X)' is a deviation in feet squared. For many purposesit is
desirableto use a measureof variationwhich is not in squared units butin
units of the original measurementsthemselves.We obtain this resuhby
takingthe square root of either formula [5.2]o|formula[5.3]. This statistic
^fhus
is calledthestqndarddeyiation.
[5.-rI
5.5 AN ITTUSTRATIVE APPLICATION
Our understandingof the nature of the variance andthe standard deviation
will be enriched by considering illustrative situations wherethesestatistics
are of interest. Consider a simple experiment designed to investigate the
effecl of a dlug on a cognitive task suchas coding. An experimentalgroup
of subjects,who receive the drug,and a control group,who do not receive
lhe drug, are used. Each groupcontains l0 subjects. Let us assumethe
scoreson the coding task for the two groupsareas follows:
Experimental 5 7 11 ll 15 4'7 6E 85 96 99
Control :9 .16 1'l 42 49 58 6t 63 69 7()

The mean score for the experimental groupis 50.0, and that for the control,
51.5. The investigator might be led to concludefrom inspecting these
meansthat the drug had little or no effect on the performanceof the subjects.
The standard deviations for the two groupsare, respectively. 35.63
and 14.86, the experimentalgroupbeing much more variable in performancethan
the control group. Quiteclearly the treatment appearsto be exening
a substantial influenceon the variationin performance,althoughits
influenceon level of performanceis negligible. In the analysis ofexperimental
data the investigator mustattendto, and if possibleinterpret. differencesin
the standard deviation, or variance,as well as differences in the
arithmeticmean.
SIANDAPOSCORES
68
MaasuRls sKEwNEss, xuetosts
ot vARtAloN AND
the marks assigne
THE VARIANCE
the same as the
5.6 CATCUI.ATING SAMPTE AND THE
FROM UNGROUPED
This result follow
['or purposesof calculation. it is convenient to write the variance and the
SIANDARDDEVIATION
DATA
correspondingobs
standard deviation in a different form. The variance may be $'ritten irs
meanof the origi
-
A deviati,
, >(,Y x)"
X*c.
-
addedis then (X
-
x t. Sinceth
-
>\xt + t! 2xr )
tion of a constant,
NI
trate,by adding a
>.Y:+Nt,'_2N.t' we obtain6, 9, 12
7, and the mean o
:r' -Nt'' 12. The deviation
-6, -3,0, +t, anr
If all measuret
ln this derivation note lhat the summation of X' over N is simply NX J: dard dav
iationis a
tl\o thc \umnl:Ltion ol 2XX i\ 2t:t' .-2Ntr. since >X: NX. The the standard devi
a
standarddeviation is given by
tipliedby the consl
is3x4:12. To
of a sample of mea
c is cX. A devia
squaring,summinl
Thus to calculate the standard deviation using this formula, we sum the
obtarn
squares of the original observations. subtract from this N times the square
of the arithmeticmean, divide by N- l. and then take the square root.
For example,the five observations I, .1.7, 10. and l3 havea mean of 7.
The squirres of these obscrvations are I, | 6, 49, 100,and I 69. The sum of
thesesqrrared observations is 335. The variance is then

Thus if all measul


135 5 7r_
,.
.'._ ) Y-_rx' ---'"q11
multipliedby c, an
'--.v
I -.-|
is a negative numb
way of illustration
rrnd lhe:tantlaltl deviari,,rn i\ \ l=0 J.74.
varianceof 22.50,
An alternative formula for the standard deviation which avoids the
are multiplied by tl
calculation of the arithmetic mcan and may. ther!'[ore, be useful lbr certain
now 5 x 7, or 35.
compulational purposesi\
+30. Squaringtt
/,\Tx,,-(:xr squaresis 2,250,
15.7l
'
Y N(,V l) 23.72,whereas5
The slight discrepa
Thisformularequiresoncoperationof division only.
ON THE DEVIAIION
5.7 THEEFFECI STANDARD
5.8 STANDARDSCOR
OFADDINGOR MUTTIPTYING
BYA CONSTANT
Hitherto we havec
I.l a
utultant i:t added to ull thc obsarr.'alionsin a surnple, tltc standurd
they wereoriginall.
dcvitttion rttnuint uncltungcd. An examiner may conclude, for exarrple,
X with meanX ar
that an cxamina(ion is too difficult. He may decide to add l0 points to all
5.I SIANOARDSCORES 69
themarks assigned. The standard deviation of the originalmarkswill be
the same as the standarddeviationof marks with the 10 pointsadded.
This result follows directly from the fact that if X is an observation, the
correspondingobservationwith the constant c addedis X + c. lf t is the
mean of the original observations, the mean with the constantaddedis
* + c. A deviation from the meanof the observations with the consranr
added is then (,\'+ c) (t + c). which is readily observed to be equal to
X -t. Sincethe deviations about the mean areunchangedby the addition
of a constant,the standard deviationwill remainunchanged.To illustrate,
by addinga constant, say, 5, to the measurements I, 4, 7, 10. and ll,
weobtain6,9, 12, 15. and 18. The mean ofthe crriginalmeasurements
is
7. and the mean of the measurementswith ihe constantadded is 7 * 5. or
12. The deviations from the mean are in both instances the same, namely,
6,-3, 0, +3, and 1 6. The standard deviation in both instancesis 4.74.
lf all measuremants in a sample ore muhiplied by a (onstqnt,thestandardderiqtion
is
alsomultipliedby tht absolute value ofthat constont. lf
the standard deviation of examination marks is 4 and all marks are multipliedby
the constanl 3, then the standard deviation of the resulling marks
is 3 x 4: 12. To demonstrate weobserve is the mean
this result. that ift
of a sample of measurements, multiplied by
the mean of the measurements
r is cX. A deviationfrom the mean is Ihen rX -r'X , rX -.Yt. 81
squaring.summingover N observations,and dividing by N- l. we
Obtain
st,Y-rY\2 c22(X -*)2
Thus if all measurenrents are muhiplied by a constant c, the variance is
mulliplied by c': and the standarddeviation by the absolutevalue of c. If c
is a negative number. say, 3, s is multiplied by theabsolute value 3. By
way of illustration,themeasurements
1,4,7, 10,13have a mean of7, a
variance of 22.50, and a standard deviation of 4.74. If the measurements
are multiplied by theconstant5,we obtain 5. 20, l-5. -50, 65. The mean is
now-5x7,or 35. The deviationsfrom the mean are 30,-15,0,+15,
*30. Squaringthesewe obtain 900, 225, 0, 225, 900. The sum of
squares is 2,250, the varianceis 562.50, and the standarddeviation is
23.72, whereas 5 times the original standard deviation of 4.74 is 23.70.
The slightdiscrepancyresultsfrom the rounding of decimals.
5.8 STANDARDSCORES
Hithertowe haveconsideredscores or measurements in the form in which
theywereoriginallyobtained. Suchscores are represented by the symbol
X with meanX and standarddeviation s. Suchscores in their orisinal
MEAsuREsoF vaprllroN, aNDKupTosrs
sxEwNEss,
AOVANIAGESOF THE VARIANCTAN
form are spoken of as r.n! Jco/'eJ. We have also considered deviations In efect,
in relalion lo
-
about the arithmetic mean. .r: X t. These are Inown as tleviatiott scoreof 65 on
theEngl
.r( ore.r and have a mean of0 and a standard deviation of J. lf now we the mathe
malics
exami
divide the deviation about the mean by the standard deviation, we obtain tionabo
vethemean,th
what is callcd a stundurd score represented by the symbol ;. Thus beconsideredto
be the
the mean,that is.52 +
XXx
individualmakesa scor
58on lhe mathematics
Standardscores have a mean of 0 and a standarddeviation of l. As pre

anceon the two subje(


viously shown, if ali measurementsin a sample are multiplied by a con-
his standardscoreis (5
stant,the standard deviation is also multiplied by the absolutevalue of that
scoreis (58-52)lt2:
-
constant. Deviation scores. ,r: X X, have a standard deviation s. Each
darddeviationunitbelc
score has a constant -,\'added. This leavess unchanged. Ifall the devia

anceis .5 standard dev


tion scores are divided by ,r. which is the same thing as nrultiplying by the
individualdid muchmo
constanl l/.r, the standard deviation of the scores thus obtained is s/s: l.
lhe performanceof thTo illustrate. the following observations have been expresse
din raw-
althoughthis is notrefle
score. deviation-score. and standard-score form,
orouscomparabilityof
shouldbe identical in sl
Individual ,' clearaswe proceed.
The reader shouldn
/1 I 7l.ll
is equalto N -
R .61 l. We
c 1l
-
D I
.-._x(x
E
t5 5 .'79 ,t
It) IO l.-58
The readershouldr
Sum .(J0 .{J0 sumof squaresof slandi
Mean l{J .0t) .ot)
6.tl 6..11 l(xt
5.9 ADVANTAGESOI IHE \
Becausestandardscores have zero mean and unit standarddeviation,
DEVIATIONAS MEASUI
they are readily amenable to certain forms of algebraic manipulation.
Many formulationscan be derived more convenientlyusing standard

The variance and stand


scoresthan using raw or deviation scores.
measuresof variation.
The use of standardscoresmeans, in effect, that we are using the stan

variancehascerlainaddi
dard deviation as the unit of measurement. In the above exampleindivid

intoadditivecomponents
ualI is l.ll standard deviations, or standard deviation units, below the
cumstance.The sample
mean. while individual F is 1.58 standard devialionunitsabove the mean.
eslimateof thepopulatiot
Standardscores are frequently used to obtain compalability of obser

der certain assumptjons


vations obtainedby different procedures. Consider examinations in
deviationin the populatiEnglishand mathematics applied to the same groupof indiv
iduals,and as-
doesof the meandeviati
sume the means and standard deviations to be as follows:
deviationare more ame
measures.Theyenterin
statistics.Theyarewide
on samplingstatisticsthe
Examination
Fnslish 658
effect the standarddevial
Malhemalics 5t l:
metersfrom samplevalr
5.' ADVANIAGES O' IttE VARIANC! ANO SIANDAED DIVIAIION AS MEASURIS OF VATIA'ION
71
ln cffect. in relation to the pcrlbrmance of the individuals in the group, a
score of 65 on the English cxamination is the equivalent of a score of -52 on
the mathematics examination. To illustrate. a score one stanclard deviation
above the mean.that is, 6-5'i 8, or 73. on the English examinutioncan
he consideled to be the equivalent of a score one standard deviation above
the mean. that is.52* I2. or 64. on the mathematics cxamination. If an
individual makes a score of -57 on thc English examination and a score of
58 on thc mathemalics examination, we may compare his relative performance
on rhc rwo subjects hv comparing his standard scores. On English
his standard score is (-57 65)/8 : 1.0.and on mathematics his standard
-
scor!'is (-5ll 5l)/l:: .-5. fhus on Englishhis pertbrmanccis one rtandard
deviation unit belo\\'the average, while on mathematics his perfornranceis
.-5 standard devialion unit abovc thc avcrage. Quite clearly.this
individuai did much more poorly in English than in mathematicsrelativeto
th perfbrmance of the group of irilividuals taking the examinations,
althoughthis is not reflcctcd in the original marks assigned. To attain rigorous
comparabilityof
scorcs,the distributions of scores on the two tests
should be identical in shape. The nreaning of this stat(-ment will hecome
clear as wc orocecd.
'fhe
reader shoulcl notc that thc sum of squarcs of standard scores, ):',
is equal to N l. We obs!'r ve lhat .:' : (l ,l 1'/.r':hence
-
\-_, >(r tr :(.\' *t'
,,..,.^.'":*";T;*"', ;:ll:n;'i -r,; i )./N.,he
sum of squares of standard scorcs is N and not N l.
5.9 ADVANTAGESOF THE VARIANCEAND STANDARD
DEVIAIIONASMEASURESOF VARIATION
The variance and standard deviation havc many advantages over other
measures of variation. Much statistical work involves their use. The
variancehas certain additive propertiesand may on occasion be partitioned
into additive components. each of rvhich may be related to some causal circumsta
nce.
The sample standard deviation is a more stable or accurate
estimate of the population pantmeler than olher measures of virriation. Under
certain assumptions it provides a more stable estima(e of the standard
deviation in the population than the sample mean deviation. for example,
does of tbe mean deviation in the population. The variance and standard
deviation are more amenable to mathematical manipulation than other
measures. They enter into formulas for the computation of man) types of
statistics. They are widely used as measures oferror. ln laterdiscussion
on sampling statistics the reader rvill observe that the stanrlar,l ellot is rn
effect the stantlard deviation of errors made in estimating population parameter
s
from sample values. These errors result from the operation of
oF vARrATroN, AND
72
MEAsuREs sKEwNEss,(uRTosrs
5.'I MEASURESOf SXEWNISSAND TUTT(
chance factors in random sampling. A full appreciation of the importance
The rationale for thiss
and meaning of the varianceand standard deviation in their many ramifica

tribution (or any set ofr


tions requires considelable familiarity with statistical ideas.
the mean, whenraised
belowthe mean. when
distribution.4r.,: 0,an
5,IOMOMENISAEOUTTHEMEAN sumsof deviations abr
power, will not balance
The mean and the standard deviation are closely related to a family of g, + 0. l
f the disrribt
descriptive statistics known as mom?nls. The first four moments about positive;w
hennegativr
the arithmetic mean are as follows: introducedin order to e
fer in variability. Thus
:{x -x)
U
skewnessof a set of n
15.8r
Drt:
N
scoreon a psychologica
-
:(x t)2 N-| ,
will recallthata standar
'

"lt: N N uslngstandardscoresil
measurementsto anotl_,,:rr,*rr" directlyanalogousto th
As an illustrationof
-2(X *)'
'n'
AI
BI
ln general,the rth moment aboutthe mean is givenby
X)'
Thesenumbersexpressr
-
t5.el
^,._2(X
A
The term "moment" originatesin mechanics. Considera lever sup-B
portedby a fulcrum. lf a force li is applied to the lever at a distance-r.
from the origin, then.l,r: is called the momentof the force. Further, ifa Set ,4
is a symmetrical
deyiationsraisedtothetl
second force.ll is applied at a distance -rr. the total moment isfix1 *./l.rr.
lf we square th distances x, we obtain the second moment;if we cube
.4 64
them,we obtain the third moment; and so on. When we come to consider
a -64
frequency distributions, the origin is the analog of the fulcrum and the
frequenciesin the various class intervals are analogousto forces operating
For setl, ru,: 0 and g
at variousdistancesfrom the origin. Observe that the first moment about
.387. SetB is a positiv
the mean is 0 and the secondmoment is (N l)/N timestheunbiasedsam-
The commonly used
ple variance. The third momentis used to obtain a measureof skewness,
and is definedas
and the fourth moment.a measure of kurtosis.
t5.lrl
5,I1 MEASURES AND KURTOSIS
This definitionis based
OF SKEWNESS
mean, when raisedto th(
The commonly used measureof skewness makes use of the third moment fourthmoment.
Thecol
and is defined as tive thicknessof the tails
tionmaybeflatteror mol
I5.r0l meancontributemuchm
m2\ m2
The termzzr, is used to a
5.11MIASURISOF S([WN[SS AND (URlOSrS
The rationale fbr this statistic is based on the observation that when a distrib
ution(
or any set ofnumbers) is symmetrical. the sum ofdeviations above
the mean, u,hen raised to the third power. will balance the sum ofdeviations
below the mean. when raised to the third power. Thus for a symmetrical
distribution.,r r: 0. and the A,,: L If the distritrution is asymmetrical. the
sums of deviations above and below the mean, when raised to the third
power. will not balance. Thus for an asymrnetrical distribution rrr,; 0 and
e, + 0. lf the distribution.or set of numbers.is positivelyskewed.g, is
positive;when negatively skewed gr is negative. The quantit!,ar.f rri,is
introducedin order to ensure that gr is comparable for distributions that differ
in variability. Thus g, is independent ofthe scale ofmeasurement. The
skewnessof a set Df measurementsin gmms, meters.pounds,or units of
score on a psychologicaltest can be directly compared usingg,. The reader
(,\ -tlir.
rvillrecallthata standard score is tlehned a\:: Oncreasonfor
using standard scores is 1() achieve comparability of scores fiom one set of
measurementsto anothef. The use rrf ri:r r4 in the definirionof e, is
directly analogous to the use ol's in the definition of a slandard score.
As an illustration of g,,considertq,o sets of numbers. ,4and I
A6 I0 l:t4
R l0 l5

These numbers expressed as deviations tiom the mean become


Set,,1is a s1'mmetrical set of numbers. and set B is asymmetrical.
A410 rl +4
8420 +l -5
-lhese
deviationsraised to the third power are as follows:
Ito
B 64 o r l15
-8
For set I, rr':0 andg,:0. For set B.,rrr: 10.80,in.:9.10, and g,:
.387. Set B is a positively skewed set of numbers.
The commonly used measure of kurtosis involves the foufth moment.
and is defined as
[s.l ]l
This definition is based on the observation that large deviations from the
mean. when raised to the fourth power. will contribute substantiallyto the
ti)urth moment. The concept of kurtosis is nrole closely Iinked to the relative
thickness of the tails of distributions thiin to the idea that one distribution
may be flatter or more peakedthan another. Largedeviationsfrom the
mean contribute much more to thc fourth momenl than smallerdeviations.
The term 2,,, is used to achieve comparability. lt serves the same purpose
oF vAprATroN,
MEAsuREs sxEwNrss,aNo KuRrosrs
as ,rj\'11 doesin the delinition ofg1. The number 3 comes about because
the ratio ,??r/nr,': 3 for a normal distribution. This means that g,:0 for a
normal distribution. For a leptoku ic distribution. 8" is greaterthan zero,
and for a platykurtic distribution..9.is less than zero.
As an illustration. consider the firllowinr:setsof numbers.
A 6 8 l0t1 t1
B 5.64 9 l0 r 1 1.1.16
Inspectionsuggesls lhul both sets are platykurtic,bul set,.1 is more platykurtic
than set B. The two sets of numbers have the same nreanand the
santestandarddeviations and both are symmetrical. Yet they differ in the
property called kurtosis. Deviations about the mean, when raised to the
Jourth poxer, are as follows
; :,i6 l6 0 l6 2-s6
a l6l.t6 I I l6t .l6
ForA, n,: 108.80;for B, tn,: 114.94.For both set,4 and B. rr,:8.00.
Fof set,4,gr: 1.30; for set B, f..: .74. Both sets areplatykurtic,but
set.4is moreplatykufticthan set B asshownby the statisticgr.
DESCRIPTIVE
5.12A SIMPLE SYSIEM
The mean, the standard deviation, and the measuresof skewness and kurlosis
constitute a simple system for describing collections of numbers and
comparing one collection with another. When we have made these four
statements.X, s. g,. and .E'r.about any collection of numbers,we have said
just about everything wofth saying. The system is parsimoniousand elegant.
Few situations exist where, having nrade four preciseand simple statements,
just about everytbing has been said that is of any importance. It is true
lhat on comparing two sets of numbe:s. X, .t, I',. and gj may be equal. and
yet the numbers may differ. Suchdifitrences nray be expiored using higher
moments but for all practical purposes these differences will prove trivial.
BASIC TTRMS AND CONCEPIS
Range
Mean deviation
Populationvariance.(r:
Samplevariance. .r'!
s.fi
MaASUIESOt SKIWNISS Al{OruRt
Estimate:biased;unb
Degreesoffreedom,d
Sample standard devii
Standardscore.i
Momentsabout the m
Measureof skewness.
Measureof kurtosis.
1
EXERCISES
Ll
For the measurem
m andeviation,(
i2 The variancecalc
sum of squares of
u3
A biased variance
What is the corres
4
The variance for
variancebe if all r
(b)divided by a c(
-
5 Show that )(X
Scholaslicaptitudr
of 100. A student
L.'
Expressthesescot
Expressthe measl
tr rn" sum of squa
, /
l/8
The mean andstar
for a class of 26 st
make scores of 50,
scores?
9
Calculatethesecol
6, 10, 14, 16. Cor
l0 The following are
Group I 2 3
Group II 2 4
Calculatemeasures

You might also like