You are on page 1of 9


Why Sample the Population? Why not study the whole population?

The physical impossibility of checking all items in the population. The cost of studying all the items in a population. The sample results are usually adequate. Contacting the whole population would often be time-consuming. The destructive nature of certain tests (e.g., study of light bulb life .
Statisticians advocate Probability Sampling (not judgment sampling)

! probability sample is a sample selected in such a way that each item or person in the population being studied has a known likelihood of being included in the sample.
If e !"e #!d$ment "am%lin$ e ill ha&e no idea a'o!t the a((!ra() of o!r e"timate" "in(e e ha&e no idea a'o!t the *!alit) of #!d$ment"+ Pro'a'ilit) "am%lin$ ena'le" !" to (on"tr!(t probabilistic error bounds+ (to 'e "t!died in a "e(ond (o!r"e in Stati"ti(")+ The aim of "am%lin$ i" to $et a "am%le, hi(h i" representative of the %o%!lation+

Methods o Probability Sampling "imple #andom "ample ("#" $ ! sample formulated so that each item or person and each subset in the population has the same chance of being included. (e.g., from % items, prob. that any one is selected&'(%. ! simple way to implement this is to use a lottery or computer program. )or e*ample we can mark % cards and write names of items on these cards, shuffle the cards and select n cards. This will yield a simple random sample of si+e n. "ystematic #andom "ampling ("ys#" $ The items or individuals of the population are arranged in some order. ! random starting point is selected (by lottery and then every kth member of the population is selected.

If there are N-./// "tore" alon$ 0ifth a&en!e and e ant to "ele(t n-.// "tore" in the "am%le, 1-N2n or ./ 3e "h!ffle onl) the fir"t 1, and "ele(t one, "a) 45 No on e ")"temati(all) "ele(t "tore" ') addin$ 1, 61, 71, 51 et( to 5 So a ")"temati( "am%le ill ha&e "tore 45, .5, 65, 75, 55, 85 et(+

"tratified #andom "ampling ("tr#" $ ! population is first divided into subgroups, called strata, and a sample is selected from each stratum. (e.g., ,-. males, /-. females
If a "am%le of ./ i" "ele(ted, (n-./) 9/: of n -9, "o "ele(t 9 male" and 7 female"+ In $eneral, N-%o%!lation "i;e, N.-"trat!m .(female), N6- "trat!m 6 (male"), n-"am%le "i;e de"ired+ Sam%le "ho!ld ha&e - (N.2N)<n from "trat!m . and "o on+ Th!" 0emale" in "am%le - (N.2N)<n, Male" in the "am%le-(N62N)<n
Population has 25 students of whom 15 are white and 10 black. A stratified sample of size 10 should have how many whites / blacks Answer! "et #$population size% #1$blacks$10% #2$whites $15% n$sample size$10. #ote that #1 /# $&10/25'(10 or ) blacks and *ow many whites in the sample &#2/#'(n$ &15/25'(10 or + ,erify that +-)$10. .e have a representative sample

Cluster "ampling$ ! population is first divided into clusters and a sample of the clusters is selected. (used in marketing . 0t works if clusters are as heterogeneous as the population. )or a large country like the 1" it is convenient to use cluster sampleing and choose some geographical locations (2shkosh 3isconsin . ! sampling error is the difference between a sample statistic and its corresponding parameter. 3e can make probabilistic statements about this sampling error only if we have a probability sample (not 4udgment sample . 0n general, sampling distribution is for any sample statistic (mean, median, mode, standard deviation, etc defined over a sample space consisting of all possible samples of si+e n from the available population of si+e %. 5et us first study the sampling distribution of sample mean as an e*ample.

Sampling !istribution o the Sample Mean

The sampling distribution of the sample means is a probability distribution consisting of all possible sample means based on specified sample si+es selected from the population. The sampling distribution yields the probability of occurrence associated with each sample mean over the set of all possible sample mean numbers.

"#$MP%" & The law firm of 6oya and !ssociates has five partners (!,7,C,8,9 . !t their weekly partners meeting each reported the number of hours they charged clients for their services last week. ! ::, 7 :;, C /-, 8 :;, 9 ::. (eg, <r. 9 charged :: hrs 0f n&:, two partners are selected randomly, how many different samples are possible= This is the combination of >

ob4ects taken : at a time. That is, >C:& >?((:?/? &'-. There are '- possible samples.

Ten "am%le mean" are $i&en 'elo = (e+$+ if the "am%le ha" A and B, "am%le mean i" 65)

AB 65, AC 6>, AD 65, AE 66, BC 6?, BD 6>, BE 65, CD 6?, CE 6>, E0 65 E@er(i"e= dra a %i(t!re ith fre* on &erti(al a@i" for "am%lin$ di"tri'!tion of mean"+ Note a'o&e that mean of A and C i" 6>, B and D i" 6> and mean of C and E i" al"o 6>, hi(h mean" the x -6> re%eat" it"elf three time" (ha" fre*!en() 7)+ 3e find follo in$ li"t of fre*!en(ie"= x -66 ith fre*- ., x -65 ith fre*- 5, x -6> ith fre*- 7, x -6? ith fre*- 6+ Thi" i" almo"t the "am%lin$ di"tri'!tion of mean"

Total frequency &'-.

If e di&ide indi&id!al fre*!en(ie" ') total fre*!en() e $et Arelati&e fre*!en()B or %ro'a'ilit)+ The"e %ro'a'ilitie" add !% to one, "o e ha&e a %ro'+ di"tri'!tion+ The

This is a sampling distribution of all possible sample means. %ow the random variable is x , it is no longer 4ust @.

a'o&e information "a)" that the %ro'a'ilit) that "am%le mean i" 66 i" 6 o!t of ./ or /+6+ The "am%lin$ di"tri'!tion i" "im%l) thi" %ro'a'ilit) di"tri'!tion defined o&er all %o""i'le "am%le" of "i;e n from the %o%!lation of "i;e N+ In the real orld %ro'lem" N ill 'e lar$e (e+$+ 6// million US %o%!lation) and n ill 'e al"o 'e lar$e (e+$+, ./// %eo%le "!r&e)ed) and (N C n) ill 'e a"tronomi(al n!m'er+ then the "am%lin$ di"tri'!tion (an onl) 'e ima$ined+ 3e ha&e (ho"en a "im%le e@am%le of N-8, n-6 "o that the entire "am%lin$ di"tri'!tion (an 'e e@%li(itl) (om%!ted and &i"!ali;ed+

3hat are the %ro%ertie" of the "am%lin$ di"tri'!tion of "am%le


C Pro%ertie" in(l!de the mean and &arian(e of

Compute the mean of the sample means and compare it with the population mean$ )or our simple e*ample we can e*plicitly calculate the mean of means or 9*pected value of means or 9( x & A The mean of the sample means is obtained by weighting each sample mean by its frequency& B(:: (' C (:D (D C (:; (/ C (:E (: F('-&:>.: B#ead page :'D of your te*tF A "ince we know the value of every observation in the population in this (impractical simple e*ample, we have the directly calculated population mean & (::C:;C/-C:;C:: (> & (:>.: . %ote that in the real world we usually cannot find , we can only make

inferences about it from sample mean

A Observe that the grand mean of all 10 sample means (25.2) is equal to the population mean (25.2). Sin(e E( x )- , e "a) that Sam%le mean x i" an UNBIASED e"timator of %o%!lation mean 3e &erified thi" %ro%ert) a'o&e for the "im%le e@am%le of La )er ho!r"+ In $eneral, "!(h &erifi(ation i" diffi(!lt and one need" to !"e ad&an(ed theor)+

+ It i" %o""i'le to &erif) int!iti&el) that lar$er the "am%le "i;e, "maller the &arian(e+
No e t!rn to the &arian(e of

0or e@am%le if D i" hei$ht (1no n to 'e a Normal random &aria'le) e ant to e"timate the a&era$e hei$ht of all 0ordham "t!dent" from a "mall

"am%le of onl) ./ "t!dent"+ 3hen e (on"ider all %o""i'le "am%le" e (annot r!le o!t the "am%le of &er) tall fol1" (e+$+, all ./ from the 0ordham 'a"1et'all team ho are, "a), 9 ft tall)+ No the a&era$e hei$ht o&er "e&en feet i" lar$e and !%%er limit of the ran$e of a&era$e" ill 'e "e&en feet+ Similarl) the a&era$e for the "horte"t ./ "t!dent" ill 'e "maller than fi&e feet ("a))+ Th!" the ran$e of &aria'ilit) from the "malle"t to the lar$e"t a&era$e hei$ht" 'a"ed on n-./ ill 'e "%read o&er a ide ran$e+ Re(all that ide ran$e mean" lar$e &arian(e+ B) (ontra"t, if e (hoo"e n-.//, the a&era$e hei$ht for the talle"t .// ill not 'e "e&en feet, '!t "maller+ Similarl) the a&era$e hei$ht of "horte"t .// ill 'e hi$her than for "horte"t ./ and the ran$e for n-.// ill not 'e a" lar$e a ran$e for n-./+ Th!" the ran$e "%read of the "am%lin$ di"tri'!tion de(rea"e" a" n in(rea"e"+ In fa(t the &arian(e (an 'e %ro&ed to 'e in&er"el) %ro%ortional to n a" e "ee 'elo + Standard "rror (S") o the Sample Means (S'( root o sampling variance or standard deviation( )t is customary to distinguish between usual standard deviation (S!) and that o a sampling distribution (S")

The standard error of the sample means is the standard deviation of the sampling distribution of the sample means. n i" the "i;e of the "am%le+ is the standard deviation of the population (assumed known . 0t is computed by$ *bar & ( (n as a first appro*imation if % is not known or % is large (almost infinity . *bar is the symbol for the standard error of the sample means. 0f is not known and n /-, the standard deviation of the sample, denoted by s is used to appro*imate the population standard deviation. Then the formula for the standard error becomes$ "9( x & s sub x &s ( n !lways, think of "9( x as the standard deviation of the #andom Gariable x . 3hat is the shape of the

probability distribution of ( x = The following theorem says that it is %ormal and hence the theorem enables us to solve all kinds of practical problems.
*entral %imit +heorem (*%+) See page ,-& o .aw/es te0tboo/(
ECentral mean" it i" of (entral im%ortan(e to Stati"ti("+ Limit theorem 'e(a!"e it "t!die" the 'eha&ior a" n 'e(ome" lar$e, namel) a" n tend" to infinit), in %ra(ti(e for n7/+F

Thi" i" a %o erf!l re"!lt ') a mathemati(ian named Pol)a in .G6/H" "ho in$ that EIEN I0 @ i" NOT NORMAL, if n7/ the %ro(e"" of a&era$in$ (i" "o hel%f!l) that it )ield" normalit) of the "am%lin$ di"tri'!tion of ( x ) ith the &arian(e $i&en 'elo + )or a population with a mean and variance :, the sampling distribution of all possible means of all possible samples of si+e n generated from that population will be appro*imately normally distributed H x % I , (: (n B(%n ((%' F J assuming sufficiently large n. (n /- . 0f % is large the finite population correction term B(%n ((%' F is close to ' and can be ignored. Then, this formula simplifies to x % I , (: (n J
On %a$e" 7G.J7G6 of )o!r te@t there are "e&eral !"ef!l fi$!re"+ The) "ho that e&en if e "tart ith a 'imodal, e@%onential de(a) or !niform di"tri'!tion", hi(h are de(idedl) not normal to 'e$in ith the %ro(e"" of a&era$in$ $i&e" !" a normal di"tri'!tion for the "am%le mean %ro&ided the "am%le "i;e i" at lea"t 7/+ 3e ma) 1no that h!man intelli$en(e or h!man hei$ht are normall) di"tri'!ted, '!t e ha&e no rea"on to thin1 that La )erK" ho!r" are normall) di"tri'!ted+ The (entral limit theorem "a)" that a" lon$ a" )o! are a&era$in$ o&er 7/ la )er", normalit) (an 'e a""!med+ Thi" i" &er) !"ef!l "in(e e do not ha&e to &erif) the !nderl)in$ "ha%e of the di"tri'!tion+ A $ood %ra(ti(e e@am%le hi(h hi$hli$ht" the differen(e 'et een ordinar) di"tri'!tion of D and "am%lin$ di"tri'!tion of D'ar ith "e%arate ord %ro'lem" follo "= IL-D M N(../, ./6), 0ind P(ILN?/) Intelli$en(e L!otient (IL) i" normall) di"tri'!ted ith mean ../ and "tandard de&iation of ./+ A moron i" a %er"on ith IL le"" than ?/+ 0ind the %ro'a'ilit) that a randoml) (ho"en %er"on i" a moron+ (Oint thi" random &aria'le i" for a "in$le %er"on D) Let idiot 'e defined a" one ith an IL le"" than G/+ 0ind the %ro'a'ilit) that a randoml) (ho"en %er"on i" an idiot+ (Oint thi" random &aria'le i" for a "in$le %er"on D)

If a "am%le of 68 "t!dent" i" a&aila'le, hat i" the %ro'a'ilit) that the a&era$e IL e@(eed" ./8C (Oint thi" random &aria'le i" for an a&era$e o&er 68 %er"on" or D'ar) 3hat i" the %ro'a'ilit) that the a&era$e IL e@(eed" ..8 (Oint thi" random &aria'le i" for an a&era$e o&er 68 %er"on" or D'ar) An" er" are $i&en after man) 'lan1 line"

D-IL M N(../, (./)6 ) m!- -../ "tandard de&iation-"d- - ./ 5 time" "d- 5 -5/ Pla!"i'le ran$e of D ha" the lo er limit- J5 -../ P 5/ or 9/ !%%er limit i" Q5 -../ Q 5/ - .8/ Thi" (orre"%ond" ith the %la!"i'le ran$e of "tandard normal ; (J5 to 5) EDERCISE .= Gi&en that D-IL M N(../, (./)6 )+ If a d!m' moronK" IL i" ?/ or le"", find the %ro'a'ilit) that a randoml) (ho"en %er"on i" a d!m' moron+ ANS3ER .= Thi" i" #!"t normal di"tri'!tion ord %ro'lem+ In ")m'ol", e ant to find= P( @N?/)+ Re(all that %ro'a'ilit) i" "ome area !nder the Normal 'ell "ha%ed (!r&e+ 3e ant to e&al!ate a "haded area 'et een J to ?/ Thi" "haded area ha" the lo er limit of J and !%%er limit of ?/ The ma%%in$ of J to the ; "(ale i" o'&io!"l) J5 for all %ra(ti(al %!r%o"e" Oen(e e need not 'other ith the lo er limit of de"ired "haded area+ 3e "till need to ma% the !%%er limit ?/ to the ; "(ale ') !"in$ the ; tran"form an) ; - (@J) 2 - (?/J../)2./ 0or o!r !%%er limit @-?/-IL or moron, -../ and -./ ;- (?/J../)2? -J7 hen ;-7 area 'et een / and 7 i" /+5G?9 from the ta'le A of )o!r te@t Tail area i" /+8J/+5G?9 hen(e the an" er i" /+//.7

In R "oft are e (om%!te %norm(J7) to $et /+//.7 for the left tail EDERCISE 6= D-IL M N(../, (./)6 ) i" $i&en+ If a d!m' idiotK" IL i" G/ or le"", find the %ro'a'ilit) that a randoml) (ho"en %er"on i" a d!m' idiot+ In ")m'ol", find= P( @NG/)+ ANS3ER 6= 0or o!r !%%er limit @-G/-IL or idiot, -../ and -./ Ma%%in$ G/ to the ; "(ale i" (G/J../)2./ - J6 Tail area to the left of ;-J6 i" /+8J/+5996 -/+/66? In R "oft are e (om%!te %norm(J6) to $et /+/66? for the left tail EDERCSE 7= D-IL M N(../, (./)6 ) i" $i&en+ 0ind %ro'a'ilit) that the a&era$e IL of 68 "t!dent" e@(eed" ./8 ANS3ER 7= Sin(e the "am%le "i;e n-68 i" $i&en, thi" i" not a r!nJofJtheJmill normal di"tri'!tion ord %ro'lem+ The random &aria'le !nder (on"ideration here i" the a&era$e+ Oen(e, a "am%lin$ di"tri'!tion i" rele&ant hen e (on"ider a&era$e IL a" the &aria'le of intere"t, not he IL of an indi&id!al "t!dent, '!t the a&era$e o&er 68 "t!dent"+ "tandard de&iation of the "am%lin$ di"tri'!tion - Standard Error - SE - 2n n-68 n - 68 -8 SE - 2n - ./28 - 6 5SE - ? Pla!"i'le ran$e i" ../J? to ../Q? or ./6 to ..? for @'ar -a&era$e IL Area to the ri$ht of ./8 i" to 'e fo!nd M!"t ma% ./8 to the ; "(ale Ma%%in$ no i" ;-(@'ar J )2SE - (./8J../)26 - J6+8 Area 'et een / to 6+8 i" /+5G7? Total area /+8Q /+5G7? - /+GG7? - %ro'a'ilit) that the a&era$e IL e@(eed" ./8 In R "oft are e (om%!te %norm(J6+8,lo er+tail-0ALSE) to $et /+GG7? No find %ro'a'ilit) that the a&era$e IL e@(eed" ..8 Thi" i" the tail area to the ri$ht of ; - (..8J../)26 - /+8J/+5G7? - /+//>6 %norm(/+/.,lo er+tail-0ALSE) EDAMPLE 5 Li'rar) !"!all) ha" .7: of it" 'oo1" (he(1ed o!t 0ind the %ro'a'ilit) that in a "am%le of 8?? 'oo1" $reater than .5: are (he(1ed o!t+ ANS3ER= 3e ha&e %er(enta$e" here, "o it i" not the "im%le normal di"tri'!tion ord %ro'lem+ It !"e" the fa(t that %R M N(%, E%*2nF ) hi(h "a)" that the Sam%lin$ di"tri'!tion of the %ro%ortion %R i" Normal ith mean % and &arian(e %(.J%)2n E(%R)-/+.7, n-8?? Iar(%R)-6 (%R) - (/+.7)(.J/+.7)2n or /+///.G678

3e need the "*!are root of thi" &arian(e for !"e in o!r ; tran"form+ SE(%R)- "*rt(/+///.G678) - /+/.7?>G/7 - /+/.7G (here e ro!nd to 5 %la(e") Pla!"i'le ran$e i" /+.7 5< /+/.7G 5<SE i" /+/88> E/+/955 to /+.?8>F i" the %la!"i'le ran$e+ 0ind the %ro'a'ilit) that in a "am%le of 8?? 'oo1" $reater than .5: are (he(1ed o!t+ Oen(e the de"ired %oint i" to the ri$ht of the (enter at /+.7 In ")m'ol", e ant to (om%!te P(%R S/+.5)+ No let !" a%%l) ; tran"form to 'oth "ide" of the ine*!alit)+ P(%R S/+.5)- P(; S (/+.5 J /+.7)2SE ) or e ha&e to (om%!te= P(;S /+9.G5) - P(;S /+96)+ 3e m!"t ro!nd to 6 %la(e" to the ri$ht of the de(imal "in(e ; ta'le" are that a)+ 3e ant tail area, '!t e (an loo1 !% onl) the area from / to /+96 for ; /+8 MINUS /+6>56 or ANS- /+678?

Co%)ri$ht= Ori"hi1e"h D+ Iinod La"t !%dated ..2.G2.7 .9=/> a..2%..

You might also like