You are on page 1of 45

Basic Statistics

for the

University of Pretoria
Faculty of Economic & Management Science
10, 11, 14 & 15 September 2009

Presented by
Sumari O’Neil
sumari.oneil@up.ac.za
Table of Contents

1. STATISTICS AND ALL THAT JAZZ 1


2. DESCRIPTIVE STATISTICS 3
2.1 Frequencies 4
2.2 Central tendency 6
2.3 Statistics for variability 7
2.4 Working with percentages: 7
3. PARAMETRIC AND NO N-PARA METRIC STATISTICS 8
3.1Testing the assumption of normality 9
3.2 Equality of variances 14
4. FROM QUESTIONNAIRE TO DATASET 15
5. SCREENING AND CLEANING YOUR DATA 16
6. MANIPULATING YOUR DATA 17
6.1Calculating the total scores of scales or indexes 17
6.2 Reversing negatively worded items 17
6.3 Collapsing a continues variable into groups 18
7. CORRELATION ANALYSIS 18
7.1 Statistics to test relations between variables 19
7.2 How to interpret the results of the correlations 24
7.3 The coefficient of determination (r2) 25
7.4 How to write up the results of a correlation analysis in a research report 25
7.5 Graphically representing the relationship between variables: 25
7.6 Other analysis that is grounded in correlation analysis 27
8. T ESTING DI FFERENCES BETWEEN GROUPS (CAUSAL RELATIONSHIPS) 28
8.1 What does “testing for differences between groups,” mean? 28
8.2 Testing differences between two independent groups: t-test for independent groups 30
8.3 The nonparametric alternative for the t-test for independent samples: Mann-Whitney U
test 32
8.4 Testing differences between two dependent / related samples 33
8.4 The non-parametric alternative to the t-test for dependent/related samples: Wilcoxon
Singed-Rank Test 35
8.5 Testing differences between more than 2 groups on one variable: One-way Analysis of
Variance (One way ANOVA) 35
8.6 The non-parametric alternatives for the One-way ANOVA 40
References: 41
1. Stat istics and all that jazz
Statistics is used in quantitative research to analyse and interpret the data collected
during the data collection process. (Although very elementary statistics such as
frequency counts are sometimes used in qualitative statistics, most hard core qualitative
researchers will RATHER DIE than use any form of statistics!) In short, it implies that you
collected data from the “real world” by means of a questionnaire (most commonly used) and now
you want to tell the story of the “real world” by using statistics. Field (2009) explains this by
saying that we are actually building statistical “models” of reality. When you look at the model
you would like to be able to say that “this is what reality” looks like!

Of course, when you build a model you would want to use the best material to depict the reality
as accurately as possible. In terms of statistics these material refers to firstly, your data and
secondly the statistics used.

The data comes first. Here, the garbage-in, garbage-out principle applies. Make sure before the
study that, 1. your data answers the research question, 2 the data comes from a representative
sample and 3. the data meets the parameters of the statistics you want to use. In terms of the
latter, every statistic has a set of criteria to be met for optimal usage. Should your data not meet
the criteria for the statistic needed to answer your research question, you will do a statistic will
very little power and little validity or you may not be able to do the statistic at all.

Then in terms of the statistics, you have to make sure that if your data is good, that you choose
the best possible statistic to depict the “reality of your research question”. There is probably more
than one option to consider when selecting a statistic. Make sure you choose the best one to
increase the accuracy of your results.

It should be clear by now that although statistics is used for the analyses of the data, it should
actually be considered from the start of the research process. Generally research topics for
explorative research (topics not explored in great depth) are better answered through qualitative
research. From its nature, quantitative research gives more answers in terms of the breadth of a
problem, for instance the prevalence of HIV Aids in South Africa. Qualitative research gives a
better depiction of the depth of a problem, e.g. the experience of cancer survivors.

After finding a topic, a research question should be stated at some point and out of the question
flow the purpose of the research. Some research questions are better answered by quantitative
research. For instance, questions that revolves around determination (such as the prediction of

-1-
one event by means of another), validity (i.e. the validity of a questionnaire) and causal
relationships between variables (e.g. whether gender is the cause of a negative attitude) are all
better answered through quantitative methods.

When stating the research question, you should already have an idea of what type of analysis
you can possibly use. Most statistical analyses have some data requirements. For instance,
requirements of sample size and level of measurement (i.e. most parametric statistics require
data to be at least on interval scale of measurement).

Has the topic been explored What methodology would answer


in depth and breadth? the Research question best?
Find a topic

Research Question Does the design fit the criteria for the
statistical analysis? Is the sample big
enough for the statistical analysis?

•Design
oPlan for measurement
oSampling plan and procedures
oData analysis

•Interpretation of results Data analysis &


•Conclusion & recommendations interpretation
The Research
Process

Fig. 1: The research process

-2-
BOX 1: Different approaches to research

2. Descriptive Statistics
Descriptive statistics tells you what your data looks like. Say for instance, you used a
questionnaire to gather data. Let’s say the questionnaire was about asked biographical questions
about the managers (e.g. age, year’s experience, gender) that completed it, as well as questions
with regard to their management style. By doing descriptive statistics you will be able to draw a
profile of the managers that took part in your research. You would also be able to get an idea of
the management styles they use.

The first step of statistical analysis usually involves descriptive statistics. You can use it to
describe the sample, to check if the data is fit for specific analysis or to answer a specific
descriptive or exploratory research question.

-3-
For different types of data, different descriptive statistics are used. In other words, different
descriptive statistics are used for data from different levels of measurement. Nominal and ordinal
data are henceforth referred to as categorical data/variables. This is since these two levels of
measurement indicate different categorical answers in your data set. Interval and Ratio level data
on the other hand is referred to as Scale data/continues variables. This is because they indicate
respondent answers on a scale from 0/1, 2, 3, through to x.

A special type of categorical variable is the dichotomous variable. This is a variable that
represents only 2 categories. For instance, the variable of gender represents male and female.

Descriptive statistics include frequencies/frequency counts, statistics of central tendency and


statistics that indicate variability/dispersion.

2.1 Frequencies

Frequencies indicate to us the amount of cases (respondents), which falls into each of the
available categories. Frequencies can be displayed in terms of counts or percentages.
Frequencies are usually displayed by means of frequency tables, but can also be displayed
graphically in graphs and charts. Suitable graphs to display frequencies for categorical data are
bar charts or pie charts.

Example of a frequency table:

VOTE FOR CLINTON, BUSH, PEROT

Cumulative
Frequency Percent Valid Percent Percent
Valid Bush 661 35.8 35.8 35.8
Perot 278 15.1 15.1 50.8
Clinton 908 49.2 49.2 100.0
Total 1847 100.0 100.0

In this example, I wanted to see the frequency of people that voted for each one of the three
candidates for the US presidential elections in 1994. In my interpretation, it is very obvious that
most of the voters (908) voted for Clinton in 1994.

Example of bar chart:

-4-
VOTE FOR CLINTON, BUSH, PEROT

1,000

800

Frequency
600

908
400

661

200
278

0
Bush Perot Clinton

VOTE FOR CLINTON, BUSH, PEROT

Here I drew up a bar chart of the frequency distribution in the above-mentioned example.

Example of Pie chart:

VOTE FOR CLINTON, BUSH, PEROT

Bush
Perot
Clinton

35.79%

49.16%

15.05%

Here is a pie chart displaying the percentages of the frequencies.

For a scale variable, one would display a frequency distribution graphically by means of a
histogram and not a bar chart or pie chart. In SPSS you also have the option of adding a normal
curve to the histogram to get an idea of the normality of the distribution. Another option is to
graphically represent it by means of a frequency polygon. Although a frequency polygon is
appropriate for ordinal data as well as other scale date, it is not appropriate for nominal data.

-5-
2.2 Central tendency

For variables measured on nominal scale, the statistic for central tendency is the mode. The
mode indicates the category with the greatest number of cases. Say for instance, your question
asked peoples occupation from a list. If most of the people indicated they were medical doctors
that would be the mode of the dataset for that question.

As an example of mode look at the following (From Glosser (2004)


http://www.mathgoodies.com/lessons/vol8/mode.html):

Example 1: The following is the number of problems that Ms.


Matty assigned for homework on 10 different days.
What is the mode?

8, 11, 9, 14, 9, 15, 18, 6, 9, 10

Solution: Ordering the data from least to greatest, we get:

6, 8, 9, 9, 9, 10, 11, 14, 15, 18

Answer: The mode is 9.

For ordinal level data the best indicator of central tendency is the median the median is
the exact middle point of the data set. It indicates the value above and below which half
of the cases fall.

(From http://www.uwsp.edu/psych/stat/5/CT-Var.htm)

For interval and ratio data, one uses the mean (average score) as indicator of central tendency.
Thus with categorical data the mode and median have the same function as a mean. The mean

-6-
is not used with interval data if the distribution is skewed (not normal). In this case you will use
the median.

2.3 Statistics for variability

As mentioned above, another type of measure that can be used to summarise a data set is the
measures of dispersion or variability. These measures refer to summaries of the size of the
differences between each score and every other score. There are three measures of variability:
 Range:
Range The difference between the largest and smallest score
 Variance:
Variance Extent of the differences among scores. The greater the differences
the more the mean fails to represent the data set. The range takes only into
account the largest and smallest score. The variance takes into account every
score.
 Standard deviation:
deviation The standard deviation of the scores from the mean in the
same measurement unit as the original score.

Since categorical variables have a restricted range (it will always be bound to the number of
categories), the variability is often not used as a description. One can rather look at the minimum
and maximum scores in the data set or the range. For scale data, the standard deviation is used.
Take note, that if the standard deviation = 0, all the scores are the same variability. The higher
the standard deviation, the higher the variability.

2.4 Working
Working with percentages:

In order to compare frequencies, most researchers work out the percentage of frequency in each
category. Percentages represent the proportion of responses within each category in your
dataset, and serves two purposes: 1) it simplifies the data by reducing the numbers to a range
from 1 – 100 and 2) it translates the data into a standard form for relative comparrison. To
calculate the percentage, you need to know the number of observations in the category and the
total number of observations in the data set. The formula for percentages is:

Percentage = f /N * 100%
where f = the number of observations in the category and N = the total number of
observations in the data set.
N can also be described as the “base,” “total,” or universe.

-7-
For example, People in poverty in Johannesburg is 400,000. The total number of people living in
Johannesburg is 132 000 000. What is the percentage of people living in poverty? Of the total of
poor people, 260 000 are women, what is the percentage of poor women living in Johannesburg?

There are some rules when it comes to interpreting percentages:


1. Percentages cannot be averaged unless each is weighted by the size of the group from
which it is computed. This is referred to as a weighted average.
2. When a very small base is used (say the percentage of out of 5) it is easy to overestimate
the percentage. For instance, 60% would seem like a huge difference, while it may only
indicate 3/5.

3. Parametric and Non-parametric stat istics


When we need to use inferential statistics, the optimal is to use parametric statistics. To use
parametric tests, the data that we use should meet a number of assumptions. If it does not meet
the assumptions, the results will be inaccurate. As such, it is extremely important that you test
the assumptions of a specific statistic before you continue with the analysis.

Specific statistics has specific assumptions, however, they generally include:


• Normally distributed data: It is assumed that the data is from a normally distributed
population. If you remember that inferential statistics is done to prove that some or the
other results are applicable to an entire population, you should also understand that the
population’s distribution should also be normal. This assumption is however different
depending on the context in which it is used.
• Equal variances / homogeneity of variances: If two or more groups are compared, or
used in the research, they should have equal variances or spread of scores
• Independence: There must be independence of observations, except when the data are
paired (paired data refers to data that is related to the same respondents over more than
one measurement, like in pre-and post measurements, or respondents that are in some
way related to each other). How do we know if there exists independence of
observations? Well, you will have to look at the design of the research. Where did the
data come from? Was it observations of two entirely different groups, or was it a pre-post
measurement of the same group. How do we prove that this assumption was met?
Easy! By describing and explaining the research design. There are statistical ways to
prove independence of observations – however they are not used for the type of statistics
we will go through in this course.

-8-
• Interval data: The variables (specifically the independent variable) should be on at least
interval level of measurement (or if categorical it should have a minimum of 7 categories).
This assumption is tested by common sense and not through a statistical analysis.

When the assumptions of parametric tests are not met, we should look at the non-parametric
alternative to the parametric test (we will also look at non-parametric alternatives with each
type of statistic in following tasks). Although non-parametric statistics also has some
assumptions, there are fewer restrictions on the data that can be used. The general assumptions
of non-parametric statistics are:
• Independence of observations except when paired
• Few assumptions concerning the population’s distribution
• The scale of measurement of the dependent variable may be categorical or ordinal
• The primary focus is either the rank ordering or the frequencies of the data
• Sample size requirements are less stringent than for parametric tests.
If we look at the assumptions above it is clear why non-parametric statistics are often referred to
as statistics for small samples and distribution free tests.

3.1 Testing the assumption of normality


What is a normal distribution? The normal distribution has 4 characteristics:
• It is unimodal – thus, it has only one hump in the middle of the distribution with the mode
in the middle
• The mean, mode and medial are equal
• It is symmetrical (not skewed)
• It is asymptotic (the extreme scores never touch the x-axis)
• It’s neither too peaked not too flat, thus the kurtosis is equal to 0.

An illustration of the normal distribution

The statistics to look at when you check for normality of the distribution include:
• Skewness

-9-
• Kurtosis
• Kolmogorov-Smirnov (or K-S from now on) (the vodka statistic)
• Shapiro-Wilk test
• Q-Q Plots
• Box-and-whiskers plots
• Histogram

Skewness refers to the lack of symmetry. A distribution with a long tail to the right have is
positively skewed and visa versa.

How to see the skewness of a distribution with SPSS:


From the menu, choose: Analyse > Descriptive statistics > Descriptives > From the options…
box, select skewness.

The output will give you a number. E.g.-5,845. The +/- in from of the number indicates to what
direction the skewness tends and the number how skew the distribution is. The higher the
number, the more skewed the distribution.

Kurtosis on the other hand measures the flatness or peakedness of the distribution. Very
peaked distributions have positive kurtosis and very flat curves have a negative kurtosis. A
perfect normal distribution has kurtosis = 0. To check the kurtosis, you can follow the same
procedure as for skewness, but instead of selecting “skewness”, select “kurtosis”. (Both
skewness and kurtosis can also be computed by SPSS under the “Frequency” option of
“Analyse”.)

To use skewness and kurtosis to see if the distribution is normal, you have to convert the given
skewness and kurtosis scores to z-scores. Use the following formula: zskewness = (K-0)/SEskewness

- 10 -
or z kurtosis = (S-0)/SEkurtosis. S = Skewness; K = kurtosis; SE = Standard Error (of skewness or
kurtosis). If the value is smaller than 1.96, the distribution is normal. In larger samples, this value
should be increased to 2.58. And very large samples it should be increased to 3.29. When a
sample is larger than 200, one should look at the shape from the histogram rather than
significance testing. Significance tests of skewness and kurtosis should with large samples
because they are likely to be significant even when skew and kurtosis are not too different from
normal (Field, 2009, p. 139).

Skewness and kurtosis gives us a numerical value by which we can judge whether a distribution
is normal or not. When you draw up a histogram, you can graphically see if the distribution is
skewed or flat or peaked.
Histogram

2,000

N Valid 1013
Missing 504 1,500

Frequency
Skewness -3.817 1,000

Std. Error of Skewness .077


500
Kurtosis 12.594
Mean = 1.94

Std. Error of Kurtosis .154 0


Std. Dev. = 0.232
N = 1,013
0.5 1 1.5 2 2.5

Counselling for Mental Problems

Example of SPSS output

How to draw a histogram with SPSS:

From the menu, choose: Analyse > Descriptive statistics > Frequencies > From the charts
options… box, select histogram and select the tickbox with normal curve.

Another plot that can be used is the P-P Plot

(Probability-probability plot). A normal distribution on a

P-P Plot should be a diagonal straight line.

- 11 -
Drawing a P-P Plot with SPSS:
From the menu, choose: Analyse > Descriptive statistics > P-P Plots

Box 3: Describing the different groups in your sample: Using the split file command

Most of the time there are different subpopulations represented in the sample. In these cases
you would most likely want to explore each of the subpopulations. One of the functions in SPSS
that can help you do this is the split file function. The split file function allows you to identify a
grouping variable (a variable that is used to specify categories of people).

When you select the split file function, any subsequent procedure that you will do in SPSS will be
carried out, in turn, on each category specified by the grouping variable. For this reason it is
important to turn off the split file function after you have completed the computations you
wanted done in that way. (To switch it of follow the same path (given below) and click on the
reset button.)
To select the split file command: From the menu, choose > Data > Split file. Here the split
file dialogue box will open:

Select “Organise output by groups, and the select the grouping variable (e.g. sex). Then click OK.

Another way in which normality can be tested is by means of the Kolmogorov-Smirnov (K-S)
and the Shapiro-Wilk tests. These tests compare the distribution with a comparable normal

- 12 -
distribution. In both these tests, we are actually testing a hypothesis. This hypothesis is that the
distribution of the sample is the same as the distribution of a population with the same
mean and standard deviation. Remember that we always statistically test to reject or accept
the null hypothesis. The null hypothesis in this case will be:
There is no difference between the distributions of the sample and population (thus
they are equal).
If this is true (if we accept the null hypothesis), it means that the sample distribution is normally
distributed).

The Shapiro-Wilk test is used for small sample sizes (less than 50), otherwise use the K-S test.
The limitation of these tests is similar to the skewness and kurtosis significant tests: that is, if the
sample size is large, it will easily show significant differences (non-normality). For this reason
one would always plot data and use the graphs in collaboration with any other test used.

Komogorov-Smirnov & Shapiro Wilk in SPSS:

How to do this? Well, it is actually easy with SPSS. First, you have to do the statistics by:
From the menu, choose Analyse > Explore…In the dependent list, put all the variables of
interest to you (that you want to test). If any of the variables is a grouping variable, you can
put it in the factor list. This will split the file so that your computations will be done for
different subgroups (e.g. for males and females). If you click on statistics select
Descriptives. Continue. If you click on Plots select under box plots, the option of factor
levels together. Under descriptive select stem- and-leaf. Also Select the Normal plots
with tests and click on continue. Then OK.

There is a lot of output, but only some that is of importance specifically for the K-S and
Shapiro-Wilk. You may look at the descriptors per variable if you haven’t drawn up any
originally. The important statistic is the tests for normality.

Tests of Normality
a
Kolmogorov-Smirnov Shapiro-Wilk
Respondent's Sex Statistic df Sig. Statistic df Sig.
To Be Well Liked Male .398 408 .000 .643 408 .000
or Popular Female .444 574 .000 .548 574 .000
To Obey Male .227 408 .000 .865 408 .000
Female .266 574 .000 .857 574 .000
a. Lilliefors Significance Correction

Example of normality tests output

How do you interpret these tests? The statistic is the actual K-S statistic and the df is the
degrees of freedom (should be the same as the sample size). The one we look at to judge
whether to accept or reject the null-hypothesis is the sig. or significance value. If the sig. i
less than 0.05, there is a significant difference between the population and sample

- 13 -
distribution – therefore we reject the null hypothesis and say that the distribution is not
normal.

In the case of the above shown table, I will report:


The Kilmogorov-Smirnov statistic was significant (p<0.05) and therefore the distribution is
not normal. You will see that the normality output will also include Q-Q plots and stem and
leaf plots and even box-plots (box and whiskers plots).

3.2 Equality of variances


You can see the variances by using the descriptive and frequency commands in SPSS.
However, these give you an indication of the variance of the different groups, but you do not know
if the differences on face value are statistically significant. There are other statistics that tell us to
what extent there are significant differences between different samples. The most common of
these are the Levine’s test of homogeneity of variance and the Bartlett’s test for homogeneity of
variance.

The Levene’s test in SPSS Explore:

Go to Analyse>Descriptive statistics>Explore…put the dependent variable in the “Dependent


List. The grouping variable should be in the “Factor list”. Under the “Plots” options select
Histograms with normality plots and “untransformed” under “Spread vs Level with Levene’s
test”.

Test of Homogeneity of Variance

Levene Statistic df1 df2 Sig.

Age Based on Mean .070 1 53 .792

Based on Median .033 1 53 .856

Based on Median and with .033 1 52.457 .856


adjusted df

Based on trimmed mean .052 1 53 .820

Read the statistics based on the mean. If the significance is smaller than 0.05, it indicates
that the variances are not equal. Significance larger tan 0.05 indicates that variances are
equal.

To report the results of the Levene’s test:


Levene’s test is denoted by the letter F . F as well as the degrees of freedom (df) should be

- 14 -
mentioned in the report. The general form of reporting is: F(df1; df2) = value , sig. E.g. F(1;
53) = 0.070, 0.792.

4. From questionnaire to dataset


The data collected during the research needs to be coded and entered into SPSS to create a
dataset with which you can work. For the purposes of using statistical programmes, you have to
define and label the variables you measured during data collection. For instance, if I measured
level of statistical knowledge, the label may be STATKNOW and the levels of that variable was
measured on a 5 point scale where 1 was no knowledge, was some knowledge, 3 was average
knowledge, 4 was above expected and 5 was exceeding knowledge. So the levels would be the
codes that I will use to indicate the levels of statistical knowledge.

When you are measuring a lot of variables it is very easy to become confused with codes and
labels. For this reason, researchers create codebooks. The codebook lists all the variables
included for the statistics, as well as their labels and the codes ascribed to each answer category
given. For instance, if I measured gender in a questionnaire (in other words, a question asking
each respondents’ gender), “Gender” will be the variable name. In the SPSS data file, I will refer
to gender as “SEX” and the codes that identifies each respondents’ gender is 1 or 2, where 1
indicates “Female’ and 2 indicates “Male”. In my codebook, I will illustrate this as:

Variables SPSS Variable


Variable Name Coding Instruction
Gender SEX 1 = Female
2 = Male

The codebook can be created as soon as your data analysis tool is finalised and contains only
closed answer categories. In the case where you want to use a qualitative data collection tool,
such as an open ended questionnaire, you will have to wait until after you collected your data.

Variable names should:


• Be unique
• Must begin with a letter (not a number)
• Cannot include full stops, blanks or other characters

- 15 -
• Cannot include words used as commands by SPSS (all, ne, eq, to, lt, by, or, gt,. And, not,
ge, with)
• Cannot exceed 64 characters

The responses must all be coded with numbers. Otherwise you would not be able to do any
statistics with them. Even open-ended questions should be transformed to numerical codes to
use it in SPSS.

Before you can analyse data with a statistics programme like SPSS, you will need to create some
form of data set for it to work on. The dataset will need read the data you collected into the
chosen programme for it to work with the data. For this course we are using SPSS (Statistical
Programme for the Social Sciences). But you may decide to use MS Excel for the data analysis
or SAS (Statistical Analysis System). In which case, you would have to read the data into that
programme.

Since you will be working on SPSS you will need to open or create an SPSS data set. When you
are working with raw data (the answers of the respondents are on the questionnaires only) you
need to create a template and insert the data into the SPSS spreadsheet. If our data is in an
electronic form, it can be opened in SPSS. (Note that data should be in an Excel spreadsheet or
a text file to open with SPSS).

5. Scr eening and cleaning your data


Sally did research on managers’ stress level and blood pressure. She collected the data on
stress using the General Stress Inventory and a registered nurse took the blood pressure levels.
As soon as Sally had all the data she read it into SPSS and started the analysis. To her
amazement she found inconsistent results. Lucky for her she went back and checked her data
before she started writing the report. It turned out that Sally made a lot of mistakes while reading
in the data, and that caused the inconsistent results!

Like with Sally, it often happens that mistakes are made when capturing data. When the dataset
is faulty, it can lead to wrong conclusions and therefore invalid and unreliable research! For this
reason, the first step after capturing data is to screen and clean the dataset.

To screen data means that you explore the dataset for any errors, find the errors and correct
them. To identify errors means that you have to know what the correct data will look like, right.

- 16 -
This is easy, you know what the data should look like since a codebook is available that shows
you what the range of the data should be for each variable. For instance, if you measured the
variable of home language in South Africa with a closed ended question with 11 answer options
(one for each language), you know that for the variable of language there is a range of 1 – 11.
Anything outside this range will be a mistake. See, easy!

Now, how do you screen for errors in SPSS? Well, basically you want SPSS to describe the
data.

And, what if you find that a variable is not in the same range as you expected, how will you know
which one of the cases is out wrong? You can either search the variable or you can do more
detailed descriptive statistics. Your choice! As soon as you have identified the error you can
replace it with the correct value by going back to the raw data (questionnaires). If you do not
know what the correct value is, you need to delete the value and replace it with a missing value
(or just keep the cell empty).

6. Manipulating your data

With SPSS one can add up scores, for instance to adding the scores on individual items of a
questionnaire to get a scale score. Continues scores may need to be collapsed into categories to
create a categorical variable, or if too few responses of a specific category are present the
number of categories on a questionnaire can be reduced. Skewed distributions can also be
transformed if needed.

6.1Calculating the total scores of scales or indexes


In some questionnaires a number of questions (items) measure a specific construct. In other
words, you will not look at the single items alone. If this is the case, we would like to add the
responses on these items to obtain a total for each person. We may also use scale in which case
we want to add the responses of all the items together to obtain a scale score. To do this in
SPSS go to > Transform >compute variable.

6.2 Reversing negatively worded items

In some scales, the wording of particular items has been reversed to help prevent response bias.
Using the “Transform” function in SPSS, the item can be recoded positively.

- 17 -
6.3 Collapsing a continues variable into groups
Sometimes you will need to divide your sample according to scores to create groups. For
instance in terms of income, you would want to create categories of low income, middle income
and high income if the question on the questionnaire asked you to write in the income. So in
writing the answer, you may have a continues variable of income, but say you want to compare
the three different income groups on for instance the variable of hope. In such cases, you will
transform the continues variable into a categorical variable.

You may ask, why do you not use categories from the beginning? Well, using interval or ratio
level of measurement gives you much more detail to work with. If you ask age in categories,
every person in your sample will just fall into a category but if you ask the specific age, you have
much more detail on your samples age. It gives you also a wider variety of analysis to work with
since if needed, you may always collapse the continues variable into a categorical one. To do
this in SPSS go to > Transform > Recode into different variable

7. Correlation analysis
When we talk about relationships between variables, we imply that the variables
influence each other. Take note, influence does not imply a causal relationship! If
ice cream sales in Bloemfontein are very high this month, and the amount of
drowning is very high, there will be a correlation or relationship between ice cream
sales and drowning. Does this mean that ice cream sales cause drowning? Or does it maybe
mean that drowning cause ice cream sale? Of course not! There is no logical or theoretical link
between these two events! So a relationship implies that at a given time, in a given context, the
rate or frequency of occurrence of two variables (that is in this case ice cream sales and
drowning) increase.

Relationships between variables are also referred to as associations between variables. The
nature of a relationship/association implies its strength and the direction of the relationship.

The strength of a relationship is indicated by a correlation coefficient (the symbol r is used to


indicate the correlation coefficient in statistics output). The correlation coefficient is a number
between 0 – 1 that indicates how strong the relationship between variables are. A coefficient of 0
indicates no relationship and 1 indicates a perfect relationship.

- 18 -
The direction is whether the relationship is positive or negative. A positive relationship implies
that if the properties of the one variable increase, the properties in the other one will also
increase. Or if the properties in the one decrease, the properties in the other will also decrease.
THUS, a positive relationship means that the variables co-vary in the same direction. A positive
relationship is also referred to as a direct relationship.

A negative relationship means that if the scores in one variable increase the scores in the other
variable decrease. THUS, a negative relationship means that the variables co-vary in different
directions. A negative correlation is also referred to as an indirect relationship.

The positive and negative correlations refer to linear relationships – in other words, both of them
are fitted on a straight diagonal line (See the scatter gram examples below – you will see that a
positive and negative correlation both fit on a straight diagonal line).
In statistics, a correlation analysis is used to test the nature of the relationships between
variables. Therefore, relationships are also referred to as correlations – positive and negative
correlation.

As example, if I want to know if students with higher order thinking skills understand statistics
better, I will do a correlation analysis. That is, I will ask:
Is there a positive relationship between higher order thinking skills and student’s
understanding of statistics?
For relationship questions, I will conduct a correlation analysis. If the analysis is significant, it will
tell me that the better the higher order thinking skills, the better students’ understand statistics. It
does however not tell me that higher order thinking cause statistics understanding! There is a
difference.

Questions about relationships between variables are usually descriptive research. In other
words, the aim of the research when you are using correlations is to describe the relationships
that exists between a and b.

7.1 Statistics to test relations between variables


Different statistics are used to test the relationship between variables. They are all referred to as
types of correlation analysis, but are used for different types of data. They include:
• Pearson / product-moment;
• Spearman;
• Point Baserial;

- 19 -
• Phi coefficient and so forth.

7.1.1 The Pearson / Product-Moment correlation


A Pearson correlation coefficient is used when you are working with continuous data, in
other words, data on the interval or ratio level of measurement. The Pearson correlation is
also a parametric test or a parametric statistic. In short, in statistics we have two legs or
two kinds of statistics – those that are parametric and those that are non-parametric.
Parametric indicates that there are certain assumptions or parameters (borders) that the data
should adhere to in order for it to qualify for parametric statistics. Should the data not adhere
to the parameters or assumptions, the equivalent but NON-parametric alternative should be
used.

The Pearson correlation coefficient is a parametric statistic. To use the Pearson product-
moment correlation your data should adhere to the following assumptions or parameters:
• Data must be on Interval level
• A linear relationship must exist (can be indicated by means of a scatter plot)
• The distributions must be similar (Thus, if they are skewed, they must be skewed in
the same direction), but preferably normal.
• Outliers must be identified and omitted from the computation (please note if you
delete the outliers, delete only the cell with the outlier value)

- 20 -
How do I know if there are outliers?

To see if there are any outliers, we draw up a box-and-whiskers plot and a stem and
leaf plot. Both of these can be drawn up under Analyse…Descriptives..Explore
..Under statistics select Outliers
Outliers and under Plots select stem and leaf. To read stem
and leaf plots use the following link: http://www.cmh.edu/stats/definitions/stem.htm.

The box plot gives you a good idea of the outliers and the identity of the outliers. In
other words, it does not only show you the outlier, but also which number in the data
set has that particular value.

The maximum value, which


is not an outlier

The
median

The 1ste – 3de


quartile of the
distribution, thus it
will be the bell part in
a normal distribution

Outliers or extreme values The minimum value, which


that do not fit with the rest is not an outlier
of the distribution

Outliers cannot be included in the analysis. There are different ways to deal with
outliers:
1. Outliers can be removed
2. Data can be transformed: Outliers skew distributions. The skewness can be
reduced somewhat by transformations of the dataset. (See Field (2009) p.
155 for a short and understandable description of different transformation
options.
3. Change the score: Should the transformation fail, the value can be replaced
by:
a. Changing it to the next highest score in the dataset plus 1.
b. The mean plus two standard deviations

- 21 -
To conduct a Pearson-correlation follows the following steps should be used in SPSS:

From the menu bar select: Analyze >


Correlate >
The options that you can choose from at this
stage is: Bivariate > Partial > Distance > A
bivariate correlation is a correlation between
2 variables.

In the following box, select the variables that


you want to correlate. Select “Pearson”
under Correlation coefficients.

Under Test of significance, two-tailed means


that there is no specification of the direction
of the correlation in the hypothesis stated.
We will mostly work with this one. One tailed
is only chosen when you have specified the
direction of the effect (relationship). In other words, a directional hypothesis.

In the bottom left hand corner, you can select “Flag significant correlations”. This will show SPSS
that the significant correlations must be marked on the output.

7.1.2 The Spearman Rank-Order correlation / Spearman’s Rho


Spearman’s Rho is the non-parametric alternative of Pearson correlation coefficient. It is
used when one or both of the variables are measured on ordinal scale (If only one, the
other should be at least on interval scale). Spearman’s Rho is indicates as rs. To do this
on SPSS use the same procedure as with the Pearson correlation, but select the
Spearman Rank-Order option instead.

7.1.3 Kendall’s Tau


Kendall’s Tau is another non-parametric correlation and it should be used rather than
Spearman’s coefficient when you have a small data set (50 or less). It is stricter and if
you do both the Tau and Rho, you will probably find that the Tau is a bit lower than the
Rho. To do this on SPSS use the same procedure as with the Pearson correlation, but
select the Kendall’s Tau option instead.

7.1.4 The Point-Baserial correlation


This statistics is computed when you want to see the relationship between a continues
variable and a dichotomous variable. E.g. females and males report the total number of
years of education they have had, and we want to know whether there is any correlation

- 22 -
between gender and years of education. It is indicated by rpb The assumptions that your
data must meet to compute a Point Baserial correlation is:
• The dichotomous variable has mutually exclusive groups whose values have
been coded 1 and 0
• The two groups created by the dichotomous variables are normally distributed
• The two groups created by the dichotomous variables have equal variances
• The continues variable has equal variances across each level of the dichotomous
variable
To compute a rpb you use a normal Pearson correlation procedure.

How to test for equality of variances in SPSS?


To test for equality of variances, an easy way is to select from the menu bar
statistics…compare means….Independent-
means….Independent-Samples T-Test. The grouping variable will
obviously be the dichotomous variable and the continues the one which you want to test
differences for. Then click OK.
OK This procedure will give you a table in the output that
looks like this:

Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Mean Std. Error Difference
F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper
HIGHEST YEAR OF Equal variances
3.090 .079 1.943 1843 .052 .259 .133 -.002 .521
SCHOOL COMPLETED assumed
Equal variances
1.929 1677.298 .054 .259 .134 -.004 .523
not assumed
RS HIGHEST DEGREE Equal variances
5.685 .017 1.154 1845 .248 .065 .057 -.046 .177
assumed
Equal variances
1.147 1680.978 .252 .065 .057 -.047 .177
not assumed

Look under Levene’s test for equality of variances. If the significance value is more than
0.05 it means that the two groups have equal variances.

7.1.5 The Phi-coefficient


When both variables are dichotomous the phi-coefficient is used (indicated as rphi). The
assumptions that the data must meet to utilise the Phi coefficient is:
• Variables must be dichotomous
• Observations are independent
• The observations are in the form of frequencies and not scores
• There must be at least 5 counts in each category for each variable.

- 23 -
To compute the phi coefficient with SPSS:
To compute the phi….From the menu bar select Analyze > Describe > Cross Tabulations
Go though the same process as you would with cross tabulations. However, go to the statistics
option and select “Phi and Cramer’s V” and continue and OK.

The output box should give you a table like this:

Symmetric Measures

Value Approx. Sig.


Nominal by Phi .208 .136
Nominal Cramer's V .208 .136
N of Valid Cases 1847
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null
hypothesis.

If the significance value is less than 0.05 there is a significant relationship between the two
variables. You look at the Phi statistic only.

7.1.6 Cramer’s V Coefficient


When you want to test the association between two categorical variables (not
dichotomous) you use the Cramers’ V statistic. Obtain this by the same steps as above.

7.2 How to interpret the results of the correlations


A correlation coefficient tells you two things: 1) the strength of the relationship between the
variables and 2) the direction of that relationship. It does not tell you whether that relationship is
statistically significant or not. Here is a rough guide to interpret the correlation coefficients in
terms of strength of relationship:
Correlation Strength of relationship
coefficient (r)
0.0 – 0.2 Very weak, negligible
0.2 – 0.4 Weak, low
0.4 – 0.7 Moderate
0.7 – 0.9 Strong, high, marked
0.9 – 1.0 Very strong, very high

You have to remember to look at the direction of the correlation as well. You can only interpret
the correlation in terms of strength if the correlation is statistically significant.

- 24 -
What is statistical significance?
A statistical concept indicating that the result is very unlikely due to chance and, therefore, likely
represents a true relationship between the variables. Statistical significance is usually indicated by the
alpha value (or probability value), which should be smaller than a chosen significance level. For most
research studies the significance level of 0.05 or 0.01 is used, thus indicating that the results have only a
5% or 1% chance of being likely by chance alone.

In SPSS, we look at the p-value to tell us whether results are statistically significant or not. If the p-
value is smaller than 0.05, we know the results are statistically significant by 0.05.

7.3 The coefficient of determination (r2)


When the correlation coefficient is squared, it gives us an indication of the amount of variability in
the one variable that is explained by the other. For example if correlation coefficient between age
2
and social intelligence is 0.78 (p < 0.05) then the r = 0.6084. This can then be interpreted as:
The amount of variability that can be explained in social intelligence by means of age is 61%.
2
This r is also called the coefficient of determination (see
http://www2.chass.ncsu.edu/garson/pa765/correl.htm).

7.4 How to write up the results of a correlation analysis in a research report


Mostly you will write something like…The results of the chi-square analysis indicated a significant
but weak association between group membership and post intervention fear (Chi-square = 0.40,
p= 0.03)…depending on which analysis you used. Remember to interpret it in terms of the
practical value of the research.

7.5 Graphically representing the relationship between variables:


It is probably the easiest to see if a relationship exists by drawing up scatter plots of the different
variables that you would like to test. A scatter plot shows how the scores on the variables co-vary
(go together). Since the scatter plots gives you such a good picture of what to expect from a
correlation coefficient, it is the fist step of a correlation analysis – to first draw up a scatter plot

- 25 -
Examples of scatter plots:

A positive relationship A negative relationship No Relationship

To draw up a scatter plot in SPSS:


From the menu bar select graphs > Select Scatter. A box with the different scatter
plot options should appear - We will use the simple scatter plot for now. This type of
scatter plot looks at the relationship between two variables. Click on Define > Select the
variables for the analysis and place as x and y-axis > If there is a grouping variable
that defines different categories you may place it in the “Set markers by” block >
Select the Titles option below to give headings to the plot.

The different types of scatter plots that can be drawn up are:


• The simple scatter plot (as indicated above),
• The overlay scatter plot,
• The Matrix scatter plot and
• The 3-D scatter plot.
With the overlay scatter plot option, you can display the covariance between several
variables on the same axis / diagram. The Matrix scatter plot does the same but rather
than drawing it up on the same diagram, it is drawn up in a matrix. The 3D scatter plot is
used to draw a diagram of the relationship between 3 variables.

- 26 -
7.6 Other analysis that is grounded in correlation analysis
A lot of multivariate statistics is grounded in the logic of correlation analysis. They include Factor
analysis, cluster analysis, regression analysis, and reliability analysis, to name but a few
commonly used ones.

While correlation analysis tests whether relationship exists and the strength of that relationship,
regression analysis assess the predictive ability of an independent variable on a continues
dependent variable. For instance, if we take high school achievement and university
achievement, one can use a regression analysis to determine the extent to which the high school
achievement can be used to predict achievement at university.

While simple regression will assess the functional relationship between one dependent
(criterion/outcome measure) one independent (predictor) variable, multiple regression is used
when you want to test the predictive value of a number of predictors to a single criterion (outcome
measure), where the criterion should be a scale variable (continues variable on at least interval
level of measurement). When the criterion is not on interval level of measurement, logistic
regression should be used.

For more information on correlations and regression, see:


o http://bmj.bmjjournals.com/collections/statsbk/11.shtml
o Correlation and regression analysis for curve fitting find @
http://helios.bto.ed.ac.uk/bto/statistics/tress11.html
o Sykes, A.O. (ND) An Introduction to Regression Analysis. Retrieved from:
http://www.law.uchicago.edu/Lawecon/WkngPprs_01-25/20.Sykes.Regression.pdf
o http://www.valuebasedmanagement.net/methods_regression_analysis.html
o http://www.investorwords.com/4136/regression_analysis.html
o http://www.blackwellpublishing.com/specialarticles/jcn_10_462.pdf
o http://www.telecom.csuhayward.edu/~esuess/Links/Software/RegressionExplained/re
gression_explained.doc
o DAU Stats Refresher @ http://www.cne.gmu.edu/modules/dau/stat/dau2_frm.html
o Dallal, G.E. (2004). The Little Handbook of Statistical Analysis @
http://www.tufts.edu/~gdallal/LHSP.HTM > Select Regression pages on the menu
page.
o http://www2.sjsu.edu/faculty/gerstman/StatPrimer/regression.pdf

- 27 -
Another procedure, which is based on the logic of correlations, is the factor analysis. With a
factor analysis you can determine the underlying structure of a large data set. In other words,
when you have 10000 variables in your dataset and want to look at how these variables fall
together, you can use a factor analysis. On such a dataset the factor analysis will group the
variables that fall together with each other, thus, indicating the underlying structure (or reduced
number of latent variables) present in the data set. .

One analysis which is very important when questionnaires are used in research is the reliability
analysis. One of the main principles of selecting a data collection instrument is that it should
measure what you need it to measure and it should be a reliable indicator of what ever it is you
are measuring. In other words, the validity and reliability of your data collection instrument is
important. While a factor analysis can assess the construct validity of an instrument, the
cronbach’s alpha is one way to assess the reliability of a questionnaire. This method tests the
internal consistency of the items that are supposed to measure the same thing.

All of the reliability analysis options are under Analyse>Scale…

8. Test ing differences between gr oups (causal relat ions hips )


Sometimes we hypothesise that one variable (independent variable) may cause a change in
another variable (dependent variable). For instance, we think that gender can influence
vocational interest. In other words, if you are a male, you will have certain interests that differ
from the interests of females. Thus, too prove your hypothesis you have to prove that the career
interests of males and females differ from each other. In other words, you have to compare
groups (in this case males and females).

8.1 What does “testing for differences between groups,” mean?


Researchers often want to test the similarities or differences of the properties or characteristics
between groups. Take the following example:

Example 1:
A researcher wants to know whether there is a difference in the personalities of sales consultants
and sales managers. This would give important information for the recruitment of both groups.

The research question for this study would be:


Is there a difference between the personality profiles of sales consultants and sales managers?

- 28 -
Of course the researcher has to define each of the variables included in the study. They are:

Type of post that is the independent variable or grouping variable, which is either sales
consultant or sales manager, and personality profile (the dependent variable).

The researcher defines a sales consultant as a person who is responsible for the sales of
a specific product of a company. He is directly involved with the prospective buyer.

The sales manager is a person who is responsible for the sales of sales consultants
within a specific division of an organisation. He is not directly involved with the
prospective buyer, but rather with the management of sales consultants.

A personality profile is a profile that defines the personality dimensions important for a
specific group.

The above mentioned is the conceptualisations or conceptual definitions of the variables.


However, the researcher needs to measure these concepts and therefore will specify the
operational definitions/operationalise the variables.

For this, he will for instance define the groupings as:


For a person to fall into the category of a sales consultant, he or she has to be in a sales
consultant post for at least 1 year. And a sales manager has to be in a post specified as sales
manager for at least 1 year.

Personality profiles are measured by means of the 16 Personality Factor questionnaires. The
researcher will of course specify here what this instrument measures and how.

The hypotheses will be set out as follows:

H0 (null hypothesis): There is no difference in the personality profiles of sales consultants and
sales managers.

H1 (alternative hypothesis): There is a difference between the personality profiles of sales


consultants and sales managers.

The researcher can go so far as to set specific sub hypotheses for H1. These sub hypotheses
will specify how the personality profiles will differ. For instance, the researcher can say that:

- 29 -
H1 (a): A sales consultant will score high on dimension A, F and Q4, and low on Q2.

H1 (b): A sales manager will score high on dimension C, D and E, and lower on A and F.

If sub hypotheses are specified, the researcher will have to substantiate why and how he got to
these hypothesis from previous research.

8.2 Testing differences between two independent groups: t-test for independent groups
When a researcher wants to see if statistically significant differences exist between two different
groups with regard to a dependent variable, he will use the t-test for independent groups. For
instance, if you want to test if there is a difference in the level of language skills between a group
of matriculates from Gauteng and Limpopo, you will use the t-test for independent groups.

The t-test is a parametric statistic. The following assumptions must be met:

1. The t-test uses the means to compare for differences. This implies that the data for the
dependent variable must be on at least interval scale.

2. It is not essential for this procedure that the sample sizes of the two groups are the same.
However, for the t-test, the sample size should at least be 30 per group.

3. Equal variances are assumed. For this, the Levene’s test for homogeneity of variances is
used. This is given with the t-test output. You will remember from previous SA’s that the
significance value of the Levene’s test should be more than 0.05. The latter will indicate
that variances are equal.

4. The data for each of the two groups must be distributed normally. This can be tested by
means of the descriptions of skewness and kurtosis or the Q-Q plots (or any other test for
normality).

To do an independent samples t-test on SPSS select ANALYSE > COMPARE MEANS >
INDEPENDENT SAMPLES T-TEST. Select the dependent variable for the dependent list and
the grouping variable under grouping variable. You have to define the groups – use the codes of
the data set, e.g. group 1 = 0; group 2 = 1. Run the analysis.

- 30 -
The output will typically look like this:

Group Statistics

Std. Error
Gender N Mean Std. Deviation Mean
Income before Male 493 8.9939 1.68866 .07605
the program Female 507 8.9152 1.58510 .07040

The first table shows the number of cases (N) for each group, the mean score for each group and
the standard error of the mean.

The mean difference


when the mean of
group A is subtracted
from Group B

Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Mean Std. Error Difference
F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper
Income before Equal variances
.421 .517 .760 998 .447 .0787 .10354 -.12446 .28191
the program assumed
Equal variances
.760 989.773 .448 .0787 .10363 -.12464 .28209
not assumed

Use the row of output Levene’s test output.


corresponding to the In this case, the test
outcome of the Levene’s shows that the The significance value.
test. E.g if Levene’s test variances are equal Should be less than 0.05 to
indicates homogeneity of F(1,998) = 0.517;p = indicate a significant
The
variances as in this case, 0.517. difference. For this
degrees example no differences
use the upper row output
of exist.
for the t-test.
freedom
(N-1)

The t-value is in this case


0.760

When you write the results of a t-test you must indicate the t-value as well as the significance
value (p). In this example, Levine’s test showed that homogeneity of variances could be
assumed. Thus, from the results, it is evident that there are no differences between the groups
(t(998) = 0.760; p = 0.447). This reports the statistical significance. Lately it is required to report

- 31 -
the effect sizes of statistical results as well. There are different methods for calculating effect
sized. The most common is however using r and Cohen’s d .

Pearson's r can vary in magnitude from −1 to 1, with −1 indicating a perfect negative linear
relation, 1 indicating a perfect positive linear relation, and 0 indicating no linear relation between
two variables. Cohen gives the following guidelines for the social sciences: small effect size, r =
0.1-.23; medium, r = 0.24-.36; large, r = 0.37 or larger.

Box 3: Practical and statistical significance


In SPSS, we look at the p-value to tell us whether results are statistically significant or not. If the p-value is
smaller than 0.05, we know the results are statistically significant by 0.05. What is statistical significance? A
statistical concept indicating that the result is very unlikely due to chance and, therefore, likely represents a
true relationship between the variables. Statistical significance is usually indicated by the alpha value (or
probability value), which should be smaller than a chosen significance level. For most research studies the
significance level of 0.05 or 0.01 is used, thus indicating that the results have only a 5% or 1% chance of being
likely by chance alone.

We test the significance of yours statistics by looking at the probability that our results may be due to other
factors. If this probability is larger than 5% we generally do not accept it as “significant”. When it is smaller
than 5%, we do accept it as being “significant”. However, this significance is does not necessarily mean that it
is important. Statistical significance can sometimes be due to large samples. For this reason we calculate also
the effect sizes of significant statistics.

8.3 The nonparametric alternative for the t-test for independent samples: Mann-Whitney U
test
Used if, the assumptions of the t-test for independent samples are not met, i.e.
 Data is not normally distributed
 Dependent variable is measured on ordinal scale
 Sample sizes are small (smaller than 30 larger than 5 per group).

The hypothesis for a Mann-Whitney will look like:

HO: there are no differences between the means of the samples ( ) (median1 =median2 for
non-parametric)

H1: there is a difference between the means of the two samples ( ) (median1?median 2)

The output will typically look like this:

Mann-Whitney Test

- 32 -
Ranks

Marital status N Mean Rank Sum of Ranks


Level of education Unmarried 504 512.13 258114.01
Married 496 488.68 242386.00
Total 1000

Test Statisticsa

Level of
education
Mann-Whitney U 119130.00
Wilcoxon W 242386.00
Z -1.389
Asymp. Sig. (2-tailed) .165
a. Grouping Variable: Marital status

When you report the results you must mention the Z score and the Significance level. In the
example above, the differences between married and unmarried employees with regard to level
of education is not significant (z=-1,389; p=0.165).

8.4 Testing differences between two dependent / related samples


In some research designs, a researcher has two measurements of the same group taken at two
different points in time. For instance a pre-and post measurement.
In such cases the researcher would like to see if there is a difference between the two
measurements. A good example of such a design is when a researcher wants to test the
effectiveness of a communication skills training programme. If the training programme is
effective, the logical deduction would be that the scores on a second measurement of (after the
training programme) will be higher than the first (before the training programme.

In the case described above, the research will use the t-test for related/dependent samples. The
assumptions are the same as for the t-test for independent samples, except for the independence
of observations.

Take the following example:

We compared the mean test scores before (pre-test) and after (post-test) the subjects completed
a test preparation course. We want to see if our test preparation course improved people's score
on the test.

- 33 -
First, we see the descriptive statistics for both variables.

The post-test mean scores are higher. However this is just on face value – we still do not know if
this difference is statistically significant.

Next, we see the correlation between the two variables. Remember, the groups are paired / the
same and therefore, we assume that there is a correlation between the first and second
measurement.

There is a strong positive correlation. People who did well on the pre-test also did well on the
post-test.

Finally, we see the results of the Paired Samples T Test. Remember; this test is based on the
difference between the two variables. Under "Paired Differences" we see the descriptive statistics
for the difference between the two variables.

To the right of the Paired Differences, we see the T, degrees of freedom, and significance.

- 34 -
The T value = -2.171
We have 11 degrees of freedom
Our significance is .053

If the significance value is less than .05, there is a significant difference.


If the significance value is greater than. 05, there is no significant difference.

Here, we see that the significance value is approaching significance, but it is not a significant
difference. There is no difference between pre- and post-test scores. The test preparation course
did not help!

To conduct a t-test for related samples on SPSS you follow the same route as with the t-test for
unrelated samples. But, select the paired samples option/dependent samples t-test..

8.4 The non-parametric alternative to the t-test for dependent/related samples: Wilcoxon
Singed-Rank Test
When the level of measurement for a one-group pre-post test design is on ordinal scale, data is
not normally distributed, or sample sizes are small, the Wilcoxon Signed-Rank Test is used to test
differences. Where the t-test uses the mean to test for differences, the Wilcoxon Signed Rank
test uses the median. For more info on the Wilcoxon Sing Rank procedure see:
http://learn.lboro.ac.uk/sci/ma/mlsc/documents/wsrt.pdf

8.5 Testing differences between more than 2 groups on one variable: One-way Analysis of
Variance (One way ANOVA)
Sometimes a researcher wants to compare the differences and similarities between more than 2
groups.

- 35 -
Example:
A researcher thinks that students’ research skills are influenced by their time management skills.
The research question here is: Do time management skills influence students’ research skills?
For this study time management skills is the independent variable. This will therefore be the
grouping variable. Research skills are the dependent variable.

She measures time management skills by means of the Kubic Time-management questionnaire.
This questionnaire categorise a persons time management skills in Low, low-to-moderate,
Moderate-to-high, and High time management.

Research skills are measured by means of the outcome/score of a student’s performance on a


masters’ level dissertation.

The hypotheses are as follows:

H0: There is no difference between low, low-to-moderate, moderate-to-high and high time
management abilities and students’ performance on their masters dissertations

H1: Students with high time-management ability will perform significantly better in their masters
dissertations than students with moderate-to-high, low-to-moderate and low time management
skills.

H2: Students with moderate-to high time management skills will perform better on their masters
dissertations than students with low-to-moderate and moderate time management skills, but
worse than students with high time management skills.

(H3 and H4 will follow in the same pattern)

To test these hypotheses, a one-way ANOVA can be performed. Note the use of “one-way” in
this type of statistic. This indicates that there is only one independent variable (grouping variable)
or factor involved. This is important because there are also two-or three ways ANOVAs or
factorial ANOVAs which is computed when there are more than 1 factor or grouping variable used
in the comparison. This is however beyond the scope of this module.

As the name indicates, the ANOVA looks at the variances (or differences in variances) between
the different groups. If differences exist, we assume that there are differences somewhere

- 36 -
between the means of the different groups. Or from all the groups, at least two group means
differ significantly from each other. Thus, at this point in time, the H0 can be rejected.

The ANOVA output itself only tells you that there is a difference somewhere, or not. It does not
tell you between which groups these differences lie. To see between which groups differences
exist, post-hoc tests are used.

There are different post-hoc tests. The most commonly used in the Tukey’s Honestly Significant
Difference or HSD test. The Bonferonni test is also used since it controls for the TYPE I Error
(finding significant differences when there are none). The chances of a TYPE I Error is enhanced
since repetitive comparisons are made between groups. Both these tests are conducted when
equal variances of groups are assumed (parametric assumption).

The one-way ANOVA as explained above is a parametric test. The assumptions or requirements
for the data is the same as for the t-test for independent groups:
1. All observations must be independent from each other
2. The dependent variable must be measured on an interval or ratio scale
3. The dependent variable must be normally distributed in the population – for each group
being compared.
4. The variances of all the groups must be the same (homogeneity of variances)
5. Sample sizes need not be equal, but should preferably be larger than 30 for each group.

When equal variances are not assumed, but all other assumptions are met, SPSS gives you a
choice of post hoc tests, which adapts for the differences between group variances. For this, you
may select from Tamhanes, Dunettes or Games-Howell post-hoc tests.

SPSS output will typically give you the following:

a. Descriptive statistics:

Descriptives

Income before the program


95% Confidence Interval for
Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Did not complete
459 7.6776 .82043 .03829 7.6023 7.7528 6.00 10.00
high school
High school degree 348 9.2500 .72273 .03874 9.1738 9.3262 8.00 11.00
Some college 193 11.4560 1.02030 .07344 11.3111 11.6008 10.00 14.00
Total 1000 8.9540 1.63663 .05175 8.8524 9.0556 6.00 14.00

- 37 -
The descriptive statistics include the number of respondents per group (N), the mean or average
score per groups, the standard deviation, standard error of the mean, the minimum and maximum
scores.

b. Test for homogeneity of variance:

Test of Homogeneity of Variances

Income before the program


Levene
Statistic df1 df2 Sig.
18.420 2 997 .000

As with the t-test the ANOVA output, if you choose, should give you the results of the Levene’s
test for homogeneity of variances. If the Levene’s test is significant (sig./p<0.05) the null
hypothesis (that states that variances are equal) is rejected. THUS, the variances between
groups are not equal. This gives you a fair idea of which post-hoc tests should be interpreted.

b. The ANOVA / F-test

ANOVA

Income before the program


Sum of
Squares df Mean Square F Sig.
Between Groups 1986.479 2 993.240 1436.399 .000
Within Groups 689.405 997 .691
Total 2675.884 999

This is what you look at to decide whether there are any differences between the groups. In the
first column, you will see that the output shows you the location of the differences – being either
between groups (that is differences of variance between groups) or within groups (that is, the
amount of variation that exists within each of the groups). The amount of variation for each of
these is computed by means of the SUM OF SQUARES and the DEGREES OF FREEDOM. The
df stands for DEGREES OF FREEDOM and is N-1. MS is the MEAN SQUARE (the variance)
which is computed by SS/df. The F is the F-ratio. The ANOVA uses the F-test/F-distribution to
test for differences between groups. The F-ratio is computed by between MS/within MS. For
differences to be significant the between MS should be much larger than the within MS. If the
grouping variable has an effect (in other words when there is a difference between groups) the F-

- 38 -
ratio should be larger than 1. To see if the differences is statistically significant, you need to look
at the sig. (significance value). If the sig. < 0.05, it indicates that there are significant differences
between the groups.

The interpretation for this table will be written as:

A significant difference exists between groups (f=1436.199; p=0.00).

c. Results of the Post-hoc tests:

Multiple Comparisons

Dependent Variable: Income before the program

Mean
Difference 95% Confidence Interval
(I) Level of education (J) Level of education (I-J) Std. Error Sig. Lower Bound Upper Bound
Tukey HSD Did not complete High school degree -1.5724* .05911 .000 -1.7112 -1.4337
high school Some college -3.7784* .07134 .000 -3.9458 -3.6110
High school degree Did not complete
1.5724* .05911 .000 1.4337 1.7112
high school
Some college -2.2060* .07463 .000 -2.3811 -2.0308
Some college Did not complete
3.7784* .07134 .000 3.6110 3.9458
high school
High school degree 2.2060* .07463 .000 2.0308 2.3811
Bonferroni Did not complete High school degree -1.5724* .05911 .000 -1.7142 -1.4307
high school Some college -3.7784* .07134 .000 -3.9495 -3.6073
High school degree Did not complete
1.5724* .05911 .000 1.4307 1.7142
high school
Some college -2.2060* .07463 .000 -2.3849 -2.0270
Some college Did not complete
3.7784* .07134 .000 3.6073 3.9495
high school
High school degree 2.2060* .07463 .000 2.0270 2.3849
Tamhane Did not complete High school degree -1.5724* .05447 .000 -1.7028 -1.4421
high school Some college -3.7784* .08283 .000 -3.9773 -3.5795
High school degree Did not complete
1.5724* .05447 .000 1.4421 1.7028
high school
Some college -2.2060* .08304 .000 -2.4053 -2.0066
Some college Did not complete
3.7784* .08283 .000 3.5795 3.9773
high school
High school degree 2.2060* .08304 .000 2.0066 2.4053
Games-Howell Did not complete High school degree -1.5724* .05447 .000 -1.7004 -1.4445
high school Some college -3.7784* .08283 .000 -3.9735 -3.5833
High school degree Did not complete
1.5724* .05447 .000 1.4445 1.7004
high school
Some college -2.2060* .08304 .000 -2.4015 -2.0104
Some college Did not complete
3.7784* .08283 .000 3.5833 3.9735
high school
High school degree 2.2060* .08304 .000 2.0104 2.4015
*. The mean difference is significant at the .05 level.

The post-hoc test, compare the specific groups with each other. Those groups that differ
significantly will usually be flagged by means of an *. The significance value for the group
comparison should also be smaller than 0.05.

- 39 -
For this example, I have selected both post hoc tests that assumes homogeneity of variances,
and those who do not. From the Levene’s statistic, I can now say that the assumption of
homogeneity of variances has not been met. Therefore, I need to look at either the results of the
Tamhane or Games-Howell post hoc tests. Both these tests indicate to me that there are
statistically significant differences between all the groups.

The hypothesis that I tested in this example was that significant differences exist in the income
level of people that did not complete school, did complete school and those with post-matric
training. I can now say that the ANOVA showed that significant differences exist.

From the means plot I can see which of the groups has the highest income:

12

11
Mean of Income before the program

10

7
Did not complete hig High school degree Some college

Level of education

8.6 The non-parametric alternatives for the One-way ANOVA


When the data does not meet the requirements/assumptions of the parametric one-way ANOVA,
the Kruskal-Wallis H test, the Median test and the Johnckheere-Terpstra Tests can be used.

For the purpose of this module, we will only look at the Kruskal-Wallis H Test.

The Kruskal-Wallis H Test is an extension of the Mann-Whitney U test. It is more powerful and
preferable non-parametric alternative to use. Where the ANOVA uses the F-ratio, the Kruskal-
Wallis uses the H to assess whether differences exists. Basically it compares the medians of the
samples/groups.
• Data that can be used for the Krukskal-Wallis should:
• Groups must be independent
• More than 5 respondents per group (preferably 10)
• Sample sizes should be equal or as equal as possible

- 40 -
The distribution need not be normal and variances need not be equal.

In some situations, you would not want to compare more than two groups on one independent
variable alone. In other words, you would like to see if there are differences based on more than
one variable. Does Black, Asian and White South Africans differ in terms of demographical
location, number of children, number of people living within one household?. When two
independent variables are included we make use of the two-way ANOVA, when three
independent variables are included we make use of the three-way ANOVA. ANOVA’s with more
than one “factor” tested to see if it has an effect, can also be called Factorial ANOVA. See more
at:
o http://davidmlane.com/hyperstat/A134930.html
o http://pluto.fss.buffalo.edu/classes/psy/segal/2072001/anova2/ANOVA2.html
o http://arts.uwaterloo.ca/~djbrown/psych391/Test2/Factorial-Variance1.pdf

When more than one dependent variable is included, the Multivariate analysis of variance or
MANOVA is used. See:
o http://userwww.sfsu.edu/~efc/classes/biol710/manova/manovanew.htm
o http://www.utexas.edu/cc/docs/stat38.html

Remember when to use a partial correlation? You want to keep the effect of a variable constant,
to see what the relationship between two other variables are without its interference. Sometimes
when we want to test differences with an ANOVA, we may also want to control for the effect of
another variable. In such cases we use the Analysis of Covariance (ANCOVA). See also:
http://www-users.cs.umn.edu/~ludford/Stat_Guide/ANCOVA.htm

References:

Field, A. (2009) Discovering statistics using SPSS. SAGE Publications

Huysamen, G.K. (1998). Descriptive statistics for the social and behavioral sciences. JL van
Schaik Academic: Pretoria.

Pallant, J. (2003) SPSS survival manual. Open university press.

- 41 -

You might also like