You are on page 1of 24

Chapter 9 Experimental Design Research Methods (pp.

187-232)
Overall teaching objective: To introduce undergraduate criminal justice research methods
students to the various experimental design research methods and to demonstrate their
applications.
Note to instructors: This chapter is presented in two sections. The first section provides an
overview of the research method. The second section uses a research report to demonstrate how
previous researchers applied this method to a project relevant to criminal justice practice. In
both sections the material is organized by the generic research process that was presented in
Chapter 2.

Experimental design research methods are considered one of the purest forms of social
science inquiry.
When done correctly, results from an experimental design provide very good insight into
the actual causes of social phenomena.
Because of their ability to isolate and measure the effect of a single independent variable
on a single dependent variable, experimental design models are especially useful in
explanatory research.
Making Research Real 9.1 Would a Speed Trap Reduce Traffic Crash Fatalities? (p.
187)
A police department conducts an experiment to determine if a speed trap would
reduce traffic fatalities on an interstate highway that traverses the community.
The experiment contains all of the features of an experimental design pretest,
posttest, treatment, experimental group, control group
Experimental Design Basics (p. 189)
An experiment is a research method that measures the effect of an independent
variable on a dependent variable.
All experimental design models feature an;
o experimental group (the group that actually gets exposed to a treatment),
o a treatment (the independent variable that is alleged to cause change to the
dependent variable), and
o a posttest (a measurement of the dependent variable after the treatment.
More sophisticated experimental design models also include;
o a pretest, (a measure of the dependent variable before the treatment), and
o a control group (the group, equivalent to the experimental group in terms
of the dependent and other variables, that is not exposed to the treatment.
Types of Experimental Designs (p. 191)
There are several types or variations of the experimental designs model.
The five most commonly used experimental design models.
o The one group no pretest experimental design model

o
o
o
o

The one group pretest/posttest experimental design model


The two group no pretest experimental design model
The two group pretest/posttest experimental model
The Solomon Four Group experimental model

The one group, no pretest experimental design only includes the basic elements of
an experimental design model the experimental group, a treatment and a
posttest. This design does not include a pretest or a control group. Because of
this the effect of the treatment cannot be accurately measured and the influence of
other factors on the dependent variable cannot be identified.

Figure 9.1 The one group, no pretest experimental design model. (p. 192)

EXPERIMENTAL GROUP

PRETEST

TREATMENT

POSTTEST

NO

YES

YES

The one group pretest/posttest experimental design model includes the basic
elements of an experimental design model an experimental group, a treatment
and a posttest. In addition, this model includes a pretest that allows the researcher
to measure the actual effect of the treatment (independent variable) on the
dependent variable. This design does not include a control group so there is really
no way for the researcher to know whether something other than the treatment
caused a change to the dependent variable.

Figure 9.3 The one group pretest/posttest experimental design model. (p. 193)

EXPERIMENTAL GROUP

PRETEST

TREATMENT

POSTTEST

YES

YES

YES

The two group no pretest experimental design model includes the basic elements
of the experimental design model an experimental group, a treatment and a
posttest. This design includes a control group so the researcher would be able to
determine that the independent variable, by itself, had some effect on the
dependent variable. However, because there is no pretest the researcher cannot
measure how much effect the independent variable had on the dependent variable.

Figure 9.5 The two group no pretest experimental design model. (p. 194)
PRETEST

TREATMEN
T

POSTTEST

EXPERIMENTAL GROUP

NO

YES

YES

CONTROL GROUP

NO

NO

YES

The two group pretest/posttest experimental design model includes the basic
elements of the experimental design model an experimental group, a treatment
and a posttest. In addition this model has both a pretest and a control group.
These two features enable the researcher to measure the actual effect of the
treatment on the dependent variable and to determine whether or not other factors
might have caused a change in the dependent variable.

Figure 9.7 The two group pretest/posttest experimental design model. (p. 195)
PRETEST
EXPERIMENTAL GROUP

YES

TREATMEN
T
YES

CONTROL GROUP

YES

NO

POSTTEST

YES

YES

The Solomon Four Group experimental design model includes the basic elements
of the experimental design model an experimental group, a treatment and a
posttest. This model also contains a pretest and a control group. More
significantly, this model contains an extra experimental group and an extra control
group. These additional features enable the researcher to determine how much, if
any, the research subjects exposure to the pretest affected their performance on
the posttest.

Figure 9.9 The Solomon Four Group experimental design model.


PRETEST
PRETEST/POSTTEST
EXPERIMENTAL GROUP

YES

TREATMEN
T
YES

POSTTEST

PRETEST/POSTTEST
CONTROL GROUP

YES

NO

YES

POSTTEST ONLY
EXPERIMENTAL GROUP

NO

YES

YES

POSTTEST ONLY
CONTROL GROUP

NO

NO

YES

YES

Threats to Internal Validity in Experiments (p. 199)


You may recall that validity is the extent to which a measure actually measures
the concept it purports to measure. For example, feet and inches would be valid
measures for height, but not for weight.
In experimental designs, the term validity is used in a slightly different sense.
There are two types of validity in experimental design research internal and
external
Internal validity refers to the ability of an experimental design to document the
causal relationship between an independent variable and a dependent variable.
There are seven common threats to the internal validity of an experiment: history,
maturation, mortality, testing, instrumentation, selection bias and regression.
Table 9.1 - Threats to internal validity in an experiment. (p. 201)
Threat to
internal validity

Description

Example

History

Major events happen


during an experiment that
affect the research subjects
and thus the dependent
variable.

During an experiment on the effect of an


education program intended to reduce underage
drinking, a popular rap star dies from alcohol
poisoning. Empathetic research subjects may
decide to drink less as a result. This will affect
the researchers ability to measure the
independent effect of the educational program.

Maturation

Natural developmental
changes in the research
subjects affect the outcome
of the experiment.

A researcher is testing the effects of a new drug


use prevention program on high school
students. Because young people change within
a very short time span, some of the research
subjects may mature and become less likely to
abuse drugs regardless of whether they
participated in the program.

Mortality

A loss of research subjects


can occur over the course
of an experiment and
affect the outcome of the
experiment.

A criminologist studies juvenile delinquents


from their fifteenth through thirtieth birthday to
determine the effect of life course changes on
criminal behavior. Some of these research
subjects move, drop out of the experiment or
die over the 15-year study. As a result, the
researcher does not have enough research
subjects to ensure her results are significant.

Testing

Exposing research subjects

To evaluate the effect of a prison-based

to a pretest prior to the


treatment can change the
outcome of the posttest.

rehabilitation program, a researcher administers


a pretest to members of the experimental and
control groups. Following the six week
training program, in which only the subjects
from the experimental group participate, the
researcher administers the same test to both
groups. The members of the experimental
group report higher scores suggesting that the
training program has its intended effect. But so
does the control group. It is possible that the
members of the control group improved
because they had taken the pretest beforehand.

Instrumentatio
n

Differences between the


pretest and posttest
instruments cause a change
in the dependent variable.

In order to avoid a testing effect, the researcher


in the example above decides to make the preand posttest slightly different. The experiment
indicate that the rehabilitation program had an
enormous effect, so big that a peer reviewer
asks to review the pre- and posttests (i.e., the
instruments). The reviewer argues that the
questions were so different between the two
tests that the results are not comparable.

Selection bias

Differences between the


members of the
experimental and control
groups result in different
effects of the treatment.

Regression

Although there may be an


initial treatment effect, the
effect diminishes over
time, indicating that the
independent variable has
no long-term effect.

In this case the researcher either systematically


or inadvertently assigned research subjects to
the groups so that the groups were not
equivalent with respect to the dependent or
other variables that are affected by the
treatment
An educator conducts an experiment to
evaluate a program similar to Head Start,
whereby impoverished children are allowed to
enter school at an earlier age. His results
suggest that the program does improve reading
and math scores in elementary school. But by
the time the children enter junior high school,
their reading and math scores are more or less
equal to the children in the control group. In
short, the independent variable has no longterm effect.

Making Research Real 9.2 So Much for Community Relations (p. 199)
A media consultant is helping a large police department improve its public image

She is evaluating the effect of directed public safety announcements (the


treatment) on the public perception of the police department (the dependent
variable)
Prior to the posttest (a survey to measure the effects of the public safety
announcement) the local news broadcasts a video of police officers beating a
subject following a high speed chase.
This is an example of how history affects the internal validity of an experiments
results.

Making Research Real 9.3 Measuring the Effect of Pornography (p. 200)
A psychologists is conducting an experiment on the effect of pornography (the
independent variable) on the frequency of risky sexual behaviors (the dependent
variable).
He tests and retests a sample of adolescents and learns that most of them
experience an increase in risky sexual behaviors as they approach 18 years of
age, regardless of whether they were exposed to pornography.
Another psychologists reminds the researcher that humans naturally increase
sexual desire, and therefore sexual behaviors, during late adolescence.
This is an example of maturation.
Making Research Real 9.4 Measuring the Effect of Video Games (p. 200)
A criminologist conducts an experiment on the effect of video games (the
independent variable) on aggressive behavior (the dependent variable).
Toward the end of the experiment she learns that several members of the control
group (who were prohibited from playing video games during the experiment)
forgot and played them anyway.
She removed these errant members from the experiment. Fortunately, there were
not enough of them to affect the outcome of the experiment.
This is an example of mortality.
Threats to External Validity in Experiments (p. 202)

External validity refers to the generalizability of an experiments results to other


settings and situations. There are two common threats to external validity:
reactivity and selection bias.
Reactivity occurs when research subjects change their behavior when they
become aware that they are being watched or measured.
An interaction between selection bias and the experimental variable occurs when;
o The experimental and control groups are not equivalent with respect to the
dependent and other variables that could affect the outcome of the
experiment, and
o One of these groups includes members with characteristics that cause them
to be more or less susceptible to change caused by the treatment
(independent variable)

Table 9.2 - Threats to external validity in an experiment. (p. 204)


Threat to
external
validity

Description

Example

Reactivity

An awareness that they are being


measured causes a change in the
behavior of research subjects.

A psychologist conducting an
experiment on the effect of a
new exercise program on obese
juvenile delinquents requires his
research participants to report
their weight on a weekly basis.
Overweight participants
ashamed of their weight may
report lower weights, thwarting
the researchers ability to
measure the effect of the
exercise program.

Interaction
between
selection
bias and the
experimental
variable

There is a failure to ensure that


the subjects assigned to the
experimental and control groups
are more or less equivalent with
respect to the variables that might
influence the dependent variable.

A researcher is conducting an
experiment to determine the
effect of frequent shift changes
on police officers
cardiovascular health. She
inadvertently assigns a larger
proportion of older officers to
the experimental group.
Because age is an influential
factor in cardiovascular health,
age and not frequent shift
changes may affect a change in
the dependent variable.

One of these groups either more


or less susceptible to change
caused by the treatment
(independent variable)

Making Research Real 9.5 Watching Out for Bullies (p. 203)
A school resource officer conducts a study on bullying.
This experiment involves observing students on a playground.
The officer realizes, almost too late, that his presence, in uniform on the campus
would be reactive because potential bullies would be deterred
He decides instead to use non-reactive security cameras.
Making Research Real 9.6 Self Esteem Among Child Abuse Victims (p. 203)
A researcher inadvertently assigns a higher percentage of girls to the
experimental group.

Gender plays a role in the effect of self-esteem


This selection bias may affect the outcome of the research.

The Benefits and Limitations of Experimental Research (p. 204)


Table 9.3 - The benefits of experimental research. (p. 205)

Ability to isolate the effect of an independent variable on a dependent variable


Ability to measure how much of an effect a treatment has on an outcome
Ability to demonstrate causality, or cause and effect

Table 9.4 - The limitations of experimental design. (p. 206)

Requirement of much time, money and control


Potential for serious ethical concerns
Possible lack of feasibility

Note to instructors: In this section the research method is discussed within the context of a
relevant research project. The material is organized by the generic research process that was
presented in Chapter 2. The story (i.e. the research project that provides the context) is presented
in a series of set out boxes called Developing the Method. These are repeated here to allow
instructors to use the research to teach the concepts, tools and techniques related to this research
method.
The Experimental Research Process (p. 206)
Reading about how a research method should be conducted is important.
However, learning a new research method is easier when you can see how
another person applied the method in an actual research situation.
In this section, we will take a look at how a researcher might implement an
experimental design following the basic steps of the research process outlined in
Chapter 2.
At the end of each research step, you will notice a box called Developing the
Method.
Within these boxes, you will read about an actual research experiment in the field
of criminology. We begin with a general introduction of this study.
Developing the Method 9.1 - A Case Study in Experimental Research (p. 206)

One of the most difficult questions in American policing is how to effectively respond to
domestic violence incidents. Calls for service to locations experiencing domestic violence are
among the most dangerous of policing activities. Strong emotions, drug and alcohol abuse,
and/or severe economic problems are present in nearly every domestic violence incident. These
are very personal events and the police are often viewed, even by the victims, as interlopers into
a private matter. How to respond to these calls and how to handle cases of alleged domestic
violence, therefore, are important to the police.
In 1984, Lawrence W. Sherman and Richard A. Berk published an article called The Specific
Deterrent Effects of Arrest for Domestic Assault in the American Sociological Review (1984a).
The research considered whether the threat of an arrest had a deterrent effect on perpetrators of
domestic violence. The research was conducted from 1981 to 1982 in Minneapolis, Minnesota
with the cooperation of the Minneapolis Police Department and the Police Foundation. The
project was funded through a National Institute of Justice grant.
The results of this research had a profound effect on the policing procedures relating to domestic
assault and violence (Buzawa and Buzawa, 1990). Specifically, policing leaders throughout the
nation reconsidered their long standing policies and procedures for dealing with domestic abuse.
In short, this research was a game changer in criminal justice practice.
Ask a Research Question (p. 207)
Because of their ability to isolate the effect of a single variable on an outcome,
experimental designs are most often used in explanatory research. Experiments
are appropriate for both pure and applied research purposes.
Developing the Method 9.2 - Asking a Research Question in Experimental Research (p.
208)
Before the 1980s, the standard response to domestic violence in most police departments was to
not get involved. Police officers were routinely trained that domestic assault was a private
matter not warranting an official response, such as an arrest. In fact, in only the most egregious
cases were police officers even allowed to arrest an individual they suspected of being guilty of
domestic assault. In addition to being emotionally charged, domestic assault cases often involve
individuals who pose a real safety risk to the police officers who respond to the call. Moreover,
the victims of domestic violence, usually women, are often hesitant to testify against their
domestic partners in court. In many cases, the alleged abuser is the principal wage earner in the
family and without their income, the family might be destitute. Regardless of the reasons, a
victims unwillingness to testify makes it very difficult for a police officer to convince a
prosecutor to file charges, especially since the officer does not witness the abuse in most cases.
Despite this standard response, many policing leaders became convinced that a more proactive
approach might be warranted since police officers were repeatedly called to the same houses.
The thinking was that a more preventive approach would reduce the overall number of calls for

service, thereby enabling the police to focus on other criminal matters. Advocates for battered
women also entered the debate and encouraged the police to take domestic assault more
seriously. Some of these groups even filed lawsuits against local police departments to compel
officers to arrest suspected domestic abusers.
At the height of this controversy, Sherman and Beck began the Minneapolis Domestic Violence
Experiment, or MDVE. According to Sherman and Berk, the purpose of the experiment was to
address an intense debate about how police should respond tocases of domestic violence
(1984b: 1). The researchers recognized three existing police responses to domestic violence: (1)
the traditional approach of doing as little as possible, (2) active mediation or arbitration of
disputes, and (3) arrest. In the researchers words, If the purpose of police responses to domestic
violence calls is to reduce the likelihood of that violence recurring, the question is which of these
approaches is more effective than the others? (1984b: 1).
The MDVE is most appropriately classified as explanatory research in that it attempted to
explain which of the three police responses deterred future domestic violence incidents. Because
the researchers initially intended that the results would influence public policy, the experiment
would be an example of applied research. Some might even argue that given the common
policing response to domestic violence at the time, the experiment and its results could be
classified as action research.

Conduct a Literature Review (p. 208)


In preparing to conduct an experiment, researchers should review the previous
literature, paying particular attention to how past researchers measured the
dependent variable and what independent variables have been found to affect the
dependent variable.
Developing the Method 9.3 - Conducting a Literature Review in Experimental Research (p.
209)
The literature review for the MDVE drew on three broad areas of research: classic research on
police decision-making, research on mediation and arbitration in domestic violence prevention,
and more contemporary research on police responses to domestic violence (Sherman and Berk
1994a and 1984b).
The researchers review of the literature confirmed that, for the most part, the police tended to
respond to misdemeanor domestic violence cases rather informally. Very few departments
reported a substantial number of arrests in these cases and some even had policies that prohibited
the police from making arrests for misdemeanor domestic violence. With respect to mediation
and arbitration, the researchers consistently found that mediation and arbitration did not reduce
repeat offending in cases of domestic violence. Finally, the researchers learned that a high
percentage of spousal homicides occurred in homes that the police had previously been called to
on allegations of domestic assault.

In short, the literature review revealed what the researchers suspected: domestic violence has a
strong potential for recidivism. Ignoring it and attempting to mediate it were doing very little to
reduce repeat offending. As for whether arrests would make a difference, the existing research
was not instructive. In the words of the researchers, [I]t was impossible to determine from the
data whether making more or fewer arrests would have reduced the homicide rate (Sherman and
Berk, 1984b:2). Therefore, they decided to proceed with their own experiment.
Refine the Research Question (p. 210)
Because experimental models are used in explanatory research, creating specific
hypotheses is an essential step in the experimental research process.
Developing the Method 9.4 - Refining the Research Question in Experimental Research (p.
211)
Sherman and Berk identified three possible strategies that might reduce domestic violence
reoffending: (1) arrest, (2) separate (send the suspect from the scene for eight hours), and (3)
mediate. The objective of the research was to determine which of these strategies would reduce
the frequency and seriousness of future domestic violence incidents among the research subjects.
This research involved one independent variable police response. The attributes of this
independent variable were arrest, separate and mediate. Although most research experiments
have only one dependent variable, Sherman and Berk decided to evaluate the effect of the
independent variable on two outcomes: the frequency of reoffending and the seriousness of future
domestic violence incidents. This resulted in six sets of alternative and null hypotheses.
Ha: Arresting individuals suspected of domestic violence reduces the frequency of reoffending.
Ho: Arresting individuals suspected of domestic violence does not affect the frequency of
reoffending.
Ha: Arresting individuals suspected of domestic violence reduces the seriousness of future
domestic violence incidents.
Ho: Arresting individuals suspected of domestic violence does not affect the seriousness of
future domestic violence incidents.
Ha: Separating individuals suspected of domestic violence from their domestic partner reduces
the frequency of reoffending.
Ho: Separating individuals suspected of domestic violence from their domestic partner does not
affect the frequency of reoffending.
Ha: Separating individuals suspected of domestic violence from their domestic partner reduces
the seriousness of future domestic violence incidents.
Ho: Separating individuals suspected of domestic violence from their domestic partner does not
affect the seriousness of future domestic violence incidents.

Ha: Mediating between individuals suspected of domestic violence and their domestic partner
reduces the frequency of reoffending.
Ho: Mediating between individuals suspected of domestic violence and their domestic partner
does not affect the frequency of reoffending.
Ha: Mediating between individuals suspected of domestic violence and their domestic partner
reduces the seriousness of future domestic violence incidents.
Ho: Mediating between individuals suspected of domestic violence and their domestic partner
does not affect the seriousness of future domestic violence incidents.
Define the Concepts and Create the Measures (p. 212)
Experimental researchers should be concerned about conceptualization and
measurement of the independent and dependent variables, as well as the variables
used to determine equivalency between the experimental and control groups.
Developing the Method 9.5 - Conceptualization and Measurement in Experimental
Research (p. 213)
Before embarking on their experiment, Sherman and Berk had to define their concepts and
operationalize their variables. They used the term domestic abuse broadly to include numerous
forms of abusive behaviors including psychological, economic, and physical abuse. Drawing
from a Minnesota state statute, they defined domestic assault as an assault on an individual by a
cohabitant or spouse (Sherman and Berk, 1984a: 263).
Sherman and Berk conceptualized the frequency of reoffending as the number of repeat offenses
that occurred within a six-month period following the first domestic violence incident. They do
not report how they conceptualized the seriousness of future domestic violence incidents
variable. They only indicate, in a footnote, that the protocols were based heavily on instruments
designed for an NIMH-funded study of spousal violence (Sherman and Berk, 1984a: 263). The
citation for this research in the bibliography is not apparent, so there is really no way for us to
know how they defined this variable. Given the importance of this dependent variable, this is a
rather stark omission. All we really know is that during the six month follow-up interviews, the
researchers asked the victims whether subsequent domestic abuse incidents included actual
assault, threatened assault or property damage. It does not appear that the researchers attempted
to measure the seriousness of future domestic violence incidents more precisely than this.
In terms of measurement, the researchers had two important challenges. First, they had to
determine which domestic assaults to include in their research. Although all domestic assaults
are serious, they vary with respect to severity. Severe cases of domestic violence are relatively
rare. Had the researchers included only the most severe cases, it would have reduced their
sample size significantly. At the same time, the researchers did not want to include relatively
minor domestic disturbance cases, such as verbal confrontations. There were so many of these

cases that including them would have cause problems during the analysis phase. Instead,
Sherman and Berk chose to include:
simple (misdemeanor) domestic assaults, where both the suspect and victim
were present when the police arrived. Thus, the experiment included only those
cases in which police were empowered (but not required) to make an arrest under
a recently liberalized Minnesota state law; the police officer must have probable
cause to believe that a cohabitant or spouse had assaulted the victim within the
last four hours, but the police need not have witnessed the assault (Sherman and
Berk, 1984a: 263).
In order to avoid the risk of further injury, the researchers chose to exclude domestic assaults
involving life-threatening or severe injury (Sherman and Berk, 1984a: 263). Minnesota law
defined this as felony aggravated assault. These cases were too serious to include in the
experiment because someones life was at stake and they demanded an immediate arrest. For
obvious ethical reasons, these cases could not be randomly assigned into the separation,
mediation and arrest groups.
Design a Method (p. 214)
Experimental researchers should determine which experimental design model
they want to use early in the research process. The choice of experimental design
depends on a number of factors, including how confident the researchers want to
be in their results, what is feasible and ethical in a given research setting and what
researchers want to know about the research subjects.
Table 9.5 - Threats to internal and external validity controlled in different experimental
models.1 (p. 216)
Threats

History
Maturation
Mortality
Testing
Instrumentation
Regression
Reactivity

Onegroup no
pretest

One
group
pretest/
posttest

Two
group no
pretest

Two
group
pretest/
posttest

Solomon
Four
Group

NO
NO
NO
NO
NO
NO
NO

NO
NO
YES
NO
NO
NO
NO

YES
NO
NO
YES
YES
YES
NO

YES
YES
YES
YES
YES
YES
NO

YES
YES
YES
YES
YES
YES
YES

1 Adapted from Campbell, D.T. and Stanley, J.C. (1963). Experimental and quasi-experimental
designs for research. Boston, MA: Houghton Mifflin Company.

Selection

NO

YES

NO

YES

YES

Developing the Method 9.6 - Designing Experimental Research (p. 216)


Early in the development of their research, Sherman and Berk decided that the best method for
their study was to use a true experimental design. Specifically, they opted to use a
pretest/posttest control group design. Recall that these researchers wanted to know which
response (i.e., arrest, separate, mediate) produced the highest deterrent effect (i.e., reduced the
frequency and seriousness of future domestic assault incidents). Therefore, Sherman and Berk
decided to create three equivalent experimental groups. Because they had three experimental
groups, their design was slightly different than most pretest/posttest control group designs, which
typically only have one experimental group.
Another important difference between Sherman and Berks experimental design and the
pretest/posttest control group model was the lack of a pretest observation. The domestic assault
cases came to the attention of the researchers as the calls for service that met the researchers
criteria came to the attention of the police. Because all of these research subjects were not
available to the researchers prior to their exposure to one of the treatments (arrest, separate,
mediate), there was really no way to measure the dependent variables (frequency and
seriousness) before they were exposed to one of the treatments.
Of course, after the research subjects were identified and assigned to one of the three groups, the
researchers were able to determine whether the research subjects had been involved in previous
domestic assault incidents. They found that their research subjects had considerable experience
in the criminal justice system and with domestic assault in particular. In 80 percent of the cases,
the victims had been assaulted by the suspect within the past six months. The police had
intervened in 60 percent of these cases. Twenty-seven percent of the couples involved in the
experiment were already participating in a counseling program. And among the male suspects
involved in the research, 59 percent had been previously arrested for other offenses, 31 percent
had been previously arrested for an assaultive offense and 5 percent had been previously arrested
for a domestic violence offense (Sherman and Berk, 1984b: 5). This information substituted for
the information that would have been provided by a pretest.
You may recall that creating equivalent groups is one of the important requirements for an
experiment. In order to ensure equivalency between the three experimental groups, the
researchers devised a randomized group assignment plan:
The design called for each officer to carry a pad of report forms, color coded for
the three different police responses. Each time the officers encountered a
situation that fit the experiments criteria, they were to take whatever action was
indicated by the report form on top of the pad. The forms were numbered and
arranged for each officer in an order determined by the lottery. The consistency
of the lottery assignment was to be monitored by research staff observers riding
on patrol for a sample of evenings (Sherman and Berk, 1984b: 3).

The final element of this experimental design was the posttest. During this phase, Sherman and
Berk determined which enforcement response (arrest, separation, or mediation) had the greatest
effect on the frequency and seriousness of future domestic violence incidents. The simplest way
to determine which of the three enforcement responses reduced the frequency and seriousness of
future domestic assaults was to compare the re-offending rates of the three groups during a
period of time following the treatment phase. Sherman and Berk chose to measure the reoffending rates of the three groups within a six month period of the initial intervention. The
decision to use this time frame was based on previous research indicating the frequency of
domestic assault re-offending. You may recall that during the six month period prior to the
treatment phase, 80 percent of the victims involved in the experiment had been assaulted by their
domestic partner.
Victims are often reluctant to call the police in subsequent cases of domestic assault, especially if
their domestic partner was previously arrested and/or was the major breadwinner for the family.
Hence, calls to the police for domestic violence were not a perfect measure for reoffending;
many domestic assaults may have occurred that did not result in calls to the police. Even when
calls were made, official records did not indicate the seriousness of the domestic assault incident.
To overcome these challenges, Sherman and Berk chose to conduct follow-up interviews with
the victims in addition to drawing on police records. For six months following the initial
domestic violence incident, the researchers attempted to contact each victim every two weeks:
Anticipating something of the victims background, a predominantly minority,
female research staff was employed to contact the victims for a detailed face-toface interview, to be followed by telephone follow-up interviews every two weeks
for 24 weeks. The interviews were designed primarily to measure the frequency
and seriousness of victimizations caused by the suspect after the police
intervention. The research staff also collected criminal justice reports that
mentioned the suspects name during the six-month follow-up period (Sherman
and Berk, 1984a: 263).
The researchers knew that their analyses would require data being collected at various levels of
measurement. For example, they wanted to know whether a subsequent domestic assault
occurred. This was measured at the nominal level (yes or no). In addition, they wanted to know
how much time had elapsed (in days) between the end of the intervention and the subsequent
domestic assault. This variable had to be measured at the ratio level of measurement because it
was possible for a subsequent domestic assault to take place immediately after the intervention
(i.e., zero days after the intervention). Finally, the researchers wanted to know something about
the seriousness of any subsequent domestic assaults. They measured seriousness at the ordinal
level by creating three attributes actual assault, threatened assault or property damage.
Research involving human beings who are already vulnerable to physical, psychological and
legal harm must be evaluated extensively prior to its implementation to avoid additional harm to
the participants. In this case, the research subjects (both the suspects and their victims) were
potentially exposed to numerous types of harm. In addition, domestic assault cases often result
in injuries to police officers. As a result, this particular experiment was required to undergo
considerable scrutiny. The experiment was funded through a grant to the Police Foundation from

the National Institute of Justices Crime Control Theory Program. Approval of the grant required
the grantees (Sherman, Berk and the Police Foundation) to present detailed plans on how they
intended to avoid potential harm to the individuals involved in the experiment. It is also likely
that the researchers themselves were required to submit their research plans for human subjects
review in their respective universities (University of Maryland, College Park and University of
California, Santa Barbara). Finally, the experiment was conducted with the cooperation of the
Minneapolis Police Department. Chief Anthony V. Bouza was no doubt required to seek
approval from the City of Minneapolis prior to agreeing to participate.
To identify and mitigate potential problems before the implementation of the experiment, the
researchers decided to conduct a three day conference. Previously, Sherman and Berk chose to
focus on two precincts in the city that had the highest historical incidence of domestic assaults.
The 34 officers assigned to these precincts attended this conference and were asked to participate
in the experiment for one year. All but one officer agreed to participate. Preliminary
conferences like these tend to be very helpful to researchers in that they identify potential
problems from the perspective of the individuals who are participating in the experiment. This
conference was no exception.
Collect the Data (p. 218)
The number and nature of data collection strategies used by experimental
researchers depend on the experimental design used by the researcher.
Developing the Method 9.7 - Collecting Data in Experimental Research (p. 219)
Often in field experiments, researchers conduct pilot tests. This is analogous to what survey
researchers do when they pretest their survey instrument on a small subsample of respondents.
The purpose of this process is to determine whether the survey will function as it was intended to
function. A pilot test for a field experiment does the same thing. It is an opportunity for
researchers to work out the bugs in an experimental design. Sherman and Berk do not report
whether or not they conducted a pilot test. Instead, the three day conference they conducted
prior to the experiment was intended to identify potential problems in the conduct of the
experiment. Even with this conference, it is evident in their report that they were forced to make
numerous changes in the conduct of the experiment as they confronted problems and challenges.
It is not uncommon for researchers to encounter problems in the field. Seldom do research plans
go as expected. Sherman and Berk initially determined that they would need at least 300 cases in
order to obtain meaningful results. Originally, they trained 34 officers in two of Minneapolis
four patrol precincts. These two precincts had the highest density of domestic violence reports
and arrests, so they were more likely to gather an acceptable sample quicker. In addition,
managing the experiment in two rather than four precincts would be easier. The experiment
began on March 17, 1981. The researchers estimated that it would take about one year to collect
300 cases. By November of that year, however, the researchers were disappointed with the
number of cases they had collected. Thus, they recruited an additional 18 officers to participate
in the experiment. And by August 1, 1982, the researchers had collected 314 cases (Sherman and
Beck 1984a and 1984b).

From the beginning, Sherman and Berk were concerned about whether or not the police officers
would adhere to the group assignment procedures they had developed. It was logical that a
police officer, upon encountering an overly aggressive or non-compliant suspect, might ignore
the lottery systems assignment of this person to the mediate or separate group and decide to
initiate an arrest. It was equally logical that a police officer, upon encountering a remorseful or
compliant suspect, might ignore the assignment of this person to the arrest group and decide to
mediate or separate the offender. To monitor this potential problem, Sherman and Berk assigned
researchers to ride along with police officers and observe the group assignment process.
Because of the infrequent nature of domestic assault cases and the expenses related to paying
these researchers, however, Sherman and Berk eventually abandoned this plan. In its place, they
assigned researchers to monitor the police radio and respond to domestic assault calls as they
occurred. Unfortunately, even this method failed since the monitors could not determine from
the cryptic nature of radio communications which cases were related to domestic assault. The
researchers finally settled on an alternative method to ensure that police officers adhered to the
assignment method: they printed serial numbers on each of the reporting forms. If an officer
submitted a reporting form that was out of sequential order, researchers would know that the
officer ignored the group assignment process. Of course, this method did not prevent an officer
from ignoring the group assignment process; it just indicated to the researchers if and when the
officers ignored the assignment procedures.
Ultimately, Sherman and Berk found that 98.9 percent of the suspects who should have been
assigned to the arrest group were arrested, 77.8 percent who should have been assigned to the
mediate group were provided mediation services, and 72.8 percent who should have been
assigned to the separate group left the scene for the specified eight hours. They concluded that,
though many of the officers occasionally failed to follow fully the experimental design, they
were confident that the majority of the officers followed the assignment instructions (Sherman
and Berk, 1984b: 3). Compliance with the assignment plan was due in a large part to three
factors. First, the researchers encouraged the officers to actively participate in the design of the
experiment. The three day training conference demonstrated to the officers that the researchers
were interested in their input and provided the officers with a means to buy into the project.
Second, it is likely police officers in the Minneapolis Police Department were as frustrated by
domestic assault as the researchers. Providing these officers with a means to make a substantial
contribution to our understanding of domestic assault enforcement did a lot to encourage the
officers to adhere to experimental procedures. Finally, the researchers committed considerable
resources to supervision. Throughout the experiment, members of the research team were
available to the officers to answer questions and deal with problems.
A second challenge that emerged in the data collection phase centered on the follow-up
interviews with victims. Many researchers were unable to contact the victims for the follow-up
interviews. Researchers made up to 20 attempts to contact victims, but many of the victims had
either moved or refused to respond to telephone calls or home visits (Sherman and Berk, 1984b:
4). Only 205 of the 330 victims were located and agreed to sit for interviews, representing a 62
percent completion rate. A 62 percent response rate is not terrible, but the researchers were
understandably concerned that this attrition rate (mortality) would affect the validity of the

research findings. Had a substantially higher or lower proportion of victims in one group or
another chosen not to be interviewed, the strength of the statistical findings would have been
threatened. Fortunately, Sherman and Berk found that there [was] absolutely no evidence that
the experimental treatment assigned to the offender affected the victims decision to grant initial
interviews (1984b: 5).
Analyze the Data and Interpret the Results (p. 221)
Experimental designs tend to include variables that are measured at the interval or
ratio levels. As such, t-tests or analyses of variance are the most common
statistical techniques used to analyze data in experimental research. When
interpreting the findings from these and other analyses, experimental researchers
should be up front about the possible threats to internal and external validity
within the experimental design.
Developing the Method 9.8 - Analyzing and Interpreting Data from Experimental Research
(p. 222)
Sherman and Berk do not report how they prepared their data for analysis. It is clear from their
reports, however, that they spent considerable time monitoring the data as it became available
(Sherman and Berk, 1984a, 1984b). In terms of the actual data analysis, the researchers used
three statistical techniques: a linear probability model, a logit formulation and a proportional
hazard approach. A description of these higher order multivariate techniques is beyond the scope
of this textbook. But interested readers will find a detailed description in an article published in
the American Sociological Review (Sherman and Berk, 1984a).
Overall, the statistical analysis conducted by Sherman and Berk revealed that 38.9 percent of the
suspects perpetrated a subsequent domestic assault within three months of their treatment. Three
independent statistical analyses also indicated that:

the suspects who were arrested were the least likely to re-offend within six months of the
treatment (10 percent re-offended within the reporting period);
the suspects who were separated were the most likely to re-offend within six months of
the treatment (24 percent re-offended within the reporting period); and
the suspects assigned to the mediation group were less likely to re-offend within six
months compared to the suspects who were separated, by more likely to re-offend
compared to the suspects who were arrested (19 percent re-offended with the reporting
period) (Sherman and Berk, 1984a and 1984b).

Sherman and Berk concluded that a mandatory arrest policy was the most effective strategy for
deterring domestic assault suspects from re-offending.
Sherman and Berk had to have known that the results of this experiment would have profound
policy implications. As such, they devoted considerable time to discussing the potential
weaknesses of their research. One potential weakness was officer misconduct in adhering to the
rules of the experiment. As discussed previously, the researchers were concerned that the

officers would ignore the lottery system that was designed to randomly assign the cases into one
of the three experimental groups. Again, the researchers went to great lengths to avoid this
potential problem. As a result, they were able to identify most cases in which this occurred.
Second, Sherman and Berk recognized that the incapacitation effect of an arrest may have
explained the lower re-offending rate. As they explained, if the arrested suspects spend a large
portion of the next six months in jail, they would be expected to have lower recidivism rates
(Sherman and Berk, 1984a: 268). To address this potential weakness, they determined how long
each of the arrested suspects in their experiment spent in jail:
[Forty-three] percent were released within one day, 86 percent were released
within one week, and only 14 percent were released after one week or had not yet
been released at the time of the initial victim interview. Clearly, there had been
very little incapacitation, especially in the context of a six-month follow-up
(Sherman and Berk, 1984a: 268).
A third problem was sample size, especially when the sample was broken down into various
categories (e.g., age, race, employment status, etc). This made it impossible for the researchers
to determine whether some types of offenders responded differently to an arrest. For example, it
may be that individuals with more violent criminal histories are less affected by an arrest. Given
this weakness, the researchers concluded that it was premature for state legislatures to pass laws
requiring arrests in all misdemeanor domestic assaults (Sherman and Berk, 1984b: 8).
Fourth, the researchers recognized that the location of the experiment might have produced some
external validity problems. Recall that the external validity of an experiment deals with
generalizability. Minneapolis has a rather large Native-American population, a historically low
rate of violence, and low unemployment. Sherman and Berk concluded that the cultural context
of other cities may produce different effects of police actions in domestic violence cases
(Sherman and Berk, 1984b: 8).
Finally, Sherman and Berk acknowledged that the follow up interviews might have had a
surveillance effect on the suspects (Sherman and Berk, 1894b: 8). That is, the suspects might
have known that their victims were being monitored and thereby might have been reluctant to
engage in future offending during the follow-up period. This is an example of reactivity.
Sherman and Becks straight-forward discussions about the potential weakness of this research
went a long way toward increasing the acceptance of this research among both the scholarly and
practitioner communities.
Communicate the Findings (p. 223)
How, when and where the results of an experiment are reported typically depends
on who is interested in the results of the experiment.
Developing the Method 9.9 - Communicating the Findings from Experimental Research (p.
224)

The results of the MDVE were published in two places. First, they appeared in an article in the
American Sociological Review, a scholarly journal published by the American Sociological
Association. This highly respected academic journal is distributed widely throughout the
Sociology and Criminal Justice academic communities. Remember that articles that appear in
academic journals are almost always peer reviewed. Although Professors Sherman and Berk are
highly qualified and well respected in their field, their research no doubt benefited from the peer
review process. In the report appearing in the American Sociological Review, the authors write:
We wish to express our thanks to the Minneapolis Police Department and its Chief,
Anthony V. Bouza, for their cooperation, and to Sarah Fenstermaker Berk, Peter H. Rossi,
Albert J. Reiss, Jr., James Q. Wilson, Richard Lempert, and Charles Tittle for comments
on an earlier draft of this paper (Sherman and Berk, 1984a: 261).
This list of reviewers contains some of the most respected scholars in the field. By
communicating this information, Sherman and Berk are letting the reader know that this research
was evaluated by others who have expertise in this subject and who know a great deal about
experimental design.
In order to reach a wider audience of policing practitioners, a shorter and less technical version
of the report was published by the Police Foundation, a co-sponsor of the research. This version
of the report was distributed to a large number of police chiefs and administrators. Normally,
scholars do not write up a separate report for laypersons or practitioners. But in this case, the
research was crucial to the policing community. Note that in writing up their results for two very
different audiences, the researchers had to communicate those results a bit differently. For
example, the following passages communicate the same finding, but one appeared in the
American Sociological Review and the other in Police Foundation Reports.
From the American Sociological Review

From Police Foundation Reports

The official recidivism measures show that


the arrested suspects manifested significantly
less subsequent violence than those who were
ordered to leave (Sherman and Berk, 1984a:
261).

[The experiment] found that arrest was the


most effective of the three standard methods
police use to reduce domestic violence
(Sherman and Berk, 1984b: 1).

To be sure, these were not the only times these researchers presented their findings. In fact, for
several years following the experiment, Sherman and Berk were asked to make presentations at
various scholarly conferences, police organizations and womens advocacy groups.
Ask another Research Question (p. 225)
Good research tends to produce as many questions as it answers. These new
questions are opportunities to continue the research process.

Develop the Method 9.10 Asking another Research Question in Experimental Research
(p. 225)
The MDVE changed policing procedures in significant ways, especially in the Minneapolis
Police Department:
As a result of the experiments findings, the Minneapolis Police Department
changed its policy on domestic assault in early March of 1984. The policy did not
make arrest 100 percent mandatory. But it did require officers to file a written
report explaining why they failed to make an arrest when it was legally possible to
do so. The initial impact of the policy was to double the number of domestic
assault arrests, from 13 the weekend before the policy took effect to 28 the first
weekend after. On one day in mid-March there were 42 people in the
Minneapolis jail on spouse assault charges, a record as far as local officials could
remember (Sherman and Berk, 1984b: 8).
One would think that with results this convincing, the controversy over whether a mandatory
arrest policy will reduce domestic assaults would have ended. But this is not exactly what
happened. One question that remained was whether the results were truly generalizable. Would a
mandatory arrest policy reduce domestic assault reoffending in other communities? Subsequent
experiments of a similar nature did not produce findings even closely harmonious with the
Minneapolis experiment. For example, after the MDVE, the National Institute of Justice and the
Centers for Disease Control and Prevention co-sponsored five research programs designed to test
Sherman and Berks findings. These studies, collectively referred to as the Spousal Assault
Replication Program, consistently found that the use of arrest was only occasionally associated
with reductions in repeat offending. As a result, many police administrators and prosecutors
doubt the effectiveness of a one size fits all response to domestic assault.
Sherman and Berk themselves admitted that more research was necessary before the results of
their experiment could be generalized to all communities. But they were clear that their results
were compelling enough to consider a change in the police response to domestic assaults:
A replication of the experiment in a different city is necessary to address these
questions. But police officers cannot wait for further research to decide how to
handle the domestic violence they face each day. They must use the best
information available. This experiment provides the only scientifically controlled
comparison of different methods of reducing repeat violence. And on the basis of
this study alone, police should probably employ arrest in most cases of minor
domestic violence (Sherman and Berk, 1984b: 8).
To be sure, Sherman and Berk made an important contribution to our understanding of domestic
violence and its prevention. But theirs is not the last word in this controversy
Getting to the Point (Chapter Summary)

An experiment is a research method that measures the effect of an independent


variable on a dependent variable. All experimental design models feature an
experimental group, a treatment and a posttest. More sophisticated experimental
design models also include a pretest and a control group.

The one group, no pretest experimental design only includes the basic elements of
an experimental design model the experimental group, a treatment and a
posttest. This design does not include a pretest or a control group. Because of
this the effect of the treatment cannot be accurately measured and the influence of
other factors on the dependent variable cannot be identified.

The one group pretest/posttest experimental design model includes the basic
elements of an experimental design model an experimental group, a treatment
and a posttest. In addition, this model includes a pretest that allows the researcher
to measure the actual effect of the treatment (independent variable) on the
dependent variable. This design does not include a control group so there is really
no way for the researcher to know whether something other than the treatment
caused a change to the dependent variable.

The two group no pretest experimental design model includes the basic elements
of the experimental design model an experimental group, a treatment and a
posttest. This design includes a control group so the researcher would be able to
determine that the independent variable, by itself, had some effect on the
dependent variable. However, because there is no pretest the researcher cannot
measure how much effect the independent variable had on the dependent variable.

The two group pretest/posttest experimental design model includes the basic
elements of the experimental design model an experimental group, a treatment
and a posttest. In addition this model has both a pretest and a control group.
These two features enable the researcher to measure the actual effect of the
treatment on the dependent variable and to determine whether or not other factors
might have caused a change in the dependent variable.

The Solomon Four Group experimental design model includes the basic elements
of the experimental design model an experimental group, a treatment and a
posttest. This model also contains a pretest and a control group. More
significantly, this model contains an extra experimental group and an extra control
group. These additional features enable the researcher to determine how much, if
any, the research subjects exposure to the pretest affected their performance on
the posttest.

Internal validity refers to the ability of an experimental design to document the


causal relationship between an independent variable and a dependent variable.
There are six common threats to the internal validity of an experiment: history,
maturation, mortality, testing, instrumentation and regression.

External validity refers to the generalizability of an experiments results to other


settings and situations. There are two common threats to external validity:
reactivity and selection bias. Reactivity occurs when research subjects change
their behavior when they become aware that they are being watched or measured.
Selection bias occurs when some members of the population are more or less
likely to be included in the experimental group, such that the experimental and
control groups are not equivalent.

Experimental research is effective at isolating and measuring the effect of a single


independent variable on a dependent variable. Experimental research is also
effective at demonstrating a causal relationship between two variables.

Experimental research requires considerable resources (e.g. time and money).


Often experimental research is not feasible because of the amount of control the
researcher must exert over the research subjects. In experiments involving human
subjects there is a potential for ethical violations.

Because of their ability to isolate the effect of a single variable on an outcome,


experimental designs are most often used in explanatory research. Experiments
are appropriate for both pure and applied research purposes.

In preparing to conduct an experiment, researchers should review the previous


literature, paying particular attention to how past researchers measured the
dependent variable and what independent variables have been found to affect the
dependent variable.

Because experimental models are used in explanatory research, creating specific


hypotheses is an essential step in the experimental research process.

Experimental researchers should be concerned about conceptualization and


measurement of the independent and dependent variables, as well as the variables
used to determine equivalency between the experimental and control groups.

Experimental researchers should determine which experimental design model


they want to use early in the research process. The choice of experimental design
depends on a number of factors, including how confident the researchers want to
be in their results, what is feasible and ethical in a given research setting and what
researchers want to know about the research subjects.

The number and nature of data collection strategies used by experimental


researchers depend on the experimental design used by the researcher.

Experimental designs tend to include variables that are measured at the interval or
ratio levels. As such, t-tests or analyses of variance are the most common
statistical techniques used to analyze data in experimental research. When

interpreting the findings from these and other analyses, experimental researchers
should be up front about the possible threats to internal and external validity
within the experimental design.

How, when and where the results of an experiment are reported typically depends
on who is interested in the results of the experiment.

Good research tends to produce as many questions as it answers. These new


questions are opportunities to continue the research process.

You might also like