NBME Epid

A study is conducted to assess the effectiveness of a new blood test for early detection of
prostate cancer. Ten thousand healthy men over the age of 50 years are randomly assigned
to receive either annual rectal examination or annual screening with the new blood test.
After 5 years, results show that of the 50 men in the blood test group that were diagnosed
with prostate cancer, 40 were living 2 years after the diagnosis was made. In comparison,
only 15 out of 45 men in the rectal examination group survived 2 years after being
diagnosed with prostate cancer. Researchers conclude that the blood test increases survival
compared with rectal examination. Which of the following potential flaws is most likely to
invalidate this conclusion?
A) Age of the patients
B) Diagnostic bias
C) Lead time bias
D) Recall bias
E) Type II erro
Answer is C
Which of the following statistical tests is most appropriately used to evaluate differences
among mean body weights of women in three different age groups?
(A) Dependent (paired) t-test
(B) Chi-square test
(C) Analysis of variance
(D) Independent (unpaired) t-test
(E) Fisher's exact test
Answer is C
A case-control study is conducted to assess the risk for intussusception in infants under the
age of 1 year who receive the rotavirus vaccine. The medical records of all those who
received the vaccine and those who did not receive the vaccine over a 6-month period are
reviewed. Results show 125 cases per 100,000 infant-years for infants who received the
vaccine compared to 45 cases per 100,000 infant-years for infants who did not receive the
vaccine. The investigators conclude that the relative risk for intussusception is 1.9 times
greater in infants
who receive the rotavirus vaccine (95% confidence interval of 0.57.7 and p=0.39). Which
of the following is the most accurate
interpretation of these results?
A) The results do not show an association between rotavirus vaccine and intussusception,
but they may be related
B) The results show sufficient statistical power to identify an
association between rotavirus vaccine and intussusception
C) Rotavirus vaccine is associated with a 39% risk for
intussusception
D) Rotavirus vaccine causes intussusception in 1.9% of infants
E) Rotavirus vaccine prevents 80 cases of intussusception per 100,000 infant-years

Answer is A
PSA test has its limitations. When it is negative, 1 out 200 times the pt actually has the
prostate cancer; when it is positive, only 1 out 5 times the Pt actually has the cancer.
Question, what is the relative risk of prostate cancer in pts with positive PSA test??
1.
2.
3.
4.
5.
1/1000
1/40
40
1000
39/200
Answer is 40
Answer is C
In a study on the use of antibiotic therapy to prevent the occurrence of febrile neutropenia
during chemotherapy, 760 patients were randomly assigned to either oral levofloxacin
(Levaquin) or to placebo beginning at the time that chemotherapy was initiated and
continuing until the neutropenia resolved. Febrile neutropenia developed in 65% of patients
who received levofloxacin and 85% of those who received placebo. Mortality rates and side
effect profiles were similar in both groups. What is the number of patients needed to treat to
prevent 1 episode of febrile neutropenia?
A. 50 patients
B. 20 patients
C. 10 patients
D. 5 patients
E. 2 patients
Answer id D
Number needed to treat = 1/(.85-.65)=5
A particular association determines membership on the basis of members' IQ scores. Only
those persons who have documented IQ scores at least 2 standard deviations above the
mean on the Wechsler Adult Intelligence Scale (WAIS) are eligible for admission. Of a group
of 200 people randomly selected from the population at large, how many would be eligible
for membership to this society? A. 1
B. 2
C. 3
D. 4
E. 5
The correct answer is E. 95% of a normally distributed population will fall between plus or
minus 1.96 standard deviations from the mean. Since the population is normally distributed
with regard to IQ, this means that approximately 2.5% of the population will have IQ scores 2
standard deviations or more above the mean, and 2.5% of the population will have IQ scores
2 standard deviations or more below the mean. 2.5% of 200 people is 5 people.
Trisomy 21 Normal karyotype

Positive test 100 50
Negative test 100 250
An experimental diagnostic test is developed to noninvasively detect the presence of
trisomy 21, Down's syndrome. The test is administered to a group of 500 women considered
to be at risk for a Down's fetus based on blood tests. The results of this test are shown
above. What is the sensitivity of this new test?
A. 40%
B. 50%
C. 67%
D. 71%
E. 83%
The correct answer is B. Sensitivity is defined as the ability of a test to detect the presence
of a disease in those who truly have the disease. It is calculated as the number of people
with a disease who test positive (true positive) divided by the total number of people who
have the disease (true positive + false negative). In this case, sensitivity equals the number
of babies born with trisomy 21 whose mothers tested positive (100) divided by the total
number of babies born with trisomy 21. This yields 100/200 = 50% (not a very sensitive
test).
40% (choice A) corresponds to the prevalence of the disease in the tested population, which
in this case equals the total number of babies with Down's syndrome (true positive + false
negative = 100 + 100= 200) divided by the total number of people tested (500). This yields
200/500= 40%.
67% (choice C) corresponds to the positive predictive value of the test, which equals the
number of babies with Down's whose mothers test positive (true positives = 100) divided by
the total number of mothers testing positive (true positive + false positive = 150). This
yields 100/150 = 67%.
71% (choice D) corresponds to the negative predictive value of the test, which equals the
number of normal babies whose mothers tested negative (250) divided by the total number
of people testing negative (350). This yields 250/350 = 71%.
83% (choice E) corresponds to the specificity of the test, which equals the number of babies
without Down's whose mothers tested negative (250) divided by the total number of babies
without Down's (300). This yields 250/300 = 83%.
An easy way to remember these concepts is:
Sensitivity = true positives/all diseased
Specificity = true negatives/all normal
PPV = true positives/all positives
NPV = true negatives/all negatives
Prevalence = all diseased/total population
A recent study was conducted to evaluate the efficacy of a new drug for preventing
hospitalization from pneumonia associated with HIV-infection. Reducing hospital stays
dramatically lowers the cost of medical care. One group of patients, who received the drug
in the year preceding the observation period, was identified from patient records of
participating physicians. A second group of patients who had not received the drug was also
identified from the records of these same physicians. Both groups were followed for a threeyear period to determine the number of hospitalizations, and any deaths from all causes.
The number of hospitalizations for the group given the new drug was significantly lower (p
.001) than for the group not given the new drug. However, the mortality rate from all
causes was found to be significantly higher (p .01) in the group who received the drug.
Based on these results, the physicians conducting the study decided that patients should not
be given the drug. This conclusion is most likely invalid because of which of the following?
A. Incorrect classification of patients into the appropriate research arm
B. Knowledge about whether patients had taken the drug or not biased the measurement of
the outcome variables
C. Patients who received the new drug may be less healthy than those who did not
D. Physicians who participated in the study may not be representative of the universe of
physicians at large
E. The existence of a Hawthorne effect
F. The sickest HIV patients died before they could be enrolled in the study
The correct answer is C. Because all patients enrolled in the study came from the records of
the same physicians, whether or not the patients received the new drug depended on the
assessment and treatment decisions made by each physician. It is very likely that those
patients to whom the physicians administered the new drug were different than those for
whom they prescribed some alternative. One likely difference is that patients who received
the new drug were sicker, and that this alone caused the physicians to use the new
drug. The higher mortality rate in the group who received the drug supports this possibility.
Failing to control for disease severity for patients in different treatment arms of a study is a
major confounding element in this type of research.
Although it is possible that physicians record contained errors as to whether patients
had received the new drug or not, this type of misclassification is (choice A) very unlikely.
For choice B to be true, one would need to argue that knowing which arm of the study a
patient was in made them more likely to go to the hospital or to die. The outcome variables
used here are clean, easily measured, and unlikely to be changed due to knowledge of which
study group the patients were in.
Choice D is incorrect. It is almost certain that the physicians who participated in this study
do not reflect all physicians everywhere. But, because both treatment and control subjects
come from the same patient pool, using selected physicians does not bias the comparison
that is at the core of the study design.
The Hawthorne effect (choice E) tells us that people act differently when they know they are
being watched or measured. As noted in the comment on choice B, this is unlikely in this
case. In addition, having a control group, as in this case, offers a simple solution to the
Hawthorne effect. Participants in both the treatment and the control group know they are
being watched, so this factor will not contaminate any comparison between the two groups.
Choice F would be true for both the treatment and the control groups, and so does not
disturb any research conclusion drawn by a comparison between them.
A screening test for sickle cell hemoglobin is available in most clinical laboratories. In which
of the following groups would this screening test have the highest positive predictive value?
A. American blacks
B. Ashkenazi Jews
C. Caucasians
D. Hispanics
E. Native Americans
The correct answer is A. The positive predictive value of a diagnostic test reflects the
likelihood that an individual actually has the condition if the test is positive. Positive
predictive value is mainly affected by two variables, namely the accuracy (combination of
sensitivity and specificity) and prevalence of the disease. For example, if a test has a
sensitivity of 99% and a specificity of 99% (i.e., a high degree of accuracy), and the disease
has a prevalence of 1/10,000, the positive predictive value will be only 1%. On the other
hand, if the same test is applied to a population in which the disease has a prevalence of
25/100, the positive predictive value will be 97%. Thus, the clinical usefulness of a screening
test is based not only on its intrinsic accuracy, but also on the prevalence of the disease
tested. It is useless to use a screening test (whether a questionnaire, an imaging study, or a
laboratory measurement) if the disease has a low prevalence. In the above example, the
disease for which the test is available (i.e., sickle cell hemoglobin) is highly prevalent in
American blacks, but not in the remaining ethnic groups, namely Ashkenazi Jews (choice B),
Caucasians (choice C), Hispanics (choice D), and Native Americans (choice E).
In a city with a population of 1,000,000, 10,000 individuals have HIV disease. There are 1000
new cases of HIV disease and 200 deaths each year from the disease. There are 2500
deaths per year from all causes. Assuming no net emigration from or immigration to the city,
the incidence of HIV disease in this city is given by which of the following? A. 200/1,000,000
B. 800/1,000,000
C. 1000/990,000
D. 2500/1,000,000
E. 10,000/990,000
F. 10,000/1,000,000
The correct answer is C. The incidence of a disease is given by the number of new cases in a
given period divided by the total susceptible population. In this case, this is equal to
1000/990,000.
The disease-specific mortality rate is the number of deaths per year from a specific disease
divided by the population; in this case, 200/1,000,000 (choice A).
The rate of increase of a disease is given by the number of new cases per year, minus the
number of deaths (or cures) per year, divided by the total population. Since there is yet no
cure for HIV disease, the number of cures is 0. In this case, the rate of increase is (1000200)/1,000,000 = 800/1,000,000 (choice B).
The crude mortality rate is given by the number of deaths from all causes, divided by the
population; in this case, 2500/1,000,000 (choice D).
10,000/990,000 (choice E) is a distracter.
The prevalence of a disease is defined as the number of cases of the disease at a given time
divided by the total population. In this case, it would be 10,000/1,000,000 (choice F).
The American Diabetes Association (ADA) recently lowered the cutoff value for fasting
glucose used in diagnosing diabetes mellitus from 140 mg/dL to 126 mg/dL. This reference
interval change would be expected to produce which of the following alterations? A.
Decrease the test's sensitivity
B. Increase the test's false negative rate
C. Increase the test's negative predictive value
D. Increase the test's positive predictive value
E. Increase the test's specificity
The correct answer is C. The negative predictive value of a test (PV-) refers to the percent
chance that a normal test result is a true negative (TN; a healthy person with a normal test
result) rather than a false negative (FN; a diseased person with a normal test result). It is
expressed by the following formula:
PV- = TN/(TN + FN)
Similarly, the positive predictive value of a test (PV+) reflects the probability that an
abnormal test result represents a true positive (TP) rather than a FP (a TP is an abnormal
test result in an individual with disease, while a FP is an abnormal test result in a healthy
person). PV+ is calculated using the following formula:
PV+ = TP/(TP + FP)
There is a relationship between the test's sensitivity (the ability of a test to detect disease in
those who truly have the disease), specificity (ability of the test to correctly identify those
without disease) and the PV- and PV+, respectively. Tests with 100% sensitivity (no FNs) and
tests with 100% specificity (no FPs) automatically have a PV- and PV+ of 100%, respectively.
Changing the reference interval of a test alters its sensitivity, specificity, PV+, and PV-. In
this question, lowering the upper limit of the reference interval of a fasting glucose from 140
mg/dL to 126 mg/dL increases the test's sensitivity, since a lower glucose cut-off approaches
the normal value for glucose in the normal population (70-110 mg/dL). Furthermore,
increasing the test's sensitivity automatically increases the test's PV-, since there are fewer
FNs.
The test's sensitivity is increased (choice A) rather than decreased by the given change in
the reference interval.
Since the test's sensitivity is increased, the FN rate at the new reference interval is
decreased (choice B).
The test's positive predictive value (choice D) decreases, since the test's specificity, which
ultimately determines its FP rate, decreases as the test's sensitivity increases. Stated
another way, a greater number of normal individuals will have FP fasting glucose levels
when the test's upper limit is decreased to 126 mg/dL.
A test's sensitivity is inversely related to its specificity. For example, changing the reference
interval of a test to increase its sensitivity automatically lowers its specificity (choice E),
since the number of FPs will increase. Similarly, when altering a reference interval to
increase a test's specificity, its sensitivity is reduced because of an increase in the number
of FNs.
In summary, lowering a test's upper limit of normal increases the test's sensitivity, which
decreases the FN rate, and increases the PV-. Increasing the test's sensitivity also decreases
the test's specificity. Decreasing a test's specificity increases the FP rate and decreases the
PV+.
The blood supply used by a local hospital for transfusions during surgical procedures is
discovered to be infected with hepatitis C. To protect the blood supply from future
contamination, all new donors are required to be screened for hepatitis C. The test used for
this screening has a sensitivity of 95%, and a specificity of 90%.
If this test is used on a sample of donors in which 10% are known to have hepatitis C, what
is the chance that a donor who tests negative is actually free from the disease?
A. About 99%
B. About 95%
C. About 90%
D. About 85%
E. About 50%
F. About 45%
The correct answer is A. To answer this question you must construct a 2 x 2 table. Start with
a sample of 1,000 because it is a nice, round number and, because screening tests are
about ratios, it does not matter what sample size you use. The Table is then completed by
computing 10% of 1,000 for the total number of diseased in the marginal (100), taking 95%
of this 100 and 90% of the remaining 900 non-diseased part of the sample. With the table
complete, Negative predictive value can be computed using the given numbers. The
complete table and the negative predictive value calculation are presented below:
Negative predictive value = 810/815 or just over 99%. Note that you should be able to
estimate that without even actually completing the calculation. To complete the full set of
calculations:
Note: the test has excellent negative predictive value, but a positive predictive value just
slightly better than chance (50%).
Periodic measurements of prostate-specific antigen (PSA) have been proposed by several
health organizations as a screening test for early detection of prostate cancer in men older
than 50. Which of the following reasons makes the PSA test suboptimal for this purpose? A.
The condition does not have a presymptomatic stage
B. The condition is not sufficiently prevalent
C. The condition is not treatable if identified
D. The cost of the test is prohibitive
E. The test is not sufficiently accurate
F. Treatment of the disease does not improve outcome
The correct answer is E. Use of the serum marker PSA has become widespread as a
screening test for early detection of prostatic cancer. Currently, both the American Cancer
Society and the American Urological Society recommend annual PSA screening for prostate
cancer in all men older than 50, whereas the U.S. Preventive Services Task Force (USPSTF)
does not recommend it. The main difficulty is related to the poor sensitivity and specificity of
the PSA test. Its sensitivity for detecting prostate cancer is between 30% and 80%, with a
positive predictive value of less than 35%. This implies that if an individual has a PSA level
above the normal limit, the probability that he has prostate cancer is no more than 35%.
Furthermore, a significant portion of prostatic cancers do not behave aggressively, and
obviously the PSA test does not discriminate between those that will have an indolent course
and those that will result in potentially fatal disease.
Prostate cancer has a long preclinical (asymptomatic) period (compare with choice A), which
allows effective screening.
In men, prostate cancer is the most prevalent of cancers (compare with choice B) and the
second most common cause of cancer-related deaths.
Radical prostatectomy, radiation therapy, and hormonal therapy are effective therapeutic
interventions (compare with choice C).
The cost of the test is low (compare with choice D), but additional diagnostic work-up of
false positive results may lead to considerable expense.
Available treatments are associated with a good outcome (compare with choice F) if cancer
is detected at an early stage. However, as mentioned above, many small prostatic
neoplasms detected by PSA screening often do not progress to metastatic disease.
Therefore, improvement of outcome may not necessarily be related to PSA screening.
A randomized, double-blind clinical trial was conducted to compare the efficacy of two
antifungal drugs for treating a common dermatological problem. All participants in the study
had been previously diagnosed with the fungal infection. Half of the study participants
received a newly developed antifungal agent and half received the commonly prescribed
alternative. The sample size was selected to provide a statistical power of 70%. Results of
the study showed that the fungus disappeared in 50% of the patients receiving the newer
agent and in 40% of patients receiving the alternative treatment. Results of the study stated
p = .13.
Using the commonly accepted alpha criterion, the chance that the study results will not
reflect the results of the treatment for this type of fungus in the population at large is which
of the following?
A. 0
B. .10
C. .13
D. .30
E. .70
F. .87
The correct answer is D. This question asks for the probability of Type II or beta error. The
computed p-value of .11 is greater than the common alpha criterion of p .05, therefore
we do not reject the null hypothesis. When we fail to reject the null hypothesis the chance of
a Type I error is zero, however, a Type II error is a distinct possibility. Because the p-value is
about Type I error, it is not the key for answering question. Instead, recall that Type II error is
related to statistical power in the following manner: 1 C Power = Type II error. In this case
statistical power is given at 70%, therefore: 1- .70 = .30.
No, choice A is the chance of a Type I error in the presented case.
No, choice B is the difference between the 50% and the 40% results for the two anti-fungal
drugs. It gives us a sense of clinical significance, but nothing about the chance of Type II
error.
No, choice C is the computed p-value, which is greater than p < .05, meaning we cannot
reject the null hypothesis. It tells us Type I error IF we did reject, but does not tell us the
chance of Type II.
Choice E is the statistical power, as given in the question. 1 C power = beta.
Choice F is derived by 1 - .13 =.87 but does not give the chance of Type II error. Simply if .13
is the chance of being wrong if we reject the null hypothesis (Type I error), then .87 is the
chance of being right if we reject the null hypothesis. Type II error is the chance of being
wrong if we do NOT reject the null hypothesis.
The National Council for the Prevention of Violence (NCPV) has long recommended that
primary care physicians screen for the presence of guns in the homes of their patients. The
recommended screening question is, Do you have any type of gun or firearm in the
home where you live? Recently, the Council has recommended a second question be
added to this original question. The second recommended question is, Do you keep your
firearm locked so that a key is required to access it? If patients response
affirmatively to the first question and negatively to the second, the Council recommends
that the physician spends a couple of minutes reviewing some of the key aspects of firearm
safety. Materials for this discussion are provided free of charge on the Councils website.
This change in recommendation by the Council is most likely to have what effect on the
screening for potential gun violence?
A. Cannot be determined with any certainty from the information given
B. Decreased positive predictive value
C. Decreased sensitivity
D. Decreased specificity
E. Increased efficiency of the screening procedure
F. Increased negative predictive value
The correct answer is C. Moving from a one to a two question sequence will decrease the
sensitivity, increase the specificity, increase the positive predictive value, and decrease the
negative predictive value of the screening test. The two-question sequence makes it harder
to classify someone as at risk.
Certain people who would have been called at risk by a yes to question 1 will now
be excluded if they say yes to question 2. Excluding more people reduces the
sensitivity but increases specificity. However, people who have a gun and keep it unlocked
are more likely to use the gun in violence than if they keep the gun locked so, positive
predictive value increases, while negative predictive value decreases. The impact on efficacy
(accuracy) is difficult to determine without more information, so this cannot be the best
answer.
Physicians in a particular group practice wanted to agree on a standard treatment protocol

for patients suffering from multiple sclerosis (MS). A review of the literature uncovered over
40 different studies on the topic of treatment for MS published in peer-reviewed journals in
the past five years.
To summarize this literature and reach a conclusion about what constitutes the current
standard of care for MS, the physicians in this practice would most likely be assisted by the
use of which of the following?
A. Analysis of variance
B. Chi-square analysis
C. Meta-analysis
D. Odds-ratio analysis
E. Pearson correlation
F. Regression analysis
G. Relative risk analysis
The correct answer is C. A meta-analysis is a set of methods for conducting a

mathematically based literature review. Simplistically, the results of all studies are
combined, weighted by such issues as sample size and design quality, and summarized to
yield a single p-value. The results of a meta-analysis tell us, based on the weight of existing
data, what treatment, among many, works best, and the degree of benefit that treatment
provides. That is, a meta-analysis summarizes both statistical and clinical significance.
Analysis of variance (choice A) is used to analyze data from a single study where the design
provides one or more nominal variables and one interval variable.
Chi-square analysis (choice B) is used to analyze data from a single study where the design
provides two nominal variables.
Odds-ratio analysis (choice D) is used to analyze data from a single case-control study.
Pearson correlation (choice E) is used to analyze data from a single study where the design
provides two interval variables.
Regression analysis (choice F) is used to analyze data from a single study where the design
provides two interval variables. Think of it as a more complicated correlation analysis.
Relative risk analysis (choice G) is used to analyze data from a single cohort study.
A standard urine test, used to detect the presence of cocaine, is determined to have a
sensitivity of 97%, a specificity of 94%, a positive predictive value of 90%, and a negative
predictive value of 95%. These values are based on the use of the screening test in the year
1995. Recent data suggest that the use of cocaine in all age groups has declined since that
year.
This change in the use of cocaine will most likely have which of the following impacts on the
results of the screening test?
A. Decrease the sensitivity
B. Decrease the specificity
C. Increase the negative predictive value
D. Increase the positive predictive value
E. Increase the sensitivity
F. Increase the specificity
The correct answer is C. This question provides a lot of numbers to ask a simple question:
What happens to screening test values when prevalence declines? The numbers given are a
distraction, and are not needed to answer this question. As prevalence declines, sensitivity
and specificity are unchanged. On the other hand, a decline in prevalence will increase
negative predictive value, and decrease positive predictive value. This is true because
predictive value is a function of the properties of the screening test in conjunction with the
disease base-rate in the population.
To assess the effects of cigarette smoking on rates of automobile accidents, three groups of
volunteer subjects were followed for a four-year period. One group was composed of 1,000
current cigarette smokers, a second group was composed of 800 former smokers, and the
third group was composed of 750 people who had never smoked cigarettes. All three groups
contained participants of both genders and across all ethnic/racial groups. At the end of the
four years, the automobile accident rates for the three groups were compared. The results
showed an automobile accident rate for the current smokers that was higher than the rate
for nonsmokers (p .05), but not statistically different than that for the former smokers
(p > .05).
When interpreting the results of this study, one should be most concerned with the effects of
what type of bias?
A. Expectancy bias
B. Late-look bias
C. Measurement bias
D. Proficiency bias
E. Recall bias
F. Selection bias
The correct answer is F. This question describes a cohort study in which three different
groups of people are followed over time to see if they have different incidence rates of
automobile accidents. For this type of study, the most common type of bias is selection, or
sampling bias. The problem is that the people who volunteer to participate in this study may
be very different from the general population. One would need to check the demographic
variables of the participants and compare them to those not participating in order to be sure
the results were at all generalizable. Some selection bias is probably inevitable because
people get to decide to participate or not to participate in any given study. It is very likely
that the different decisions that people make reflect the different types of people that they
are.
Expectancy bias (choice A) is when a researcher or physician knows which subjects are in a
treatment vs. a placebo group, which may, unwittingly, cause him to interact with them
differentially, based on that information. If you think someone is getting a better treatment,
you are more likely to think they get better, and perceive effects over and above the
physiological effects of the drugs administered. The solution to the expectancy effect is a
double blind design, in which neither subjects nor the researchers who have contact with
them know which arm of the study the subjects are in. This is not the answer here because
there is no intervention, but merely observation.
Late-look bias (choice B) is a problem when gathering information about some types of
severe diseases. The problem is that the most severe cases will be dead or inaccessible
before you can gather their information. This is not the answer here, because classification
as to cigarette smoking was done at the start of the study, and the outcome variable of
automobile accidents is public and easy to confirm.
With measurement bias (choice C), something about how the information is gathered affects
the information collected. This can be true because survey questions use inappropriate
wording that slants respondents to a particular answer, or because just knowing that you are
being measured causes people to act differently than they would if they were not observed.
Although it is possible that the study participants may all try to be better drivers because
they are in the study, this would be true of all of the groups in the study, and would not
distort the comparison among them.
Proficiency bias (choice D) is an issue when comparing the effects of different treatments
administered at multiple sites. Simply put, the physicians at one site may have more skill
with a given procedure than others. This means that the different levels of skill of the
physicians delivering treatment might impact patient outcomes more than the treatment
selection itself.
Recall bias (choice E) is a problem in retrospective studies (think case-control study) in which
people are asked to remember what happened in the past and report it in the present. If
people do not remember, and say so, then we have missing data. But often, people will
invent answers either from a desire to please the researcher, or because our memory of the
past changes over time.
A respected national medical journal recently published a study detailing the signs and
symptoms of a new type of viral infection. The infection was first reported by passengers on
a cruise ship in the Caribbean Sea, and subsequently by passengers on cruise ships off the
Pacific coast as well. The results of the study provided details as to the symptoms, timing of
onset, duration of illness, and treatments, if any, used by the passengers. In the discussion
section of the article, the authors catalogue various pathogens that may be at fault. Based
on the comparison between the passengers reported symptoms and the expected
symptom profiles of other infectious agents, the authors concluded that the passengers
symptoms were most likely the result of a new and, as yet, uncatalogued pathogen. The
type of study described here is most likely to be regarded as which of the following?
A. Case series study

B. Case-control study
C. Cohort study
D. Cross sectional study
E. Cross-over study
F. Phase I clinical trial
The correct answer is A. A case series study collects detailed information about people who
are all believed to have the same disease or condition. This information allows for a clear
description of the common elements that define the disease. Note that, as in this case, there
is no control group of non-diseased people in the sample. Case series studies are critical first
steps to generating a profile by which a new disease can be recognized, so that appropriate
treatment can be administered.
A case-control study (choice B) starts by classifying study participants as either diseased
(cases) or non-diseased (controls), and then looking backward in time for the presence or
absence of suspected risk factors. We analyze data from this type of study by using an odds
ratio.
A cohort study (choice C) classifies study participants as either having, or not having a given
risk factor, and then follows them forward in time to assess the incidence (disease onset) of
one or more disease conditions. We analyze data from this type of study by using a relative
risk calculation.
Cross sectional study (choice D) assesses the prevalence of a disease in a defined
population. The numbers of people with the disease are first counted. Then, factors that are
more common in the diseased, than the non-diseased, are identified based on the current
characteristics of the study participants. We usually analyze data from this type of study by
using a chi-square analysis.
A cross-over study (choice E) is an intervention study in which all study participants get the
treatment being tested, but at different times. The central idea is to leave no untreated
study participants. Said differently, the treatment and control groups are switched at some
predetermined point in the study.
A Phase I clinical trial (choice F) is the first in a series of studies that must be concluded
before the Food and Drug Administration will consider approving a new drug for general
usage. A Phase I trial is about safety. The new drug is given to a small group of healthy
volunteers to make sure that it has no adverse consequences. In other words, first do no
harm.
A number of articles in the past ten years have examined the relationship between alcohol
consumption and performance in medical school. In the most recent of these studies, all
students in a large Eastern medical school were classified as either abstainers, light drinkers,
or heavy drinkers using Research Diagnostic Criteria. At the same time, all students were
classified as being in either the top, middle, or bottom of their class. Results showed that
students in the top or the bottom of the class were more likely to be heavy drinkers and
included the statement p .01.
Which statistical test was most likely used to generate this result?
A. Analysis of variance
B. Chi-square
C. Matched pairs t-test
D. Meta-analysis
E. Pearson correlation coefficient
F. Pooled t-test
The correct answer is B. Questions about which statistical test to run are answered by first
deciding what type of data is to be analyzed. Nominal data is things counted in groups or
categories. Interval data is measured along a dimension graded in equal intervals. The
present case gives us three levels of drinking and three levels of class performance. Both
will be treated as nominal. The key to the question is in the following short table:
The intelligence quotient (IQ) scores are obtained for a sample of 100 patients diagnosed
with various types of schizophrenia who completed a standard IQ test battery. An additional
20 patients had to be dropped from the sample because they lacked the functional capacity
to complete two or more portions of the test. Four other patients refused to take the test
battery when offered the opportunity. Results for the patients who completed the test
battery gave an average IQ of 110 and a standard deviation of 20.
Using this information, compute the 95% confidence interval for this estimate of the mean.
A. 70 to 130
B. 70 to 150
C. 85 to 115
D. 90 to 130
E. 105 to 115
F. 106 to 114
The correct answer is F. The formula for the confidence interval of the mean is:
The mean is given as 110. To achieve a 95% confidence interval we use a Z-score =1.96 (or
2.0 to make the calculation easier). The standard deviation (S) is given as 20 and the sample
size is given as 100. Inserting these values into the formula returns the result of 110 4
or 106 to 114. Note that the information about those who either were unable or refused to
participate in the study is not relevant to answering the question asked here.
Choice A is incorrect. Asking for a 95% confidence interval is not the same thing as asking
for 95% of the cases in a normal distribution. The first is an inferential statistic, trying to
decide what the true mean might be, while the second is a descriptive statistic, looking for
95% of the cases. This choice tells us the answer to the question: in the general
population, 95% of the population has an IQ in what range? With a standard mean of 100
and a standard deviation of 15, 95% of the cases would fall between 70 and 130 (mean
2S).
Choice B gives the given mean of 110 2S. But the question asked for the 95%
confidence interval, not for the scores of 95% of the people who took the test.
See comment on choice A. Choice C is the result for 68% of the cases in the population
(mean 1S).
For choice D, see comment on choice B. In this case the result is for 110 1S.
Choice E is simply the given mean (110) 5. If this seemed to the right answer to you,
you made a calculation erro
An African community with a population of 10,000 people has a 5% prevalence of HIV. A
screening test for HIV has a 98% sensitivity and 95% specificity. If the prevalence of HIV
doubles in the community, how will the sensitivity and positive predicted value (PPV) of the
screening test be affected?
A. Increased sensitivity; Increased PPV
B. Increased sensitivity, Decreased PPV
C. Increased sensitivity; No change in PPV
D. Decreased sensitivity; Increased PPV
E. Decreased sensitivity; Decreased PPV
F. No change in sensitivity; Increased PPV
G. No change in sensitivity; Decreased PPV
correct.....answer is f
there is no change is sensitivity but positive predictive value will increase..
cus PPV is directaly proprtional to prevalence..
when prevalence increases ...PPV also increases
but prevalence has inverse relationship with NPV
cus when prevalence increases NPV decreases
as far as the sensitivity and specificty are concerned..they have no relationship to
prevalence..i mean they are not affected by change in prevalence...
A group of researchers has designed a study to determine whether or not pesticide use
among farmers is associated with an increased risk of cancer. 100 farmers with known
pesticide exposure are followed for 10 years. 100 organic farmers with no pesticide exposure
are followed for ten years. At the end of 10 years, it is determined that 12 of the farmers
with pesticide exposure developed cancer compared to 4 of the organic farmers with no
pesticide exposure. What type of study is this?
A. Case-Control study
B. Cohort study
C. Cross-sectional study
D. Case Study
Answer is B
A statistician is reviewing a double-blinded placebo control study for type I error. Which of
the following statements best describes type I error?
A. Failure to reject the null hypothesis when it is false
B. Rejecting the null hypothesis when it is true
C. Accepting the alternative hypothesis when it is true.
D. Accepting the null hypothesis when it is true
E. Rejecting the null hypothesis unconditionally
Answer is b.
A group of 10,000 cigarette smokers are followed prospectively for 20 years. Of the 10,000
smokers, 2,000 develop lung cancer. A matched control group of 10,000 non-smokers is also
followed prospectively during the same time period. 200 non-smokers in the control group
develop lung cancer within the 20 year period. What is the relative risk of developing lung
cancer if you smoke?
A. 0.1
B. 0.2
C. 0.6
D. 1
E. 5
F. 10
G. 15
Answer is F
1000 adults are selected randomly to participate in an intelligence test. The test is scored
between 1 and 200 with a mean score of 100 and a standard deviation of 15. The scores of
the 1000 adults has a normal, Gaussian distribution. How many of the adults scored
between 100 and 115 on the intelligence test?
A. 220
B. 340
C. 400
D. 500
E. 680
Answer is B. 340
In a group of 100 patients with known occupational asbestos exposure, 9 developed
malignant mesothelioma. In a matched control group of 100 patients without asbestos
exposure, 1 patient developed malignant mesothelioma. What is the attributable risk of
asbestos exposure?
A. 1%
B. 2%
C. 3%
D. 5%
E. 8%
F. 9%
G. 10%
Answer is E. 8 %
Which statistical test would be most appropriate for assessing the difference in age among
three groups of patients?
A)Student's t-test
B)Analysis of variance
C)Correlation coefficient
D)Chi-squared test
E)Logistic regression
Answer is B
Raising the cut off point in a test....
What happend with sensibility and with specificity?
specificity increases and sensitivity decreases which is like more number of people we get
without disease and less number with disease, PPV increases and NPV decreases
An aspiring medical student opens up his Step 1 score report and is elated to find that his
score is 230, a 92 on the 1-100 scale! The score report indicates that the mean on the exam
was 215 and the standard deviation is 20. Assuming a normal distribution, in which
percentile is the student's score?
A. 61.3%
B. 72.6%
C. 84.1%
D. 97.7%
E. 99.9%
Answer is B

NBME Epid

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NBME Epid

Uploaded by

Copyright:

Available Formats

A study is conducted to assess the effectiveness of a new blood test for early detection of

E) Rotavirus vaccine prevents 80 cases of intussusception per 100,000 infant-years

Trisomy 21 Normal karyotype

Physicians in a particular group practice wanted to agree on a standard treatment protocol

The correct answer is C. A meta-analysis is a set of methods for conducting a

A. Case series study

You might also like