You are on page 1of 8

Pharmacological modulation of subliminal learning

in Parkinson’s and Tourette’s syndromes


Stefano Palminteria,b,c, Maël Lebretona,b,c, Yulia Worbea,b,c, David Grablia,b,c,d, Andreas Hartmanna,b,c,e,
and Mathias Pessiglionea,b,c,1
aInstitutdu Cerveau et de la Moëlle èpinière (CR-ICM), F-75013 Paris, France; bInstitut de la Santé et de la Recherche Médicale (INSERM), Unité Mixte de
Recherche (UMR 975), F-75013 Paris, France; cUniversité Pierre et Marie Curie (UPMC-Paris 6), F-75013 Paris, France; dFédération de Neurologie, Groupe
Hospitalier Pitié-Salpêtrière, Assistance Publique-Hôpitaux de Paris, F-75013 France; and eCentre de Référence “Syndrome Gilles de la Tourette”,
F-75013 Paris, France

Edited by Mortimer Mishkin, National Institutes of Health, Bethesda, MD, and approved September 16, 2009 (received for review April 12, 2009)

Theories of instrumental learning aim to elucidate the mechanisms reinforce approach pathways, while dopamine dips (encoding
that integrate success and failure to improve future decisions. One negative prediction errors) reinforce avoidance pathways (10).
computational solution consists of updating the value of choices in Instrumental learning may involve both conscious and sub-
proportion to reward prediction errors, which are potentially conscious processes. We recently demonstrated that healthy
encoded in dopamine signals. Accordingly, drugs that modulate subjects can learn associations between cues and choice out-
dopamine transmission were shown to impact instrumental learn- comes, even if the cues are masked and hence not consciously
ing performance. However, whether these drugs act on conscious perceived (11). During performance of this subliminal condi-
or subconscious learning processes remains unclear. To address this tioning task, prediction errors generated with a standard rein-
issue, we examined the effects of dopamine-related medications in forcement learning algorithm were reflected in striatal activity,
a subliminal instrumental learning paradigm. To assess generality possibly due to dopaminergic inputs. However, the assumption
of dopamine implication, we tested both dopamine enhancers in that subconscious learning is actually driven by dopamine release
Parkinson’s disease (PD) and dopamine blockers in Tourette’s in the striatum remains to be tested. It is noteworthy that
syndrome (TS). During the task, patients had to learn from mon- learning is dramatically reduced in the subliminal compared to
etary outcomes the expected value of a risky choice. The different the unmasked condition, where the associations can be trivially
outcomes (rewards and punishments) were announced by visual acquired in one trial. Thus, conscious processes, notably the
cues, which were masked such that patients could not consciously ability to keep in mind the cues and outcomes seen previously,
perceive them. Boosting dopamine transmission in PD patients seem important for a good learning performance, but are not
improved reward learning but worsened punishment avoidance. necessary for a more limited acquisition of instrumental
Conversely, blocking dopamine transmission in TS patients favored
responses.
punishment avoidance but impaired reward seeking. These results
To our knowledge, the question of whether dopamine-related
drugs affect conscious or subconscious learning-related pro-
thus extend previous findings in PD to subliminal situations and to
cesses has not been addressed so far. Here, we examined this
another pathological condition, TS. More generally, they suggest
issue by administrating our subliminal conditioning paradigm to
that pharmacological manipulation of dopamine transmission can
PD patients. The hypothesis was that the above-mentioned
subconsciously drive us to either get more rewards or avoid more
double dissociation, between reinforcement valence (reward or
punishments.
punishment) and medication status (off or on levodopa), could
be replicated in subliminal conditions. To strengthen the dem-
dopamine 兩 instrumental learning 兩 subliminal perception 兩 reward 兩
onstration, we also tested whether a reverse double dissociation
punishment
could be observed in patients with Gilles de la Tourette’s
syndrome (TS), which can be opposed to PD in terms of both

H ow we learn from success and failure is a long-standing


question in neuroscience. Instrumental learning theories
explain how outcomes can be used to modify the value of choices,
symptoms and treatments. TS is characterized by hyperkinetic
symptoms (motor and vocal tics) alleviated by neuroleptics
(dopamine receptor antagonists), whereas PD is a hypokinetic

NEUROSCIENCE
such that better decisions are made in the future. A basic syndrome alleviated by dopamine receptor agonists. Medication
learning mechanism consists of updating the value of the chosen effects were assessed between two groups of 12 TS patients on
option according to a reward prediction error, which is the one hand and within one group of 12 PD patients on the other.
difference between the actual and the expected reward (1, 2). Matched healthy controls (24 young and 12 older subjects) were
This learning rule, using prediction error as a teaching signal, has also administrated to the same experimental paradigm. Disease
provided a good account of instrumental learning in a variety of effects were assessed by comparing each group of patients off
species including both human and nonhuman primates (3, 4). medication with their matched control group. Subjects’ demo-
Single-cell recordings in monkeys suggest that reward prediction graphic and clinical features are displayed in Tables 1 and 2,
errors are encoded by the phasic discharge of dopamine neurons respectively.
(5, 6). In humans, dopamine-related drugs have been shown to The subliminal conditioning task used three abstract cues that
bias prediction error encoding in the striatum to modulate were paired with different monetary outcomes (⫺1€, 0€, ⫹1€).
reward-based learning (7). One of these drugs, levodopa (a
metabolic precursor of dopamine), is used to alleviate motor
Author contributions: D.G., A.H., and M.P. designed research; S.P. and M.L. performed
symptoms in idiopathic Parkinson’s disease (PD), which is research; Y.W., D.G., and A.H. contributed new reagents/analytic tools; S.P. and M.P.
primarily caused by degeneration of nigral dopamine neurons. analyzed data; and S.P. and M.P. wrote the paper.
PD patients were shown to learn better from positive feedback The authors declare no conflict of interest.
when on levodopa and from negative feedback when off levo- This article is a PNAS Direct Submission.
dopa (8, 9). This double dissociation lead Frank and colleagues 1To whom correspondence should be addressed. E-mail: mathias.pessiglione@gmail.com.
to propose a computational model of fronto-striatal circuits This article contains supporting information online at www.pnas.org/cgi/content/full/
where dopamine bursts (encoding positive prediction errors) 0904035106/DCSupplemental.

www.pnas.org兾cgi兾doi兾10.1073兾pnas.0904035106 PNAS 兩 November 10, 2009 兩 vol. 106 兩 no. 45 兩 19179 –19184
Table 1. Demographic data
Demographic features PD (n ⫽ 12) Seniors (n ⫽ 12) TS Off (n ⫽ 12) TS On (n ⫽ 12) Juniors (n ⫽ 24)

Age (years) 57.0 ⫾ 3.1 60.7 ⫾ 2.7 21.3 ⫾ 2.6 19.8 ⫾ 2.6 22.3 ⫾ 0.9
Sex (female/male) 1/11 5/7 3/9 2/10 12/12
Education (years) 10.3 ⫾ 1.3 16.4 ⫾ 1.0 11.3 ⫾ 1.4 10.0 ⫾ 0.9 15.1 ⫾ 0.5

The cues were briefly flashed between two mask images, after occasional conscious perception. To address this issue, we
which subjects had to choose between safe and risky options calculated correlations between d⬘ and payoffs: Pearson’s coef-
(Fig. 1). The safe choice means a null outcome for sure: no gain, ficients were around zero and nonsignificant (PD Off, r ⫽ 0.22,
no loss. A risky choice may result in a gain (⫹1€), a loss (⫺1€), P ⬎ 0.5; PD On, r ⫽ 0.17, P ⬎ 0.5; TS Off, r ⫽ ⫺0.13, P ⬎ 0.1;
or a neutral outcome (0€), depending on the cue. As they would TS On, r ⫽ ⫺0.29, P ⬎ 0.5), suggesting that learning effects
not see the cues, subjects were encouraged to follow their were not driven by patients with above-chance discrimination
intuition: to make a risky choice if they had the feeling they were performance.
in a winning trial or to make a safe choice if they felt it was a After controlling for these potential confounding effects, we
losing trial. For half of the subjects, the risky response was a next examined the hypothesized double dissociation between
‘‘Go’’ (key press), and for the other half it was a ‘‘Nogo’’ (no key reinforcement valence and medication status. We distinguished
press). Thus the experimental design allowed measuring depen- between reward and punishment learning in the calculation of
dent variables for three orthogonal dimensions: the rate of Go monetary payoffs. Relative to the neutral condition, additional
response (motor impulsivity), risky choice (cognitive impulsiv- correct choices were considered as an index of reward learning
ity), and monetary payoff (reinforcement learning). Note that if in the gain condition and as an index of punishment learning in
subjects always made the same response, or if they performed at the loss condition. Note that subtracting the neutral condition
chance, their final payoff would be zero. Hence a positive payoff removes the potential effects of motor and cognitive impulsivity.
indicates that some representation of cue–outcome contingen- The number of correct choices was expressed as euros that
cies had been acquired through conditioning. A separate visual subjects won for reward learning or avoided losing for punish-
discrimination task was subsequently conducted to assess the ment learning (Fig. 2A).
subjects’ sensitivity to differences between cues, presented with As expected, we observed that off-medication PD patients
the same masking procedure as during conditioning. The ratio- significantly learned to avoid punishments (1.3 ⫾ 0.5€, t11 ⫽ 2.8,
nale is that if subjects are unable to discriminate between cues, P ⬍ 0.01, one-tailed t test) but not to get rewards (⫺0.3 ⫾ 0.7€,
then they are a fortiori unable to build conscious representations t11 ⫽ ⫺0.5, P ⬎ 0.1, one-tailed t test). On-medication PD patients
of cue–outcome associations. exhibited the opposite pattern: no punishment learning (⫺0.3 ⫾
0.5€, t11 ⫽ ⫺0.6, P ⬎ 0.1, one-tailed t test) but significant reward
Results learning (1.5 ⫾ 0.5€, t11 ⫽ 2.9, P ⬍ 0.01, one-tailed t test). The
All dependent measures in the different groups have been reverse double dissociation was observed in TS patients: When
summarized in Table 3. We first tested motor and cognitive off medication, they learned to obtain rewards (1.9 ⫾ 1.0€, t11 ⫽
impulsivity measures (Go response and risky choice). There was 2.0, P ⬍ 0.05, one-tailed t test) but not to avoid punishments
no significant difference between PD and TS groups (all P ⬎ 0.1, (0.0 ⫾ 0.5€, t11 ⫽ 0.1, P ⬎ 0.5, one-tailed t test) and when on
two-tailed t tests) and no significant effect of medication, either medication, they failed to obtain rewards (0.1 ⫾ 0.4€, t11 ⫽ 0.3,
in PD or TS (all P ⬎ 0.05, two-tailed t tests). These results were P ⬎ 0.1, one-tailed t test) but successfully avoided punishments
not necessarily expected given the motor and cognitive signs (1.6 ⫾ 0.5€, t11 ⫽ 3.0, P ⬍ 0.01, one-tailed t test). Having
associated with the diseases and treatments, but they suggest that identified the combinations of medication status and reinforce-
performance was not driven by a difficulty in pressing keys or a ment valence where patients did learn, we checked the correla-
propensity to take risks. tions between d⬘ and learning in these situations (Fig. 2B). They
Then we examined learning performance (monetary payoff) were again close to zero and not significant in both PD patients
and discrimination sensitivity (d⬘). Monetary payoffs were sig- (Off/punishment, r ⫽ 0.01, P ⬎ 0.5; On/reward, r ⫽ 0.01, P ⬎ 0.5)
nificantly above zero, indicating a conditioning effect, in both and TS patients (Off/reward, r ⫽ ⫺0.20, P ⬎ 0.5; On/
PD and TS patients (PD, 1.1 ⫾ 0.5€, t11 ⫽ 2.1, P ⬍ 0.05; TS, 1.8 ⫾ punishment, r ⫽ ⫺0.29, P ⬎ 0.5). Moreover, regression lines
0.5€, t23 ⫽ 3.7, P ⬍ 0.001, one-tailed t test). In contrast, crossed the y axis (d⬘ ⫽ 0) for positive payoffs in all situations,
performance did not improve in the visual discrimination test, demonstrating the presence of conditioning effects in the ab-
where subjects remained at chance level throughout the entire sence of visual discrimination.
series of trials [see Fig. S1]. As the impulsivity measures, payoffs To verify that the double dissociations were due to difference
and d⬘ were not affected by dopamine enhancers in PD or by in learning rates, we plotted the cumulative money won (for
dopamine blockers in TS (all P ⬎ 0.1, two-tailed t test). Note, reward learning) and not lost (for punishment learning) as a
however, that d⬘ were numerically above zero in all situations, function of trials (Fig. 3B). Linear regression coefficients
suggesting that learning effects may have been driven by some (slopes) of these learning curves were extracted and tested for

Table 2. Clinical data


Clinical features PD (n ⫽ 12) Clinical features TS Off (n ⫽ 12) TS On (n ⫽ 12)

Disease duration (years) 10.7 ⫾ 1.2 Disease duration (years) 13.7 ⫾ 2.9 12.3 ⫾ 2.8
UPDRSIII score Off 28.7 ⫾ 4.5 YGTSS/50 score 15.9 ⫾ 1.6 18.3 ⫾ 2.1
UPDRSIII score On 6.9 ⫾ 1.6 YGTSS/100 score 33.4 ⫾ 3.8 42.4 ⫾ 4.0
Treatment Levodopa* Treatment — Risperidone Primozide
Daily dose (mg/day) 850 ⫾ 116 Daily dose (mg/day) — 2.3 ⫾ 0.7 3.3 ⫾ 2.3

*Dose is expressed as dopa-equivalent, taking into account both levodopa (all patients) and dopamine agonists (seven patients).

19180 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0904035106 Palminteri et al.


Fig. 1. Subliminal learning task. Successive screenshots displayed during a
given trial are shown from Left to Right, with durations in milliseconds. After
seeing a masked contextual cue flashed on a computer screen, subjects choose
to press or not to press a response key and subsequently observe the outcome.
In this example, ‘‘Go’’ appears on the screen because the subject has pressed
the key, following the cue associated with reward (winning 1€).

the different groups and medications (Fig. 3A). These slopes


exhibited a profile very similar to what was obtained with payoffs
(compare with Fig. 2 A). They were significantly positive (P ⬍
0.05, one-tailed t test) only in Off PD with punishments, in On
PD with rewards, and in On TS with punishments.
We then tested the effects of medication on the reward bias,
defined as the difference between the money won (correct
choices following reward cues) and the money not lost (correct
choices following punishment cues). This measure can hence be
considered an index of the difference between reward and
punishment learning performance. We found that the reward
bias was significantly increased by dopamine enhancers in PD
patients (Off, ⫺1.7 ⫾ 0.9€; On, 1.8 ⫾ 0.8€; t11 ⫽ 3.0, P ⬍ 0.05,
two-tailed t test) and significantly decreased by dopamine block-
ers in TS patients (Off, 1.9 ⫾ 1.0€; On, ⫺1.5 ⫾ 0.4€; t22 ⫽ 2.2,
P ⬍ 0.05, two-tailed t test). Thus, the reward bias was the only
dependent variable sensitive to medication, with reciprocal
effects in PD and TS showing that dopamine enhancers favored Fig. 2. Monetary payoffs. (Left) Idiopathic Parkinson’s disease (PD) patients.
reward learning, whereas dopamine blockers favored punish- (Right) Gilles de la Tourette’s syndrome (TS) patients. (A) Reward bias. Histo-
ment avoidance. grams in each graph show additional correct choices (in euros) in the gain
Finally, all experimental data were systematically compared (Left) and loss (Right) condition relative to the neutral condition. Solid histo-
between patients and controls. Note that controls were matched grams represent medicated patients (on dopamine enhancers or blockers)
whereas open histograms represent unmedicated patients. Error bars are plus
with patients in terms of age but not sex or education. However,
or minus between-subjects standard errors of the mean. (B) Learning vs.
taking into account all of the 36 healthy subjects, we found no discrimination performance. Graphs represent for each individual the euros
significant effect of sex on monetary payoff or reward bias (both won (reward learning) or not lost (punishment learning) as a function of
P ⬎ 0.5, two-tailed t tests) and no significant correlation between discrimination sensitivity (d⬘). Medicated patients (on dopamine enhancers or
education level and monetary payoff or reward bias (r ⫽ 0.21 and blockers) are represented by solid squares and solid regression lines and
r ⫽ 0.04, both P ⬎ 0.1). Thus reinforcement learning performance unmedicated patients by open squares and dashed lines. Only situations
was not dependent on sex or education. In both control groups our where learning was significant are shown: On PD and Off TS patients for
crucial measure, the reward bias, was found between those obtained rewards, Off PD and On TS patients for punishments.
for the on and off medication status in the corresponding patient
group. In other words, the trend was that relative to healthy old patients (t22 ⫽ 2.6, P ⬍ 0.05; all other P ⬎ 0.1; two-tailed t test).
subjects, PD patients had a lower reward bias when off medication

NEUROSCIENCE
Medication effects on the reward bias therefore appear much more
and a higher one when on medication. And relative to healthy reliable than disease effects.
young subjects, TS patients had a higher reward bias when off
medication and a lower one when on medication. However, the Discussion
differences being smaller than when comparing on and off states, To summarize, we extended the double dissociation between
the comparison with control subjects was significant only for Off PD reinforcement valence and dopamine medication status, which

Table 3. Experimental data


PD (n ⫽ 12)
Seniors TS Off TS On Juniors
Behavioral measures Off On (n ⫽ 12) (n ⫽ 12) (n ⫽ 12) (n ⫽ 24)

Monetary payoff (€) 1.0 ⫾ 0.8 1.3 ⫾ 0.6 0.6 ⫾ 0.6 1.9 ⫾ 0.6 1.8 ⫾ 0.8 2.8 ⫾ 0.9
Visual discrimination (d⬘) 0.14 ⫾ 0.25† 0.33 ⫾ 0.15 0.43 ⫾ 0.14 0.37 ⫾ 0.11 0.07 ⫾ 0.14 0.05 ⫾ 0.12
Payoff/d⬘ correlation (r) 0.22 0.17 ⫺0.29 ⫺0.13 0.29 0.22
Go responses (%) 50.9 ⫾ 6.4 47.5 ⫾ 6.2 48.1 ⫾ 2.5 51.2 ⫾ 5.1 49.6 ⫾ 3.8 46.7 ⫾ 3.8
Risky choices (%) 70.1 ⫾ 2.4 55.6 ⫾ 6.0 67.7 ⫾ 2.2 65.7 ⫾ 1.9 58.8 ⫾ 2.8 63.4 ⫾ 2.6
Reward obtained (€) ⫺0.3 ⫾ 0.7 1.5 ⫾ 0.5 0.9 ⫾ 0.4 1.9 ⫾ 1.0 0.1 ⫾ 0.4 1.5 ⫾ 0.8
Punishment avoided (€) 1.3 ⫾ 0.5* ⫺0.3 ⫾ 0.5 ⫺0.2 ⫾ 0.4 0.0 ⫾ 0.5 1.6 ⫾ 0.5 1.3 ⫾ 0.7

*P ⬍ 0.05, significant difference with the control group (two-tailed t test)


†Data were collected in 11 patients only.

Palminteri et al. PNAS 兩 November 10, 2009 兩 vol. 106 兩 no. 45 兩 19181
most subliminal perception studies, the cues were never shown
until the debriefing at the end of the experiment. Although they
did not provide the above criteria for absence of awareness, some
previous studies in PD reported deficits in implicit learning (8,
9, 14, 15). In these paradigms the cues are consciously perceived,
but subjects fail to report explicitly the cue–outcome contin-
gencies at debriefing, even if they previously expressed some
knowledge of these contingencies in their motor responses.
Debriefing tests have, however, been criticized as confounded by
memory decay (16–18), so masking cues serves as a more
stringent approach to limit conscious associations between cues
and outcomes. Compared to implicit learning paradigms, such as
probabilistic classification or transitive inference tasks, the Go/
Nogo mode of response used here makes reinforcement learning
more direct, with no need for building high-level representations
of cue–outcome contingencies.
Our findings are in line with a growing body of evidence that
reinforcement learning can operate subconsciously (19–23).
More specifically, they extend a previous functional neuroimag-
ing study using the same subliminal conditioning paradigm (11),
which showed that reward prediction errors were reflected in the
ventral striatum. A parsimonious explanation may be that do-
pamine enhancers and blockers, because they interfere with
dopamine transmission, modulate the magnitude of prediction
error signals, as was previously demonstrated during conscious
instrumental learning (7). This would be compatible with
Frank’s model (10), if we assume that dopamine enhancers and
Fig. 3. Learning rates. (Left) Idiopathic Parkinson’s disease (PD) patients. blockers have opposite effects both on positive prediction errors
(Right) Gilles de la Tourette’s syndrome (TS) patients. (A) Accumulation rates. following rewards and on negative prediction errors following
Histograms in each graph show linear regression coefficients of corresponding punishments. The drugs may impact the reinforcement of
learning curves below. Solid histograms represent medicated patients (on
fronto-striatal synapses, which allegedly underlies the formal
dopamine enhancers or blockers) whereas open histograms show unmedi-
cated patients. Error bars are plus or minus between-subjects standard errors
process of using prediction error as a teaching signal to update
of the mean. (B) Accumulation curves. Graphs represent for each individual the value of the current cue, according to Rescorla and Wagner’s
the cumulative sum of euros won (reward learning) or not lost (punishment rule (1). At a lower level, the underlying mechanisms remain
learning) as a function of trials. The curves have been averaged across sessions speculative, however, as it is unclear which dopamine receptors
and subjects. Medicated patients (on dopamine enhancers or blockers) are (D1, D2, or others) and which component of dopamine release
represented by solid squares and solid regression lines and unmedicated (tonic, phasic, or a combination of both) are impacted by
patients by open squares and dashed lines. medications. Although we argue that the reinforcement process
modulated by medications was subconscious, we do not imply
that conscious feelings, when seeing the masks or the outcomes,
was originally demonstrated in PD patients by Frank and col-
remained unaffected. It remains, for instance, possible that
leagues (8), to the subliminal case and to TS patients. In short,
subjects, even if not perceiving the cue itself, had a conscious
reinforcement learning was biased toward reward seeking when
positive feeling following a reward-predicting cue or a negative
boosting dopamine transmission and toward punishment avoid-
one after a punishment-predicting cue. Further experiments are
ance when blocking dopamine transmission. The effects were needed to determine whether we can develop a conscious access
independent from factors such as discrimination sensitivity and to the value of cues that we do not consciously perceive.
motor or cognitive impulsivity, which were orthogonal to the The replication of the double dissociation in a second patho-
reinforcement valence in our design. Moreover, these factors logical condition (TS) suggests that our manipulation tapped
were not significantly affected by medication, suggesting that into general dopamine-related mechanisms and not into peculiar
patients did not perceive the cues, press the button, or choose the dysfunction restricted to PD. Our findings potentially facilitate
risky response any more in the on- than in the off-medication understanding not only dopamine-related drug effects but also
state. dopamine-related disorders. The case for dopamine neuron
Despite the use of short duration and backward masking, we degeneration in PD is well established (24), so from Frank’s
cannot formally ensure that all cues remained subliminal in all model (10) it could be predicted that off-medication PD patients
trials, as there is no direct window to the conscious mind. We were impaired in reward learning but not in punishment avoid-
nonetheless provide standard criteria that are generally consid- ance. A lack of positive reinforcement following rewards might
ered as indirect evidence for nonconscious perception (12, 13). explain action selection deficits that are frequently reported in
Verbal reports were recorded to assess the subjective criterion: PD (14, 15, 25). Indeed, if an action is not reinforced when
When shown the unmasked cues, all subjects reported not having rewarded, selection of that action will not be facilitated in the
seen them previously. Discrimination performance was mea- future. A deficit in movement selection could also account for
sured to assess the objective criterion: Learning effects were some motor symptoms, such as akinesia and rigidity, that are the
obtained even for a null d⬘, which indicates that subjects were hallmarks of PD. The double dissociation evidenced in PD may
unable to correctly decide whether two consecutive cues were the also provide insight into compulsive behaviors, such as patho-
same or different. We therefore conclude that the learning logical gambling, induced in these patients by dopamine agonists
processes affected by medications were largely subconscious. (26, 27). The explanation would be that due to dopamine
Masking was undoubtedly helped by the fact that subjects had no agonists, repetitive behaviors would be more reinforced by
prior representation to guide visual search, since, contrary to rewarding outcomes than impeded by punishing consequences.

19182 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0904035106 Palminteri et al.


In contrast, the case for an overactive dopamine transmission not represent a huge handicap. We included medicated and unmedicated
in TS has not reached general agreement (28–30), despite patients in equal numbers, such that we could make on– off comparisons with
supporting evidence from both genetic and neuroimaging studies the same number of data points (n ⫽ 24) as in Parkinson’s disease. The
difference was that comparisons were made within patients in PD and be-
(31–34). That TS patients mirrored PD patients would further
tween patients in TS. All On TS patients were treated with neuroleptics only:
support the idea of underlying dopaminergic hyperactivity. Of eight with risperidone, four with primozide. For the sake of simplicity, we
course this does not necessarily imply that dopaminergic hyper- referred to neuroleptics as dopamine blockers. There was no significant
activity is causal to the pathology of TS. It is nonetheless difference between on-medication (treated) and off-medication (untreated)
tempting to speculate that tics may come from excessive rein- TS patients regarding age (t22 ⫽ 0.4, P ⬎ 0.5, two-tailed t test), sex (␹2 ⫽ 0.25;
forcement of certain cortico-striatal pathways. We must remain P ⬎ 0.5, chi-square test), disease duration (t22 ⫽ 0.4, P ⬎ 0.5, two-tailed t test),
cautious however, because we observed only a trend and not a and education (t22 ⫽ 0.7, P ⬎ 0.1, two-tailed t test). The Yale Global Tic Severity
significant difference between off-medication TS patients and Scale (YGTSS) showed no significant difference between On and Off patients,
matched healthy controls. either with the 50-items (motor tics) or with the 100-items (complex tics)
version (respectively t22 ⫽ 1.6, P ⬎ 0.1; t22 ⫽ 0.9, P ⬎ 0.1, two-tailed t test).
More generally, because they eliminate conscious strategies
that could confound potential deficits, subliminal stimulations
Experimental Task and Design. The behavioral tasks used in our previous study
may allow targeting more specific cognitive processes, just as was (11) were slightly shortened and translated into French and euros. Subjects
done here for reinforcement learning, and hence provide insight first read the instructions (see SI Text), which were later explained again step
into a variety of neurological or psychiatric conditions. For the by step. They were first trained to perform the conditioning task on a 16-trial
same reasons, subliminal conditions might also prove useful in practice version. Then, they had to perform three sessions of this conditioning
identifying specific effects of drugs, other than those of dopa- task, each containing 90 trials and lasting 10 min, and one session of the
mine enhancers and blockers on reinforcement learning. To our perception task, containing 60 trials and lasting ⬇5 min. The abstract cues
knowledge, pharmacological studies have not intended so far to were letters taken from the Agathodaimon font. The same two masking
distinguish between drug effects on conscious and subconscious patterns, one displayed before and the other after the cue, were used in all
task sessions (Fig. 1). Assignment of cues to the different task sessions, and
processes. Indeed, a huge literature is devoted to understanding
associations of cues with the different outcomes, was fixed for all subjects to
how drugs modify conscious experience, but little is known about undergo the exact same experimental procedure. For similar purposes, dura-
how drugs play on processes occurring outside conscious aware- tion of cue display was fixed at 50 ms and not adapted to each individual, such
ness. We believe that the present study opens the door to that subliminal stimulations were identical for all subjects.
research on the pharmacology of subconscious processing. As 50 ms is near the threshold for conscious perception, however, some
subjects (three TS patients and three junior and one senior controls) could not
Experimental Procedures be included because they managed to discriminate some part of the cues.
Subjects. The study was approved by the Ethics Committee for Biomedical Indeed they reported having spotted discriminative parts (both during task
Research of the Pitié-Salpêtrière Hospital, where the study was conducted. A performance and at debriefing), had abnormally high discrimination sensi-
total of 72 subjects, including 36 patients and 36 controls, were included in the tivity (d⬘ ⬎ 1.5), and won unusually high amounts of money (payoff ⬎10€).
study. All subjects gave written informed consent before their participation. Note that without excluding the TS patients who saw the cues, the double
They were not paid for their voluntary participation and were told that the dissociation reported in this condition would fail to reach significance. Indeed,
money won in the task was purely virtual. Previous studies have shown that these TS patients were on medication and nonetheless learned to get rewards,
using real money is not mandatory to obtain robust motivational or condi- consistent with the intuitive idea that the task gets trivial as soon as subjects
tioning effects (8, 35). In our case using real money would be unethical since can discriminate the cues.
it would mean paying patients according to their handicap or treatment. In The instrumental conditioning task involved choosing between pressing or
total, 12 patients with idiopathic PD and 24 patients with TS were included in not pressing a key, in response to masked cues. After showing the fixation
the study. We also tested 12 old (seniors) and 24 young (juniors) healthy cross and the masked cue, the response interval was indicated on the com-
controls, who were screened out for any history of neurological or psychiatric puter screen by a question mark. The interval was fixed to 3 s and the response
conditions and selected for age to match that of either PD or TS patients. We was taken at the end: Go if the key was being pressed, and Nogo if the key was
checked that age was not significantly different between old subjects and PD released. The response was written on the screen as soon as the delay had
patients (t22 ⫽ 0.9, P ⬎ 0.1, two-tailed t test) or between young subjects and elapsed. Subjects were told that one response was safe (you do not win or lose
TS patients (t46 ⫽ ⫺0.9, P ⬎ 0.1, two-tailed t test). anything) while the other was risky (you can win 1€, lose 1€, or get nothing).
PD patients were consecutive candidates for deep brain stimulation, hos- Subjects were also told that the outcome of the risky response would depend
pitalized for a clinical preoperative examination. Inclusion criteria were a on the cue that was displayed between the mask images. In fact, three cues
diagnosis of idiopathic PD, with a good response to levodopa [⬎50% improve- were used: One was rewarding (⫹1€), one was punishing (⫺1€), and the last
ment on the Unified Parkinson’s Disease Rating (UPDRSIII) Scale], in the was neutral (0€). Because subjects were not informed about the associations,

NEUROSCIENCE
absence of dementia [Mini Mental State (MMS) score ⬎25] and depression they could learn them only by observing the outcome, which was displayed at
[Montgomery and Asberg Depression Rating Scale (MADRS) score ⬍20]. Con- the end of the trial. This was a circled coin image (meaning ⫹1€), a barred coin
sequently, average MMS score was 27.7 ⫾ 0.3, average MADRS score was 4.3 ⫾ image (meaning ⫺1€), or a gray square (meaning 0€).
0.8, and Hoenh and Yahr stage was 2.46 ⫾ 0.10 in the ‘‘off’’ state and 2.17 ⫾ The risky response was assigned to Go for half of task completions and to
0.15 in the ‘‘on’’ state. Among the 12 patients, 5 were on levodopa alone, and Nogo for the other half, such that motor aspects were counterbalanced
7 were also taking dopamine receptor agonists. For the sake of simplicity, we between reward and punishment conditions. TS patients and junior controls
converted all medications as levodopa equivalents (Table 3) and we used the were assessed only once and hence performed either the Go or the Nogo
term dopamine enhancers to designate both levodopa and receptor agonists. version of the task. Junior controls were randomly assigned to either the
Every patient was assessed twice, on the morning of 2 different days: once in Go version for one half or the Nogo version for the other half. In TS, the task
the off state, after overnight (⬎12 h) withdrawal of levodopa and a full day version was balanced with respect to the medication status, such that each of
(24 h) withdrawal of dopamine agonists, and once in the on state, 1 h after the four combinations (Off/Nogo, Off/Go, On/Nogo, and On/Go) was admin-
intake of habitual medication dose (levodopa in all patients ⫹ dopamine istrated in the same number of patients (n ⫽ 6). PD patients and senior controls
agonists in 7 of them). One patient included in the study could not complete were assessed twice, once on the Go version and once on the Nogo version. For
the visual discrimination task in the off state due to excessive motor fatigue. senior controls the order of Go and Nogo task versions was simply alternated.
Three patients were unable to perform the conditioning task in the off state In PD, the order was balanced with respect to the medication status, such that
and were therefore not included in the study. each of the four combinations (Off/Nogo–On/Go, Off/Go–On/Nogo, On/
TS patients were consecutive candidates screened for the French Reference Nogo–Off/Go, and On/Go–Off/Nogo) was administrated in the same number
Center for Gilles de la Tourette’s syndrome. Patients were at least 10 years old of patients (n ⫽ 3).
and did not present relevant comorbid conditions (depression, obsessive- The perceptual discrimination task was used as a control for awareness at
compulsive disorder, and/or attention deficit with hyperactivity disorder). the end of conditioning sessions. Hence it was administrated once in TS
Treatment usually cannot be stopped in these patients for ethical reasons: It patients and junior controls and twice in PD patients and senior controls. In
would leave patients in discomfort for too long during washout. However, this task, subjects were flashed two masked cues, 3 s apart, displayed on the
some patients diagnosed with TS remain unmedicated, because their tics do center of a computer screen, each following a fixation cross. As there were 60

Palminteri et al. PNAS 兩 November 10, 2009 兩 vol. 106 兩 no. 45 兩 19183
trials, each cue was presented 40 times, which is more than in conditioning index (d⬘), as the difference between normalized rates of hits (correct differ-
sessions (30 times). Subjects had to report whether or not they perceived any ent responses) and false alarms (incorrect different responses).
difference between the two visual stimulations. The response was given All data (demographic, clinical, or experimental) are reported as mean ⫾
manually, by pressing one of two keys assigned to ‘‘same’’ and ‘‘different’’ between-subjects standard error of the mean (SEM). To assess instrumental
choices. Importantly, subjects had no opportunity to see the cues unmasked, conditioning, we used one-tailed paired t tests comparing individual perfor-
so they could not get any prior information about what these cues look like. mances with chance level (which corresponds to a zero payoff). Similarly, to
Note that the three cues used in the perceptual discrimination control were assess visual discrimination, we compared individual d⬘ with chance level
different from those used in instrumental learning sessions, to avoid subjects (which is also zero), using one-tailed paired t tests. Within each pathological
distinguishing cues on the basis of their learned values. At the end of the condition (PD or TS), we assessed medication effects by comparing dependent
experiment, subjects were debriefed about whether or not they could per- variables between On and Off states. We used within-group comparisons
ceive some piece of cues. They were also shown the cues unmasked one by one (paired two-tailed t tests) for PD patients, who were tested in the two
and asked whether or not they had seen them before. No included subject medication states, and between-group comparisons (unpaired two-tailed t
reported having seen any cue. tests) for TS patients, who were either medicated or not. To assess disease
effects relative to controls we performed between-group comparisons (un-
Statistical Analysis. From the conditioning task we extracted the percentages paired two-tailed t tests). Finally, to assess significance of linear correlation
of Go and risky responses, which can be taken as indirect measures of motor between learning (payoff) and discrimination (d⬘) measures, we calculated
and cognitive impulsivity, respectively. We also extracted the number of Pearson’s coefficients. For all statistical tests the threshold for significance was
correct choices, which is equivalent to the monetary payoff. The payoff can set at P ⬍ 0.05.
then be split into euros won for the reward condition and euros not lost for
the punishment condition. To correct for motor and cognitive bias, we sub-
ACKNOWLEDGMENTS. We are grateful to Helen Bates for helping with
tracted the correct choices made in the neutral condition, which captures the
behavioral task administration and to Virginie Czernecki and Priscilla Van
propensity to make a Go response and a risky choice. To display learning Meerbeeck for providing clinical data. We also thank Arlette Welaratne and
progression, we plotted the cumulative money won (reward learning) or not all of the staff of the Centre d’Investigation Clinique for taking care of
lost (punishment learning) across trials. A linear regression was fitted on these patients. Aman Saleem, Shadia Kawa, and Beth Pavlicek checked the English.
learning curves, and coefficients (betas) were considered as an index of S.P. received a Ph.D. fellowship from the Neuropôle de Recherche Francilien.
learning rates. From the visual discrimination task we calculated a sensitivity The study was funded by the Ecole de Neurosciences de Paris.

1. Rescorla RA, Wagner AR (1972) A theory of Pavlovian conditioning: Variations in the 18. Wilkinson L, Shanks DR (2004) Intentional control and implicit sequence learning. J Exp
effectiveness of reinforcement and nonreinforcement. Classical Conditioning II: Cur- Psychol Learn Mem Cogn 30(2):354 –369.
rent Research and Theory, eds Black AH, Prokasy WF (Appleton-Century-Crofts, New 19. Morris JS, Ohman A, Dolan RJ (1998) Conscious and unconscious emotional learning in
York), pp 64 –99. the human amygdala. Nature 393(6684):467– 470.
2. Sutton RS, Barto AG (1998) Reinforcement Learning. (MIT Press, Cambridge, MA). 20. Olsson A, Phelps EA (2004) Learned fear of ‘‘unseen’’ faces after Pavlovian, observa-
3. Daw ND, Doya K (2006) The computational neurobiology of learning and reward. Curr tional, and instructed fear. Psychol Sci 15(12):822– 828.
Opin Neurobiol 16(2):199 –204. 21. Knight DC, Nguyen HT, Bandettini PA (2003) Expression of conditional fear with and
4. O’Doherty JP, Hampton A, Kim H (2007) Model-based fMRI and its application to without awareness. Proc Natl Acad Sci USA 100(25):15280 –15283.
reward learning and decision making. Ann N Y Acad Sci 1104:35–53. 22. Seitz AR, Kim D, Watanabe T (2009) Rewards evoke learning of unconsciously pro-
5. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. cessed visual stimuli in adult humans. Neuron 61(5):700 –707.
Science 275(5306):1593–1599. 23. Li W, Howard JD, Parrish TB, Gottfried JA (2008) Aversive learning enhances perceptual
6. Waelti P, Dickinson A, Schultz W (2001) Dopamine responses comply with basic and cortical discrimination of indiscriminable odor cues. Science 319(5871):1842–1845.
assumptions of formal learning theory. Nature 412(6842):43– 48. 24. Braak H, Del Tredici K (2008) Invited article: Nervous system pathology in sporadic
7. Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD (2006) Dopamine-dependent Parkinson disease. Neurology 70(20):1916 –1925.
prediction errors underpin reward-seeking behaviour in humans. Nature 25. Pessiglione M, et al. (2005) An effect of dopamine depletion on decision-making: The
442(7106):1042–1045.
temporal coupling of deliberation and execution. J Cogn Neurosci 17(12):1886 –1896.
8. Frank MJ, Seeberger LC, O’Reilly RC (2004) By carrot or by stick: Cognitive reinforce-
26. Voon V, Potenza MN, Thomsen T (2007) Medication-related impulse control and
ment learning in parkinsonism. Science 306(5703):1940 –1943.
repetitive behaviors in Parkinson’s disease. Curr Opin Neurol 20(4):484 – 492.
9. Cools R, Altamirano L, D’Esposito M (2006) Reversal learning in Parkinson’s disease
27. Lawrence AD, Evans AH, Lees AJ (2003) Compulsive use of dopamine replacement
depends on medication status and outcome valence. Neuropsychologia 44(10):1663–
therapy in Parkinson’s disease: Reward systems gone awry? Lancet Neurol 2(10):595–
1673.
604.
10. Frank MJ (2005) Dynamic dopamine modulation in the basal ganglia: A neurocompu-
28. Singer HS (2005) Tourette’s syndrome: From behaviour to biology. Lancet Neurol
tational account of cognitive deficits in medicated and nonmedicated Parkinsonism. J
4(3):149 –159.
Cogn Neurosci 17(1):51–72.
29. Albin RL, Mink JW (2006) Recent advances in Tourette syndrome research. Trends
11. Pessiglione M, et al. (2008) Subliminal instrumental conditioning demonstrated in the
Neurosci 29(3):175–182.
human brain. Neuron 59(4):561–567.
12. Kouider S, Dehaene S (2007) Levels of processing during non-conscious perception: A 30. Leckman JF (2002) Tourette’s syndrome. Lancet 360(9345):1577–1586.
critical review of visual masking. Philos Trans R Soc Lond B Biol Sci 362(1481):857– 875. 31. Wong DF, et al. (2008) Mechanisms of dopaminergic and serotonergic neurotransmis-
13. Dehaene S, Changeux JP, Naccache L, Sackur J, Sergent C (2006) Conscious, precon- sion in Tourette syndrome: Clues from an in vivo neurochemistry study with PET.
scious, and subliminal processing: A testable taxonomy. Trends Cogn Sci 10(5):204 – Neuropsychopharmacology 33(6):1239 –1251.
211. 32. Tarnok Z, et al. (2007) Dopaminergic candidate genes in Tourette syndrome: Associa-
14. Knowlton BJ, Mangels JA, Squire LR (1996) A neostriatal habit learning system in tion between tic severity and 3⬘ UTR polymorphism of the dopamine transporter gene.
humans. Science 273(5280):1399 –1402. Am J Med Genet B Neuropsychiatr Genet 144B(7):900 –905.
15. Shohamy D, et al. (2004) Cortico-striatal contributions to feedback-based learning: 33. Gilbert DL, et al. (2006) Altered mesolimbocortical and thalamic dopamine in Tourette
Converging data from neuroimaging and neuropsychology. Brain 127(Pt 4):851– 859. syndrome. Neurology 67(9):1695–1697.
16. Lagnado DA, Newell BR, Kahan S, Shanks DR (2006) Insight and strategy in multiple-cue 34. Yoon DY, et al. (2007) Dopaminergic polymorphisms in Tourette syndrome: Associa-
learning. J Exp Psychol Gen 135(2):162–183. tion with the DAT gene (SLC6A3). Am J Med Genet B Neuropsychiatr Genet
17. Lovibond PF, Shanks DR (2002) The role of awareness in Pavlovian conditioning: 144B(5):605– 610.
Empirical evidence and theoretical implications. J Exp Psychol Anim Behav Process 35. Schmidt L, et al. (2008) Disconnecting force from money: Effects of basal ganglia
28(1):3–26. damage on incentive motivation. Brain 131(Pt 5):1303–1310.

19184 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0904035106 Palminteri et al.


Supporting Information
Palminteri et al. 10.1073/pnas.0904035106
SI Text Task Instructions 2: ‘‘Nogo-Risky’’ Version. The aim of the game is
Task Instructions 1: ‘‘Go-Risky’’ Version. The aim of the game is to to win money, by guessing the outcome of a key press.
win money, by guessing the outcome of a key press. At the beginning of each trial you must orient your gaze
At the beginning of each trial you must orient your gaze toward the central cross and pay attention to the masked cue.
toward the central cross and pay attention to the masked cue. You will not be able to perceive the cue that is hidden behind the
You will not be able to perceive the cue that is hidden behind the mask.
mask. When the question mark appears you have 3 seconds to make
When the question mark appears you have 3 seconds to make your choice between
your choice between —holding the key down
—holding the key down —leaving the key up.
—leaving the key up. If you change your mind you can still release or press the key
If you change your mind you can still release or press the key until the 3 seconds have elapsed.
until the 3 seconds have elapsed. ‘‘NO!’’ will be written in red if, at the end of the 3-seconds
‘‘GO!’’ will be written in red if, at the end of the 3-seconds delay, the key is being released.
delay, the key is being pressed. Then, we display the outcome of your choice. Pressing the key
Then we display the outcome of your choice. Not pressing the key is safe: You will always get a neutral outcome (0€). Releasing the
is safe: You will always get a neutral outcome (0€). Pressing the key key is of interest but risky: You can equally win 1€, get nil (0€),
is of interest but risky: You can equally win 1€, get nil (0€), or lose or lose 1€. This depends on which cue was hidden behind the
1€. This depends on which cue was hidden behind the mask. mask.
There is no logical rule to find in this game. If you never press There is no logical rule to find in this game. If you never press
the key, or if you press it every trial, your overall payoff will be the key, or if you press it every trial, your overall payoff will be
nil. To win money you must guess if the ongoing trial is a winning nil. To win money you must guess if the ongoing trial is a winning
or a losing trial. Your choices should be improved trial after trial or a losing trial. Your choices should be improved trial after trial
by your unconscious emotional reactions. Just follow your gut by your unconscious emotional reactions. Just follow your gut
feelings and you will win, and avoid losing, a lot of euros! feelings and you will win, and avoid losing, a lot of euros!

Palminteri et al. www.pnas.org/cgi/content/short/0904035106 1 of 2


Fig. S1. Visual discrimination across trials. Graphs represent (solid squares) performance in the visual discrimination task (percentage of correct response)
plotted against trials. Dashed lines represent chance level behavior (50% correct). Error bars are between-subjects standard errors of the mean. To formally test
the presence of perceptual learning we tested whether performance slopes were significantly positive across subjects. Mean slopes were found close to zero and
nonsignificant (PD, ⫺0.06 ⫾ 0.10; TS, 0.14 ⫾ 0.09; Juniors, ⫺0.09 ⫾ 0.06; Seniors, 0.07 ⫾ 0.09; all P ⬎ 0.05, one-tailed t test).

Palminteri et al. www.pnas.org/cgi/content/short/0904035106 2 of 2

You might also like