Still at the Choice-Point
Action Selection and Initiation in Instrumental Conditioning
BERNARD W. BALLEINE
Department of Psychology and the Brain Research Institute, University of California, Los Angeles, California, USA
Search for more papers by this authorSEAN B. OSTLUND
Department of Psychology and the Brain Research Institute, University of California, Los Angeles, California, USA
Search for more papers by this authorBERNARD W. BALLEINE
Department of Psychology and the Brain Research Institute, University of California, Los Angeles, California, USA
Search for more papers by this authorSEAN B. OSTLUND
Department of Psychology and the Brain Research Institute, University of California, Los Angeles, California, USA
Search for more papers by this authorAbstract
Abstract: Contrary to classic stimulus–response (S-R) theory, recent evidence suggests that, in instrumental conditioning, rats encode the relationship between their actions and the specific consequences that these actions produce. It has remained unclear, however, how encoding this relationship acts to control instrumental performance. Although S-R theories were able to give a clear account of how learning translates into performance, the argument that instrumental learning constitutes the acquisition of information of the form “response R leads to outcome O” does not directly imply a particular performance rule or policy; this information can be used both to perform R and to avoid performing R. Recognition of this problem has forced the development of accounts that allow the O and stimuli that predict the O (i.e., S-O) to play a role in the initiation of specific Rs. In recent experiments, we have used a variety of behavioral procedures in an attempt to isolate the processes that contribute to instrumental performance, including outcome devaluation, reinstatement, and Pavlovian–instrumental transfer. Our results, particularly from experiments assessing outcome–selective reinstatement, suggest that both “feed-forward” (O-R) and “feed-back” (R-O) associations are critical and that although the former appear to be important to response selection, the latter—together with processes that determine outcome value—mediate response initiation. We discuss a conceptual model that integrates these processes and its neural implementation.
REFERENCES
- 1 Bechara, A. & M. Van Der Linden. 2005. Decision-making and impulse control after frontal lobe injuries. Curr. Opin. Neurol. 18: 734–739.
- 2 Dalley, J.W., R.N. Cardinal & T.W. Robbins. 2004. Prefrontal executive and cognitive functions in rodents: neural and neurochemical substrates. Neurosci. Biobehav. Rev. 28: 771–784.
- 3 Daw, N.D., Y. Niv & P. Dayan. 2005. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8: 1704–1711.
- 4 Glimcher, P.W. 2005. Indeterminacy in brain and behavior. Annu. Rev. Psychol. 56: 25–56.
- 5 Ma, W.J., J.M. Beck, P.E. Latham & A. Pouget. 2006. Bayesian inference with probabilistic population codes. Nat. Neurosci. 9: 1432–1438.
- 6 Montague, P.R., B. King-Casas & J.D. Cohen. 2006. Imaging valuation models in human choice. Annu. Rev. Neurosci. 29: 417–448.
- 7 Guthrie, E.R. 1935. The Psychology of Learning. Harpers. New York .
- 8 Tolman, E.C. 1932. Purposive Behavior in Animals. Century Books. New York .
- 9 Hull, C.L. 1943. Principles of Behavior. Appleton. New York .
- 10
Adams, C.D. &
A. Dickinson. 1981. Instrumental responding following reinforcer devaluation.
Q. J. Exp. Psychol.
33B: 109–121.
10.1080/14640748108400816 Google Scholar
- 11 Holman, E.W. 1975. Some conditions for the dissociation of consummatory and instrumental behavior in rats. Learn. Motiv. 6: 358–366.
- 12 Colwill, R.M. & R.A. Rescorla. 1986. Associative structures in instrumental learning. In : The Psychology of Learning and Motivation. Vol. 20. G.H. Bower, Ed.: 55–104. Academic Press. Orlando , FL .
- 13 Balleine, B.W. & A. Dickinson. 1998. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37: 407–419.
- 14 Corbit, L.H. & B.W. Balleine. 2000. The role of the hippocampus in instrumental conditioning. J. Neurosci. 20: 4233–4239.
- 15 Balleine, B.W. 2001. Incentive processes in instrumental conditioning. In : Handbook of Contemporary Learning Theories. R.M.S. Klein, Ed.: 307–366. LEA. Hillsdale , NJ .
- 16 Dickinson, A. & B.W. Balleine. 1994. Motivational control of goal-directed action. Anim. Learn. Behav. 22: 1–18.
- 17 Rescorla, R.A. & R.L. Solomon. 1967. Two-process learning theory: relationships between Pavlovian conditioning and instrumental learning. Psychol. Rev. 74: 151–182.
- 18 Trapold, M.A. & J.B. Overmier. 1972. The second learning process in instrumental conditioning. In : Classical Conditioning: II. Current Research and Theory. A.A. Black & W.F. Prokasy, Eds.: 427–452. Appleton-Century-Crofts. New York .
- 19 Sutton, R.S. & A.G. Barto. 1998. Reinforcement Learning. MIT Press. Cambridge , MA .
- 20 O'Doherty, J., et al. 2004. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304: 452–454.
- 21 Dickinson, A., J. Campos, Z.I. Varga & B. Balleine. 1996. Bidirectional instrumental conditioning. Q. J. Exp. Psychol. B. 49: 289–306.
- 22 Colwill, R.M. & R.A. Rescorla. 1988. Associations between the discriminative stimulus and the reinforcer in instrumental learning. J. Exp. Psychol. Anim. Behav. Process. 14: 155–164.
- 23 Corbit, L.H., J.L. Muir & B.W. Balleine. 2001. The role of the nucleus accumbens in instrumental conditioning: evidence of a functional dissociation between accumbens core and shell. J. Neurosci. 21: 3251–3260.
- 24 Corbit, L.H. & B.W. Balleine. 2003. Instrumental and Pavlovian incentive processes have dissociable effects on components of a heterogeneous instrumental chain. J. Exp. Psychol. Anim. Behav. Process. 29: 99–106.
- 25 Rescorla, R.A. 1992. Response-outcome versus outcome-response associations in instrumental learning. Anim. Learn. Behav. 20: 223–232.
- 26 Rescorla, R.A. 1991. Associative relations in instrumental learning: the Eighteenth Bartlett Memorial Lecture. Q. J. Exp. Psychol. 43: 1–23.
- 27 Dickinson, A. & S. De Wit. 2003. The interaction between discriminative stimuli and outcomes during instrumental learning. Q. J. Exp. Psychol. B. 56: 127–139.
- 28 Botvinick, M.M., J.D. Cohen & C.S. Carter. 2004. Conflict monitoring and anterior cingulate cortex: an update. Trends Cogn. Sci. 8: 539–546.
- 29 De Wit, S., Y. Kosaki, B.W. Balleine & A. Dickinson. 2006. Dorsomedial prefrontal cortex resolves response conflict in rats. J. Neurosci. 26: 5224–5229.
- 30 Donegan, N.H., J.W. Whitlow Jr., & A.R. Wagner. 1977. Posttrial reinstatement of the CS in Pavlovian conditioning: facilitation or impairment of acquisition as a function of individual differences in responsiveness to the CS. J. Exp. Psychol. Anim. Behav. Process. 3: 357–376.
- 31 Ostlund, S.B. & B.W. Balleine. 2005. Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning. J. Neurosci. 25: 7763–7770.
- 32 Ostlund, S.B. & B.W. Balleine. 2007. Instrumental reinstatement depends on sensory- and motivationally-specific features of the instrumental outcome. Learn. Behav. 35(1).
- 33 Dickinson, A., B.W. Balleine, A. Watt, F. Gonzales & R.A. Boakes. 1995. Overtraining and the motivational control of instrumental action. Anim. Learn. Behav. 22: 197–206.
- 34 Adams, C.D. 1981. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q. J. Exp. Psychol. 34B: 77–98.
- 35 Colwill, R.M. & R.A. Rescorla. 1990. Effect of reinforcer devaluation on discriminative control of instrumental behavior. J. Exp. Psychol. Anim. Behav. Process. 16: 40–47.
- 36 Rescorla, R.A. 1994. Transfer of instrumental control mediated by a devalued outcome. Anim. Learn. Behav. 22: 27–33.
- 37
Holland, P.C.
2004. Relations between Pavlovian-instrumental transfer and reinforcer devaluation.
J. Exp. Psychol. Anim. Behav. Process.
30: 104–117.
10.1037/0097-7403.30.2.104 Google Scholar
- 38 Dickinson, A. 1985. Actions and habits: the development of behavioural autonomy. Philos. Trans. R. Soc. Lond. B. 308: 67–78.
- 39 Dickinson, A. 1994. Instrumental conditioning. In : Animal Cognition and Learning. N.J. Mackintosh, Ed.: 4–79. Academic Press. London .
- 40 Dickinson, A. & B.W. Balleine. 1993. Actions and responses: the dual psychology of behaviour. In : Spatial Representation. N. Eilan, R. McCarthy & M.W. Brewer, Eds.: 277–293. Basil Blackwell Ltd. Oxford.
- 41 Balleine, B.W. 2005. Neural bases of food seeking: affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86: 717–730.
- 42 Dayan, P. & B.W. Balleine. 2002. Reward, motivation, and reinforcement learning. Neuron. 36: 285–298.
- 43
Balleine, B.W.
2004. Incentive Behavior.
In
: The Behavior of the Laboratory Rat: A Handbook with Tests. I.Q. Whishaw &
B. Kolb, Eds.: 436–446. Oxford University Press.
Oxford
.
10.1093/acprof:oso/9780195162851.003.0041 Google Scholar
- 44 Kelley, A.E. 2004. Ventral striatal control of appetitive motivation: role in ingestive behavior and reward-related learning. Neurosci. Biobehav. Rev. 27: 765–776.
- 45 Haruno, M. & M. Kawato. 2006. Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning. Neural. Netw. 19: 1242–1254.
- 46 Goto, Y. & A.A. Grace. 2005. Dopaminergic modulation of limbic and cortical drive of nucleus accumbens in goal-directed behavior. Nat. Neurosci. 8: 805–812.
- 47 Atallah, H.E., D. Lopez-Paniagua, J.W. Rudy & R. O'Reilly. 2007. C. Separate neural substrates for skill learning and performance in the ventral and dorsal striatum. Nat. Neurosci. 10: 126–131.
- 48 Koechlin, E., A. Danek, Y. Burnod & J. Grafman. 2002. Medial prefrontal and subcortical mechanisms underlying the acquisition of motor and cognitive action sequences in humans. Neuron 35: 371–381.
- 49 Hamilton, A.F. & S.T. Grafton. 2006. Goal representation in human anterior intraparietal sulcus. J. Neurosci. 26: 1133–1137.
- 50 Buccino, G., F. Binkofski & L. Riggio. 2004. The mirror neuron system and action recognition. Brain Lang. 89: 370–376.
- 51 Corbit, L.H. & B.W. Balleine. 2003. The role of prelimbic cortex in instrumental conditioning. Behav. Brain Res. 146: 145–157.
- 52 Corbit, L.H., J.L. Muir & B.W. Balleine. 2003. Lesions of mediodorsal thalamus and anterior thalamic nuclei produce dissociable effects on instrumental conditioning in rats. Eur. J. Neurosci. 18: 1286–1294.
- 53 De Borchgrave, R., J.N. Rawlins, A. Dickinson & B.W. Balleine. 2002. Effects of cytotoxic nucleus accumbens lesions on instrumental conditioning in rats. Exp. Brain. Res. 144: 50–68.
- 54 Balleine, B. & S. Killcross. 1994. Effects of ibotenic acid lesions of the nucleus accumbens on instrumental action. Behav. Brain Res. 65: 181–193.
- 55 Balleine, B.W., A.S. Killcross & A. Dickinson. 2003. The effect of lesions of the basolateral amygdala on instrumental conditioning. J. Neurosci. 23: 666–675.
- 56 Wang, S.H., S.B. Ostlund, K. Nader & B.W. Balleine. 2005. Consolidation and reconsolidation of incentive learning in the amygdala. J. Neurosci. 25: 830–835.
- 57 Yin, H.H., S.B. Ostlund, B.J. Knowlton & B.W. Balleine. 2005. The role of the dorsomedial striatum in instrumental conditioning. Eur. J. Neurosci. 22: 513–523.
- 58 Yin, H.H., B.J. Knowlton & B.W. Balleine. 2005. Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning. Eur. J. Neurosci. 22: 505–512.
- 59 Poldrack, R.A. & M.G. Packard. 2003. Competition among multiple memory systems: converging evidence from animal and human brain studies. Neuropsychologia 41: 245–251.
- 60 Graybiel, A.M. 1995. Building action repertoires: memory and learning functions of the basal ganglia. Curr. Opin. Neurobiol. 5: 733–741.
- 61 McDonald, R.J. & N.M. White. 1993. A triple dissociation of memory systems: hippocampus, amygdala, and dorsal striatum. Behav. Neurosci. 107: 3–22.
- 62 Yin, H.H., B.J. Knowlton & B.W. Balleine. 2004. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur. J. Neurosci. 19: 181–189.
- 63 Yin, H.H., B.J. Knowlton & B.W. Balleine. 2006. Inactivation of dorsolateral striatum enhances sensitivity to changes in the action-outcome contingency in instrumental conditioning. Behav. Brain Res. 166: 189–196.
- 64 Kelley, A.E., V.B. Domesick & W.J. Nauta. 1982. The amygdalostriatal projection in the rat–an anatomical study by anterograde and retrograde tracing methods. Neuroscience 7: 615–630.
- 65 Vertes, R.P. 2006. Interactions among the medial prefrontal cortex, hippocampus and midline thalamus in emotional and cognitive processing in the rat. Neuroscience 142: 1–20.
- 66 Alheid, G.F. 2003. Extended amygdala and basal forebrain. Ann. N. Y. Acad. Sci. 985: 185–205.
- 67 Sesack, S.R., D.B. Carr, N. Omelchenko & A. Pinto. 2003. Anatomical substrates for glutamate-dopamine interactions: evidence for specificity of connections and extrasynaptic actions. Ann. N. Y. Acad. Sci. 1003: 36–52.
- 68 Joel, D. & I. Weiner. 2000. The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum. Neuroscience 96: 451–474.
- 69 Fudge, J.L. & A.B. Emiliano. 2003. The extended amygdala and the dopamine system: another piece of the dopamine puzzle. J. Neuropsychiatry Clin. Neurosci. 15: 306–316.
- 70 Balleine, B.W. & S. Killcross. 2006. Parallel incentive processing: an integrated view of amygdala function. Trends Neurosci. 29: 272–279.
- 71 Alexander, G.E., M.R. DeLong & P.L. Strick. 1986. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9: 357–381.
- 72 Alexander, G.E. & M.D. Crutcher. 1990. Trends Neurosci. 13: 266–271.
- 73 Nakahara, H., K. Doya & O. Hikosaka. 2001. Parallel cortico-basal ganglia mechanisms for acquisition and execution of visuomotor sequences—a computational approach. J. Cogn. Neurosci. 13: 626–647.
- 74 Haber, S.N. 2003. The primate basal ganglia: parallel and integrative networks. J. Chem. Neuroanat. 26: 317–330.
- 75 Percheron, G. & M. Filion. 1991. Parallel processing in the basal ganglia: up to a point. Trends Neurosci. 14: 55–59.
- 76 Bar-Gad, I., G. Morris & H. Bergman. 2003. Information processing, dimensionality reduction and reinforcement learning in the basal ganglia. Prog. Neurobiol. 71: 439–473.
- 77 Yelnik, J. 2002. Functional anatomy of the basal ganglia. Mov. Disord. 17 (Suppl. 3): S15–S21.
- 78 Nadjar, A., et al. 2006. Phenotype of striatofugal medium spiny neurons in parkinsonian and dyskinetic nonhuman primates: a call for a reappraisal of the functional organization of the basal ganglia. J. Neurosci. 26: 8653–8661.