Speaker: Professor Kevin Gurney, University of Sheffield
How can animals acquire a repertoire of actions enabling the achievement of their goals? Moreover, how can this be done in an intrinsically motivated way without the animal being instructed, or without having some overt, primary reward assigned to successful learning? The relation between actions and outcomes are presumed to be held in internal models, encoded in associative neural networks. In order for these associations to be learned, representations of the motor action, sensory context, and the sensory outcome must be repeatedly activated in the relevant neural systems. This requires a transient change in the action selection policy of the agent, so that the to-be-learned action is selected more often than other competing actions. A programme of work seeking the biologicalunderpinning of this computational framework requires an understanding of action selection in the brain, a key component of which is a set of sub-cortical nuclei - the basal ganglia. The basal ganglia are subject to reinforcement learning, mediated by phasic activity in midbrain dopamine neurons constituting a reinforcement signal. We propose that this signal encodes a sensory prediction error, initiated when the agent's actions elicit 'surprising' events, thereby fostering intrinsically motivated exploration of the environment. I will describe models of intrinsically motivated action learning in basal ganglia based on these ideas. They are tested in a simple autonomous agent whose behaviour is constrained to mimic that of rats in an in vivo experiment. The model shows a complex interplay of several mechanisms that we believe are responsible for biological action discovery.