Structured Belief Learning for Meta-Reinforcement Learning
Keywords
Loading...
Authors
Issue Date
2022-12-01
Language
en
Document type
Journal Title
Journal ISSN
Volume Title
Publisher
Title
ISSN
Volume
Issue
Startpage
Endpage
DOI
Abstract
Meta-reinforcement learning is an important subfield in reinforcement
learning (RL) in which a decision-making agent learns to perform in related
tasks by leveraging acquired information and rapidly adapting to
new, unknown tasks. For an agent to perform well in a meta-learning
setup it must intelligently trade-off exploration and exploitation. A recently
proposed method, variBAD, achieves this trade-off near-optimally
by constructing a belief and conditioning the action selection on this belief,
thus conditioning on the uncertainty of the task. Analysis of the
variBAD agent performing in a deterministic task, however, suggests potentially
problematic behaviour: the belief variance increases as the agent
gains information, as well as prediction of non-zero reward probabilities
for non-goal cells even after belief convergence. variBAD and most
other model-based meta-RL methods use, by default, unstructured RNN
memory units, despite many known challenges associated with them. In
this work we build on variBAD and propose an alternative memory unit,
metaMU, that learns the belief parameters in a structured manner. At
its core, metaMU follows an intuitive idea: information acquisition entails
an increase in certainty. We test metaMU on deterministic and stochastic
tasks and show it achieves comparable performance at a smaller computational
cost, yet it lacks the flexibility needed to solve more complex
tasks.
Description
Citation
Supervisor
Faculty
Faculteit der Sociale Wetenschappen