Structured Belief Learning for Meta-Reinforcement Learning

Keywords

Loading...
Thumbnail Image

Issue Date

2022-12-01

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

Meta-reinforcement learning is an important subfield in reinforcement learning (RL) in which a decision-making agent learns to perform in related tasks by leveraging acquired information and rapidly adapting to new, unknown tasks. For an agent to perform well in a meta-learning setup it must intelligently trade-off exploration and exploitation. A recently proposed method, variBAD, achieves this trade-off near-optimally by constructing a belief and conditioning the action selection on this belief, thus conditioning on the uncertainty of the task. Analysis of the variBAD agent performing in a deterministic task, however, suggests potentially problematic behaviour: the belief variance increases as the agent gains information, as well as prediction of non-zero reward probabilities for non-goal cells even after belief convergence. variBAD and most other model-based meta-RL methods use, by default, unstructured RNN memory units, despite many known challenges associated with them. In this work we build on variBAD and propose an alternative memory unit, metaMU, that learns the belief parameters in a structured manner. At its core, metaMU follows an intuitive idea: information acquisition entails an increase in certainty. We test metaMU on deterministic and stochastic tasks and show it achieves comparable performance at a smaller computational cost, yet it lacks the flexibility needed to solve more complex tasks.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen