Improving Behaviour by Modelling Irrelevant Environmental Features

Keywords

Loading...
Thumbnail Image

Issue Date

2022-06-20

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

DreamerV2 is a model based algorithm that has only recently been created. It uses a compact latent state to store information about the current state of the environment to gain higher rewards. We observed that DreamerV2 has a problem in more complex environments because it is unable to encode all the features of this more complex environment properly. This research introduces an alteration of DreamerV2 that can work better in these certain environments. We created a CNN that gets multiple images, with these images the CNN tries to predict the irrelevant features of the environment. By adding this CNN to the DreamerV2 framework we expect the model to get higher rewards when the complexity of the environment increases. We expect that, because the CNN predicts all irrelevant features, only the relevant features need to be processed by the auto-encoder of DreamerV2, resulting in a higher reward when playing a game. We have seen that our model does exactly this. Our added model predicts some of the irrelevant features, resulting in a significantly higher reward for our model,compared to the original DreamerV2, when trained on a game with an added distractor.

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen