Learning representations for simultaneous localisation and mapping

Keywords

Loading...
Thumbnail Image

Issue Date

2022-09-01

Language

en

Document type

Journal Title

Journal ISSN

Volume Title

Publisher

Title

ISSN

Volume

Issue

Startpage

Endpage

DOI

Abstract

Building a cognitive map from a continuous stream of visual inputs is a non-trivial undertaking. While humans perform this task effortlessly, current robotics systems often fail to do so. Possessing a rich internal representation of the spatial layout is important for downstream tasks such as navigation, scene understanding, or exploration. We start by highlighting connections between different fields dealing with navigation, such as simultaneous localization and mapping, animal foraging, and world model learning. To determine which architecture is best suited to build spatial representations, we compared eight different network types trained in a self-supervised fashion on datasets generated using 3D maze environments of varying size and complexity. The compared networks consist of both feedforward and recurrent models which performaction sequence prediction and coordinate prediction tasks.We find that models that integrate the largest amount of previous information perform best. Local single-state information is insufficient to distinguish identical-looking states and to uniquely determine the global position. Our results imply that model-based navigation agents profit from integrating trajectory information and that agents should thus be endowed with mechanisms to do so. Keywords Spatial Memory · Scene Representation · Representation Learning · Self-supervised Learning · SLAM

Description

Citation

Faculty

Faculteit der Sociale Wetenschappen