Optimal Control and Reinforcement Learning for Partially Observable Dynamical Systems
Keywords
Loading...
Authors
Issue Date
2023-06-01
Language
en
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Dynamical systems can be found everywhere around us. They range from complex
systems such as the weather to smaller systems like a bead accelerating towards
the earth. How agents or controllers can influence these systems to best suit their
interests is a central question in both Artificial Intelligence and Control Theory.
This problem is of interest to research, industry, and society. We observe the ability
to control the internal or external state at every level of life. Insights into these
control mechanisms can help us gain a better understanding of natural intelligence,
constituting a scientific interest. Furthermore, many processes can be described
as dynamical systems. Optimizing controllers can benefit industrial interest in
making processes more cost and time efficient. In society, controllers already
play an enormous role. Think about your thermostat, washing machine, ’smart’
traffic lights, or cruise control. Research into control can make our world safer
(driver assistance), more reliable (energy grid controllers), and more sustainable
(automated building management). While this field has been extensively studied
for a long time, since the 1960s there has been a great open problem. How can we
optimally both regulate a system and learn its dynamics? As real-world systems
are often only partially observable, we want controllers that can learn from their
observations. Where some fully observable systems have known optimal solutions,
this problem, known as dual control, is known to be intractable. This work aims to
investigate how both Control Theory and Reinforcement Learning can solve the
dual control problem. We show that an existing method for solving the dual control
problem can be extended to the two-dimensional case. Furthermore, we provide
both a recurrent and a regular Soft Actor-Critic agent implemented in the JAX
ecosystem. We show that the Recurrent Soft Actor-Critic provides more reliable
and effective control to partially observable dynamical systems.
Description
Citation
Supervisor
Faculty
Faculteit der Sociale Wetenschappen