Navigating Aperiodicity: Challenges of Reinforcement Learning

dc.contributor.advisorKachman, Tal
dc.contributor.advisorThill, Serge
dc.contributor.authorGerding, Fynn
dc.date.issued2024-08-05
dc.description.abstractThis thesis investigates the application of reinforcement learning (RL) in environments with uncertain action spaces, specifically focusing on the Penrose P3 tiling environment. Traditional RL approaches fail in such situations due to the violation of the Markov property, caused by the variable action space. To address this, I employ proximal pol icy optimisation (PPO) and explore various neural network architectures as the acting policy to allow for the integration of contextual information. Specifically, I use various recurrent network architectures and a Transformer encoder to study the influence of in corporating the agent’s past trajectory into the policy optimisation process. Additionally, I evaluate the impact of attention mechanisms and positional embeddings on convergence rates and attention scores. Through extensive experiments, I analyse the performance of these architectures in navigating the aperiodic and non-Markovian Penrose P3 environ ment. The findings reveal that all model architectures use the context to enhance their decision-making. The attention scores show that the local context matters particularly. Augmenting the agent’s context with positional embeddings helps, particularly with the Transformer’s convergence speed, but also revealed interesting artefacts from the P3 en vironment in the other architecture’s attention scores. This research contributes to the understanding of how context can help RL algorithms to navigate aperiodicity and pro vides insights into designing RL agents for complex real-world applications with dynamic action spaces.
dc.identifier.urihttps://theses.ubn.ru.nl/handle/123456789/19492
dc.language.isoen
dc.thesis.facultyFaculteit der Sociale Wetenschappen
dc.thesis.specialisationspecialisations::Faculteit der Sociale Wetenschappen::Artificial Intelligence::Bachelor Artificial Intelligence
dc.thesis.studyprogrammestudyprogrammes::Faculteit der Sociale Wetenschappen::Artificial Intelligence
dc.thesis.typeBachelor
dc.titleNavigating Aperiodicity: Challenges of Reinforcement Learning

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Gerding, F. s-1075217-BSc-BKI340-Thesis-2024.pdf
Size:
4.3 MB
Format:
Adobe Portable Document Format