Guided belief updates in Deep Bayesian Meta-Reinforcement Learning

dc.contributor.advisorAmbrogioni, Luca
dc.contributor.advisorHinne, Max
dc.contributor.authorBorker, Jeremy
dc.date.issued2022-09-01
dc.description.abstractBalancing exploration and exploitation is a key challenge of reinforcement learning. The Bayes-adaptive policy finds this optimal balance by conditioning on a posterior belief over reward and transition function. The current state-of-the-art approach, VariBad, attempts to meta-train a recurrent neural network to perform approximate Bayesian inference over the posterior belief. Observing the posterior variance reveals behavior dissimilar to exact posterior updates. Therefore it appears that learning the desired behavior entirely a posteriori from data is problematic. Hence, this work provides the belief inference model of a Bayesian RL agent with Bayesian inference mechanics a priori and investigate how this influences performance.
dc.identifier.urihttps://theses.ubn.ru.nl/handle/123456789/16314
dc.language.isoen
dc.thesis.facultyFaculteit der Sociale Wetenschappen
dc.thesis.specialisationspecialisations::Faculteit der Sociale Wetenschappen::Artificial Intelligence::Master Artificial Intelligence
dc.thesis.studyprogrammestudyprogrammes::Faculteit der Sociale Wetenschappen::Artificial Intelligence
dc.thesis.typeMaster
dc.titleGuided belief updates in Deep Bayesian Meta-Reinforcement Learning
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Börker, C. s-4414683-MSc-MKI94-Thesis-2022.pdf
Size:
1.39 MB
Format:
Adobe Portable Document Format