Connecting the Demons: How connection choices of a Horde implementation affect Demon prediction capabilities.
The reinforcement learning framework Horde, developed by Sutton et al. , is a network of Demons that processes sensorimotor data to general knowledge about the world. These Demons can be connected to each other and to data-streams from specific sensors. This paper will focus on how and if the capability of Demons to learn general knowledge is affected by different numbers of connections with both other Demons and sensors. Several experiments and tests where done and analyzed to map these effects and to provide insight in how these effects arose. Keywords: Artificial Intelligence, value function approximation, temporal difference learning, reinforcement learning, predictions, prediction error, pendulum environment, parallel processing, offpolicy learning, network connections, knowledge representation, Horde Architecture, GQ( ), general value functions
Faculteit der Sociale Wetenschappen