Using Deep Probabilistic Reinforcement Learning to Improve Traffic Flow with Partial Vehicle Detection for Intelligent Traffic Signal Control

Vleuten, van der , Noah

Using Deep Probabilistic Reinforcement Learning to Improve Traffic Flow with Partial Vehicle Detection for Intelligent Traffic Signal Control

Files

Vleuten, vd, N. s-1018323-BSc-Thesis-2021.pdf (3.99 MB)

Authors

Vleuten, van der , Noah

Issue Date

2021-06-18

Language

en

URI

https://theses.ubn.ru.nl/handle/123456789/15810

Abstract

With the constant rise of vehicles on the streets, traffic flow needs to be as optimal as possible. Improving traffic flow requires intelligent traffic signal control (the field of controlling traffic lights). Mismanagement of traffic lights can cause traffic congestion. Traffic congestion can lead to accidents, wasted productivity, and, most importantly, is terrible for the environment (with fossil fuel cars). Several deep reinforcement learning methods have been suggested to improve TSC, but they all assume perfectly observed data. However, the assumption of always being able to observe how many cars are on each lane is not realistic. Real-world observations are inherently noisy, sensors can fail, and some sensors might be left out as cost-saving measures. This work, therefore, focuses on partial vehicle detection in traffic signal control. Specifically trying to improve average travel time given only partial observations of cars at intersections. A variational autoencoder (VAE) has been developed using graph neural networks (GNN's) for the encoder and decoder network to solve this problem. This VAE tries to re- construct the real-world situation and the uncertainty of its prediction given lane counts at a mix of observed and unobserved intersections. A state-of- the-art reinforcement learning method called CoLight is then used to test this VAE in the Cityflow simulator to see how well it performs. Results show that the CoLight model performs better when using the reconstructed modes from the VAE (515) compared to no information at all (1146) at the unobserved intersections. The CoLight model with the additional uncertainty measure performs marginally better (490) than the model that only gets the reconstruction. However, the MaxPressure + FixedTime baseline still outperforms all models using the average travel time metric (402).

Supervisor

Ambrogioni, L.

Hinne, M.

Faculty

Faculteit der Sociale Wetenschappen