Changing the U-net in a diffusion model

Hill, Alan

Changing the U-net in a diffusion model

Files

Hill, A. s-1036334-Bsc-Thesis-2023.pdf (1.37 MB)

Authors

Hill, Alan

Issue Date

2023-01-27

Language

en

URI

https://theses.ubn.ru.nl/handle/123456789/16032

Abstract

Diffusion models are an important discovery in the field of image generation, performing better than the previous state- of- the- art, big GANS, while also being easier to train and implement. Due to this, diffusion has replaced GANs as the current state of the art, with use in projects such as Dall-e and Imagen from Google. However, the issue with diffusion is the fact that it is so new. Because of this, a lot has yet to be explored and usually, many diffusion models use similar implementations. And often, many of these implementations use U-nets with little differences between U-net implementations. So an exploration of these U-nets and modification of these U-nets is important to discover potential new improvements or, at the least, explain why certain functions of the U-net are crucial. In the end there were three implementations attempted: A baseline with no changes; U- net with a removal of the time embeddings; and U- net on top of a U-net. These then attempted to generate images using the Stanford cars data set. The baseline seemed to perform decently, creating a mostly complete picture of a car at 100 epochs. The removal of the time embeddings seemed to fail as it only produced noise, though this is incorrect and most likely did so as a result of a programming error. Finally the U-net on U-net was far too slow to produce anything, taking about three hours to hit five epochs.

Supervisor

Ambrogioni, Luca

Shahsavari, Mahyar

Faculty

Faculteit der Sociale Wetenschappen