Changing the U-net in a diffusion model
Keywords
Loading...
Authors
Issue Date
2023-01-27
Language
en
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Diffusion models are an important discovery in the field of image generation, performing
better than the previous state- of- the- art, big GANS, while also being easier to train
and implement. Due to this, diffusion has replaced GANs as the current state of the art,
with use in projects such as Dall-e and Imagen from Google. However, the issue with
diffusion is the fact that it is so new. Because of this, a lot has yet to be explored and
usually, many diffusion models use similar implementations. And often, many of these
implementations use U-nets with little differences between U-net implementations. So
an exploration of these U-nets and modification of these U-nets is important to discover
potential new improvements or, at the least, explain why certain functions of the U-net
are crucial. In the end there were three implementations attempted: A baseline with no
changes; U- net with a removal of the time embeddings; and U- net on top of a U-net.
These then attempted to generate images using the Stanford cars data set. The baseline
seemed to perform decently, creating a mostly complete picture of a car at 100 epochs.
The removal of the time embeddings seemed to fail as it only produced noise, though
this is incorrect and most likely did so as a result of a programming error. Finally the
U-net on U-net was far too slow to produce anything, taking about three hours to hit five
epochs.
Description
Citation
Supervisor
Faculty
Faculteit der Sociale Wetenschappen