This week's iteration focuses on Diffusion models. Here's every vision application Diffusion models were a game changer in 2022: image, text, video, 3D, and more! We cover what they are and how they were used in all those amazing applications using lots of videos and featured articles. We will introduce most approaches and give a short explanation, but we invite you to watch or read the featured content for a complete understanding. We hope you enjoy it.
1️⃣ To start: What are Diffusion models?
Nothing's better for explaining something than a concrete example. Let's go over Diffusion models using one of the most popular models of the past few months: Stable Diffusion
. Stable Diffusion is a powerful text-to-image model based on a recent technique called Latent Diffusion. This basically takes Diffusion and makes it more efficient but it stays the exact same process, so it will do fine for this explanation as it means it is also much more accessible to the "non-Google" entities, like us.
Diffusion models are iterative models that take random noise as inputs, which can be conditioned with a text, an image, or any modalities (types of inputs), so it is not completely random noise. It iteratively learns to remove this noise by learning what parameters the model should apply to this noise to end up with a final image. So the basic diffusion models will take random noise with the size of the image and learn to apply even further noise until we get back to a real image.
This is possible because the model will have access to the real images during training and will be able to learn the right parameters by applying such noise to the image iteratively until it reaches complete noise and is unrecognizable. Then, when we are satisfied with the noise we get from all images, meaning that they are similar and generate noise from a similar distribution, we are ready to use our model in reverse and feed it similar noise in the reverse order to expect an image similar to the ones used during training.
Learn more in my article about Stable Diffusion
or in the video below!