SD Architecture

Stable Diffusion Architecture: A Brief Overview

(many thanks to Steins for presenting a clear explanation of the legacy SD 1.5 architecture! You can find his article here (opens in a new tab)!)

The original name of Stable Diffusion is "Latent Diffusion Model", which means that the diffusion process happens in the latent space:

Diffusion is the action of either adding or removing noise from an image, using a U-Net.

With Stable Diffusion, it generates random latent noise, and then diffuses it, removing the noise from the image while applying conditioning to the U-Net in order to turn the latent noise into an image representing your text prompt.

Stable Diffusion Architecture