Stable Diffusion Architecture: A Brief Overview
(many thanks to Steins for presenting a clear explanation of the legacy SD 1.5 architecture! You can find his article here (opens in a new tab)!)
The original name of Stable Diffusion is "Latent Diffusion Model", which means that the diffusion process happens in the latent space:
Diffusion is the action of either adding or removing noise from an image, using a U-Net.
With Stable Diffusion, it generates random latent noise, and then diffuses it, removing the noise from the image while applying conditioning to the U-Net in order to turn the latent noise into an image representing your text prompt.