Latent space
Latent space is the compressed, abstract representation a diffusion model works in. Instead of manipulating millions of pixels, the model generates in this smaller space and then decodes it into an image.
A full-resolution image is millions of pixels - too much to generate efficiently. Latent space is a much smaller, encoded representation of an image where each point captures meaning (shapes, textures, structure) rather than individual pixels. Latent diffusion models do all their heavy lifting here, which is what makes everyday AI image generation fast enough to be practical.
How it works
The VAE encoder compresses images into latents, and its decoder turns them back into pixels. The diffusion model never sees raw pixels during generation - it starts from random latents and denoises them step by step. Only at the very end does the VAE decode the finished latents into the picture you see.
Why it matters
Working in latent space is the breakthrough that made high-quality generation affordable on consumer hardware. It also explains a lot of behavior: the seed sets the initial latent noise, and operations like image-to-image and inpainting happen by editing latents, not pixels.
Try it in the generator
Put latent space to work right now - free daily generations, commercial license included.
Related terms
- VAEA VAE (Variational Autoencoder) is the component that converts images between full-resolution pixels and the compressed latent space a diffusion model works in - encoding on the way in and decoding on the way out.
- Diffusion modelA diffusion model is the type of AI that powers most modern image generators. It learns to turn random noise into a coherent image by reversing a step-by-step noising process.
- DenoisingDenoising is the core operation of a diffusion model: at each step it predicts and removes a little noise, gradually turning a random field into a clear image.
- SeedA seed is the number used to initialize the random noise an image is built from. The same seed plus the same prompt and settings produces the same image every time, which makes results reproducible.