Core concepts

Latent space

Latent space is the compressed, abstract representation a diffusion model works in. Instead of manipulating millions of pixels, the model generates in this smaller space and then decodes it into an image.

A full-resolution image is millions of pixels - too much to generate efficiently. Latent space is a much smaller, encoded representation of an image where each point captures meaning (shapes, textures, structure) rather than individual pixels. Latent diffusion models do all their heavy lifting here, which is what makes everyday AI image generation fast enough to be practical.

How it works

The VAE encoder compresses images into latents, and its decoder turns them back into pixels. The diffusion model never sees raw pixels during generation - it starts from random latents and denoises them step by step. Only at the very end does the VAE decode the finished latents into the picture you see.

Why it matters

Working in latent space is the breakthrough that made high-quality generation affordable on consumer hardware. It also explains a lot of behavior: the seed sets the initial latent noise, and operations like image-to-image and inpainting happen by editing latents, not pixels.

Try it in the generator

Put latent space to work right now - free daily generations, commercial license included.

Start creating free

Related terms

Back to the glossary

Latent space

How it works

Why it matters

Related terms

Ready to get started?