Core concepts

Text-to-image

Text-to-image is the AI workflow where you type a written prompt and the model generates a brand-new image from it - no source image required.

Text-to-image (sometimes written "txt2img") is the most common way to use an AI image generator. You describe what you want in words and the model paints a fresh image that matches your description. Nothing is traced or copied from an existing picture - the result is synthesized from scratch.

How it works

Under the hood, a diffusion model starts from pure random noise and removes it step by step. A text encoder turns your prompt into numbers (an embedding) that steer each denoising step toward an image that fits your words. After a fixed number of steps, the noise has become a coherent picture.

How closely the result follows your wording is controlled by the CFG scale. The exact image you get for a given prompt depends on the random starting point, which is set by the seed - reuse a seed and you can reproduce the same image.

Why it matters

Text-to-image is what makes AI generators feel like magic: you go from an idea to a usable visual in seconds, with no drawing skill required. Writing clear, specific prompts - subject, style, lighting, composition - is the single biggest lever on quality.

Example

A descriptive prompt that names the subject, style, lighting and framing gives the model far more to work with than a single word.

Example prompt

A cozy Scandinavian reading nook by a rain-streaked window, warm morning light, soft film grain, photorealistic, 35mm, shallow depth of field

Try it in the generator

Put text-to-image to work right now - free daily generations, commercial license included.

Start creating free

Frequently asked questions

What is the difference between text-to-image and img2img?

Text-to-image starts from random noise and builds a brand-new picture from your words alone. Img2img starts from an existing image and transforms it according to your prompt, keeping some of the original structure.