GeminiBeginner

How to Generate a Short Video with Veo in Gemini

Use the Veo video model inside the Gemini app to turn a text prompt into a short clip, then refine it.

7 minBeginner

Veo is Google's text-to-video model, available to Gemini subscribers inside the chat app. You describe a scene and it generates a short clip with motion and sound. This guide writes a prompt that produces a usable result and shows how to iterate when the first take is off.

What you need

  • A Gemini plan that includes Veo video generation
  • A clear visual idea: subject, setting, and motion
  • A few minutes per clip for rendering

Step 1: Open the video tool

In the Gemini app, open the tools menu near the prompt box and pick the video option (often labelled Video or Veo). The interface switches to a mode built for generating clips rather than text replies.

Gemini - tools menu
Tools
------------------
Deep Research
Canvas
> Video (Veo)
Image (Imagen)
Select the video tool to switch into Veo mode.

Step 2: Write a specific prompt

Good video prompts name the subject, the action, the camera movement, and the style. Vague prompts produce generic footage. Treat it like a one-line shot description a director would hand a camera operator.

veo-prompt.txt
A golden retriever puppy running across a sunny beach at sunrise,
slow motion, camera tracking alongside, warm cinematic lighting,
soft waves in the background.

Step 3: Review, then refine

When the clip finishes, watch it and decide what to change. Rather than rewriting from scratch, adjust one element at a time, such as the camera angle or time of day, so you can tell what each change does.

Gemini - refine the clip
You
Same scene but at night with a full moon and cooler blue tones.
Agent
Generating a new version with night lighting...
Iterate by changing a single attribute.
Describe motion, not just the scene
The thing that separates video prompts from image prompts is movement. Spell out what moves and how the camera behaves (pan, zoom, track) to avoid a clip that looks like a barely-animated still.
Note
Clips are short by design, often around 8 seconds. For a longer sequence, generate several clips and stitch them in a video editor rather than expecting one long render.

Result

You get a short, downloadable clip from a text description, and a fast loop for refining it. Generating two or three variations and picking the best one is usually quicker than perfecting a single prompt.

Watch related tutorials

Tags
#video#veo#generation#text-to-video