Video EditingBeginner

How to Generate Video with Veo in the Gemini App

Use Google's Veo model inside the Gemini app to create a clip from a text prompt and save it.

6 minBeginner

Veo is Google's video generation model, available to subscribers through the Gemini app and through Google's video tools. The easiest entry point is the Gemini app, where you select Video and type a prompt. This guide covers generating, reviewing, and saving a Veo clip.

What you need

  • A Google account with access to Veo (a Google AI Pro or Ultra plan)
  • The Gemini app or gemini.google.com in a browser
  • A descriptive scene prompt
  • About 5 minutes plus render time

Step 1: Switch Gemini to Video

Open Gemini and look for the tool selector near the prompt bar. Choose Video (or the Veo option if your plan exposes it directly). The composer changes to accept a video prompt and shows aspect ratio choices.

Gemini - Video tool
Gemini Model: Veo
------------------------------------------------------------
Tools: [ Image ] [ *Video* ] [ Deep Research ]
Aspect: 16:9 | 9:16
| A surfer riding a glassy wave at golden hour...
[ Create ]
Selecting the Video tool before typing a prompt.

Step 2: Write a vivid scene prompt

Veo handles natural scenes and motion well and can also generate matching audio. Describe the subject, the action, the setting, and the mood. If you want ambient sound, mention it in the prompt.

veo-prompt.txt
A surfer rides a glassy turquoise wave at golden hour,
spray catching the light, cinematic wide shot,
sound of crashing waves and wind.
Veo can include audio
Unlike most rivals, Veo can generate a soundtrack and ambient effects from your prompt. Naming the sounds you want, waves, footsteps, dialogue tone, often produces a usable audio bed alongside the video.

Step 3: Create and wait

Click Create. Veo renders in the background and notifies you when the clip is ready, usually within a couple of minutes. The result appears inline in the conversation with playback controls.

Step 4: Review and refine the prompt

Play the clip and judge the motion and framing. To iterate, reply in the same thread with a refinement such as 'make the camera lower and closer to the water'. Gemini keeps the context and generates a new take.

Gemini - Result
[ > 0:00 / 0:08 ] surfer_goldenhour.mp4
------------------------------------------------------------
Audio: on (waves, wind)
[ Download ] [ Share ] [ Regenerate ]
A finished Veo clip with download and share options.

Step 5: Download the clip

Use the download control under the clip to save the MP4 to your device. Clips carry a SynthID watermark identifying them as AI-generated, which is expected and does not affect normal editing.

Result

You have an eight second surfing clip with a matching audio bed, generated entirely from text inside Gemini. The ability to refine in the same conversation makes Veo feel more like directing than prompting.

Watch related tutorials

Tags
#veo#gemini#google#text-to-video