GeminiIntermediate

How to Stream Gemini Responses in Node.js

Use the JavaScript SDK to stream a Gemini answer token by token instead of waiting for the full reply.

7 minIntermediate

For chat interfaces and CLIs, waiting for a full Gemini response feels slow. Streaming prints text as it is generated, which makes the output appear instantly. This guide uses the official @google/genai package in Node to stream a response chunk by chunk.

What you need

Node.js 20 or newer
A Gemini API key in GEMINI_API_KEY
A project set to ES modules (type module in package.json)
About 6 minutes

Step 1: Install the JavaScript SDK

Add the official Google GenAI package to your project. It works in both ES module and CommonJS projects, but the example below uses modern import syntax.

terminal

npm install @google/genai

Step 2: Use the streaming method

Instead of generateContent, call generateContentStream. It returns an async iterable, so you loop over chunks with for await and write each piece to stdout as it arrives.

stream.mjs

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({}); // reads GEMINI_API_KEY

const stream = await ai.models.generateContentStream({
  model: "gemini-2.5-flash",
  contents: "Explain what a race condition is, briefly.",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}
process.stdout.write("\n");

Step 3: Run and watch it stream

Run the file. Rather than appearing all at once, the explanation types itself out across the terminal as Gemini produces it.

node - stream

$ node stream.mjs

A race condition happens when two

operations access shared state at the

same time and the result depends on

their timing...

Text appearing chunk by chunk as it streams.

Streaming does not change the total cost

You are billed for the same input and output tokens whether you stream or not. Streaming only changes how fast the user sees the first words.

Result

Your Node program now renders Gemini output progressively. In a web app you would forward each chunk over a server-sent events stream or a websocket so the browser updates live, exactly the way a chat UI feels.