How to Stream AI Replies Live in Discord by Editing Messages
Make Discord answers feel like ChatGPT by streaming tokens and editing the message as they arrive.
Waiting ten seconds for a wall of text feels broken. The fix is to stream: post a placeholder, then edit the message as tokens arrive so the answer appears to type itself. This guide shows the pattern with a throttle so you do not hit Discord's rate limits.
What you need
- A working discord.js bot (see the auto-reply guide)
- An AI provider that supports streaming responses
- Node 18+ installed
Step 1: Open a streaming request
Set stream: true on the chat completion. Instead of one response you get an async iterable of chunks, each carrying a small delta of text.
const stream = await ai.chat.completions.create({
model: "gpt-5-mini",
stream: true,
messages: [{ role: "user", content: prompt }],
});Step 2: Post a placeholder and edit on a timer
Send a single character first to get a message you can edit. Accumulate the streamed text in a buffer and flush it to the message on an interval. Clear the interval and do a final edit when the stream ends.
const sent = await msg.reply("...");
let buffer = "";
let dirty = false;
const timer = setInterval(async () => {
if (!dirty) return;
dirty = false;
await sent.edit(buffer.slice(0, 1990) || "...");
}, 1000);
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content || "";
if (delta) {
buffer += delta;
dirty = true;
}
}
clearInterval(timer);
await sent.edit(buffer.slice(0, 1990));Step 3: Run and watch it type
Result
Long answers now feel responsive because users see text appear within a second and watch it grow. The timer keeps you safely under Discord's rate limits even for paragraph-length replies.
Watch related tutorials
5:42
24:16
33:42
41:18
28:05
3:12