TroubleshootingIntermediate

How to Fix Rate Limit Errors from an AI API

Diagnose a 429 rate-limit error and add backoff, batching, and concurrency control so your calls stop getting throttled.

9 minIntermediate

A 429 Too Many Requests error means you hit the provider's rate limit. It is rarely a sign of a bug; it means your code sends requests faster than your tier allows. This guide reads the error correctly and fixes it with backoff and pacing.

What you need

Code that calls an AI API and sometimes fails with 429
Access to the provider's rate-limit docs for your tier
About 10 minutes

Step 1: Read the error and headers

The response usually tells you how long to wait. Log the status and the retry-after header before you change any logic.

zsh - the error

$node app.js

Error 429: rate limit exceeded

retry-after: 2

The provider is telling you to wait 2 seconds

Step 2: Retry with exponential backoff

On a 429, wait and try again, doubling the delay each time with a little randomness. This clears short bursts without hammering the API.

withRetry.js

async function withRetry(fn, max = 5) {
  for (let i = 0; i < max; i++) {
    try { return await fn(); }
    catch (e) {
      if (e.status !== 429 || i === max - 1) throw e;
      const wait = (2 ** i) * 500 + Math.random() * 250;
      await new Promise(r => setTimeout(r, wait));
    }
  }
}

Step 3: Cap concurrency

Firing a hundred calls at once guarantees a 429. Limit how many run in parallel so you stay under the per-minute ceiling.

Logs - paced requests

concurrency = 4

[ok] req 1 [ok] req 2 [ok] req 3 [ok] req 4

[ok] req 5 [ok] req 6 ...

0 rate-limit errors

With a concurrency cap, requests stay under the limit.

Step 4: Batch where the API allows it

Fewer, bigger calls

If the API accepts multiple inputs per request, batch them. One call doing ten items uses far less of your rate budget than ten separate calls.

Result: transient limits are absorbed by backoff, sustained load stays under the ceiling via concurrency caps, and 429 errors stop reaching your users.