AutomationIntermediate

How to Add Error Handling and Retries to n8n AI Workflows

Protect AI workflows from rate limits and flaky responses with node retries, an error workflow, and alerts so failures never go silent.

8 minIntermediate

LLM APIs rate-limit, time out, and occasionally return junk. A workflow that ignores this looks fine in testing and quietly drops data in production. This guide adds three layers of protection: per-node retries, a global error workflow, and an alert so you always know when something failed.

What you need

  • An existing AI workflow that calls an LLM
  • A notification destination such as Slack, Telegram, or email
  • A few minutes to configure node and workflow settings

Step 1: Turn on retries for the LLM node

Open the OpenAI or chat model node, go to its Settings tab, and enable Retry On Fail. Set 3 retries with a wait of a few seconds between attempts. Most rate-limit errors clear on the second try, so this alone removes a large share of failures.

n8n - Node settings
Settings (OpenAI node)
Retry On Fail [x]
Max Tries 3
Wait Between Tries 5000 ms
Continue On Fail [ ]
Per-node retry settings under the Settings tab.
Continue On Fail hides errors
Continue On Fail lets the workflow proceed past a failed node, but it can mask real problems. Use it only when you handle the empty result downstream, not as a blanket fix.

Step 2: Build a dedicated error workflow

Create a new workflow whose first node is an Error Trigger. This workflow runs automatically whenever a linked workflow fails, and it receives details about which node broke and why.

Error workflow - Slack message
Workflow failed: {{ $json.workflow.name }}
Node: {{ $json.execution.lastNodeExecuted }}
Error: {{ $json.execution.error.message }}
Time: {{ $json.execution.startedAt }}

Open your main AI workflow, go to its Settings, and set Error Workflow to the one you just built. Now any unhandled failure in the main workflow fires the error workflow and posts an alert.

Step 4: Validate model output before using it

Even a successful API call can return malformed content. Add an IF node after the LLM that checks the result is present and well-formed, and route bad output to a fallback or alert instead of passing garbage downstream.

n8n - execution log
Simulated rate limit on first attempt
OpenAI node: 429 rate_limit_exceeded
Retry 1/3 in 5s...
OpenAI node: success
Workflow finished without firing error trigger
$

Result

Transient errors now self-heal through retries, genuine failures trigger an instant alert with the exact node and message, and malformed model output is caught before it spreads. Your AI workflows fail loudly and recover quietly.

Watch related tutorials

Tags
#n8n#error-handling#retries#reliability