How to Fix Out-of-Memory and Slow Generations in ComfyUI
Diagnose CUDA out-of-memory errors and sluggish runs, then apply launch flags and workflow tweaks that let big models run on modest cards.
The most common wall ComfyUI users hit is the CUDA out-of-memory error, usually with Flux or SDXL on an 8 GB card. The fix is rarely buying a new GPU; it is a mix of launch flags, lighter model files, and smaller batches. This guide works through the levers in order of effort.
What you need
- A ComfyUI install that is throwing memory errors or running slowly
- Access to your launch command or the run batch file
Step 1: Read the actual error
When a run fails, look at the console. A CUDA out of memory message tells you VRAM ran out, which is different from a missing-file error. Knowing which one you have decides the fix.
Step 2: Lower the easy levers first
Drop batch_size back to 1, reduce the resolution (1024 instead of 1536), and close other GPU apps like games or a second browser running video. These cost nothing and often clear the error on their own.
Step 3: Switch to fp8 or GGUF model files
Large models have lighter versions. For Flux, use the fp8 unet and the t5xxl_fp8 encoder instead of the fp16 files, which roughly halves memory. For very tight cards, quantized GGUF model files with the GGUF custom nodes go even lower.
Step 4: Add a memory launch flag
ComfyUI offers flags that trade speed for lower VRAM use. Add --lowvram for cards that keep running out, or --normalvram if auto-detection guessed wrong. On the Windows portable build, edit the .bat file to append the flag.
.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --lowvram
pauseStep 5: Address slowness separately
If runs work but crawl, slowness usually means the model is spilling out of VRAM into system RAM, or you are on an aggressive lowvram mode you no longer need. Once your other fixes free memory, remove --lowvram and let ComfyUI keep more of the model on the GPU.
Result: heavier models running on a modest card without crashing. Apply the cheap fixes first, reach for fp8 or GGUF next, and use launch flags as the final adjustment.
Watch related tutorials
19:48
20:00
19:00
32:00
39:00
21:00