A
AiderTeach a small open model your task on a single GPU
Fine-Tune an Open Model on One GPU (QLoRA)
setuproll@setuproll84.0Overall score
A budget fine-tuning pipeline that adapts an open model to your domain with QLoRA, fitting training onto a single consumer GPU and exporting straight to a GGUF you can run in Ollama. For builders who have a few thousand labeled examples and want a specialized model without renting a cluster.
84.0Score
980Votes
5Components
Install this build
terminal
pip install unsloth && python train.py --model llama-3.1-8b --4bitComponents
Model
- Llama 3.1 8B
- Qwen3 8B
- Gemma 3 12B
Stack
- Unsloth
- TRL
- PEFT
- bitsandbytes
- Weights & Biases
Hardware
- 1x RTX 4090 24GB
- 16GB works for 8B with 4-bit
Export
- llama.cpp GGUF convert
- Ollama Modelfile
How it works
- Format your examples as instruction or chat JSONL
- Unsloth loads the base model in 4-bit and trains LoRA adapters
- Track loss and eval samples in Weights & Biases as it runs
- Merge, convert to GGUF, and serve the tuned model in Ollama
Summary
A budget fine-tuning pipeline that adapts an open model to your domain with QLoRA, fitting training onto a single consumer GPU and exporting straight to a GGUF you can run in Ollama. For builders who have a few thousand labeled examples and want a specialized model without renting a cluster.
84.0 score 980 votes
0 Reviews
Your rating
Sign in to post
Loading discussion...