S
Claude CodeRun a strong open model offline on a MacBook
Best Local LLM on a Mac (Apple Silicon)
setuproll@setuproll91.0Overall score
The simplest way to get a capable open model running on Apple Silicon, using Ollama with a clean chat UI and the right quant for your unified memory. Built for Mac owners who want private, offline coding and chat without touching the terminal much.
91.0Score
3.1kVotes
5Components
Install this build
terminal
brew install ollama && ollama run qwen3:32bComponents
Model
- Qwen3 32B (Q4_K_M)
- Llama 3.3 70B for 64GB+ Macs
Stack
- Ollama (Metal backend)
- Open WebUI
Hardware
- M-series, 36GB+ unified memory
- 70B needs 64GB+
Quantization
- Q4_K_M for balance
- Q5_K_M if memory allows
How it works
- Install Ollama, it auto-uses the Metal GPU backend
- Pull a quant sized to leave 8GB headroom for the OS
- Run Open WebUI in Docker and point it at localhost:11434
- Chat fully offline, no data leaves the machine
Summary
The simplest way to get a capable open model running on Apple Silicon, using Ollama with a clean chat UI and the right quant for your unified memory. Built for Mac owners who want private, offline coding and chat without touching the terminal much.
91.0 score 3.1k votes
0 Reviews
Your rating
Sign in to post
Loading discussion...