Claude CodeRun a strong open model offline on a MacBook

Best Local LLM on a Mac (Apple Silicon)

91.0Overall score

The simplest way to get a capable open model running on Apple Silicon, using Ollama with a clean chat UI and the right quant for your unified memory. Built for Mac owners who want private, offline coding and chat without touching the terminal much.

91.0Score

3.1kVotes

5Components

Install this build

Export

terminal

brew install ollama && ollama run qwen3:32b

Components

Model

Qwen3 32B (Q4_K_M)
Llama 3.3 70B for 64GB+ Macs

Stack

Ollama (Metal backend)
Open WebUI

Hardware

M-series, 36GB+ unified memory
70B needs 64GB+

Quantization

Q4_K_M for balance
Q5_K_M if memory allows

How it works

Install Ollama, it auto-uses the Metal GPU backend
Pull a quant sized to leave 8GB headroom for the OS
Run Open WebUI in Docker and point it at localhost:11434
Chat fully offline, no data leaves the machine

Summary

91.0 score 3.1k votes

0 Reviews

Your rating

Loading discussion...

← All builds