Claude CodeHeavy reasoning and long context, fully self-hosted

Local Reasoning Model on a Workstation

84.0Overall score

Runs a strong open reasoning model with a long context window on a single beefy workstation, using LM Studio for an easy server toggle and model management. For people who want deep step-by-step reasoning on hard problems without sending anything to an API.

84.0Score

1.6kVotes

5Components

Install this build

Export

terminal

lms get deepseek-r1-distill-qwen-32b && lms server start

Components

Model

DeepSeek-R1 Distill Qwen 32B
Qwen3 32B (thinking mode)

Stack

LM Studio
Local server mode
MLX or CUDA runtime

Hardware

RTX 4090 24GB or 64GB Apple Silicon
48GB+ for higher quants

Quantization

Q4_K_M fits 24GB
Q6_K on 48GB for sharper reasoning

How it works

Browse and download the model inside LM Studio
Set a long context window and enable the local server
Toggle thinking mode so it shows its reasoning trace
Call the OpenAI-style endpoint from your own scripts

Summary

84.0 score 1.6k votes

0 Reviews

Your rating

Loading discussion...

← All builds