A
Claude Code logoClaude CodeHeavy reasoning and long context, fully self-hosted

Local Reasoning Model on a Workstation

setuproll@setuproll
84.0Overall score

Runs a strong open reasoning model with a long context window on a single beefy workstation, using LM Studio for an easy server toggle and model management. For people who want deep step-by-step reasoning on hard problems without sending anything to an API.

84.0Score
1.6kVotes
5Components

Install this build

Export
terminal
lms get deepseek-r1-distill-qwen-32b && lms server start

Components

Model

  • DeepSeek-R1 Distill Qwen 32B
  • Qwen3 32B (thinking mode)

Stack

  • LM Studio
  • Local server mode
  • MLX or CUDA runtime

Hardware

  • RTX 4090 24GB or 64GB Apple Silicon
  • 48GB+ for higher quants

Quantization

  • Q4_K_M fits 24GB
  • Q6_K on 48GB for sharper reasoning

How it works

  • Browse and download the model inside LM Studio
  • Set a long context window and enable the local server
  • Toggle thinking mode so it shows its reasoning trace
  • Call the OpenAI-style endpoint from your own scripts

Summary

Runs a strong open reasoning model with a long context window on a single beefy workstation, using LM Studio for an easy server toggle and model management. For people who want deep step-by-step reasoning on hard problems without sending anything to an API.

84.0 score 1.6k votes

0 Reviews

Your rating
Sign in to post

Loading discussion...