How to Run Continue with a Local Model Using Ollama
Install Ollama, pull a code model, and connect it to Continue so your AI assistant runs fully offline.
Running a model locally means your code never leaves your machine and there is no per-request bill. Ollama makes this practical by serving open models behind a simple local endpoint. This guide installs Ollama, pulls a coding model, and wires it into Continue for both chat and autocomplete.
What you need
- The Continue extension installed
- A machine with at least 8 GB of RAM, more for larger models
- A few gigabytes of free disk space per model
- About 10 minutes plus download time
Step 1: Install Ollama
Download Ollama from its website and install it, or use the install script on Linux. Once installed it runs as a background service listening on localhost port 11434. Confirm it is running from the terminal.
Step 2: Pull a model
Use ollama pull to download a model. A general coding model like qwen2.5-coder works well for chat and edits. For autocomplete, a smaller model gives faster responses. The first pull downloads several gigabytes; later runs are instant.
Step 3: Add the model to Continue
Open ~/.continue/config.yaml and add an entry with provider set to ollama and the model id matching the tag you pulled. No API key is needed because the model runs locally. Add a second entry with the autocomplete role pointing at the small base model.
models:
- name: Qwen Coder (local)
provider: ollama
model: qwen2.5-coder:7b
roles:
- chat
- edit
- name: Qwen Autocomplete
provider: ollama
model: qwen2.5-coder:1.5b-base
roles:
- autocompleteStep 4: Test offline
Select the local model in the chat panel and ask a question. Then, to prove it is truly local, turn off your network and ask again. The response still arrives because everything runs on your own machine.
Result
Continue now uses a model served by Ollama on localhost, with a larger model for chat and a smaller one for autocomplete. You have an AI assistant that costs nothing per request and keeps your code on your machine.
Watch related tutorials
14:00
18:00
10:00
14:00
16:00
1:42:18