Easy

Continue.dev: The Open-Source AI Coding Assistant Guide

You're paying $10-20/month for an AI coding assistant that locks you into one model provider, sends your code to servers you don't control, and gives you zero visibility into how it works. Meanwhile, Continue.dev does the same job for free, runs any model you want (including ones on your own machine), and publishes every line of its source code on GitHub.

Continue has quietly crossed 20,000+ GitHub stars and become the most popular open-source AI coding assistant in 2026. It works as a plugin for both VS Code and JetBrains -- meaning you keep your existing editor, keybindings, and extensions. And unlike proprietary alternatives, you can point it at a local Ollama instance and never send a single line of code to the cloud.

This guide covers everything from installation to advanced configuration, with practical examples you can run today.

📋 What You'll Need

A code editor -- VS Code (1.70+) or any JetBrains IDE (IntelliJ, PyCharm, WebStorm, etc.)
An AI model provider -- at least one of: an API key from Anthropic/OpenAI/Google, or Ollama installed locally
Basic terminal comfort -- you'll edit a YAML config file, nothing scary
10 minutes -- that's genuinely all the setup takes

🚀 Installation and First Run

Getting Continue running takes about three clicks. The harder part is choosing which AI model to wire it up to -- but we'll get there.

VS Code

Open the Extensions panel (Cmd+Shift+X on macOS, Ctrl+Shift+X on Windows/Linux)
Search for "Continue"
Click Install on the extension by Continue.dev
Open the Continue sidebar with Cmd+L (macOS) or Ctrl+L (Windows/Linux)

JetBrains

Open Settings → Plugins → Marketplace
Search for "Continue"
Click Install and restart your IDE
Open the Continue panel with Cmd+J (macOS) or Ctrl+J (Windows/Linux)

After installing, Continue will walk you through connecting your first model. You can pick from a dropdown of providers or skip ahead to manual configuration.

Tip: If you already have an Anthropic or OpenAI API key, the fastest path is to select the provider from the setup wizard and paste your key. You'll be coding with AI in under 60 seconds.

Your First Interaction

Once a model is connected, highlight any code in your editor and press Cmd+L (VS Code) or Cmd+J (JetBrains) to send it to Continue's chat. Try something simple:

# Highlight this function, then ask Continue: "Add error handling"
def fetch_user(user_id):
    response = requests.get(f"https://api.example.com/users/{user_id}")
    return response.json()

Continue will stream back a modified version with try/except blocks, status code checks, and timeout handling. You can apply the suggestion directly to your file or copy it manually.

⚙️ Configuration Deep Dive

Continue's configuration lives in a YAML file that controls everything: which models to use, what context providers are available, custom rules, and tool integrations. This is where Continue's flexibility leaves proprietary tools in the dust.

Where the Config Lives

Continue stores its configuration at ~/.continue/config.yaml. You can also open it directly from the Continue sidebar by clicking the gear icon.

Here's a practical starter configuration that sets up Claude for chat and a fast model for autocomplete:

# ~/.continue/config.yaml
models:
  - name: "Claude Sonnet"
    provider: anthropic
    model: claude-sonnet-4-20250514
    apiKey: YOUR_ANTHROPIC_API_KEY
    roles:
      - chat
      - edit

  - name: "GPT-4o Mini"
    provider: openai
    model: gpt-4o-mini
    apiKey: YOUR_OPENAI_API_KEY
    roles:
      - autocomplete

rules:
  - Always use TypeScript strict mode
  - Prefer functional components in React
  - Include error handling in all async functions

Model Roles Explained

Continue lets you assign different models to different tasks. This is genuinely smart -- you don't need a $15/million-token model for autocomplete suggestions:

Role	What It Does	Recommended Model
`chat`	Powers the chat sidebar	Claude Sonnet 4, GPT-4o
`edit`	Inline code edits	Claude Sonnet 4, GPT-4o
`autocomplete`	Tab completion suggestions	GPT-4o Mini, Codestral, Starcoder2
`apply`	Applies chat suggestions to files	Claude Sonnet 4, GPT-4o

# Use an expensive model for thinking, a cheap one for typing
models:
  - name: "Claude Opus (heavy thinking)"
    provider: anthropic
    model: claude-opus-4-20250918
    apiKey: YOUR_KEY
    roles:
      - chat

  - name: "Codestral (fast autocomplete)"
    provider: mistral
    model: codestral-latest
    apiKey: YOUR_MISTRAL_KEY
    roles:
      - autocomplete

Context Providers

Context providers control what information Continue can pull into its context window when you're chatting. Type @ in the chat input to see available providers:

Provider	Trigger	What It Does
`@file`	`@file`	Reference any file in your workspace
`@code`	`@code`	Reference specific functions or classes
`@docs`	`@docs`	Search indexed documentation
`@web`	`@web`	Search the web for context
`@terminal`	`@terminal`	Include recent terminal output
`@codebase`	`@codebase`	Search your entire codebase semantically

The @codebase provider is particularly powerful -- it uses embeddings to find relevant code across your entire project, not just the file you have open.

🏠 Running Local Models with Ollama

This is Continue's killer feature for privacy-conscious developers and teams working with sensitive codebases. You can run everything locally -- zero data leaves your machine.

Step 1: Install Ollama

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Windows -- download from https://ollama.ai

# Verify installation
ollama --version

Step 2: Pull Models

Use ollama pull (not ollama run) to download models. Here are the best choices for coding tasks in 2026:

# Best all-around coding model (fits on 16GB RAM)
ollama pull qwen2.5-coder:7b

# Stronger reasoning, needs 32GB+ RAM
ollama pull deepseek-r1:32b

# Fast autocomplete specialist
ollama pull starcoder2:3b

# General-purpose with good coding ability
ollama pull llama3.1:8b

Step 3: Configure Continue for Ollama

# ~/.continue/config.yaml
models:
  - name: "Qwen Coder (Local)"
    provider: ollama
    model: qwen2.5-coder:7b
    roles:
      - chat
      - edit

  - name: "StarCoder (Local Autocomplete)"
    provider: ollama
    model: starcoder2:3b
    roles:
      - autocomplete

That's it. No API keys, no billing, no data leaving your network.

Autodetect Mode

If you want Continue to automatically discover all models you have in Ollama:

models:
  - name: "Autodetect"
    provider: ollama

This scans your local Ollama installation and lists every available model in the Continue sidebar. Convenient for experimentation, though you'll want explicit configuration for serious work.

Warning: Local models are significantly slower than cloud APIs unless you have a capable GPU. On a MacBook Pro M3 with 36GB RAM, expect ~20 tokens/second with qwen2.5-coder:7b. Cloud APIs deliver 80-150 tokens/second. The tradeoff is privacy vs. speed.

Local vs. Cloud: When to Use Each

Scenario	Use Local (Ollama)	Use Cloud API
Proprietary codebase	✅	⚠️ Check compliance first
Air-gapped environment	✅	❌ Not possible
Speed-critical autocomplete	❌ Noticeable latency	✅
Complex multi-file reasoning	⚠️ Smaller context windows	✅
Cost-sensitive team	✅ Free after hardware	⚠️ API costs add up
Offline development	✅	❌

🤖 Chat, Agent, and Plan Modes

Continue offers three interaction modes, each designed for different types of tasks. Understanding when to use each one is the difference between fighting the tool and flowing with it.

Chat Mode

The simplest mode. You ask questions, get answers. No tools, no file modifications -- just conversation with code context.

Best for: Quick questions about your code, explaining unfamiliar patterns, brainstorming approaches.

You: @file:auth.py What authentication strategy does this use?

Continue: This file implements JWT-based authentication with refresh
token rotation. The `authenticate()` method validates the access token,
and if expired, checks for a valid refresh token in the HttpOnly cookie...

Plan Mode

Plan mode gives the model read-only access to your codebase. It can explore files, understand structure, and create a step-by-step plan -- but it won't modify anything.

Best for: Understanding unfamiliar codebases, planning refactors before executing them, architectural reviews.

Agent Mode

Agent mode is where Continue gets powerful. The model can read files, write code, run terminal commands, and iterate until the task is done. It's the closest thing to having a pair programmer who actually types.

Best for: Multi-file edits, new feature implementation, refactoring, debugging.

You: Add input validation to all API endpoints in the /routes directory.
     Use Zod schemas and return 400 responses for invalid input.

Agent: I'll scan the routes directory, create Zod schemas for each
endpoint's expected input, and add validation middleware...
[reads files, writes schemas, modifies routes, runs tests]

Mode Comparison

Capability	Chat	Plan	Agent
Answer questions	✅	✅	✅
Read your files	Via `@` context	✅ Automatic	✅ Automatic
Write/edit code	❌	❌	✅
Run terminal commands	❌	❌	✅
Use MCP tools	❌	❌	✅

MCP Integration

Continue supports the Model Context Protocol (MCP), which lets you give Agent mode access to external tools -- databases, APIs, documentation servers, and more.

# ~/.continue/config.yaml
mcpServers:
  - name: "filesystem"
    command: "npx"
    args:
      - "-y"
      - "@modelcontextprotocol/server-filesystem"
      - "/path/to/allowed/directory"

  - name: "postgres"
    command: "npx"
    args:
      - "-y"
      - "@modelcontextprotocol/server-postgres"
      - "postgresql://localhost:5432/mydb"

With MCP servers configured, Agent mode can query your database, read documentation, or interact with external services -- all within the Continue chat.

⚔️ Continue.dev vs. GitHub Copilot

This is the comparison everyone asks about. Both tools live in your editor and help you write code faster. The similarities end there.

Feature	Continue.dev	GitHub Copilot
Price	Free (open-source core)	Free tier / $10-39/mo
Source code	✅ Fully open (Apache 2.0)	❌ Proprietary
IDE Support	VS Code, JetBrains	VS Code, JetBrains, Neovim
Model choice	✅ Any model (cloud or local)	⚠️ GPT + limited Claude/Gemini
Local/offline mode	✅ Via Ollama, LM Studio	❌ Cloud only
Autocomplete	✅	✅ (faster out of the box)
Agent mode	✅	✅
GitHub integration	⚠️ Basic Git	✅ Deep (PRs, Issues, Reviews)
Team management	Hub ($10/user/mo)	Business ($19/user/mo)
Data privacy	✅ Full control	⚠️ Enterprise tier only
MCP support	✅	⚠️ Limited
Custom rules	✅ config.yaml	✅ .github/copilot-instructions.md

When Continue Wins

You need model flexibility. Want to use Claude for chat and a local model for autocomplete? Continue does that. Copilot locks you into its supported models.
Privacy is non-negotiable. Regulated industries, defense contracts, healthcare codebases -- if code can't leave your network, Continue + Ollama is the only option in this tier.
Budget is tight. Continue's core is free forever. Copilot's free tier limits you to 2,000 completions/month.
You want transparency. Continue's source code is auditable. You can verify exactly what data gets sent where.

When Copilot Wins

You live in the GitHub ecosystem. The Coding Agent that auto-drafts PRs from Issues, code review suggestions, and pull request summaries are genuinely useful -- and Continue has nothing comparable.
Zero-config setup matters. Copilot works out of the box. Continue requires you to choose and configure at least one model provider.
Enterprise procurement. Microsoft backing makes Copilot an easier sell to compliance teams than an open-source project.

Tip: You don't have to choose one. Many developers run Copilot for autocomplete (it's faster and requires zero thought) and Continue for chat and agent work (where model choice matters). The two extensions coexist in VS Code without conflicts.

Pricing Comparison

Tier	Continue.dev	GitHub Copilot
Free	✅ Full features, BYO model	✅ 2,000 completions, 50 chats
Individual	$0 (+ API costs)	$10-39/mo
Team	$10/user/mo (Hub)	$19/user/mo (Business)
Enterprise	Custom	$39/user/mo

The real cost of Continue depends on your API usage. If you're sending 50 chat messages a day through Claude Sonnet, expect $5-15/month in API costs. If you're running everything locally with Ollama, the cost after hardware is $0.

🔧 Troubleshooting Common Issues

"Continue can't connect to Ollama."
Make sure Ollama is running (ollama serve in a terminal). Verify the default port is accessible at http://localhost:11434. If you changed the port, update apiBase in your config:

models:
  - name: "Local Model"
    provider: ollama
    model: qwen2.5-coder:7b
    apiBase: "http://localhost:11434"

"My model doesn't appear in the Continue dropdown."
Check that the model name in your config exactly matches the output of ollama list. Names are case-sensitive and include the tag (e.g., qwen2.5-coder:7b, not qwen2.5-coder).

"Autocomplete is slow or not showing up."
Tab autocomplete needs to be explicitly enabled. In VS Code, click the Continue button in the status bar and toggle "Enable Tab Autocomplete." For better performance, use a smaller model for autocomplete -- starcoder2:3b or gpt-4o-mini respond much faster than large reasoning models.

"Agent mode asks permission for every action."
This is the default safety behavior. You can configure tool policies in your config to auto-approve specific tools:

toolPermissions:
  readFile: automatic
  writeFile: ask
  runTerminal: ask

"Chat responses reference outdated code."
Continue caches codebase embeddings. If you've made significant changes, trigger a re-index from the command palette: Continue: Reindex Codebase. In JetBrains, use the Continue settings panel.

🚀 What's Next

Try the local setup first. Install Ollama, pull qwen2.5-coder:7b, and connect it to Continue. You'll have a fully private AI coding assistant running in 5 minutes -- no API key, no subscription, no cloud dependency.
Layer in a cloud model for heavy tasks. Keep Ollama for autocomplete and daily chat, but add a Claude Sonnet or GPT-4o entry in your config for complex reasoning and multi-file refactors.
Explore MCP servers. The Model Context Protocol turns Continue's Agent mode from a code editor into a development platform. Start with the filesystem server and expand from there.
Read the Claude Code Workflow Guide if you want to complement Continue with a terminal-first agent for deep codebase work.
Check out AI Coding Agents Compared to see how Continue stacks up against Cursor, Windsurf, and other tools in the 2026 landscape.

For a deep dive into running AI models entirely on your own hardware, see the Local LLM + Ollama RAG Guide. And if you want to understand what GitHub's alternative offers, read the GitHub Copilot Agent Mode Guide.

Dislike

Thanks for feedback.