Cline Architecture

Autonomous AI coding agent for VS Code. Uses Claude and other LLMs to plan, edit files, run terminal commands, and automate browsers -- all with human-in-the-loop approval at every step.

VS Code Extension Human-in-the-Loop Plan/Act Workflow MCP Tools Diff-Based Editing Browser Automation
01

System Overview

Cline is an open-source VS Code extension that turns large language models into autonomous coding agents. It combines file editing, terminal execution, browser automation, and extensible MCP tools into a single agentic loop, with the human always in control of what gets approved.

High-Level Architecture
graph TB subgraph VSCode["VS Code"] direction TB EH["Extension Host
(Node.js process)"] WV["Webview Panel
(React UI)"] end subgraph Core["Cline Core"] Agent["Agent Loop"] Tools["Tool System"] Context["Context Manager"] Checkpoint["Checkpoint System"] end subgraph Integrations["Integrations"] FS["File System"] Term["Terminal"] Browser["Browser
(Headless)"] MCP["MCP Servers"] end subgraph Providers["API Providers"] Anthropic["Anthropic"] OpenRouter["OpenRouter"] OpenAI["OpenAI"] Others["Gemini / Bedrock /
Azure / Local"] end WV <-->|"postMessage"| EH EH --> Agent Agent --> Tools Agent --> Context Agent --> Checkpoint Tools --> FS Tools --> Term Tools --> Browser Tools --> MCP Agent <-->|"API calls"| Providers style VSCode fill:#1c2030,stroke:#00d4aa,stroke-width:2px,color:#e8ecf4 style Core fill:#141821,stroke:#4d9cf5,stroke-width:2px,color:#e8ecf4 style Integrations fill:#141821,stroke:#a78bfa,stroke-width:2px,color:#e8ecf4 style Providers fill:#141821,stroke:#f59e0b,stroke-width:2px,color:#e8ecf4
🧠

Autonomous Agent

Plans multi-step tasks, executes them tool-by-tool, and self-corrects when errors occur. Monitors linter and compiler output to proactively fix issues.

🛡

Human-in-the-Loop

Every file change and terminal command requires explicit user approval. Diff views let you inspect and edit proposed changes before accepting.

💰

Cost Tracking

Real-time token usage and cost display per request. Supports all major API providers with transparent pricing per model.

02

VS Code Extension Architecture

Cline follows the standard VS Code extension pattern: a Node.js Extension Host handles all backend logic, while a Webview Panel renders the React-based chat UI. Communication flows through VS Code's postMessage API.

Extension Host vs Webview
graph LR subgraph WebviewPanel["Webview Panel (React)"] ChatUI["Chat UI"] Settings["Settings Panel"] History["Task History"] Approval["Approval Buttons"] end subgraph ExtHost["Extension Host (Node.js)"] ExtEntry["extension.ts
(activation)"] ClineAgent["Cline Agent"] Registry["Service Registry"] Config["Configuration"] APIClients["API Clients"] end subgraph VSCodeAPIs["VS Code APIs"] FileAPI["workspace.fs"] TermAPI["Terminal API"] DiffAPI["Diff Editor"] DiagAPI["Diagnostics"] end ChatUI -->|"postMessage"| ExtEntry ExtEntry -->|"postMessage"| ChatUI ExtEntry --> ClineAgent ClineAgent --> Registry ClineAgent --> APIClients ClineAgent --> VSCodeAPIs Approval -->|"approve/reject"| ExtEntry style WebviewPanel fill:#1c2030,stroke:#00d4aa,stroke-width:2px,color:#e8ecf4 style ExtHost fill:#1c2030,stroke:#4d9cf5,stroke-width:2px,color:#e8ecf4 style VSCodeAPIs fill:#1c2030,stroke:#a78bfa,stroke-width:2px,color:#e8ecf4

Source Code Organization

// Repository structure
src/
  core/           // Agent loop, tool execution, state machine
  integrations/   // Terminal, browser, file system bridges
  services/       // API provider clients, cost tracking
  hosts/          // VS Code host adapter (webview, commands)
  shared/         // Types, utilities shared across layers
  packages/       // Internal sub-packages
  exports/        // Public extension API
  standalone/     // Non-VS Code deployment target
  extension.ts    // Entry point, activates extension
  config.ts       // Configuration schema and defaults
  registry.ts     // Service/component registry
webview-ui/         // React frontend for the sidebar panel
03

Plan/Act Workflow

Cline operates in two distinct modes. Plan Mode focuses on information gathering and strategy without making changes. Act Mode executes the plan by invoking tools. Users can toggle between modes at any time, and the agent can be configured to start in either mode.

Plan/Act State Machine
stateDiagram-v2 [*] --> PlanMode state PlanMode { [*] --> Analyze Analyze --> ReadFiles: gather context ReadFiles --> SearchCode: find patterns SearchCode --> FormPlan: synthesize FormPlan --> PresentPlan: show to user } state ActMode { [*] --> SelectTool SelectTool --> ExecuteTool: invoke ExecuteTool --> WaitApproval: needs approval WaitApproval --> ProcessResult: user approves WaitApproval --> Abort: user rejects ProcessResult --> SelectTool: next step ProcessResult --> Complete: done } PlanMode --> ActMode: user switches / auto ActMode --> PlanMode: user switches ActMode --> [*]: task complete
📋

Plan Mode

The agent reads files, searches code with AST and regex, and analyzes project structure. It formulates a step-by-step plan without making any changes to the workspace.

Act Mode

Executes the plan by calling tools: editing files, running commands, launching browsers. Each action requires user approval before proceeding.

🔄

Self-Correction

When linter errors or test failures occur after an edit, the agent detects them via VS Code diagnostics and automatically proposes fixes in subsequent steps.

04

Tool System & Approval Flow

Cline's agent loop works by selecting and invoking tools. The LLM chooses which tool to call, constructs the arguments, and presents the action to the user. Nothing executes until the human clicks "Approve" in the webview.

Tool Execution Pipeline
sequenceDiagram participant LLM as LLM Provider participant Agent as Agent Loop participant UI as Webview UI participant User as Human participant Tool as Tool Handler Agent->>LLM: Send context + available tools LLM-->>Agent: Tool call (name + args) Agent->>UI: Present action for approval UI->>User: Show diff / command / action alt Approved User->>UI: Click Approve UI->>Agent: Approval signal Agent->>Tool: Execute tool Tool-->>Agent: Result / output Agent->>LLM: Feed result back else Rejected User->>UI: Click Reject UI->>Agent: Rejection signal Agent->>LLM: Tool was rejected by user end
Tool Purpose Approval Required
read_file Read file contents for context Auto
write_to_file Create or overwrite a file Yes
apply_diff Apply targeted diff edits to files Yes
execute_command Run shell commands in terminal Yes
search_files Regex search across files Auto
list_files List directory contents Auto
browser_action Launch browser, click, type, screenshot Yes
use_mcp_tool Call a tool from an MCP server Configurable
ask_followup Ask the user a clarifying question Auto
attempt_completion Mark the task as done with summary Auto
05

File Editing System

Cline uses two approaches to file modification: full file writes for new files, and diff-based edits for surgical changes to existing files. Both present a VS Code diff view so the user can inspect every change before approving.

Diff-Based Edit Flow
graph TB LLM["LLM generates
apply_diff tool call"] --> Parse["Parse diff
search/replace blocks"] Parse --> Locate["Locate target lines
in source file"] Locate --> Apply["Construct
modified content"] Apply --> DiffView["Open VS Code
Diff Editor"] DiffView --> UserReview["User reviews
side-by-side diff"] UserReview -->|"Approve"| Write["Write to disk"] UserReview -->|"Edit"| InlineFix["User edits in
diff view directly"] InlineFix --> Write UserReview -->|"Reject"| Discard["Discard changes"] Write --> Diagnostics["VS Code checks
linter / compiler"] Diagnostics -->|"Errors found"| AutoFix["Agent proposes fix
in next iteration"] Diagnostics -->|"Clean"| Next["Continue to
next tool call"] style LLM fill:#1c2030,stroke:#a78bfa,stroke-width:2px,color:#e8ecf4 style DiffView fill:#1c2030,stroke:#00d4aa,stroke-width:2px,color:#e8ecf4 style Diagnostics fill:#1c2030,stroke:#f59e0b,stroke-width:2px,color:#e8ecf4

Key insight: Users can directly edit within the diff view before approving. This means you can correct small mistakes without rejecting the entire edit, making the collaboration feel more like pair programming than a request-response cycle.

06

Terminal Command Execution

Cline executes commands directly in the user's VS Code terminal using the Shell Integration API (VS Code v1.93+). This gives it access to the user's full environment -- shell config, PATH, virtual environments, and installed tools.

Terminal Integration Flow
sequenceDiagram participant Agent as Agent Loop participant UI as Webview UI participant User as Human participant Shell as VS Code Terminal participant Process as Child Process Agent->>UI: Propose command UI->>User: Show command for approval User->>UI: Approve UI->>Agent: Approved Agent->>Shell: Send via Shell Integration API Shell->>Process: Execute in user's shell Process-->>Shell: stdout + stderr stream Shell-->>Agent: Output captured alt Long-running (dev server) Agent->>UI: Show "Proceed While Running" User->>UI: Click proceed Note over Agent,Process: Agent continues
while server runs end alt Command fails Process-->>Shell: Non-zero exit code Shell-->>Agent: Error output Agent->>Agent: Self-correct,
propose fix end

Shell Integration API

Uses VS Code's v1.93 shell integration to run commands in the actual terminal, capturing output streams and exit codes programmatically.

Background Execution

"Proceed While Running" lets the agent continue working while long-running processes (dev servers, builds) run in the background.

Environment Access

Commands run in the user's actual shell environment with full access to PATH, aliases, virtual environments, and local tooling.

07

Browser Automation

Cline can launch a headless browser to test web applications, debug UI issues, and verify changes visually. Using the Computer Use capability, it captures screenshots and console logs to feed back into the agent loop.

Browser Automation Loop
graph LR Agent["Agent Loop"] -->|"browser_action"| Launch["Launch
headless browser"] Launch --> Navigate["Navigate to URL
(e.g. localhost:3000)"] Navigate --> Screenshot["Capture
screenshot"] Screenshot --> Console["Capture
console logs"] Console --> Analyze["LLM analyzes
visual + logs"] Analyze -->|"click / type / scroll"| Interact["Interact with
page elements"] Interact --> Screenshot Analyze -->|"issue found"| Fix["Agent edits
source code"] Fix -->|"re-check"| Navigate style Agent fill:#1c2030,stroke:#4d9cf5,stroke-width:2px,color:#e8ecf4 style Analyze fill:#1c2030,stroke:#00d4aa,stroke-width:2px,color:#e8ecf4 style Fix fill:#1c2030,stroke:#f59e0b,stroke-width:2px,color:#e8ecf4

Dev server workflow: The agent can start a dev server in the terminal (with "Proceed While Running"), launch a browser to that server, screenshot the result, identify visual bugs, edit the source, and re-check -- a fully autonomous code-build-test-debug loop.

08

MCP (Model Context Protocol)

The Model Context Protocol lets users extend Cline with custom tool servers. MCP servers are separate processes that expose tools over a standardized protocol. Cline can even build and install MCP servers autonomously when a user describes what they need.

MCP Server Architecture
graph TB subgraph ClineExt["Cline Extension"] Agent["Agent Loop"] MCPClient["MCP Client"] end subgraph MCPServers["MCP Servers (separate processes)"] Jira["Jira Server"] AWS["AWS Server"] DB["Database Server"] Custom["Custom Server"] end subgraph Protocol["MCP Protocol"] ListTools["list_tools"] CallTool["call_tool"] ListRes["list_resources"] end Agent -->|"use_mcp_tool"| MCPClient MCPClient <-->|"stdio / HTTP"| Protocol Protocol <--> Jira Protocol <--> AWS Protocol <--> DB Protocol <--> Custom style ClineExt fill:#1c2030,stroke:#00d4aa,stroke-width:2px,color:#e8ecf4 style MCPServers fill:#1c2030,stroke:#a78bfa,stroke-width:2px,color:#e8ecf4 style Protocol fill:#1c2030,stroke:#f59e0b,stroke-width:2px,color:#e8ecf4
09

Context Management & Checkpoints

Large codebases can easily exceed LLM context windows. Cline carefully manages what information enters the context, using selective file reading, AST-based code search, and intelligent summarization to stay within token limits.

Context Window Management
graph TB subgraph Inputs["Context Sources"] SysPrompt["System Prompt
(tools + rules)"] UserMsg["User Message"] FileContent["File Contents
(selective reads)"] ToolResults["Tool Results"] Diagnostics["Linter/Compiler
Diagnostics"] BrowserData["Screenshots +
Console Logs"] end subgraph Manager["Context Manager"] Budget["Token Budget
Tracking"] Truncate["Smart Truncation"] Summarize["Conversation
Summarization"] end subgraph Window["Context Window"] Active["Active Context
(fits in window)"] end Inputs --> Manager Manager --> Active Budget -.->|"approaching limit"| Summarize Summarize -.->|"compress history"| Active style Inputs fill:#1c2030,stroke:#4d9cf5,stroke-width:2px,color:#e8ecf4 style Manager fill:#1c2030,stroke:#00d4aa,stroke-width:2px,color:#e8ecf4 style Window fill:#1c2030,stroke:#f59e0b,stroke-width:2px,color:#e8ecf4

Checkpoint System

At each step of the agent loop, Cline captures a workspace snapshot. This enables two powerful features:

  • Compare -- View a diff of the current workspace against any prior checkpoint to see exactly what changed.
  • Restore -- Revert the entire workspace to a previous checkpoint. Useful for trying different approaches or undoing a sequence of changes.
  • Non-destructive experimentation -- Users can let the agent try an approach, inspect the result, and roll back if it does not work out.
10

API Providers & Model Support

Cline supports a wide range of LLM providers and models. Users can switch providers at any time, and the extension normalizes the API differences behind a unified client layer.

Provider Architecture
graph LR subgraph Agent["Agent Loop"] Unified["Unified API Client"] end subgraph CloudProviders["Cloud Providers"] Anthropic["Anthropic
(Claude)"] OpenRouter["OpenRouter
(multi-model)"] OpenAI["OpenAI
(GPT-4o+)"] Gemini["Google Gemini"] end subgraph Enterprise["Enterprise / Cloud"] Bedrock["AWS Bedrock"] Azure["Azure OpenAI"] Vertex["GCP Vertex"] end subgraph Local["Local / Fast"] Ollama["Ollama"] LMStudio["LM Studio"] Cerebras["Cerebras"] Groq["Groq"] end Unified --> Anthropic Unified --> OpenRouter Unified --> OpenAI Unified --> Gemini Unified --> Bedrock Unified --> Azure Unified --> Vertex Unified --> Ollama Unified --> LMStudio Unified --> Cerebras Unified --> Groq style Agent fill:#1c2030,stroke:#00d4aa,stroke-width:2px,color:#e8ecf4 style CloudProviders fill:#1c2030,stroke:#4d9cf5,stroke-width:2px,color:#e8ecf4 style Enterprise fill:#1c2030,stroke:#a78bfa,stroke-width:2px,color:#e8ecf4 style Local fill:#1c2030,stroke:#f59e0b,stroke-width:2px,color:#e8ecf4
Provider Type Key Models
Anthropic Cloud Claude Sonnet, Claude Opus, Claude Haiku
OpenRouter Aggregator Multi-provider access (100+ models)
OpenAI Cloud GPT-4o, o1, o3
Google Cloud Gemini 2.0 Flash, Gemini Pro
AWS Bedrock Enterprise Claude via AWS, data stays in-region
Azure OpenAI Enterprise GPT models via Azure deployment
Ollama / LM Studio Local Any GGUF model, fully offline
Groq / Cerebras Fast Hardware-accelerated inference

OpenAI-compatible API: Any provider that implements the OpenAI chat completions API can be used with Cline, making it compatible with virtually any LLM backend including self-hosted inference servers.

Diagram
100%
Scroll to zoom · Drag to pan · Esc to close