Architecture map of Bolt.diy, the open-source prompt-to-app generator — multi-model LLM pipeline, WebContainer runtime, streaming code generation, and one-click deployment, compiled from the public GitHub repository.
Bolt.diy is the open-source fork of Bolt.new by StackBlitz. It enables prompt-to-app development entirely in the browser: users describe an application in natural language, an LLM generates the full-stack code, and a WebContainer sandbox compiles and runs it live — all without a backend server. The platform supports 22 LLM providers, live code editing, integrated terminal, and one-click deployment to Netlify, Vercel, or GitHub Pages.
graph TD
subgraph User["User Layer"]
BROWSER["Browser Client"]
ELECTRON["Electron Desktop App"]
end
subgraph Frontend["Frontend · Remix + React"]
CHAT["Chat Interface"]
WB["Workbench Panel"]
EDITOR["CodeMirror Editor"]
TERM["Integrated Terminal"]
PREVIEW["Live Preview"]
end
subgraph Server["Server Layer · Remix SSR"]
API_CHAT["api.chat Action"]
API_MODELS["api.models Route"]
API_DEPLOY["api.deploy Routes"]
API_GIT["api.git Routes"]
end
subgraph LLM["LLM Provider Layer"]
MGR["LLM Manager"]
REG["Provider Registry"]
STREAM["Stream Text Engine"]
PROVIDERS["22 Provider Adapters"]
end
subgraph Runtime["Runtime Engine"]
PARSER["Message Parser"]
RUNNER["Action Runner"]
end
subgraph Sandbox["WebContainer Sandbox"]
FS["Virtual Filesystem"]
NODE["Node.js Runtime"]
DEV["Dev Server"]
end
subgraph Deploy["Deploy Targets"]
NETLIFY["Netlify"]
VERCEL["Vercel"]
GHPAGES["GitHub Pages"]
end
BROWSER --> CHAT
ELECTRON --> CHAT
CHAT --> API_CHAT
API_CHAT --> MGR
MGR --> REG
MGR --> STREAM
STREAM --> PROVIDERS
STREAM --> PARSER
PARSER --> RUNNER
RUNNER --> FS
NODE --> DEV
DEV --> PREVIEW
WB --> EDITOR
WB --> TERM
WB --> PREVIEW
EDITOR --> FS
API_DEPLOY --> NETLIFY
API_DEPLOY --> VERCEL
API_DEPLOY --> GHPAGES
Bolt.diy runs entirely in the browser with no traditional backend server for code execution. The Remix server handles API proxying for LLM calls and deployment auth, but all code compilation, dependency installation, and execution happens inside StackBlitz's WebContainer — a browser-native Node.js sandbox using WebAssembly and Service Workers.
The core innovation of Bolt.diy is its end-to-end prompt-to-app pipeline. A user types a natural language prompt, which flows through the LLM provider layer, gets parsed into structured artifacts and actions, and is executed in real-time inside a WebContainer sandbox. The entire cycle — from prompt to running app — happens in seconds.
sequenceDiagram
participant U as User
participant C as Chat UI
participant S as Remix Server
participant L as LLM Provider
participant P as Message Parser
participant R as Action Runner
participant W as WebContainer
U->>C: Types prompt
C->>S: POST /api/chat
S->>S: Extract model + provider from message
S->>S: Optional: summarize context + select files
S->>L: streamText() via Vercel AI SDK
L-->>S: Streaming response chunks
S-->>C: Data stream with progress annotations
C->>P: Feed streaming tokens to StreamingMessageParser
P->>P: Detect boltArtifact and boltAction tags
P->>R: Emit actions (file, shell, start, build, supabase)
R->>W: Write files to virtual filesystem
R->>W: Spawn shell commands (npm install, etc.)
W->>W: Node.js compiles + runs dev server
W-->>C: Preview iframe shows running app
C-->>U: Live app preview + editable code
The user enters a natural language description of the application they want. The Chat UI packages this with the conversation history, selected files, and chosen LLM provider, sending it to the Remix server as a POST request to /api/chat.
The server optionally runs a pre-pass: generating a conversation summary with createSummary() and selecting the most relevant code files via selectContext(). This reduces token consumption and improves generation quality, especially for long conversations.
The Vercel AI SDK's streamText() sends the optimized prompt to the selected provider. Responses stream back token-by-token through a SwitchableStream with a StreamRecoveryManager that retries on 45-second timeouts (up to 2 retries).
When the LLM hits its token limit, the server automatically reconstructs the message chain by appending the partial output plus a continuation prompt, then re-invokes streamText(). This lets a single prompt generate output exceeding the model's context window.
The client-side StreamingMessageParser detects structured XML tags (<boltArtifact> and <boltAction>) in the streaming response. Each action is dispatched to the ActionRunner, which writes files, runs shell commands, or starts dev servers in the WebContainer — all while the LLM is still generating.
Bolt.diy supports 22 LLM providers through a registry-based architecture. Each provider extends a BaseProvider class and is registered in a central Registry. The LLMManager orchestrates provider selection, model lookup, and token limit configuration. Users can switch providers per-prompt via the UI or configure them through environment variables.
graph LR
subgraph Registry["Provider Registry"]
BP["BaseProvider"]
REG["registry.ts"]
MGR["LLMManager"]
end
subgraph Cloud["Cloud Providers"]
OAI["OpenAI"]
ANT["Anthropic"]
GGL["Google Gemini"]
GRQ["Groq"]
DS["DeepSeek"]
MST["Mistral"]
COH["Cohere"]
TOG["Together"]
PPX["Perplexity"]
HF["HuggingFace"]
OR["OpenRouter"]
XAI["xAI"]
MS["Moonshot"]
HYP["Hyperbolic"]
FW["Fireworks"]
GH["GitHub Models"]
BED["Amazon Bedrock"]
ZAI["z.AI"]
CEL["Cerebras"]
end
subgraph Local["Local Providers"]
OLL["Ollama"]
LMS["LM Studio"]
OAL["OpenAI-Like"]
end
subgraph SDK["Vercel AI SDK"]
ST["streamText()"]
MC["Model Config"]
end
BP --> OAI & ANT & GGL & GRQ & DS
BP --> MST & COH & TOG & PPX & HF
BP --> OR & XAI & MS & HYP & FW
BP --> GH & BED & ZAI & CEL
BP --> OLL & LMS & OAL
MGR --> REG
REG --> BP
MGR --> ST
ST --> MC
Bolt.diy includes special handling for reasoning models like OpenAI's o1 and GPT-5. These models use maxCompletionTokens instead of maxTokens, and unsupported parameters (temperature, topP, frequency/presence penalties) are automatically filtered from the request to prevent API errors.
The system resolves completion token limits through a three-tier priority chain:
| Priority | Source | Description |
|---|---|---|
| 1st | Model-specific limit | The maxCompletionTokens value defined on the individual model configuration |
| 2nd | Provider default | A per-provider fallback from PROVIDER_COMPLETION_LIMITS map |
| 3rd | Global safety cap | Hard-coded fallback of 16,384 tokens if neither model nor provider defines a limit |
The StreamingMessageParser is the bridge between raw LLM output and executable actions. It processes streaming token chunks in real-time, detecting structured XML-like tags that the LLM embeds in its responses. The parser is resumable across chunks, maintaining internal state as tokens arrive one-by-one.
stateDiagram-v2
[*] --> Scanning: Token arrives
Scanning --> QuickAction: Detect bolt-quick-actions tag
Scanning --> ArtifactOpen: Detect boltArtifact tag
QuickAction --> Scanning: Close tag
ArtifactOpen --> ActionScan: Extract title + type
ActionScan --> ActionOpen: Detect boltAction tag
ActionOpen --> ActionStreaming: Stream content
ActionStreaming --> ActionClose: Detect close tag
ActionClose --> ActionScan: More actions?
ActionScan --> ArtifactClose: Detect close boltArtifact
ArtifactClose --> Scanning: Continue parsing
Scanning --> [*]: Stream ends
The LLM generates structured output wrapped in XML-like tags. Each artifact represents a logical unit of work (such as creating a project), and contains one or more actions that are executed sequentially.
The parser emits events through a callback interface, allowing the UI and runtime to react incrementally as the LLM generates code:
| Callback | Trigger | Use Case |
|---|---|---|
onArtifactOpen |
Opening <boltArtifact> tag parsed |
Show artifact title in chat, open workbench |
onActionOpen |
Opening <boltAction> tag parsed |
Create file tab, start terminal output |
onActionStream |
Each content chunk inside an action | Live-update editor content, stream terminal output |
onActionClose |
Closing action tag detected | Finalize file write, execute shell command |
onArtifactClose |
Closing artifact tag detected | Mark artifact complete in chat |
The ActionRunner is the execution backbone of Bolt.diy. It receives parsed actions from the message parser and executes them against the WebContainer sandbox. Actions are serialized through a promise chain to prevent race conditions, and each action tracks its lifecycle through a state machine (pending, running, complete, failed, aborted).
graph TD
subgraph Input["Parser Output"]
PA["Parsed Action"]
end
subgraph Queue["Action Queue"]
ADD["addAction()"]
CHAIN["Promise Chain"]
EXEC["executeAction()"]
end
subgraph Dispatch["Type Dispatcher"]
FILE["runFileAction()"]
SHELL["runShellAction()"]
BUILD["runBuildAction()"]
START["runStartAction()"]
SUPA["runSupabaseAction()"]
end
subgraph WC["WebContainer"]
VFS["Virtual FS Write"]
SPAWN["spawn() Process"]
DEVS["Dev Server"]
end
subgraph State["Action State"]
PENDING["pending"]
RUNNING["running"]
COMPLETE["complete"]
FAILED["failed"]
ABORTED["aborted"]
end
PA --> ADD
ADD --> PENDING
ADD --> CHAIN
CHAIN --> EXEC
EXEC --> RUNNING
EXEC --> FILE & SHELL & BUILD & START & SUPA
FILE --> VFS
SHELL --> SPAWN
BUILD --> SPAWN
START --> DEVS
SUPA --> SPAWN
VFS --> COMPLETE
SPAWN --> COMPLETE
SPAWN --> FAILED
DEVS --> COMPLETE
Write files to the WebContainer's virtual filesystem. Content is streamed character-by-character as the LLM generates it, so the editor shows live updates before the action is complete.
Execute terminal commands via WebContainer's spawn() API. Output streams to the integrated terminal. Common uses: npm install, npx create-..., file operations.
Run npm run build and locate the output directory. The runner inspects the build result to determine where compiled assets land, enabling deployment workflows.
Launch dev servers non-blockingly with a 2-second delay between consecutive starts. The server URL is captured and routed to the preview iframe for live rendering.
Handle database migrations and queries against a connected Supabase instance. Supports migration operations (with file paths) and direct query execution.
The ActionRunner includes defensive error handling: it wraps shell failures in a formatted ActionCommandError, auto-adds -f flags to rm commands when target files don't exist, and provides callbacks for UI alerts (onAlert, onSupabaseAlert, onDeployAlert) so failures surface to the user without crashing the runtime.
WebContainer is StackBlitz's browser-native runtime that provides a full Node.js environment without a remote server. It uses WebAssembly to run a lightweight OS and Node.js process, with Service Workers intercepting network requests to simulate localhost. This is how Bolt.diy runs npm, compiles TypeScript, and serves applications — entirely client-side.
graph TD
subgraph Browser["Browser Environment"]
APP["Bolt.diy App"]
SW["Service Worker"]
WASM["WebAssembly Runtime"]
end
subgraph WC["WebContainer"]
VFS["Virtual Filesystem
(in-memory)"]
NODE["Node.js Process"]
NPM["npm / pnpm"]
VITE["Vite / Next.js / etc."]
end
subgraph IO["I/O Layer"]
TERM["Terminal Stream"]
PREV["Preview iframe"]
EDIT["Editor Sync"]
end
APP -->|"boot()"| WASM
WASM --> VFS
WASM --> NODE
APP -->|"writeFile()"| VFS
APP -->|"spawn()"| NODE
NODE --> NPM
NPM -->|"install deps"| VFS
NODE --> VITE
VITE -->|"dev server"| SW
SW -->|"localhost proxy"| PREV
NODE -->|"stdout/stderr"| TERM
VFS -->|"file changes"| EDIT
An in-memory filesystem that supports standard Node.js fs operations. Files written by the ActionRunner appear instantly in the CodeMirror editor and persist for the session duration.
The spawn() API creates Node.js child processes that run npm scripts, build tools, and dev servers. Each process gets isolated stdio streams routed to the terminal UI.
Dev servers running inside WebContainer are accessible via Service Worker interception. The preview iframe loads localhost URLs that the Service Worker routes to the in-browser server.
WebContainer authentication is handled client-side via auth.client.ts. Dedicated routes (/webcontainer.connect.$id and /webcontainer.preview.$id) manage sandbox connections and preview rendering.
The Bolt.diy workbench is a split-pane IDE experience built with React components. The left panel shows the chat conversation; the right panel contains the code editor, terminal, and live preview. Users can edit AI-generated code directly, view diffs, lock files from modification, and switch between multiple preview tabs.
graph LR
subgraph Header["Header Bar"]
LOGO["Logo + Nav"]
PROV["Provider Selector"]
SETTINGS["Settings"]
end
subgraph Left["Left Panel"]
SIDEBAR["Chat History Sidebar"]
CHATUI["Chat Messages"]
INPUT["Prompt Input + File Attach"]
end
subgraph Right["Right Panel · Workbench"]
subgraph Tabs["Tab Bar"]
CODE["Code Tab"]
PREVTAB["Preview Tab"]
end
subgraph Editor["Code Editor"]
CM["CodeMirror"]
FILETREE["File Tree"]
DIFF["Diff View"]
end
subgraph Bottom["Bottom Panel"]
TERMINAL["Terminal"]
LOGS["Console Output"]
end
subgraph Preview["Preview Panel"]
IFRAME["Live Preview iframe"]
QRCODE["QR Code (Mobile)"]
end
end
CHATUI --> Right
CM --> FILETREE
IFRAME --> QRCODE
Full-featured code editor with syntax highlighting, file tree navigation, and real-time sync with the WebContainer filesystem. Supports diff view to track AI-generated changes.
Conversational UI with message history, file attachment, model selection per-message, and streaming response display. Artifacts appear inline as collapsible panels.
Configurable provider toggles, API key management, local model health monitoring, MCP (Model Context Protocol) configuration, and theme preferences.
Clone repositories, import projects from GitHub/GitLab, and push generated apps to new repos. Dedicated components handle authentication, branch selection, and repo creation.
Bolt.diy supports one-click deployment to three platforms: Netlify, Vercel, and GitHub Pages. The deployment flow runs a build action inside the WebContainer, packages the output, and pushes it to the selected platform via API. Each platform has dedicated API routes for authentication and deployment.
graph LR
subgraph Build["Build Phase"]
BA["Build Action"]
WC["WebContainer"]
OUT["Build Output
(dist/ or build/)"]
end
subgraph Package["Package Phase"]
ZIP["ZIP Archive"]
API["Deploy API Routes"]
end
subgraph Targets["Deploy Targets"]
N["Netlify
api.netlify-deploy"]
V["Vercel
api.vercel-deploy"]
G["GitHub Pages
api.git Routes"]
end
subgraph Auth["Auth Routes"]
NA["api.netlify-user"]
VA["api.vercel-user"]
GA["api.github-*"]
end
BA --> WC --> OUT
OUT --> ZIP
ZIP --> API
API --> N & V & G
NA --> N
VA --> V
GA --> G
| Platform | Auth Method | API Routes | Features |
|---|---|---|---|
| Netlify | OAuth token via cookie | api.netlify-deploy, api.netlify-user |
Direct site deployment, automatic SSL, CDN distribution |
| Vercel | API token via cookie | api.vercel-deploy, api.vercel-user |
Project deployment, serverless functions, edge network |
| GitHub Pages | GitHub OAuth | api.git, api.github-* |
Push to repo, automatic Pages build, branch selection |
Users can also download their entire project as a ZIP file without deploying. The download packages all files from the WebContainer's virtual filesystem into a client-side archive, with no server round-trip required.
Bolt.diy uses nanostores for lightweight, framework-agnostic reactive state management. The application maintains 20 discrete stores covering everything from chat history and editor state to deployment connections and streaming progress. Stores are organized by domain and use MapStore for complex state and atom for simple values.
graph TD
subgraph Core["Core Stores"]
CHAT["chat.ts
Conversation state"]
FILES["files.ts
Project files"]
EDITOR["editor.ts
Editor config"]
WB["workbench.ts
Workspace layout"]
TERM["terminal.ts
Terminal output"]
end
subgraph Integration["Integration Stores"]
GH["github.ts"]
GHC["githubConnection.ts"]
GLC["gitlabConnection.ts"]
NET["netlify.ts"]
VER["vercel.ts"]
SUP["supabase.ts"]
end
subgraph Feature["Feature Stores"]
SET["settings.ts"]
THM["theme.ts"]
PRF["profile.ts"]
LOG["logs.ts"]
STR["streaming.ts"]
end
subgraph UI["UI Stores"]
PRV["previews.ts"]
QR["qrCodeStore.ts"]
MCP["mcp.ts"]
TAB["tabConfigurationStore.ts"]
end
CHAT --> FILES
CHAT --> STR
FILES --> EDITOR
FILES --> WB
WB --> PRV
WB --> TERM
SET --> THM
| Store | Domain | Purpose |
|---|---|---|
chat.ts |
Core | Conversation messages, active chat ID, message history persistence |
files.ts |
Core | Project file tree, file contents, file locking state, modified flags |
workbench.ts |
Core | Panel layout, active tabs, split pane sizes, workbench visibility |
streaming.ts |
Feature | Real-time streaming progress, token counts, generation status |
mcp.ts |
Integration | Model Context Protocol server configs, tool availability, MCP status |
netlify.ts / vercel.ts |
Deploy | Platform auth state, deployment progress, site URLs |
supabase.ts |
Integration | Supabase connection state, migration history, query results |
The Bolt.diy codebase is organized as a Remix application with clear separation between client components, server-side API logic, and shared modules. The app/ directory contains all application code, with framework-specific routing in routes/, reusable UI in components/, and business logic in lib/.
graph TD
subgraph Root["Repository Root"]
PKG["package.json"]
VITE["vite.config.ts"]
DOCKER["Dockerfile + docker-compose"]
ELEC["electron/"]
FN["functions/"]
end
subgraph App["app/"]
ENTRY_C["entry.client.tsx"]
ENTRY_S["entry.server.tsx"]
ROOT_TSX["root.tsx"]
end
subgraph Routes["app/routes/ · 38 routes"]
R_IDX["_index.tsx"]
R_CHAT["chat.$id.tsx"]
R_API["api.chat.ts"]
R_MOD["api.models.ts"]
R_DEP["api.*-deploy.ts"]
R_GIT["api.git*.ts"]
R_WC["webcontainer.*.tsx"]
end
subgraph Components["app/components/"]
C_CHAT["chat/"]
C_WB["workbench/"]
C_ED["editor/codemirror/"]
C_DEP["deploy/"]
C_GIT["git/"]
C_SET["@settings/"]
C_HEAD["header/"]
C_SIDE["sidebar/"]
C_UI["ui/"]
end
subgraph Lib["app/lib/"]
L_SRV[".server/llm/"]
L_MOD["modules/llm/"]
L_RT["runtime/"]
L_WC["webcontainer/"]
L_STORE["stores/ (20 files)"]
L_SVC["services/"]
L_HOOKS["hooks/"]
L_PERSIST["persistence/"]
end
App --> Routes
App --> Components
App --> Lib
Routes --> L_SRV
Components --> L_STORE
L_RT --> L_WC
L_SRV --> L_MOD
| Layer | Technology | Role |
|---|---|---|
| Framework | Remix | Full-stack React framework with SSR, file-based routing, and server actions |
| UI | React | Component library for the entire workbench, chat, and settings UI |
| Styling | UnoCSS | Atomic CSS engine for utility-first styling with zero runtime |
| Build | Vite | Development server and production bundler |
| State | nanostores | Lightweight reactive state management (framework-agnostic) |
| AI SDK | Vercel AI SDK | Unified interface for LLM streaming across providers |
| Editor | CodeMirror 6 | Extensible code editor with syntax highlighting and file navigation |
| Runtime | WebContainer | Browser-native Node.js sandbox via WebAssembly + Service Workers |
| Desktop | Electron | Native desktop wrapper for Windows, macOS, and Linux |
| Edge | Cloudflare Workers | Serverless deployment via Wrangler (optional production hosting) |
| Package Manager | pnpm | Fast, disk-efficient package manager |
| Service | File | Responsibility |
|---|---|---|
| GitHub API | githubApiService.ts |
Repository CRUD, branch listing, template access, user authentication |
| GitLab API | gitlabApiService.ts |
Project listing, branch management, GitLab-specific integration |
| Import/Export | importExportService.ts |
Project serialization, ZIP download, repo import |
| Model Health | localModelHealthMonitor.ts |
Monitors local LLM availability (Ollama, LM Studio), connection status |
| MCP Service | mcpService.ts |
Model Context Protocol tool invocation, server configuration management |
The LLM integration is split across two directories: app/lib/.server/llm/ contains server-only code (streaming, context selection, recovery) that runs in the Remix server action, while app/lib/modules/llm/ contains shared code (provider registry, base provider class, types) used by both client and server. The .server convention in Remix ensures that streaming logic and API keys never leak to the browser bundle.