Interactive architecture map of xterm.js -- the terminal emulator component powering VS Code, Hyper, and Azure Cloud Shell. Four-layer architecture with pluggable renderers and an addon ecosystem.
xterm.js is a frontend-only terminal emulator written in TypeScript. It implements VT100/VT220/xterm escape sequences with zero runtime dependencies, separating platform-agnostic core logic from browser-specific rendering through a four-layer architecture.
graph TD
subgraph API["Layer 4: Public API"]
TYPES["xterm.d.ts
Stable API surface"]
end
subgraph Browser["Layer 3: Browser Integration"]
TERM["Terminal class"]
REND["RenderService"]
SEL["SelectionService"]
MOUSE["MouseService"]
A11Y["AccessibilityManager"]
end
subgraph Core["Layer 2: Core Terminal Logic"]
CT["CoreTerminal"]
IH["InputHandler"]
BS["BufferService"]
OPTS["OptionsService"]
end
subgraph Foundation["Layer 1: Foundation"]
PARSER["EscapeSequenceParser"]
BUF["Buffer / BufferLine"]
DI["InstantiationService (DI)"]
end
TYPES --> TERM
TERM --> CT
REND --> CT
SEL --> BS
A11Y --> BS
CT --> IH
CT --> BS
IH --> PARSER
BS --> BUF
CT --> DI
style TYPES fill:#242424,stroke:#B4009E,color:#F2F2F2
style TERM fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style REND fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style SEL fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style MOUSE fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style A11Y fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style CT fill:#242424,stroke:#3A96DD,color:#F2F2F2
style IH fill:#242424,stroke:#3A96DD,color:#F2F2F2
style BS fill:#242424,stroke:#3A96DD,color:#F2F2F2
style OPTS fill:#242424,stroke:#3A96DD,color:#F2F2F2
style PARSER fill:#0C0C0C,stroke:#3B78FF,color:#F2F2F2
style BUF fill:#0C0C0C,stroke:#3B78FF,color:#F2F2F2
style DI fill:#0C0C0C,stroke:#3B78FF,color:#F2F2F2
VS Code (integrated terminal), Hyper, Eclipse Theia, Azure Cloud Shell, Tabby, Google Cloud Shell, CoCalc, StackBlitz, and many more.
The codebase enforces a strict dependency hierarchy: common is the foundation, browser and headless both depend on common, but never on each other. This separation enables xterm-headless for Node.js environments.
graph TD
subgraph Public["typings/"]
XTERMD["xterm.d.ts"]
end
subgraph BrowserPkg["src/browser/"]
BTERM["Terminal.ts"]
BSVC["services/
RenderService, SelectionService"]
BREND["renderer/
DOM, Canvas, WebGL"]
end
subgraph HeadlessPkg["src/headless/"]
HTERM["Headless Terminal"]
end
subgraph CommonPkg["src/common/"]
CTERM["CoreTerminal.ts"]
CINPUT["InputHandler.ts"]
CPARSER["parser/
EscapeSequenceParser"]
CBUF["buffer/
Buffer, BufferLine, CellData"]
CSVC["services/
Core DI services"]
end
subgraph AddonPkg["addons/"]
ADDONS["12 official addons"]
end
XTERMD --> BTERM
XTERMD --> HTERM
BTERM --> CTERM
HTERM --> CTERM
BSVC --> CSVC
BREND --> CBUF
CTERM --> CINPUT
CINPUT --> CPARSER
CTERM --> CBUF
CTERM --> CSVC
ADDONS -.-> XTERMD
style XTERMD fill:#242424,stroke:#B4009E,color:#F2F2F2
style BTERM fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style BSVC fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style BREND fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style HTERM fill:#1A1A1A,stroke:#C19C00,color:#F2F2F2
style CTERM fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style CINPUT fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style CPARSER fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style CBUF fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style CSVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style ADDONS fill:#242424,stroke:#C19C00,color:#F2F2F2
Platform-agnostic infrastructure: escape sequence parser state machine, buffer data structures, dependency injection framework.
CoreTerminal, InputHandler, BufferService, OptionsService. Runs independently as xterm-headless in Node.js.
DOM manipulation, rendering coordination, viewport scrolling, keyboard/mouse input, and accessibility.
Stable API surface in xterm.d.ts that external applications and addons depend on.
The parser is based on Paul Flo Williams' state machine for DEC-compatible terminals. It processes input character-by-character with deterministic state transitions, recognizing VT sequences (CSI, OSC, DCS, ESC, APC) and dispatching to specialized handlers.
graph LR
INPUT["Input Source
(API / keyboard)"]
WRITE["terminal.write()"]
WBUF["WriteBuffer
(async batching)"]
PARSE["InputHandler.parse()"]
ESC["EscapeSequenceParser
(state machine)"]
HANDLERS["Sequence Handlers
(CSI/OSC/DCS/ESC)"]
BUFUPD["Buffer Updates
via BufferService"]
EVENTS["Event Emission"]
RENDER["RenderService
(debounced)"]
INPUT --> WRITE
WRITE --> WBUF
WBUF --> PARSE
PARSE --> ESC
ESC --> HANDLERS
HANDLERS --> BUFUPD
BUFUPD --> EVENTS
EVENTS --> RENDER
style INPUT fill:#242424,stroke:#767676,color:#F2F2F2
style WRITE fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style WBUF fill:#1A1A1A,stroke:#C19C00,color:#F2F2F2
style PARSE fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style ESC fill:#0C0C0C,stroke:#3B78FF,color:#F2F2F2
style HANDLERS fill:#0C0C0C,stroke:#B4009E,color:#F2F2F2
style BUFUPD fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style EVENTS fill:#1A1A1A,stroke:#61D6D6,color:#F2F2F2
style RENDER fill:#242424,stroke:#E74856,color:#F2F2F2
Five handler registration categories, each with a registerXxxHandler() API. Handlers are probed in reverse registration order -- returning true stops the chain, false continues to the next handler.
| Type | Prefix | Example | Usage |
|---|---|---|---|
| CSI | ESC [ | ESC [ 2 J | Erase display, SGR attributes, cursor movement |
| OSC | ESC ] | ESC ] 0 ; title ST | Window title, hyperlinks (OSC 8) |
| DCS | ESC P | ESC P ... ST | Device control with payload |
| ESC | ESC | ESC 7 | Save cursor, charset switching |
| APC | ESC _ | ESC _ ... ST | Application program commands |
Incoming data is queued for asynchronous processing via WriteBuffer, preventing UI blocking. Maximum chunk size is 131,072 characters. Async handlers are supported but incur throughput cost. OSC/DCS payloads are capped at 10MB.
xterm.js maintains a dual buffer system -- normal and alternate screen. The normal buffer holds command output with scrollback history. The alternate buffer provides a clean slate for full-screen applications like vim and htop.
graph TD
subgraph BufferSet["BufferSet (manages active pointer)"]
NORMAL["Normal Buffer
(default screen)"]
ALT["Alternate Buffer
(full-screen apps)"]
end
subgraph NormalDetail["Normal Buffer Internals"]
SCROLL["Scrollback
(CircularList)"]
VIEWPORT["Visible Viewport
(rows x cols)"]
MARKERS["Markers
(survive line trimming)"]
end
subgraph DataStructs["Data Structures"]
BLINE["BufferLine
(Uint32Array, 3 per col)"]
CELL["CellData
(char + attrs + width)"]
end
NORMAL --> NormalDetail
SCROLL --> BLINE
VIEWPORT --> BLINE
BLINE --> CELL
style NORMAL fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style ALT fill:#1A1A1A,stroke:#C19C00,color:#F2F2F2
style SCROLL fill:#0C0C0C,stroke:#3B78FF,color:#F2F2F2
style VIEWPORT fill:#0C0C0C,stroke:#16C60C,color:#F2F2F2
style MARKERS fill:#0C0C0C,stroke:#B4009E,color:#F2F2F2
style BLINE fill:#242424,stroke:#E74856,color:#F2F2F2
style CELL fill:#242424,stroke:#E74856,color:#F2F2F2
Buffer extends CircularList for bounded-memory scrollback. Lines are overwritten when scrollback limit (default 1000) is reached.
Uses a Uint32Array with 3 elements per column providing O(1) column-to-cell mapping. Attributes are bit-packed into fg/bg elements.
Packed structure storing character code, combined attributes (flags, foreground, background), and character width in a compact format.
viewportY tracks the visible top line. baseY tracks where the bottom page starts. Markers survive line trimming via index tracking.
The rendering system abstracts multiple backends through the IRenderer interface. RenderService coordinates rendering by debouncing updates, batching changes into single frames, and supporting DEC synchronized output (mode 2026).
graph TD
RSVC["RenderService
(debounce + batch)"]
subgraph DOM["DOM Renderer (Default)"]
DOMR["HTML elements
per visible row"]
end
subgraph Canvas["Canvas Renderer (Addon)"]
TEXT["Text Layer
(bg + fg, opaque)"]
SELC["Selection Layer"]
LINK["Link Layer"]
CURS["Cursor Layer"]
ATLAS["Texture Atlas
(ImageBitmap cache)"]
end
subgraph WebGL["WebGL Renderer (Addon)"]
WGLP["WebGL2 Program
(vertex + fragment)"]
FLOAT["Float32Array
(full terminal data)"]
WATL["Texture Atlas
(512x512 to 4096x4096)"]
end
RSVC --> DOMR
RSVC --> TEXT
RSVC --> WGLP
TEXT --> ATLAS
SELC --> ATLAS
WGLP --> FLOAT
WGLP --> WATL
style RSVC fill:#242424,stroke:#E74856,color:#F2F2F2
style DOMR fill:#1A1A1A,stroke:#767676,color:#F2F2F2
style TEXT fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style SELC fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style LINK fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style CURS fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style ATLAS fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style WGLP fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style FLOAT fill:#0C0C0C,stroke:#16C60C,color:#F2F2F2
style WATL fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
The canvas renderer uses multiple layered elements, each handling a specific concern. This separation allows the engine to repaint only changed layers rather than the entire terminal.
Uses multiple active rows, adding glyphs to the most suitable row based on pixel height. Supports multiple 512x512 texture pages that merge up to 4096x4096. All characters including emoji are cached; glyphs are trimmed to minimal rectangles for space efficiency. When capacity is reached, the atlas clears and restarts.
DOM renderer: simplest but slowest. Canvas renderer: 5x-45x faster than DOM depending on scenario. WebGL renderer: fastest, offloads drawing entirely to the GPU, freeing the main thread.
xterm.js uses a service-oriented architecture with dependency injection through InstantiationService, a simplified version of VS Code's DI system. Components depend on interfaces rather than concrete implementations.
graph TD
INST["InstantiationService
(DI container)"]
subgraph CoreSvc["Core Services (src/common/)"]
OPTSVC["OptionsService
(config + validation)"]
BUFSVC["BufferService
(normal + alt buffers)"]
CORESVC["CoreService
(terminal modes)"]
LOGSVC["LogService
(configurable levels)"]
UNISVC["UnicodeService
(char widths)"]
CHARSVC["CharsetService
(G0, G1 switching)"]
OSCSVC["OscLinkService
(OSC 8 hyperlinks)"]
end
INST --> CoreSvc
BUFSVC --> OPTSVC
CORESVC --> OPTSVC
UNISVC --> OPTSVC
style INST fill:#242424,stroke:#B4009E,color:#F2F2F2
style OPTSVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style BUFSVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style CORESVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style LOGSVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style UNISVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style CHARSVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style OSCSVC fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
| Service | Responsibility |
|---|---|
| OptionsService | Manages configuration with validation; fires change events when options are modified |
| BufferService | Manages normal/alternate buffers, scrolling, viewport tracking |
| CoreService | Tracks terminal modes, DEC private modes, keyboard state |
| LogService | Configurable logging at multiple levels for debugging |
| UnicodeService | Unicode version handling and character width calculations (wcwidth) |
| CharsetService | Character set switching (G0, G1, etc.) for legacy terminal compatibility |
| OscLinkService | Processes OSC 8 hyperlink sequences for clickable terminal links |
| CoreMouseService | Mouse tracking modes and event handling for terminal mouse support |
Addons extend the terminal through the ITerminalAddon interface: activate(terminal) is called on loadAddon(), and dispose() handles cleanup. Third-party addons can be built using only the public API.
graph LR
APP["Application Code"]
TERM["Terminal Instance"]
LOAD["terminal.loadAddon()"]
subgraph Addons["ITerminalAddon"]
FIT["addon-fit
(resize to container)"]
SEARCH["addon-search
(buffer search)"]
WEBGL["addon-webgl
(GPU renderer)"]
IMAGE["addon-image
(inline images)"]
SERIAL["addon-serialize
(export buffer)"]
ATTACH["addon-attach
(WebSocket PTY)"]
end
APP --> TERM
TERM --> LOAD
LOAD --> Addons
style APP fill:#242424,stroke:#767676,color:#F2F2F2
style TERM fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style LOAD fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style FIT fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style SEARCH fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style WEBGL fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style IMAGE fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style SERIAL fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style ATTACH fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
| Addon | Purpose | Category |
|---|---|---|
| addon-attach | Connects to a process via WebSocket | IO |
| addon-clipboard | Browser clipboard integration | IO |
| addon-fit | Resizes terminal to fit container element | Layout |
| addon-image | Inline image rendering support | Render |
| addon-ligatures | Font ligature rendering | Render |
| addon-progress | OSC 9;4 progress reporting API | Extend |
| addon-search | Search through terminal buffer content | Extend |
| addon-serialize | Export buffer to VT sequences or HTML | Extend |
| addon-unicode-graphemes | Enhanced grapheme clustering (experimental) | Unicode |
| addon-unicode11 | Unicode 11 character width standards | Unicode |
| addon-web-fonts | Web font loading and integration | Render |
| addon-webgl | WebGL2 GPU-accelerated rendering | Render |
xterm.js is a frontend-only library. It connects to backend processes (shells) via a WebSocket relay architecture. The server-side component (typically node-pty) spawns a pseudo-terminal and bridges it to the browser.
graph LR
subgraph BrowserSide["Browser"]
XTJS["xterm.js
(terminal emulator)"]
end
WS["WebSocket
(bidirectional)"]
subgraph ServerSide["Server"]
RELAY["WebSocket Server
(relay process)"]
NPTY["node-pty
(pseudo-terminal)"]
SHELL["Shell
(bash / zsh / fish)"]
end
XTJS <-->|"onData / write()"| WS
WS <-->|"data relay"| RELAY
RELAY <-->|"pty I/O"| NPTY
NPTY <-->|"stdin/stdout"| SHELL
style XTJS fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style WS fill:#242424,stroke:#C19C00,color:#F2F2F2
style RELAY fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style NPTY fill:#0C0C0C,stroke:#3B78FF,color:#F2F2F2
style SHELL fill:#0C0C0C,stroke:#767676,color:#F2F2F2
1. User types in xterm.js -- onData event fires -- data sent via WebSocket. 2. Server receives data -- writes to ptyProcess.write(). 3. PTY executes command -- output via ptyProcess.onData. 4. Server sends output via WebSocket -- xterm.js terminal.write().
xterm.js processes data at 5-35 MB/s, constrained to avoid blocking the UI thread (targeting less than 16ms per frame). A hardcoded 50MB buffer limit prevents memory exhaustion.
Pause PTY on each write, resume in callback. Simple but creates excessive context switches.
High/low watermarks (e.g., 100K/10K bytes) reduce pause/resume frequency significantly.
Create callbacks only every ~100K bytes instead of per-chunk, with pending count as the signal.
Character width calculations use a wcwidth implementation managed by UnicodeService. Full grapheme cluster support is complex because clusters join multiple cells that wcwidth would output separately into one perceived character.
graph TD
INPUT["Incoming Bytes"]
DECODE["StringToUtf32 /
Utf8ToUtf32"]
USERVICE["UnicodeService
(version tables)"]
WCWIDTH["wcwidth
(char width calc)"]
subgraph GraphemeAddon["addon-unicode-graphemes"]
GRAPHEME["Grapheme Clustering
(experimental)"]
PROPS["Unicode Properties
(generated file)"]
end
FUTURE["Future: WASM
acceleration"]
INPUT --> DECODE
DECODE --> USERVICE
USERVICE --> WCWIDTH
USERVICE -.-> GRAPHEME
GRAPHEME --> PROPS
USERVICE -.-> FUTURE
style INPUT fill:#242424,stroke:#767676,color:#F2F2F2
style DECODE fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style USERVICE fill:#0C0C0C,stroke:#3B78FF,color:#F2F2F2
style WCWIDTH fill:#0C0C0C,stroke:#16C60C,color:#F2F2F2
style GRAPHEME fill:#242424,stroke:#C19C00,color:#F2F2F2
style PROPS fill:#242424,stroke:#C19C00,color:#F2F2F2
style FUTURE fill:#1A1A1A,stroke:#B4009E,color:#F2F2F2
A following character in a cluster might have wcwidth != 0, breaking cursor movement if widths are summed. The total of individual wcwidth values typically exceeds the final cluster's display width. This is why full grapheme support remains experimental.
xterm.js employs multiple strategies across the rendering and write pipelines to maintain 60fps while processing high-throughput terminal output.
graph TD
subgraph RenderPerf["Rendering Performance"]
DIRTY["Dirty Row Tracking
(inlined in InputHandler)"]
TEXATL["Texture Atlas
(GPU-friendly cache)"]
LAYERS["Layered Canvas
(selective repaint)"]
DEBOUNCE["Render Debouncing
(batch to single frame)"]
CLIP["Viewport Clipping
(only visible rows)"]
end
subgraph WritePerf["Write Pipeline Performance"]
ASYNC["Async Batching
(131K char chunks)"]
BUDGET["Frame Budget
(16ms target for 60fps)"]
BURST["Burst Buffering
(power-efficient coalescing)"]
end
subgraph Results["Measured Results"]
CANVAS_R["Canvas: 5x-45x speedup"]
WEBGL_R["WebGL: GPU offload"]
SIZE_R["Bundle: 265KB (-30%)"]
end
RenderPerf --> Results
WritePerf --> Results
style DIRTY fill:#0C0C0C,stroke:#16C60C,color:#F2F2F2
style TEXATL fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style LAYERS fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style DEBOUNCE fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style CLIP fill:#0C0C0C,stroke:#3A96DD,color:#F2F2F2
style ASYNC fill:#1A1A1A,stroke:#3B78FF,color:#F2F2F2
style BUDGET fill:#1A1A1A,stroke:#3B78FF,color:#F2F2F2
style BURST fill:#1A1A1A,stroke:#3B78FF,color:#F2F2F2
style CANVAS_R fill:#242424,stroke:#16C60C,color:#F2F2F2
style WEBGL_R fill:#242424,stroke:#16C60C,color:#F2F2F2
style SIZE_R fill:#242424,stroke:#16C60C,color:#F2F2F2
Inlined into InputHandler so only rows that changed since the last frame are redrawn. No full-line reconstruction needed.
Pre-rendered character glyphs cached in GPU-friendly format. Avoids per-character fillText calls using drawImage from the atlas.
RenderService batches multiple state changes into single animation frames. Supports DEC mode 2026 for synchronized output.
WriteBuffer processes chunks targeting under 16ms per frame. Burst buffering reduces GPU usage for power-sensitive environments.
xterm.js creates an off-screen accessibility tree alongside rendered terminal rows, using ARIA attributes and a live region to support screen readers without sacrificing rendering performance.
graph TD
subgraph A11yTree["Accessibility Tree (off-screen)"]
VLIST["Virtual List
(aria-posinset / aria-setsize)"]
LIVE["Assertive Live Region
(announces output)"]
FOCUS["Focus Listeners
(boundary scrolling)"]
end
subgraph EchoLogic["Character Echo Suppression"]
QUEUE["Keystroke Queue"]
ECHO["Echo Detection"]
SUPPRESS["Suppress if match"]
ANNOUNCE["Announce if different"]
end
KEYPRESS["User Keypress"] --> QUEUE
QUEUE --> ECHO
ECHO --> SUPPRESS
ECHO --> ANNOUNCE
ANNOUNCE --> LIVE
VLIST --> LIVE
style VLIST fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style LIVE fill:#1A1A1A,stroke:#16C60C,color:#F2F2F2
style FOCUS fill:#1A1A1A,stroke:#3A96DD,color:#F2F2F2
style QUEUE fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style ECHO fill:#0C0C0C,stroke:#C19C00,color:#F2F2F2
style SUPPRESS fill:#0C0C0C,stroke:#767676,color:#F2F2F2
style ANNOUNCE fill:#0C0C0C,stroke:#16C60C,color:#F2F2F2
style KEYPRESS fill:#242424,stroke:#767676,color:#F2F2F2
When a key is pressed, the character is queued. When the echo returns from the process, xterm.js compares it to the queue. If it matches, the announcement is suppressed (the user already heard the keypress). If they differ, the queue clears and the output is announced.
The live region announces a maximum of 20 rows and clears when the user types. Tab characters are substituted with spaces. The accessibility tree updates at up to 60fps, balancing responsiveness with performance.