Product Launch
OpenAI Releases GPT-5.4 with Native Computer Use and 1M-Token Context
Three model variants — standard, Thinking, and Pro — arrive with agentic computer-use capabilities, record-setting knowledge-work benchmarks, and a new Codex desktop app for Windows that runs parallel coding agents.
OpenAI released GPT-5.4 on March 5 in three variants designed to cover the full spectrum of enterprise and developer use cases: a standard model optimized for general-purpose tasks, a Thinking variant with extended chain-of-thought reasoning for complex problem-solving, and a Pro tier with the highest capability ceiling for research and professional workloads. The release comes just one day after GPT-5.3 Instant shipped — an unprecedented cadence that signals OpenAI is running multiple model pipelines simultaneously rather than iterating sequentially, a shift in release strategy that would have been logistically impossible even a year ago.
The headline capability is native computer use: GPT-5.4 agents can interact with desktop software through screenshots, mouse movements, and keyboard inputs without requiring any API integration or custom tooling on the target application. This puts OpenAI in direct competition with Anthropic’s Claude computer-use feature, which launched months earlier and has been steadily gaining enterprise traction. OpenAI’s implementation differs in its emphasis on parallelism — the accompanying Codex Windows desktop app allows developers to run multiple computer-use agents simultaneously, each operating in its own sandboxed environment, enabling workflows like running automated tests in one agent while refactoring code in another.
On benchmarks, GPT-5.4 scored a record 83% on GDPval, a knowledge-work evaluation that measures performance across legal analysis, financial modeling, medical reasoning, and technical writing — tasks that represent the highest-value enterprise applications. OpenAI also claims the model is 33% less likely to produce factual errors than GPT-5.2, building on the hallucination-reduction work that defined GPT-5.3 Instant. The 1M-token context window, available through the API, positions the model for document-heavy enterprise workflows like contract analysis, codebase understanding, and regulatory compliance review where the ability to hold an entire corpus in context eliminates the need for retrieval-augmented generation pipelines.
The competitive implications are significant. Anthropic’s Claude has held the lead in computer use and agentic capabilities for several months, and Google’s Gemini 2.5 Pro has been the context-window leader. GPT-5.4 attempts to close both gaps simultaneously while also pushing on reliability metrics that enterprise buyers increasingly cite as the primary factor in model selection. The speed of this release — arriving before enterprise customers have fully evaluated GPT-5.3 Instant — suggests OpenAI views the current moment as a land-grab where shipping velocity matters more than orderly product cadence.