The AI Dispatch

National Security & AI Policy

Pentagon Uses Claude in Iran Campaign Despite Blacklisting Anthropic

The U.S. military continued deploying Claude AI during strikes on Iran over the weekend — even as the Trump administration officially blacklisted Anthropic as a “supply chain risk.”

Sources: Washington Post • CNBC • CBS News • TechCrunch

The U.S. military continued to deploy Anthropic’s Claude AI in operational planning during its campaign of strikes against Iran over the weekend, according to three defense officials with direct knowledge of the systems in use — even as the Trump administration’s executive order officially designating Anthropic a “supply chain risk” remained in effect. The contradiction underscores the gap between political directives and the practical reality on the ground, where military commanders have become dependent on Claude’s analytical capabilities and have no near-term alternative that matches its performance in the specific intelligence-synthesis tasks for which it was deployed.

The blacklisting was triggered when Anthropic CEO Dario Amodei refused Defense Secretary Pete Hegseth’s demands to remove safeguards preventing mass domestic surveillance and fully autonomous weapons targeting. Amodei’s refusal, which the company framed as a matter of principle rather than policy negotiation, prompted the administration to move swiftly to cut Anthropic out of the federal technology stack — a process that is proving far more difficult in practice than it was on paper.

Meanwhile, OpenAI has revised its own Pentagon contract in the wake of the controversy. CEO Sam Altman admitted in a CNBC interview that the deal, struck within 72 hours of Anthropic’s standoff, “looked opportunistic and sloppy.” The amended agreement now includes explicit prohibitions on domestic surveillance of U.S. persons, though critics warn that significant loopholes remain — particularly around commercially acquired geolocation data and financial records, which do not fall under the same legal protections as communications intercepts.

Open Source

OpenAI Releases GPT-OSS: First Open-Weight Models Under Apache 2.0

The 120B model achieves near-parity with o4-mini on reasoning benchmarks while running on a single 80GB GPU. The 20B model matches o3-mini and targets edge deployment.

Sources: OpenAI Blog • OpenAI Model Card • Hugging Face

OpenAI entered the open-weights arena on Tuesday with the release of gpt-oss-120b and gpt-oss-20b, its first models published under the permissive Apache 2.0 license with no commercial restrictions. The move represents a dramatic strategic reversal for a company that spent the better part of three years arguing that open-weight releases of frontier models posed unacceptable safety risks — a position it maintained even as Meta, Mistral, and a growing roster of competitors built substantial ecosystems around freely available weights.

The 120-billion-parameter model achieves near-parity with OpenAI’s own o4-mini on standard reasoning benchmarks while fitting on a single 80GB GPU, a deployment profile that makes it accessible to a broad range of academic and enterprise users without requiring multi-node clusters. The smaller 20B variant matches o3-mini performance and is explicitly designed for edge and on-device deployment, requiring only 16GB of memory — putting it within reach of high-end consumer hardware.

OpenAI also released gpt-oss-safeguard variants tuned specifically for safety use cases, including content moderation, policy compliance checking, and adversarial-input detection. The company said it intends gpt-oss to serve as a “foundation for the ecosystem” rather than a competitive threat to its commercial API, betting that broad adoption of its architecture will pull developers toward its proprietary tooling and infrastructure.

Architecture

Mercury 2: Diffusion-Based Reasoning Hits 1,000 Tokens Per Second

Sources: Inception Labs • BusinessWire • InfoWorld

Inception Labs released Mercury 2, the first reasoning LLM built on a diffusion architecture rather than sequential token prediction. The model generates responses through parallel refinement, achieving over 1,000 tokens per second — roughly 5× faster than Claude 4.5 Haiku Reasoning at approximately 89 tokens per second and GPT-5 Mini at around 71 tokens per second.

Benchmark scores are competitive with established reasoning models: 91.1 on AIME 2025, 73.6 on GPQA Diamond, and 67.3 on LiveCodeBench. The inference cost advantage is equally striking — Inception claims Mercury 2 can deliver frontier-adjacent reasoning at a fraction of the per-token cost of autoregressive alternatives, making it particularly attractive for high-throughput applications where latency is the binding constraint.

The release challenges a foundational assumption of the current generation of AI systems: that sequential, left-to-right token generation is the only viable path to high-quality reasoning. If Mercury 2’s performance holds up under independent evaluation, diffusion-based language models could open a second front in the architecture wars that have so far been dominated by the transformer paradigm.

Robotics

Google Absorbs Intrinsic, Pursues ‘Android for Robots’ Vision

Sources: TechCrunch • CNBC

Google has officially absorbed Intrinsic — the robotics software company that spent five years incubating inside Alphabet’s X moonshot division — directly into Google’s core business. The integration combines Intrinsic’s Flowstate manufacturing platform with DeepMind AI models and Google Cloud infrastructure, creating what the company describes as an end-to-end stack for intelligent robotic systems.

CEO Sundar Pichai compared the strategic vision to “Android for robotics” — a software layer that abstracts away hardware differences and allows developers to build applications that run across a wide range of robotic platforms. The analogy is deliberate: just as Android gave Google a dominant position in mobile by becoming the default operating system for non-Apple devices, the company is betting that a similar platform play in robotics could define the next decade of physical-world AI.

The move follows a series of aggressive robotics investments across Alphabet, including DeepMind’s integration of Gemini into Boston Dynamics’ Atlas humanoid robots and the high-profile poaching of Boston Dynamics’ former CTO. Taken together, the signals point to a company that views embodied AI not as a research curiosity but as a core business priority on par with search and cloud.

Models & Benchmarks

Mistral Large 3 Ships at 675B Parameters, Matches 92% of GPT-5.2 at 15% Cost

Sources: Design for Online • LLM Stats

Mistral released Large 3, anchored by a 675-billion-parameter Mixture-of-Experts model that achieves 92% of GPT-5.2’s benchmark performance at roughly 15% of the inference cost. The release cements Mistral’s position as the leading European open-weights provider and challenges assumptions about the cost ceiling for frontier-class performance.

The economics are the headline: enterprises that have been priced out of GPT-5.2’s API tier can now access near-frontier reasoning and generation quality at a cost structure that makes previously impractical applications viable. Mistral has positioned Large 3 as the model for organizations that need 90th-percentile capability without 99th-percentile budgets — a segment of the market that has been underserved as the leading labs have focused their commercial efforts on the top of the performance curve.

The MoE architecture means that despite its 675B total parameter count, only a fraction of the model is active for any given inference pass, keeping per-token costs dramatically lower than a comparably sized dense model. Mistral has published full weights alongside the announcement, continuing its commitment to open access as a competitive differentiator against OpenAI, Google, and Anthropic, all of which keep their largest models proprietary.

Society & Regulation

London’s “March Against the Machines” Is Largest Anti-AI Protest Yet

Sources: MIT Technology Review • Yahoo News UK

On February 28, hundreds of protesters marched through London’s King’s Cross tech corridor — home to UK offices of OpenAI, Meta, and Google DeepMind — in what organizers called the largest anti-AI demonstration ever held in Britain. The coalition spanned Pause AI, climate activists, and disability advocates, ending with a People’s Assembly drafting demands for the UK government.

MIT Technology Review embedded with the march and found the crowd united less by a single grievance than by a shared sense that AI deployment is outpacing democratic accountability. The breadth of the coalition — spanning labor organizers worried about automation, environmental groups concerned about data center energy consumption, and disability advocates fighting algorithmic bias in benefits decisions — suggests that opposition to unchecked AI deployment is coalescing into a broader political movement rather than remaining siloed in specialist policy circles.

Model Release

xAI Ships Grok 4.20 Beta 2 with Four-Agent Architecture

xAI released Grok 4.20 Beta 2 on March 3, featuring a four-agent parallel processing system. A coordinating Grok model routes queries to specialized sub-agents: Harper for real-time X data and fact-checking, Benjamin for logic and coding, and Lucas for creative reasoning. New capabilities include medical document photo analysis and improved engineering reasoning.

Separately, xAI faces legal pressure from the California Attorney General over Grok’s “spicy mode” generating nonconsensual deepfakes — a regulatory confrontation that highlights the tension between xAI’s permissive approach to content generation and the emerging legal consensus around synthetic media harms.

Sources: Axios • California Attorney General

In Brief

vLLM Ships Major Release with 30% Throughput Gains

The vLLM project shipped a major release with async scheduling and pipeline parallelism delivering 30.8% end-to-end throughput improvement. A new WebSocket-based Realtime API enables streaming audio, and a major XPU platform overhaul adds MoE and FP8 kernel support. The release also patches CVE-2026-0994.

Google Launches Developer Knowledge API and MCP Server

Google released a public preview of the Developer Knowledge API with an official Model Context Protocol server, giving AI assistants a canonical, machine-readable gateway to all Google developer docs. Documentation is re-indexed within 24 hours of any update.

NIST AI Agent Standards — Public Comment Deadline March 9

NIST’s Center for AI Standards and Innovation launched an AI Agent Standards Initiative targeting interoperability, identity, and security for autonomous agents. The Request for Information on AI Agent Security has a public comment deadline of March 9, 2026 — days away.

Trending on GitHub

Repo	Language	Stars / Growth	Description
openclaw/openclaw	Python	253,500 (+15K)	Open-source personal AI assistant with 5,700+ community skills for WhatsApp, Telegram, Slack, Discord, and Signal
bytedance/deer-flow	Python	24,400 (+3.7K)	ByteDance’s SuperAgent framework for research, coding, and creation with sandboxes and subagents
clockworklabs/SpacetimeDB	Rust	22,200 (+2.9K)	Database-as-server platform with application logic in the database for realtime/multiplayer apps
muratcankoylan/Agent-Skills-for-Context-Engineering	Python	13,300 (+3.8K)	Agent skills collection for context management and multi-agent production architectures
abhigyanpatwari/GitNexus	TypeScript	9,536 (+6.5K)	Browser-based knowledge graph creator with Graph RAG Agent for deep code exploration
KeygraphHQ/shannon	TypeScript	~3,000 (+1.8K)	Autonomous AI hacker for finding real exploits in web applications