NEWS

NVIDIA Vera CPU Hits Full Production and Targets x86’s Moat

Published

2 months ago

June 2, 2026

NVIDIA just put its first standalone data center processor into full production, and it walked straight into territory Intel and AMD have owned since the 1990s. The NVIDIA Vera CPU, an 88-core Arm-based chip purpose-built for AI agents, entered mass production after chief executive Jensen Huang confirmed the milestone at the GTC Taipei keynote on June 1. NVIDIA says the central processing unit (CPU, the chip that coordinates a server’s work) runs agentic workloads up to 50% faster than the best x86 server silicon from its rivals.

There is a catch that the launch-day excitement glossed over. You cannot drop a Vera chip into a generic server the way you buy an Intel or AMD part. It comes only as part of NVIDIA’s own rack systems, which changes who should actually be nervous.

Why x86 Suddenly Has a Problem in the AI Rack

For twenty years the server room ran on x86, the instruction set behind Intel’s Xeon and AMD’s EPYC lines. NVIDIA owned the accelerator slot with its graphics processors, but the host chip that fed those accelerators was almost always somebody else’s. Vera is NVIDIA’s move to take that slot too.

The targeting is not subtle. In its own technical material, NVIDIA benchmarks Vera against AMD’s EPYC “Turin” and Intel’s Xeon 6 “Granite Rapids,” the two flagship x86 server families, and claims a 50% edge on agentic sandbox work. Investors read the message quickly. On the day of the keynote, with NVIDIA also unveiling a new PC chip, Intel shares fell about 6% and AMD about 5% at the open.

What makes the threat credible is that NVIDIA is not chasing the whole server market. It is going after the fastest-growing, highest-margin slice of it, the racks being built to run AI, where it already controls the most valuable component.

88 cores of custom Arm silicon per Vera socket, branded the Olympus core.
1.2 TB/s of memory bandwidth, roughly 14 gigabytes per second per core.
Up to 1.5 TB of LPDDR5X memory attached to a single processor.

NVIDIA Vera CPU mass production challenges Intel and AMD x86 server chips.

What Sits Inside the 88-Core Olympus Design

NVIDIA built Vera from a custom core it calls Olympus, compatible with the Arm v9.2 instruction set rather than designed around an off-the-shelf Arm blueprint. Each chip carries 88 of these cores on a single monolithic compute die, with separate dielets handling memory and input/output. NVIDIA pairs that with its Spatial Multithreading feature, letting the chip flex between throughput and per-thread speed depending on the job.

The headline engineering number is bandwidth. Vera moves data at 1.2 terabytes per second and, according to NVIDIA’s Vera CPU engineering breakdown, sustains more than 90% of that peak under heavy load. That matters because agent software thrashes memory constantly, jumping between code execution, tool calls and database lookups rather than grinding through one neat calculation.

There is a second-generation Scalable Coherency Fabric tying the cores together at 3.4 TB/s of bisection bandwidth, plus a neural branch predictor that NVIDIA says can evaluate two taken branches per cycle. The memory itself arrives over LPDDR5X modules, the low-power standard, which keeps energy draw down while feeding all 88 cores.

Strip away the spec sheet and the design choice is plain. This is a chip built to keep busy, not to win single-threaded races against a desktop part.

Vera Ships as a Rack, Not as a Socket

Here is where the disruption narrative needs a brake. You will not find Vera sold as a loose chip you slot into a Dell or Supermicro server of your choosing. It ships inside NVIDIA’s own integrated rack systems, which is a fundamentally different product than the merchant CPUs Intel and AMD sell by the tray.

The flagship home for Vera is the liquid-cooled Vera Rubin NVL72 platform, which pairs 36 Vera chips with 72 Rubin graphics processors over a sixth-generation NVLink interconnect. NVIDIA also offers a CPU-centric Vera rack for general compute, but in both cases you are buying an NVIDIA system, not a component for somebody else’s box.

Attribute	NVIDIA Vera	AMD EPYC Turin	Intel Xeon 6 Granite Rapids
Architecture	Custom Arm v9.2 (Olympus)	x86-64	x86-64
Memory	LPDDR5X, up to 1.5 TB	DDR5	DDR5
Sold as	NVIDIA rack system	Merchant socket chip	Merchant socket chip
NVIDIA’s pitch	Up to 50% faster on agents	Baseline	Baseline

So the part of the x86 business most exposed is narrow: the host CPUs that go into AI training and inference racks. The millions of general-purpose servers running web apps, databases and enterprise software stay on familiar ground for now.

The Agentic Bottleneck NVIDIA Says It Solved

The whole pitch rests on a shift in how AI actually runs. Models have moved from answering a prompt and stopping to acting as agents that write code, call tools and check sandboxed results in a loop. Each of those steps lands on the CPU, not the graphics chip.

The Coordination Problem

Graphics processors handle the parallel math of training and inference. But an agent spends much of its time on serial housekeeping: launching a Python runtime, querying a database, parsing a result, deciding the next move. When the host chip stalls, the expensive accelerators sit idle waiting for instructions. NVIDIA’s argument is that a slow CPU now wastes graphics capacity that costs far more.

Sandboxes by the Thousand

NVIDIA measures Vera in “sandboxes,” the isolated environments where an agent safely runs untrusted code. It claims a Vera rack delivers four times the capacity and twice the performance per watt of an x86 rack, with room for more than 22,500 sandboxes in a single rack. At the system level, the company says the Rubin platform cuts cost per token tenfold and trains mixture-of-experts models, the sparse architecture behind many frontier systems, with one-fourth as many graphics chips as the prior generation.

Rubin arrives at exactly the right moment, as AI computing demand for both training and inference is going through the roof.

That was Huang in NVIDIA’s launch material, framing the platform that Vera anchors. The quote is a sales line, but the underlying bet is real: agents change the math of the data center, and whoever controls the host chip captures some of that value.

Who Lined Up to Buy First

The customer roster is the strongest evidence that this is more than a spec-sheet flex. NVIDIA named a long list of cloud providers and AI labs queued up for Rubin systems, with the first deliveries expected in the second half of 2026. The same June 1 keynote also introduced an Arm chip for PCs, a separate push detailed in our coverage of NVIDIA’s RTX Spark move into Windows on Arm.

According to NVIDIA’s Rubin platform announcement, the early names break down across three groups:

Cloud providers: AWS, Google Cloud, Microsoft, Oracle Cloud Infrastructure, CoreWeave, Lambda, Nebius and Nscale.
AI labs: Anthropic, OpenAI, xAI, Meta, Mistral AI, Cohere and Perplexity, among others.
Hardware builders: Cisco, Dell Technologies, HPE, Lenovo and Supermicro, the same OEMs (original equipment manufacturers, the firms that assemble finished systems) that already build NVIDIA racks.

That last group is worth noting, because Dell, HPE and Lenovo also sell plenty of x86 servers. They are now shipping systems that displace some of their own Intel and AMD volume, a sign of where they think the money is heading.

What the x86 Camp Still Holds

None of this means Intel and AMD are cornered. Their moat was never just performance; it was the enormous base of software, tooling and operators built around x86 over three decades. An Arm chip locked inside an NVIDIA rack does not touch the bank, the airline or the retailer running its core systems on EPYC and Xeon.

The merchant model is also a real advantage. A buyer who wants choice can mix and match x86 chips, motherboards and accelerators from different vendors. Vera offers no such flexibility; it is NVIDIA’s rack or nothing, and some enterprises will pay a premium specifically to avoid that kind of single-vendor lock-in. European challengers are circling the same opening, as our report on Axelera’s funding round to take on NVIDIA shows.

What changed on June 1 is that x86’s grip on the AI host slot, once treated as automatic, is now contested by the company that supplies the accelerators those hosts feed. The disruption is genuine, but it is fenced inside the AI factory rather than loose across the whole server market.

If Rubin ships on schedule in the second half of the year and the agentic workloads keep growing, NVIDIA captures a slice of the host market it never held before. If supply slips or the cheaper x86 path proves good enough for most agent work, Intel and AMD keep the slot they have defended since the last century.

Frequently Asked Questions

When will the NVIDIA Vera CPU be available to buy?

NVIDIA says Vera-based systems will be available from partners in the second half of 2026. The chip ships inside finished rack systems built by OEMs including Cisco, Dell, HPE, Lenovo and Supermicro rather than as a standalone retail part.

Can you buy the Vera CPU without NVIDIA GPUs?

Partly. Vera anchors the GPU-paired Vera Rubin NVL72 platform, but NVIDIA also offers a CPU-centric Vera rack aimed at general compute and agent sandboxes. In both cases the chip comes inside an NVIDIA rack system, not as a loose socket part you fit to a third-party server.

How does Vera compare to Intel Xeon and AMD EPYC?

NVIDIA benchmarks Vera against AMD’s EPYC Turin and Intel’s Xeon 6 Granite Rapids and claims up to 50% faster performance on agentic sandbox workloads. Vera uses a custom Arm v9.2 architecture, while both rivals use x86, so software compatibility differs.

What is the Olympus core?

Olympus is NVIDIA’s custom CPU core, compatible with the Arm v9.2 instruction set rather than based on an off-the-shelf Arm design. Each Vera chip packs 88 Olympus cores on a single monolithic compute die.

What is the Vera Rubin NVL72?

It is NVIDIA’s liquid-cooled data center platform that combines 72 Rubin graphics processors with 36 Vera CPUs over a sixth-generation NVLink interconnect. NVIDIA says the system cuts cost per token tenfold versus the prior generation.