Nvidia introduces Rubin CPX, a GPU built for AI video generation and coding

Image via NVIDIA

Nvidia has announced a new product, Rubin CPX, a new specialized GPU the company claims is purpose-built for massive-context processing. This covers demanding jobs like large-scale coding and generative video.

The hardware is designed to separate the task of understanding an AI prompt from the task of generating a response, which Nvidia says will make the whole process more efficient for its customers.

This new hardware is supposed to work as part of the larger Vera Rubin platform, which integrates both Vera CPUs and Rubin GPUs. Nvidia claims the full-rack version, the Vera Rubin NVL144 CPX, packs 8 exaflops of AI performance.

The standalone Rubin CPX GPU contains 128GB of GDDR7 memory. Nvidia promises that the new hardware has 3x faster attention capabilities and delivers up to 30 petaflops of compute using the company"s 4-bit NVFP4 precision.

CEO and founder of Nvidia, Jensen Huang, compared it to the RTX, saying, "Just as RTX revolutionized graphics and physical AI, Rubin CPX is the first CUDA GPU purpose-built for massive-context AI, where models reason across millions of tokens of knowledge at once." He also tried to quantify the return on investment for customers, suggesting that a $100 million deployment of the new hardware could generate $5 billion in revenue.

Nvidia claims that companies like Cursor and Runway are already exploring the potential of the new chip system, with Cristóbal Valenzuela, the CEO of Runway, saying it represents a "leap in performance" for creative workflows:

We see Rubin CPX as a major leap in performance, supporting these demanding workloads to build more general, intelligent creative tools. This means creators — from independent artists to major studios — can gain unprecedented speed, realism and control in their work.

According to Nvidia, the hardware will be supported by its full software stack. This includes Nemotron, its family of open, multimodal models designed for building enterprise AI agents (systems meant to handle complex tasks autonomously). The Nemotron models are offered in different sizes, from Nano for on-device applications to Super for single-GPU setups and Ultra for large data centers.

Nvidia says it expects the new chip to be available at the end of 2026.

Via: Seeking Alpha

Tags