TL;DR / Key Takeaways
Nemotron 3: Nvidia's Open-Source Gambit
Nvidia unveiled Nemotron 3 Ultra, a monumental open-source AI model. This monster boasts 550 billion total parameters, with up to 55 billion active per token, built on a groundbreaking hybrid Mamba (SSM) and Transformer Mixture-of-Experts (MoE) architecture. This unique combination delivers exceptional processing speed, setting a new benchmark for model efficiency.
Nvidia demonstrates a profound commitment to open AI development with Nemotron 3 Ultra. Unlike other models, Nvidia provides a truly comprehensive open-source package, releasing: - Model weights - Training scripts - Full dataset This unprecedented transparency allows developers worldwide to freely inspect, build upon, and customize the model, fostering innovation across the AI community.
Jensen Huang underscored Nemotron 3 Ultra's transformative performance. He declared it runs 5x faster and 30% cheaper for inference than the world's best open models, even the most cost-effective. This architectural leap enables AI agents to execute more complex, longer-running tasks at the same cost, effectively allowing them to "think longer" and more deeply within existing budgets.
Vera: The CPU Built For Your AI Assistant
Traditional CPUs present a significant bottleneck for AI, particularly within the 'agentic loop' where a CPU must efficiently manage and feed data to powerful GPUs. This traditional architecture, designed for a different era, directly impedes GPU utilization, throttling token throughput, increasing latency, and degrading user experience in sophisticated AI applications.
Nvidia unveiled Vera, a CPU purpose-built for the age of agents. At its core lies the custom Olympus Core, engineered for modern data center workloads like branch-heavy Python runtimes and sandbox code execution. A scalable coherency fabric unifies all 88 Olympus cores on a monolithic mesh, enabling 50% faster core-to-core communication than traditional chiplet designs. Vera is also the first CPU to integrate LPDDR5X memory, delivering 40% lower peak memory latency compared to x86, crucial for timely data retrieval and analytics.
Vera delivers 1.8 times the agentic sandbox performance of x86 CPUs. This substantial boost directly translates to higher token throughput and a superior user experience for complex AI applications. Tightly coupled with Rubin GPUs via memory-coherent NVLink chip-to-chip connections, Vera ensures accelerated workflows. Vera BlueField 4 STX further powers context memory and AI storage, providing a complete compute, networking, and storage solution for the age of agents.
Cosmos 3: The AI That Understands Reality
Nvidia unveiled Cosmos 3, an omnimodal world model designed to redefine Physical AI. This foundational system grants robots and autonomous vehicles a profound understanding of the physical world by processing a rich tapestry of data: video, sound, text, and critical action inputs. It establishes a robust, holistic perception of reality for intelligent agents.
Cosmos 3 operates on an unprecedented scale, trained on a staggering 20 trillion tokens of multimodal data. Its training corpus encompasses nearly 4 billion images, 400 million real and synthetic videos, alongside vast sound, text, and action datasets. This empowers Cosmos 3 to transcend mere observation; it actively outputs "action data," enabling systems to predict outcomes and make sophisticated decisions. This innovative approach seamlessly unifies traditional world models and action models into a singular, powerful framework.
Nvidia bolsters its commitment to open innovation by making Cosmos 3 an open model. Developers can readily access its weights on Hugging Face and the complete source code on GitHub. This democratizes access to a powerful starting point for advancements in robotics, complex simulations, and autonomous systems, directly accelerating the broader physical AI revolution. For deeper insights into Nvidia's agentic AI ecosystem, including the Vera CPU, refer to NVIDIA Unveils Vera, the CPU for Agents.
Your Next PC is an AI Agent
Nvidia and Microsoft are partnering to "reinvent the PC for the first time in 40 years," introducing the groundbreaking RTX Spark superchip. This collaboration marks a profound shift, transforming the personal computer from a device that merely executes applications into one that hosts and runs native AI agents seamlessly. This initiative fundamentally redefines the user experience and the very purpose of personal computing.
Sparkβs specifications are nothing short of monstrous, designed for unparalleled local AI capabilities. A single chip fuses a powerful Blackwell RTX GPU, boasting an immense 6,144 CUDA cores, with a custom 20-core Grace CPU. This integrated powerhouse delivers a staggering one petaFLOP of AI performance, all backed by a massive 128GB of unified memory, eliminating traditional data bottlenecks and enabling complex AI tasks.
Future PCs will run personal AIs continuously and securely, operating entirely within a local sandbox environment directly on the device. This ensures both robust privacy and always-on functionality for individual users, empowering them with intelligent assistance without relying on the cloud. RTX Spark provides the essential hardware foundation for a new Windows platform, purpose-built to enable this profound paradigm shift towards pervasive, agentic personal computing.
Frequently Asked Questions
What is Nvidia Nemotron 3 Ultra?
Nemotron 3 Ultra is Nvidia's new, completely open-source large language model with 550 billion parameters. It's designed to be 5x faster and 30% cheaper to run than comparable open models.
Why did Nvidia create the Vera CPU?
Nvidia created the Vera CPU specifically for the 'age of agents.' It's designed to eliminate the performance bottleneck of traditional CPUs in AI workflows, acting as a conductor for GPU-heavy tasks.
What is Nvidia Cosmos 3 used for?
Cosmos 3 is an open foundation model for 'physical AI.' It helps robots, self-driving cars, and other physical systems understand, predict, and act within the real world using multimodal data.
What is RTX Spark?
RTX Spark is a new 'superchip' developed by Nvidia and Microsoft to reinvent the PC for the AI era. It combines a powerful Blackwell RTX GPU and a Grace CPU to run sophisticated AI agents locally on your computer.