NVIDIA Launches Nemotron 3 Super: Open Model for Agentic AI

NVIDIA has just pulled back the curtain on its latest AI powerhouse: Nemotron 3 Super. This isn't just another large language model; it's a game-changer specifically engineered to tackle the complexities of "agentic AI" – those autonomous AI systems that can reason, plan, and execute multi-step tasks on their own. With a whopping 120 billion parameters, yet cleverly optimized for efficiency, Nemotron 3 Super promises to dramatically boost throughput and accuracy for the next generation of intelligent agents.

What It's For: Powering Autonomous AI Agents

Imagine an AI system that doesn't just answer questions but can autonomously perform a series of complex actions, learn from its environment, and adapt to new information. That's agentic AI, and it's where much of the cutting-edge AI development is headed. However, these ambitious systems face significant hurdles, primarily "context explosion" and the "thinking tax."

Multi-agent workflows often generate massive amounts of conversational context, repeatedly sending full histories, tool outputs, and intermediate reasoning steps. This can quickly become expensive, slow, and even lead to "goal drift" where the agent loses sight of its original objective. Nemotron 3 Super directly addresses this with an incredible 1-million-token context window, allowing agents to retain their entire workflow state in memory.

This capability is transformative. For instance, a software development agent can load an entire codebase into context, enabling end-to-end code generation and debugging without constant fragmentation. In financial analysis, it can process thousands of pages of reports, eliminating the need to re-reason across lengthy conversations, significantly improving efficiency. Furthermore, its high-accuracy tool calling is crucial for reliable navigation of massive function libraries, preventing execution errors in critical environments like cybersecurity orchestration. Companies like Perplexity, CodeRabbit, Greptile, Edison Scientific, Lila Sciences, Amdocs, Palantir, Cadence, Dassault Systèmes, and Siemens are already integrating Nemotron 3 Super into their agentic AI applications, from search and software development to life sciences and manufacturing.

Why It Matters: Unprecedented Performance and Openness

Nemotron 3 Super stands out with a blend of innovative architecture and impressive performance metrics. It's a hybrid mixture-of-experts (MoE) model with only 12 billion active parameters at inference time, optimized for NVIDIA Blackwell, delivering up to 5x higher throughput and 2x higher accuracy than its predecessor.

Under the hood, it features a unique hybrid architecture combining Mamba layers for 4x higher memory and compute efficiency with transformer layers for advanced reasoning. A new "Latent MoE" technique further enhances accuracy by activating four expert specialists for the cost of one to generate the next token. Plus, "Multi-Token Prediction" allows it to predict multiple future words simultaneously, resulting in 3x faster inference. When run in NVFP4 precision on Blackwell, it achieves up to 4x faster inference than FP8 on NVIDIA Hopper, all without any loss in accuracy.

The model has quickly made its mark, claiming the top spot on Artificial Analysis for efficiency and openness, and achieving leading accuracy among models of its size. It even powers the NVIDIA AI-Q research agent to the #1 position on the DeepResearch Bench II Leaderboard, showcasing its ability to conduct thorough, multi-step research.

NVIDIA is committing to openness with Nemotron 3 Super, releasing it with open weights under a permissive license. This includes the complete methodology, over 10 trillion tokens of training datasets, 15 training environments for reinforcement learning, and evaluation recipes, empowering developers to customize and deploy it freely. You can explore the broader family of models at NVIDIA Nemotron Foundation Models.

Where You Get It: Widespread Availability

Ready to supercharge your agentic AI workflows? NVIDIA Nemotron 3 Super is available now and has a broad ecosystem of deployment options. Developers can directly access the model at build.nvidia.com. It's also integrated with platforms like Perplexity, OpenRouter, and Hugging Face.

For enterprises and developers looking for robust deployment, Nemotron 3 Super is being integrated by leading cloud service providers, including Google Cloud’s Vertex AI and Oracle Cloud Infrastructure, with Amazon Web Services through Amazon Bedrock and Microsoft Azure coming soon. NVIDIA Cloud Partners like Coreweave, Crusoe, Nebius, and Together AI, along with inference service providers such as Baseten, CloudFlare, DeepInfra, Fireworks AI, Inference.net, Lightning AI, Modal, and FriendliAI, are also offering the model. The model is conveniently packaged as an NVIDIA NIM microservice, enabling flexible deployment from on-premises systems to the cloud.

Read more: New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI for a deep dive into Nemotron 3 Super's capabilities and ecosystem integrations.

NVIDIA Launches Nemotron 3 Super: Open Model for Agentic AI

What It's For: Powering Autonomous AI Agents

Why It Matters: Unprecedented Performance and Openness

Where You Get It: Widespread Availability

Read next

Anthropic Upgrades Claude Opus to 4.8, Boosting Benchmarks and Collaboration

Get notified when our newsletter launches