NVIDIA Jetson: Run Open Generative AI Models at the Edge

Bringing Generative AI to the Edge with NVIDIA Jetson

The world of generative AI is rapidly evolving, with powerful open-source models moving beyond data centers and into the physical world. At the forefront of this shift is the NVIDIA Jetson family, including both Jetson Orin and Jetson Thor platforms, which are becoming the go-to choice for running these advanced models directly on edge devices. This allows for real-time, private, and low-latency applications that were once confined to the cloud.

Developers can now deploy a wide array of models like NVIDIA Nemotron, Cosmos, and NVIDIA Isaac GR00T, alongside community favorites such as Qwen, Gemma, Mistral AI, GPT-OSS, and PI. This capability transforms how machines interact with their environments, making them more autonomous and responsive. One key benefit is the ability to run applications like OpenClaw on any Jetson developer kit, enabling private, always-on AI assistants at the edge with zero API costs and full data privacy. These systems support open models ranging from 2 billion to an impressive 30 billion parameters, putting frontier-class AI assistance directly into the hands of users.

Real-World Impact: Edge AI in Action

The practical applications of running generative AI at the edge are already making headlines. At CES earlier this year, a striking demonstration featured a Cat 306 CR mini-excavator with an in-cab Cat AI Assistant running on NVIDIA Jetson Thor. This system leveraged NVIDIA Nemotron speech models for natural voice interactions and Qwen3 4B (served locally via vLLM) to interpret requests and generate responses with incredibly low latency, all without needing a cloud connection. This showcases the power of local processing for industrial machinery, enhancing operator guidance and safety features.

Beyond industrial innovations, the impact extends to advanced robotics and research. Franka Robotics, for instance, demonstrated their FR3 Duo dual-arm system running the NVIDIA GR00T N1.6 model end-to-end onboard, handling perception to motion without any task scripting. Similarly, NVIDIA's GEAR Lab's SONIC project deploys humanoid controller policies on Jetson Orin, with the kinematic planner executing at approximately 12 milliseconds per pass and the policy loop at 50 Hz, all running locally. Even developer communities are leveraging this power; a UIUC SIGRobotics team won an NVIDIA embodied AI hackathon with a dual-arm matcha-making robot built on Jetson Thor, running the GR00T N1.5 model.

Getting Started and Unleashing Potential

Making this technology accessible, the NVIDIA Jetson Orin Nano 8GB serves as an excellent starting point for those looking to explore entry-level generative AI models at the edge. The flexibility of the Jetson platform means developers can easily switch between a variety of open models and AI frameworks, tailoring their edge solutions to almost any generative AI workload.

This paradigm shift moves AI from centralized cloud deployments, which incur latency and ongoing compute costs, to localized devices optimized for low latency, limited power, and consistent behavior. By bringing compute and memory together in a system-on-module, Jetson accelerates hardware design and simplifies sourcing. For those eager to dive deeper, model benchmarks and tutorials from the open model community are readily available at Jetson AI Lab. Whether it's powering smart home systems or automating daily tasks, Jetson offers a robust, private, and cost-effective solution for bringing generative AI to life at the edge.

Read more: NVIDIA Jetson Brings Generative AI to the Edge to explore the full capabilities and developer resources.

NVIDIA Jetson: Run Open Generative AI Models at the Edge

Bringing Generative AI to the Edge with NVIDIA Jetson

Real-World Impact: Edge AI in Action

Getting Started and Unleashing Potential

Read next

Claude API Skill Expands to JetBrains, CodeRabbit: Streamlines Agent Building, Model Upgrades

Get notified when our newsletter launches