OncoAgent: 2-Tier Oncology AI Achieves 56x Fine-Tuning Speed on AMD
Written byLilac
Drafted with AI; edited and reviewed by a human.
![]()
TL;DR
- OncoAgent is a new open-source oncology decision support system designed for privacy and accuracy.
- It uses a dual-tier LLM architecture and a multi-agent approach to handle complex clinical queries.
- Fine-tuning on AMD Instinct MI300X hardware achieved a 56x speedup, completing in around 50 minutes.
- The system prioritizes patient data sovereignty by enabling on-premises deployment.
In the fast-evolving field of oncology, keeping clinical decision support systems up-to-date with the latest research and guidelines is a monumental challenge. Today, we're introducing OncoAgent, an innovative open-source system that tackles this head-on. This system is not just another AI tool; it's a privacy-preserving clinical decision support system for oncology, built with a sophisticated dual-tier, multi-agent Large Language Model (LLM) architecture. It's designed to run efficiently and securely, even on specialized hardware like AMD's Instinct MI300X.
At its core, OncoAgent employs a unique LangGraph topology featuring eight specialized nodes. This decomposed approach allows for a more nuanced and auditable breakdown of clinical reasoning. The system incorporates a four-stage Corrective RAG (Retrieval-Augmented Generation) pipeline, meticulously trained on over 70 NCCN and ESMO guidelines. This ensures that the AI's recommendations are firmly grounded in established medical knowledge. Furthermore, a three-layer reflexion safety validator enforces a strict Zero-PHI policy, meaning no Protected Health Information is exposed, a critical requirement in healthcare settings.
The system's intelligence is tiered. Clinical queries are first assessed by an additive complexity scorer. Based on this assessment, queries are routed to either a 9B parameter speed-optimized model (Tier 1) for faster responses or a 27B deep-reasoning model (Tier 2) for more complex cases requiring deeper analysis. Both models underwent fine-tuning using QLoRA on a vast corpus of 266,854 oncological cases. This fine-tuning process was significantly accelerated by leveraging the Unsloth framework on AMD Instinct MI300X hardware, boasting 192 GB of HBM3 memory.
The results of this optimization are staggering. By employing sequence packing on the MI300X, the full dataset fine-tuning was completed in approximately 50 minutes, representing a remarkable 56× throughput acceleration compared to traditional methods. This boost in efficiency means faster updates and improvements to the model. Beyond speed, the Corrective RAG document grading achieved a perfect 100% success rate with a mean RAG confidence score of 2.3+, underscoring the system's reliability. Critically, OncoAgent is 100% open source and designed for on-premises deployment, directly addressing the need for patient data sovereignty and eliminating reliance on proprietary cloud APIs.
Developers and healthcare institutions can explore the technical details and contributing code behind OncoAgent on Hugging Face. The project's architecture, including the complexity router and model tiering, along with the Corrective RAG pipeline and safety mechanisms, are all available for review and adaptation. This commitment to openness allows for wider adoption and community-driven enhancements, fostering a collaborative environment for advancing AI in oncology. You can find more information and technical documentation on the OncoAgent Official Paper page.
Summary
- OncoAgent is a novel open-source oncology AI system employing a dual-tier multi-agent LLM architecture.
- It achieves unprecedented fine-tuning speeds (56x) on AMD Instinct MI300X hardware, completing the process in under an hour.
- Key features include a Corrective RAG pipeline, a Zero-PHI safety validator, and on-premises deployment for patient data sovereignty.
- The system aims to improve clinical decision support by providing accurate, guideline-grounded, and private AI assistance.
Read next

OpenAI Launches GPT-5.5 with Trusted Access for Cybersecurity Defenders
OpenAI's new GPT-5.5 models, available via Trusted Access, aim to accelerate vulnerability research and enhance critical infrastructure protection for verified cybersecurity professionals.
Continue reading