Model
Claude Opus 4.6: Early Access Customer Insights
![]()
Before Anthropic launched Claude Opus 4.6 to everyone, a small group of customers — including Harvey, bolt.new, Shopify, and Lovable — got early access to see how it handled their hardest real-world workloads. Their tests show a model that’s noticeably stronger at reasoning, more autonomous, and closer to a true collaborator than previous generations.
What it's for
Opus 4.6 is Anthropic’s top-tier model for teams that need deep reasoning and high reliability on serious work:
- Legal and enterprise workflows: Harvey ran Opus 4.6 on its BigLaw Bench of real-world legal tasks and saw it break 90.2%, the first Anthropic model to cross 90%, with 40% of tasks scored perfectly. Internal lawyers described its output as “smart and analytical, like it's actually thinking.”
- Complex coding and debugging: bolt.new used its automated eval platform plus hands-on tests and watched Opus 4.6 diagnose a long-standing waterfall graph bug on the first try, spotting eight parallel HubSpot API calls and extra raw
fetches that bypassed rate limiting. Shopify engineers had it port a large TypeScript library to Ruby, where it built a shim, ran against existing tests, and migrated almost the entire spec in one shot. - Product and app building: At Lovable, engineers used Opus 4.6 for “vibe checks” by building real apps. One stress test involving tricky subway mapping and itinerary logic went further than previous models had ever managed, and the team noticed a clear shift in how autonomously the model could explore and test ideas.
If you’re working on high-stakes legal analysis, complex codebases, or rich product experiences, Opus 4.6 is designed to take on more of the heavy lifting.
Why it matters
Across teams, the theme was the same: the relationship with the model is changing.
- Deeper reasoning and reliability: As bolt.new’s Garrett Serviss put it, “The jump in reasoning depth is real.” Opus 4.6 can trace through messy systems, pinpoint root causes, and fix issues that previous models repeatedly missed.
- Feels like a real teammate: Shopify’s Paulo Arruda described asking Opus 4.6 to move something into another menu with almost no detail — and it not only did the move but filled in design details he hadn’t thought to specify. He found himself saying “you’re absolutely right” to the model instead of correcting it. Ben Lafferty called Opus 4.6 “the first model from Anthropic that feels like a true collaborator in my day-to-day work.”
- Better instruction following and autonomy: Early testers reported far fewer prompt iterations and less micromanagement. At Lovable, you can “feel a difference in autonomy” as the model uses tools like the browser and tests on its own inside their stack. For teams, that means more time on strategy and product direction, and less time debugging the AI itself.
Opus 4.6 isn’t just faster or bigger — it’s a step toward models that you can actually trust with longer-horizon, complex tasks.
Where you get it
Claude Opus 4.6 is available through Anthropic’s platform for teams and enterprises that want a flagship reasoning model at the center of their workflows. If you’re already using Claude, you can start testing Opus 4.6 on your own benchmarks, debug sessions, and product ideas to see where it changes what your team can ship.
Try it: Behind the model launch: What customers discovered testing Claude Opus 4.6 early — then explore Claude Opus 4.6 in your Claude workspace or Anthropic developer tools.