Claude Opus 4.8 is here: faster, smarter, and significantly better at the work that matters
Anthropic's Claude Opus 4.8 brings stronger coding, agentic tasks, and professional workflows — at the same price as its predecessor.
Anthropic has released Claude Opus 4.8, its most capable generally available model to date. The update arrives just 41 days after Opus 4.7, which is a notably fast turnaround for Anthropic. The short cycle is likely no accident: Opus 4.7 received a lukewarm reception from users who found it underwhelming, and the competitive pressure from OpenAI and Google has only intensified since then.
The short version: Opus 4.8 is meaningfully better at coding, autonomous task completion, and professional work, and it costs the same as 4.7. If you use Claude for anything serious, the upgrade is worth paying attention to.
Better at the things developers actually care about
On agentic coding benchmarks, Opus 4.8 scores 69.2% on SWE-Bench Pro, up from 64.3% on 4.7, and outperforms GPT-5.5 and Gemini 3.1 Pro on that test. Multidisciplinary reasoning with tools climbs from 54.7% to 57.9%. On Anthropic’s Super-Agent benchmark, Opus 4.8 is the only model to complete every case end-to-end, including tasks that GPT-5.5 at cost parity could not finish.
For developers using Claude Code, the practical improvement is that the model asks better clarifying questions before making large changes, catches its own mistakes more reliably, and is about four times less likely to let flaws in generated code go unremarked compared to 4.7. It pushes back when a plan is not sound, which sounds like a small thing but matters a lot on complex, multi-service work where a silent bad assumption compounds quickly.
Dynamic Workflows: Claude coordinating Claude
Alongside the model itself, Anthropic has introduced Dynamic Workflows, currently in research preview for Claude Code users on Enterprise, Team, and Max plans.
The idea is straightforward: for large tasks, Claude plans the work, spins up hundreds of parallel subagents within a single session to handle different parts simultaneously, then verifies the outputs before returning the result to you. This is meaningful for tasks that would previously require you to manage multiple prompts or break a project into manually sequenced chunks. With Dynamic Workflows, you hand off the coordination problem along with the task itself.
Effort controls: spending tokens deliberately
Opus 4.8 introduces effort controls across claude.ai and Claude Code. You can set how much the model “thinks” before responding. Higher effort produces better results on difficult tasks but uses more tokens. Lower effort is faster and draws down rate limits more slowly, which matters if you are running Claude in an agentic loop or billing customers for usage.
Developers using the API can set effort levels directly. The model uses adaptive thinking, meaning it only triggers extended reasoning when a turn actually needs it, so you are not wasting tokens on simple requests the way a fixed extended-thinking model might.
Fast mode for Opus 4.8 runs 2.5 times faster than its predecessor and costs three times less, at $10 per million input tokens and $50 per million output. Standard pricing starts at $5 per million input tokens and $25 per million output, with up to 90% savings through prompt caching and 50% through batch processing.
What this means for legal and professional teams
Claude Opus 4.8 sets a new record on Anthropic’s Legal Agent Benchmark and is the first model to break 10% overall on the all-pass standard. That benchmark measures end-to-end accuracy on substantive legal tasks, not just retrieval or summarisation. For teams using AI to assist with legal work, the difference between a model that “mostly” completes a task correctly and one that passes end-to-end accuracy tests is the difference between a tool you trust and one you constantly double-check.
More broadly, Anthropic says Opus 4.8 handles multi-day projects with stronger context retention, and performs well on spreadsheets, slides, and documents. The claim is that it can manage complex professional workflows end-to-end with less need for you to re-explain context mid-task.
A note on honesty and alignment
This release puts some emphasis on model behaviour, not just capability. Anthropic reports that Opus 4.8 has lower rates of deceptive behaviour compared to 4.7, and performs at a level similar to the still-unreleased Mythos Preview model on measures of whether the system acts in line with user interests and instructions.
The code honesty improvement is the most concrete: the model is four times less likely to let its own coding errors go unremarked. For anyone who has been burned by a model confidently delivering broken code, that improvement is genuinely useful.
A few technical details worth knowing
- Context window: 1 million tokens on the Claude API, Amazon Bedrock, and Vertex AI (200k on Microsoft Foundry), with 128k max output tokens.
- Prompt caching: The minimum cacheable prompt length drops to 1,024 tokens, lower than on 4.7. Prompts that could not be cached before may now be eligible without any code changes.
- No extended thinking budgets: Opus 4.8 does not support the
budget_tokensparameter. Use adaptive thinking and the effort parameter instead. - Messages API: The API now accepts
role: "system"messages mid-conversation, so you can append updated instructions later in a long-running session without restating the full system prompt. This preserves prompt cache hits on earlier turns and reduces cost in agentic loops. - Refusal categories: The Messages API now returns refusal categories in
stop_details, making it easier to route different types of declined requests appropriately in your application.
Where this sits on the roadmap
Opus 4.8 is Anthropic’s best publicly available model, but the more powerful Mythos model is still held back. A preview raised cybersecurity concerns, and Anthropic says Mythos-class models are expected in the coming weeks once necessary safeguards are in place.
If your work lives in the middle of the capability spectrum, Opus 4.8 is a solid, fairly priced upgrade right now. If you are waiting for frontier-level performance on the hardest tasks, Mythos is apparently not far off.
Opus 4.8 is available now on claude.ai for Pro, Max, Team, and Enterprise subscribers, and on the Claude API, Amazon Web Services, Google Cloud, and Microsoft Foundry.