Claude Sonnet 5 Launch Guide: Specs, Pricing & Benchmarks [2026]
The Fennec leak was right โ Claude Sonnet 5 launched June 30, 2026. What the leak got right (coding focus, speed, default model) and wrong (2M context โ actually 1M). Full specs, benchmarks, and API details.
Update โ June 30, 2026: Sonnet 5 launched. The Fennec codename was real. Launched June 30 as default on Free/Pro. Context window: 1M tokens (the 2M leak was wrong). API: claude-sonnet-5. Intro pricing: $2/MTok input, $10/MTok output through August 31, 2026.
What the Leak Got Right vs Wrong
Claim
Leaked
Actual
Codename
Fennec
โ Correct โ Fennec was the internal codename
Launch timing
"as early as next week" from June 23
โ Correct โ launched June 30
Coding focus
Emphasized coding speed
โ Correct โ most agentic Sonnet yet
Default model
Not specified
โ Confirmed โ default on Free and Pro
Context window
2 million tokens
โ Wrong โ actual is 1 million tokens
Better price-performance
Yes
โ Confirmed โ $2/MTok intro pricing
Benchmark Charts
Benchmark
Sonnet 5
Sonnet 4.6
Opus 4.8
SWE-bench Pro (agentic coding)
63.2%
58.1%
69.2%
Terminal-Bench 2.1 (agentic coding)
80.4%
67.0%
82.7%
Humanity's Last Exam (no tools)
43.2%
34.6%
49.8%
Humanity's Last Exam (with tools)
57.4%
46.8%
57.9%
OSWorld-Verified (computer use)
81.2%
78.5%
83.4%
GDPval-AA v2 (knowledge work)
1618
1395
1615
An X post from developer account @Mr_Salio on June 23, 2026 dropped what the AI community immediately flagged as a credible-sounding Anthropic internal leak: Claude Sonnet 5 is coming, possibly this week, and it carries a set of reported specs that would make it the most significant Sonnet release since the model series began.
The leak landed while Fable 5 and Mythos 5 remain suspended under US export controls โ making any new Sonnet tier especially interesting for developers who lost access to Anthropic's reasoning flagship.
The source claims to have knowledge of an internal codename โ Fennec โ and a set of reported capabilities that, if accurate, would reshape how developers and enterprises think about the Anthropic model lineup.
Here is what the leak says, what is credible, what to be skeptical of, and what it would mean for AI buyers if it is real.
What the Leak Claims
The reported details from @Mr_Salio:
Internal codename: Fennec
Expected timeline: As early as this week (from June 23, 2026)
Focus: Coding, speed, and efficiency
Competitive positioning: Better price-performance than Claude Opus 4.8 and Fable 5
Context window: 2 million tokens
Five data points. No benchmark numbers, no pricing, no architecture details. The post has accumulated 38,300 views as of the time of writing โ typical for well-sourced AI model leaks rather than pure speculation.
Why This Is Plausible
The Timeline Fits
Anthropic's model release cadence in 2026 has been aggressive. Fable 5 launched in early June 2026 before the export ban. Releasing a new Sonnet model within weeks of Fable would follow a pattern Anthropic has used before โ a powerful flagship launches first, followed quickly by an efficient sibling model designed for different workloads and buyers.
For the latest on whether Sonnet 5 actually shipped vs rumor-only activity, see the June 30 Sonnet 5 rumor tracker โ which notes the claude-sonnet-5 slug appeared in February and ultimately became Sonnet 4.6, not a generational leap.
The "Fennec" codename surfacing ahead of release is also consistent with how model leaks typically work: internal names get mentioned in API responses, documentation drafts, or developer tooling before the public announcement.
The 2M Context Window Makes Strategic Sense
Google's Gemini models have long offered multi-million token context windows. Anthropic's Claude currently tops out at 200,000 tokens on standard API access (with some enterprise arrangements at higher limits). A 2 million token context window would close a significant competitive gap โ see LLM context window explained for how window size translates to real workloads โ particularly for:
Codebases โ processing an entire large monorepo in a single prompt
Long-horizon agentic tasks โ agents that need to hold weeks of context without truncation (Qwen 3.7-Max long-horizon autonomy is the open-weight benchmark for this axis)
Research and analysis โ full academic literature sets on a topic
The timing also makes sense. Context window size has become a practical differentiator rather than a theoretical one as agentic workloads mature. A 2M window on a fast, efficient Sonnet-tier model is exactly what production agent pipelines need.
Coding Focus Is the Right Differentiation
Anthropic's Claude Code CLI has seen significant developer adoption. Sonnet-class models are the workhorse of most Claude Code workloads โ fast enough for interactive coding, capable enough for non-trivial refactors. A Sonnet 5 with explicit optimization for coding tasks would directly target:
GitHub Copilot's market position โ Copilot runs on GPT-4 variants; a faster, cheaper Sonnet optimized for code is a direct competitive move
The Claude Code user base โ currently using Sonnet 4.6 as the default model, who would immediately benefit from a Sonnet 5 upgrade (context limit management becomes less painful at 2M)
Enterprise AI coding deployments โ teams running self-hosted or API-powered coding assistants at scale where token cost is a real budget line
Better Price-Performance Than Opus and Fable 5
This is the positioning that matters most for enterprise buyers. The current Anthropic lineup has a problem: Fable 5 is the most capable model but also the most expensive, and Claude Opus 4.8 commands premium pricing for general intelligence. Most production workloads do not need maximum intelligence โ they need adequate intelligence at acceptable speed and cost.
Teams blocked from Fable during the ban already moved to GLM-5.2, Kimi K2.7, or open-weight stacks โ Sonnet 5 would need to win them back on price-performance, not raw benchmark peaks.
Sonnet has historically been the "80% of the performance at 40% of the price" position. Sonnet 5 with improved efficiency would further stretch that ratio โ particularly appealing for:
High-throughput document processing โ where you run thousands of prompts per day
Real-time coding assistance โ where latency matters as much as accuracy
Customer-facing AI products โ where token cost flows directly to margin
What to Be Skeptical Of
Unconfirmed Source
@Mr_Salio is a developer known for building apps in public. The account is not a verified insider, does not have a strong track record of confirmed Anthropic leaks, and the post provides no source detail. The 38K views reflects interest, not validation.
Model Leaks Are Frequently Wrong on Details
The history of AI model leaks is that the general shape often proves correct (a new model is coming) while the specifics are frequently off (timeline, context window size, pricing). Leaks often reflect aspirational internal specs rather than final released specs.
"Fennec" Has Not Appeared in API Responses or Documentation
One reliable signal of an imminent model release is the internal codename or model ID appearing in API responses, error messages, or SDK documentation before the announcement. There is no confirmed sighting of "claude-sonnet-5" or "fennec" in any public-facing Anthropic tooling as of this writing.
The Anthropic Model Lineup If Sonnet 5 Drops
Assuming the leak is broadly accurate, the post-release lineup would look roughly like this:
Sonnet 5 would carve out the critical middle space: fast enough and cheap enough for production scale, capable enough for coding tasks that break Haiku. This is the slot where most enterprise API spend lives.
The 2M Context Window in Practice
If the 2 million token context window is real, the practical implications deserve more than a passing mention.
For developers: A 2M context window means you can send an entire repository โ including tests, documentation, and configuration โ as a single prompt. Code generation and refactoring tasks that currently require chunking and summarization could run end-to-end in a single call. Claude Code workloads would be directly affected.
For document-heavy enterprises: Legal teams reviewing contract libraries, financial analysts working through SEC filings, compliance teams reviewing audit trails โ all currently require chunking pipelines. With 2M tokens, many of those pipelines collapse to a single call.
For agentic systems: Agent pipelines that need to maintain state over long time horizons currently hit context window limits and resort to compression or summarization strategies that lose information. A 2M window gives agents dramatically more room โ though Apodex-1.0-mini and open Agents-A1 show 35B open models already competing on long-horizon research without waiting for Sonnet 5.
The cost implication: processing 2M input tokens per call is expensive. But for workloads where the alternative is a multi-step pipeline with its own engineering and operational overhead, a single large call may be both cheaper and more reliable.
What Anthropic Has Not Said
Anthropic has not confirmed any of the following:
A model named Claude Sonnet 5
A model with the internal codename Fennec
A 2M token context window on any upcoming model
A release timeline for any model this week or this month
Anthropic's practice for model releases is to announce with little or no advance notice โ either a blog post on the morning of release or, occasionally, a same-day API update before the blog publishes. The absence of official information does not confirm or deny the leak.
When to Expect Official Confirmation
The strongest signals that a model release is imminent:
Model ID appearing in Anthropic API responses โ API errors or responses sometimes reference model IDs before public release
Prompt cache documentation updates โ model-specific prompt caching instructions are typically updated at release
Claude.ai UI changes โ the model selector on Claude.ai updates at release
Anthropic blog post โ typically published on the morning of release
Until one of those appears, the Fennec leak remains an unconfirmed report from a single source.
The Bigger Picture
Whether or not this specific leak is accurate, the competitive pressure on Anthropic to keep releasing is real.
OpenAI has released multiple models in 2026. Google's Gemini lineup has expanded. Meta's Llama continues to improve. The competitive half-life of a frontier model is measured in weeks, not months. For Anthropic, releasing Fable 5 in early June and following it with Sonnet 5 before July would be evidence that their development pipeline is running at competitive speed โ which is itself a message to the market.
OpenAI has released multiple models in 2026. Google's Gemini lineup has expanded. Meta's Llama continues to improve. The competitive half-life of a frontier model is measured in weeks, not months. For Anthropic, releasing Fable 5 in early June and following it with Sonnet 5 before July would be evidence that their development pipeline is running at competitive speed โ which is itself a message to the market.
The codename Fennec โ a small, fast, efficient desert fox โ suggests someone at Anthropic has a sense of the positioning. If that is the intended message, the model is designed to win on speed and economy, not size. For production AI workloads, that is often the right optimization.
We will update this post when Anthropic makes an official announcement. Benchmark and ban status change frequently โ verify against anthropic.com/news and our Fable status hub. Last updated: July 2, 2026.