Claude Sonnet 4.6: Features, Benchmarks, Pricing and Guide
Anthropic released Claude Sonnet 4.6 on 17 February 2026, and within hours it became the default model on claude.ai. It represents a qualitative shift in the Sonnet line, offering frontier-level intelligence at a fraction of the cost.
The pitch is straightforward: performance that used to require paying the Opus premium is now available at the balanced Sonnet price point. For 90% of daily tasks, Sonnet 4.6 is now the optimal choice.
The 1M Token Context Window Comes to Sonnet
Like Opus 4.6, Sonnet 4.6 gets the massive 1M token context window. This allows users to feed the model entire document libraries or multi-file codebases.
Memory retrieval performance
On MRCR v2 (8 needles at 1M), Sonnet 4.6 scores 65% Mean Match Ratio—a 3.5x improvement over Sonnet 4.5's 18.5%. It makes long-context reasoning truly viable at scale.
Computer Use: Best-in-Class Automation
Anthropic's latest OSWorld benchmarks position Sonnet 4.6 as their strongest model for computer use. It excels at navigating complex spreadsheets, filling multi-step web forms, and cross-tab browser workflows.
Prompt Injection Resistance
Sonnet 4.6 matches Opus 4.6 in security, with a 99.38% refusal rate for malicious operational requests—critical for safe web-browsing agents.
Benchmarks: Closing the Gap
| Benchmark | Sonnet 4.6 | Opus 4.6 | GPT-5.2 |
|---|---|---|---|
| Sway Bench (Coding/Finance) | 79.6 | - | - |
| Instruction Following | Beats GPT-5.2 | - | Baseline |
| Latency per Quality | Best Ratio | Higher | Mixed |
When to Use Sonnet 4.6 vs Opus 4.6
Use Sonnet 4.6 for:
- • 90% of daily coding tasks
- • High-volume data processing
- • Web & Desktop automation agents
- • Routine knowledge work
Reserve Opus 4.6 for:
- • Complex, zero-error architecture
- • Extended 14h+ agentic sessions
- • Deep disciplinary research
- • High-stakes infrastructure logic
Pricing advantage: Sonnet 4.6 is roughly 1.7x cheaper per million tokens than Opus 4.6, with faster response times.
Frequently Asked Questions
What are the key features and improvements of Claude Sonnet 4.6?
Claude Sonnet 4.6 is Anthropic's new default model, offering frontier-level intelligence at a balanced price point. Key features include a 1M token context window, adaptive thinking, best-in-class computer use capabilities, and significant improvements in memory retrieval and agentic coding performance. It approaches Opus-level intelligence for most daily tasks.
What is the context window capacity of Claude Sonnet 4.6 and its memory retrieval performance?
Claude Sonnet 4.6 features a massive 1M token context window, allowing it to process entire document libraries or multi-file codebases. On the MRCR v2 benchmark (8 needles at 1M), it achieves a 65% Mean Match Ratio, a 3.5x improvement over Sonnet 4.5, making long-context reasoning truly viable.
How does Claude Sonnet 4.6 perform in computer use and automation tasks, including prompt injection resistance?
Sonnet 4.6 is Anthropic's strongest model for computer use, excelling at tasks like navigating spreadsheets, filling web forms, and cross-tab browser workflows, as shown by OSWorld benchmarks. It also matches Opus 4.6 in security, with a 99.38% refusal rate for malicious operational requests, crucial for safe web-browsing agents.
How does Claude Sonnet 4.6 compare to Claude Opus 4.6 and GPT-5.2 in terms of benchmarks and performance?
Sonnet 4.6 significantly closes the gap with Opus 4.6 and even surpasses GPT-5.2 in certain areas. It scores 79.6 on Sway Bench (Coding/Finance), beats GPT-5.2 in instruction following, and offers the best latency-per-quality ratio among the models mentioned, making it highly efficient.
What are the recommended use cases for Claude Sonnet 4.6 versus Claude Opus 4.6, considering cost and performance?
Sonnet 4.6 is recommended for 90% of daily coding, high-volume data processing, web/desktop automation, and routine knowledge work due to its balanced performance and cost-effectiveness (roughly 1.7x cheaper per million tokens than Opus 4.6). Opus 4.6 should be reserved for complex, zero-error architecture, extended agentic sessions, deep disciplinary research, and high-stakes infrastructure logic where absolute frontier performance is critical.

AI Tools Review Editorial Team Expert Verified
Our editorial team consists of veteran AI researchers, software engineers, and industry analysts. We spend hundreds of hours benchmarking frontier models natively to provide you with objective, actionable intelligence on agentic AI capabilities and cybersecurity landscapes.