Released February 17, 2026 · Anthropic

Claude
Sonnet 4.6

The Model That Changed The Math

Opus-level intelligence. Sonnet pricing. A full upgrade across coding, computer use, reasoning, agents, and design — at exactly zero extra cost.

FEB 17 2026  ·  API: claude-sonnet-4-6

72.5% Computer Use
79.6% SWE-bench
70% Dev Preference
1M Token Context
Computer Use Growth
$3 / M input tokens
Scroll to explore
Introduction

What Is Claude Sonnet 4.6?

The most capable mid-tier AI model ever released — and it's now your default.

Claude Sonnet 4.6 is the latest — and most capable — model in Anthropic's Sonnet line. Released February 17, 2026, it represents a full upgrade across six pillars: coding, computer use, long-context reasoning, agent planning, knowledge work, and design.

What makes it remarkable is not just that it's better at everything. It's that it delivers capability that previously required reaching for Anthropic's flagship Opus class — at a price that hasn't changed since Sonnet 4.5. Same $3 / $15 per million tokens. Materially smarter in every measurable dimension.

If you're on a Free or Pro plan at claude.ai or Claude Cowork, Sonnet 4.6 is already your default model. You've already upgraded — for free.

💡
Did you know?

In blind head-to-head tests inside Claude Code, developers preferred Sonnet 4.6 over the previous flagship model (Opus 4.5) 59% of the time — despite Opus 4.5 being a tier above and more expensive. A mid-tier model beating the previous flagship is historically rare.

Capability Overview

Sonnet 4.5 vs. Sonnet 4.6 across all six pillars (illustrative, index 0–100)

Based on benchmark data and reported improvements. Computer use: OSWorld-Verified. Coding: SWE-bench Verified. Math: Box/enterprise evaluations.

Six Upgrade Pillars

What Got Better — And By How Much

⌨️

Coding

79.6% on SWE-bench Verified. Preferred by developers 70% of the time over its own predecessor. Less overengineering, fewer hallucinations, better context retention in long sessions.

Explore →
🖥️

Computer Use

72.5% on OSWorld-Verified — nearly 5× higher than where the Sonnet line started just 16 months ago. Real software. No special APIs. 94% accuracy in insurance workflows.

Explore →
🧠

Long-Context Reasoning

1 million token context window (beta). Reasons — not just retrieves — across entire codebases, legal contracts, or dozens of research papers simultaneously.

Explore →
🤖

Agent Planning

Nearly tripled earnings vs Sonnet 4.5 in Vending-Bench Arena — a competitive, multi-model business simulation. Adaptive thinking decides when deep reasoning is needed.

Explore →
📊

Knowledge Work

89% math accuracy (up from 62%). Matches Opus 4.6 on OfficeQA — enterprise document comprehension. +15 percentage points on Box's heavy reasoning Q&A benchmark.

Explore →
🎨

Design

Independently described by multiple customers as having "perfect design taste." Better layouts, animations, and visual outputs — with fewer iteration rounds needed.

Explore →
By The Numbers

Key Stats At A Glance

72.5% OSWorld Computer Use Score Up from 14.9% in Oct 2024
79.6% SWE-bench Coding Benchmark Averaged over 10 trials
89% Math Math Accuracy Up from 62% in Sonnet 4.5
1M Tokens Context Window (Beta) Full codebase in one request
70% Developer Preferred vs Sonnet 4.5 Claude Code head-to-head
59% vs Opus Preferred vs Opus 4.5 Previous flagship model
$3 / M Input Token Price Unchanged from Sonnet 4.5
94% Insurance Pace Computer Use Benchmark Highest any Claude model scored
Trivia & Fun Facts

Surprising Things About Sonnet 4.6

🚀
The Leap

In October 2024, Anthropic called computer use "still experimental — at times cumbersome and error-prone." The score then was 14.9%. Sonnet 4.6 scores 72.5%. That's a 57.6 percentage point leap in sixteen months.

🎯
The Strategy

In the Vending-Bench Arena, where AI models compete to run a simulated business, Sonnet 4.6 developed its own novel strategy — invest heavily for 10 months, then pivot sharply to profit. It wasn't instructed to do this. It figured it out.

🧮
The Math Surprise

Math accuracy jumped from 62% to 89% — a 27 percentage point improvement. For context, that's the difference between a C-grade and an A in most academic grading curves.

📖
Context In Real Terms

A 1 million token context window can hold roughly 750,000 words — that's approximately 10 full-length novels, or the entire Linux kernel source code, in a single conversation.

💰
The Price Paradox

Sonnet 4.6 is 40% cheaper per token than Opus 4.6 — yet in coding evaluations, developers preferred it over the previous flagship (Opus 4.5) more than half the time. The era of paying more for better code is over.

🛡️
Safety Character

Anthropic's safety researchers described Sonnet 4.6 as having "a broadly warm, honest, prosocial, and at times funny character." That's not marketing copy — that's a formal safety evaluation finding.