AI Tools Review
GPT-5.6: What's New, Benchmarks & Pricing

GPT-5.6: What's New, Benchmarks & Pricing

21 June 2026

Quick Answer:

GPT-5.6 is not officially released. As of 21 June 2026 there is no openai.com announcement, no system card, no API model page and no published benchmarks. What exists is a credible trail of leaks: a brief gpt-5.6 reference spotted in OpenAI Codex routing logs, a chain of internal codenames running up to a release candidate, and reporting that chief scientist Jakub Pachocki called it a meaningful improvement over GPT-5.5. The strongest signals point to a late-June 2026 launch, with a GPT-5.6 Pro variant expected to follow the GPT-5.5 split. Everything about its capability, pricing and context window is, for now, informed speculation. This piece separates the confirmed from the rumoured.

Every few months the AI world holds its breath for the next OpenAI model. In June 2026 the held breath is for GPT-5.6, a release that has been leaked, codenamed and traded on prediction markets, but never actually announced.

This is a careful what-we-know, what-is-rumoured breakdown. We will draw a hard line between the handful of facts that can be sourced and the much larger pile of expectation built on top of them, and we will use GPT-5.5, a model that genuinely exists, as the anchor for what GPT-5.6 might become.

Executive Summary

GPT-5.6 sits in an unusual place: it is one of the most discussed models of 2026, and one of the least documented. OpenAI has shipped a rapid cadence through the GPT-5 line this year, GPT-5.4 on 5 March and GPT-5.5 on 23 April, and the pattern points to another step soon. But a pattern is not a press release. The substantive, sourceable facts about GPT-5.6 are few, and honesty about that gap is the whole point of this article.

Here is what can actually be stood behind today, with the rest flagged clearly as speculation.

  • Confirmed: OpenAI has not officially announced or documented GPT-5.6 as of 21 June 2026. The newest confirmed flagship is GPT-5.5.
  • Leaked: a brief gpt-5.6 reference appeared in Codex routing logs, then vanished; internal codenames progressed toward a release candidate.
  • Reported: Jakub Pachocki reportedly told staff GPT-5.6 is a meaningful improvement over GPT-5.5, with the headline gains aimed at long-horizon agentic and Codex work.
  • Expected: a GPT-5.6 Pro deliberative variant, a possible GPT-5.6 Mini, and a context window observed by some users behaving as if larger than GPT-5.5's.
  • Unknown: every benchmark number, the exact price list, the precise context limit and the official release date.

What's New in GPT-5.6

Because nothing has been announced, this section describes what the leaks suggest rather than what OpenAI has confirmed. Read every claim here with that caveat attached.

The leak trail

The clearest single piece of evidence is a brief, reproducible reference to gpt-5.6 that surfaced in OpenAI's Codex backend routing logs in May 2026 before disappearing. It was just a model name, with no configuration or capability data attached. On its own that proves only that a build with this label existed internally, which is exactly what you would expect mid-development. Alongside it, reporting has tracked a chain of internal codenames, names such as iris-alpha, ember-alpha and beacon-alpha, leading up to a release candidate that briefly appeared on a public model-testing arena before being pulled.

The agentic focus

The recurring theme across leak coverage is that GPT-5.6's headline improvements target multi-hour agentic and Codex computer-use tasks rather than single-turn chat. The pitch, as reported, is better long-horizon coding, fewer wasted tokens on extended jobs and quicker Codex responses. This is plausible because it is precisely where OpenAI's previous generation trailed Claude, but plausible is not confirmed. No demo, eval or technical note from OpenAI backs it yet.

Context window and knowledge cutoff

Several reports describe Pro subscribers observing behaviour consistent with a context window pushed toward roughly 1.5 million tokens, up from the 1 million GPT-5.5 shipped with. This is a behavioural observation, not an API-documented figure, so it should be treated as a rumour with a number attached rather than a spec. A refreshed knowledge cutoff covering early-to-mid 2026 events has also been mentioned, again without confirmation. For reference, GPT-5.5's documented cutoff is December 2025.

Variants and a voice model

The leaks point to the same family structure OpenAI used for GPT-5.5: a standard GPT-5.6, a deliberative GPT-5.6 Pro for long-horizon research and codebase-wide work, and a possible GPT-5.6 Mini. A separate audio model, referred to as GPT-Bidi-1, has also been mentioned, described as a bidirectional voice system with interruption handling. None of these names should be taken as final product names until OpenAI confirms them.

Benchmarks: The Real Numbers

Here the honest answer is short: there are no GPT-5.6 benchmark numbers. No public eval data exists for SWE-bench Verified, SWE-bench Pro, GPQA, AIME, Humanity's Last Exam, Terminal-Bench or any other suite. Any specific GPT-5.6 figure circulating online is unsourced speculation, and we will not invent one to fill the gap.

What we can do is give you the confirmed baseline GPT-5.6 must beat. These are GPT-5.5's own reported results, vendor-stated and not independently audited, but real and on the record:

  • SWE-bench Pro: 58.6% (agentic, multi-file coding).
  • GPQA Diamond: 93.6% (graduate-level science reasoning).
  • Terminal-Bench 2.0: 82.7% (command-line agent tasks).
  • Humanity's Last Exam: 41.4% without tools.
  • GDPval: 84.9% (economically valuable knowledge work).

If GPT-5.6 is the agentic-coding upgrade the leaks describe, the SWE-bench and Terminal-Bench rows are where you should expect the biggest reported jumps when OpenAI finally publishes a system card. Until that card lands, the benchmark column for GPT-5.6 is blank, and we recommend treating any chart that claims otherwise with suspicion. For the wider competitive picture, our Claude vs ChatGPT, Gemini and Grok comparison tracks the confirmed numbers across vendors.

Pricing & Access

OpenAI has not published GPT-5.6 pricing, so this section is inference from the GPT-5.5 launch, not a price list. We flag it as such because pricing is one of the easiest things to get wrong before launch.

Likely API pricing

GPT-5.5 launched on the API at £4 (about 5 US dollars) per million input tokens and £24 (about 30 US dollars) per million output tokens, with the deliberative GPT-5.5 Pro at roughly £24 (30 US dollars) input and £142 (180 US dollars) output per million. The reasonable default expectation is that GPT-5.6 holds or lands close to that structure, since OpenAI has tended to keep per-token rates stable across point releases while improving token efficiency. If the leaks about reduced token usage on long agentic jobs hold up, the effective cost per completed task could fall even if the headline rate does not.

Likely ChatGPT access

GPT-5.5 reached ChatGPT as a Thinking model on Plus, Pro, Business and Enterprise, with the Pro variant restricted to the Pro, Business and Enterprise tiers, and Codex available broadly including on the cheaper Go plan. GPT-5.6 would most likely follow the same shape: a GPT-5.6 Thinking option for Plus subscribers and above, with GPT-5.6 Pro gated behind the Pro plan. None of this is confirmed, and OpenAI sometimes changes tier availability at launch, so check the official help centre once the model ships.

How It Compares

A like-for-like comparison is impossible while GPT-5.6 has no numbers. What follows is the confirmed lay of the land that GPT-5.6 will arrive into, so you can judge its claims against reality once they appear.

Versus GPT-5.5

GPT-5.5 is the model GPT-5.6 has to beat, and the internal framing reported so far, a meaningful improvement focused on agentic and Codex tasks, suggests an iterative step rather than a generational leap. If that holds, expect single-digit to low double-digit percentage-point gains on coding and agent benchmarks, plus efficiency wins, rather than a wholesale reinvention. The headline question is whether GPT-5.6 finally moves OpenAI's SWE-bench Pro figure out of the 50s.

Versus Claude Opus 4.8 and Fable 5

On the confirmed previous generation, Anthropic holds a clear agentic-coding lead. Claude Opus 4.8 scored 69.2% on SWE-bench Pro, and the frontier Claude Fable 5 reached 80% on the same benchmark, against GPT-5.5's 58.6%. OpenAI countered with strength on some terminal and tool-use tasks, where GPT-5.5's 82.7% Terminal-Bench score was competitive. If GPT-5.6's agentic-coding focus is real, this Claude lead is exactly the gap OpenAI is targeting, but until GPT-5.6 posts a number, Claude remains ahead on the metric that matters most for serious software work.

Versus Gemini

Google's Gemini 3.1 Pro rounds out the top tier, strongest where work is long-context, multimodal or already wired into Google's cloud and tooling. On the previous generation it trailed both Claude and GPT-5.5 on hard agentic coding while staying competitive on multimodal and long-document tasks. GPT-5.6's rumoured larger context window, if it materialises, would be a direct shot at one of Gemini's traditional advantages. The realistic 2026 takeaway, which predates GPT-5.6 and will outlast it, is that the leading models are converging and the smart move is routing each task to the cheapest capable model rather than betting on one winner.

Limitations & What's Unconfirmed

This is the most important section in the piece, because the safest thing to say about GPT-5.6 is how much we do not know. Here is the unconfirmed list, kept explicit so nothing here is mistaken for fact.

  • No official existence yet: OpenAI has not announced, documented or priced GPT-5.6. It may launch under a different name, on a different date, or with a different family structure than the leaks suggest.
  • No benchmarks: every capability claim, including the agentic-coding focus, is leak-based. There is no system card, no eval table and no independent testing.
  • Context window is observed, not documented: the roughly 1.5 million token figure comes from user behaviour reports, not an API spec, and could be wrong.
  • Pricing is inferred: the figures above are GPT-5.5's, used as a proxy. OpenAI has published no GPT-5.6 rates.
  • Date is a market guess: the late-June window rests on prediction markets and leak timing, not an OpenAI commitment, and release dates slip routinely.
  • Codenames are not products: internal build names appearing in logs or test arenas do not guarantee a public release or a final name.

When the official announcement and system card arrive, this article will need updating against the real figures. Until then, anyone publishing confident GPT-5.6 benchmark tables or price lists is, at best, guessing.

Who Should Use It

Strictly speaking, nobody can use GPT-5.6 yet, because it is not out. But the practical question, who should care about it and what should they do now, is worth answering.

Engineering and Codex-heavy teams have the most reason to watch this release. If the long-horizon agentic-coding gains are real, GPT-5.6 is aimed squarely at the work where OpenAI has been losing ground to Claude. The sensible move is to keep your current setup running and benchmark GPT-5.6 on your own tasks the moment it ships, rather than acting on leaked numbers.

Everyone else can wait. For general chat, writing, analysis and research, GPT-5.5, Claude and Gemini already cover the ground, and there is nothing to rush toward in a model that has no confirmed capabilities. The right posture for most readers is to treat GPT-5.6 as a watch item, not a buy item, until OpenAI publishes the system card. If you are choosing a model today, our cross-vendor comparison is built on confirmed numbers rather than rumours.

Frequently Asked Questions

Is GPT-5.6 officially released?

No. As of 21 June 2026 there is no OpenAI announcement, system card or API page for GPT-5.6. The newest confirmed flagship is GPT-5.5, from 23 April 2026. GPT-5.6 exists only as a Codex log leak, a codename trail and reporting that Jakub Pachocki called it a meaningful improvement.

When will GPT-5.6 be released?

There is no official date. Leaks and prediction markets point to late June 2026, roughly 15 June to 5 July, with high market odds for a release by 30 June. These are expectations, not commitments, so the date may slip.

What benchmarks does GPT-5.6 score?

None are published. Any specific number is speculation. As a confirmed baseline, GPT-5.5 scored 58.6% on SWE-bench Pro, 93.6% on GPQA Diamond, 82.7% on Terminal-Bench 2.0 and 41.4% on Humanity's Last Exam without tools.

How much will GPT-5.6 cost?

Unconfirmed. The likely structure, inherited from GPT-5.5, is around £4 (5 US dollars) per million input and £24 (30 US dollars) per million output, with a Pro variant near £24 (30 US dollars) input and £142 (180 US dollars) output.

How does GPT-5.6 compare to Claude?

It has no numbers yet. On the confirmed prior generation, Claude led on agentic coding: Opus 4.8 at 69.2% and Fable 5 at 80% on SWE-bench Pro, versus GPT-5.5's 58.6%. Closing that gap is the obvious target for GPT-5.6.

The Bottom Line

GPT-5.6 is real enough to leak and rumoured enough to fill a thousand headlines, but as of 21 June 2026 it is not real enough to benchmark, price or properly review. The confirmed facts fit in a paragraph: OpenAI has not announced it, the newest official model is GPT-5.5, a gpt-5.6 label briefly appeared in Codex logs, and senior staff reportedly see it as a meaningful step forward aimed at agentic coding. Everything richer than that, the 1.5 million token window, the Pro and Mini variants, the late-June date, is informed speculation.

That is not a reason to ignore it. The leak trail is consistent, the cadence supports an imminent release, and the agentic focus would put OpenAI's effort exactly where it has been trailing Anthropic. The right move is to stay ready and stay sceptical: bookmark the official channels, ignore the confident benchmark charts that have no source, and test the model on your own work the day it actually ships. When OpenAI publishes the system card, we will replace the rumours here with the real numbers.

Last updated: June 2026. GPT-5.6 is unconfirmed as of 21 June 2026. Confirmed figures here are GPT-5.5's own reported results; all GPT-5.6 capability, pricing and date claims are leak-based and clearly flagged as such, and will be revised against OpenAI's official announcement and system card when published.

Free Guide

Get the free guide: Claude vs ChatGPT, Gemini & Grok

A 20-page playbook covering everything you need to choose and use the big four AI models in 2026, full cost and feature comparisons, what each is best (and worst) at, and how-tos for images, vectors, building a website, Claude Code and more.

Pop your email in to get it free
Preview of the free guide: Claude vs ChatGPT, Gemini and Grok, 2026 features, pricing and what-you-can-do comparison.
AI Tools Review Editorial Team

AI Tools Review Editorial Team Expert Verified

Our editorial team consists of veteran AI researchers, software engineers, and industry analysts. We spend hundreds of hours benchmarking frontier models natively to provide you with objective, actionable intelligence on agentic AI capabilities and cybersecurity landscapes.