Luna, new modes, pricing, and government-reviewed rollout¶

For / Key Points

For: Developers and team operators using LLMs through APIs or coding agents who need to know what GPT-5.6 changes, how pricing works, and when broader access may arrive.

Key points:

GPT-5.6 introduces three model tiers: Sol, Terra, and Luna, plus max reasoning and ultra mode.
Pricing per one million tokens is Sol $5 / $30, Terra $2.50 / $15, and Luna $1 / $6.
The unusual part is the release process: a limited preview after U.S. government review because of cyber capability concerns.

Published: 2026-06-28

OpenAI released GPT-5.6 as a limited preview on June 26, 2026.¹ The important change is not only that a stronger model exists. Model selection, reasoning depth, sub-agent execution, pricing, and release governance all moved at once.

This article asks one practical question: how should developers read GPT-5.6 right now? The short answer is to treat it as a preview for cost planning and agent architecture, not as an immediate final migration decision.

Sol
Flagship model. Supports max reasoning and ultra mode. $5 input / $30 output.
Terra
Intended to offer GPT-5.5-level performance at half the cost. $2.50 input / $15 output.
Luna
Lowest-cost option for high-volume work. $1 input / $6 output.

What OpenAI announced¶

GPT-5.6 ships as three capability tiers: Sol, Terra, and Luna. The number marks the generation, while the tier name marks the capability and cost layer.

OpenAI positions Sol as the flagship model, Terra as a lower-cost model near GPT-5.5 capability, and Luna as the cheapest tier.¹ The release is not broadly available on day one. During preview, access is limited to selected trusted partners through the API and Codex, with broader ChatGPT, Codex, and API availability described as coming within weeks.

The model names matter less than the selection logic. Sol is for the hardest tasks, Terra is the likely replacement candidate for existing high-end workloads, and Luna is the volume-oriented option.

What changes for developers¶

Four changes affect implementation directly. The most important split is between deeper reasoning and multi-agent execution.

Max reasoning: a new highest reasoning effort for harder design, debugging, and research tasks.
Ultra mode: a sub-agent mode that decomposes complex work, runs parts in parallel, and integrates results.
New naming system: generation numbers and capability tiers are separated, making speed, intelligence, and cost choices clearer.
Cache changes: explicit cache breakpoints and a minimum cache lifetime of 30 minutes.

Cache pricing also changes. For GPT-5.6 and later, cache writes cost 1.25x uncached input, while cache reads remain 90% discounted.¹ For long-context and iterative agent workflows, cache design can matter as much as the base model price.

How to read the pricing¶

OpenAI lists prices per one million tokens. Because output tokens are more expensive, workloads that generate long answers are more sensitive to model choice.

Model	Input	Output	Practical reading
Sol	$5	$30	Hardest tasks and expensive decisions
Terra	$2.50	$15	Standard high-performance candidate
Luna	$1	$6	Low-cost high-volume processing

For a simple one-million-input plus one-million-output calculation, Sol costs $35, Terra costs $17.50, and Luna costs $7. The more output-heavy the task, the more the model and prompting strategy affect cost.

Terra is the practical hinge. If it really offers GPT-5.5-level capability at half the price, it becomes a direct cost-optimization candidate. That still needs validation on independent benchmarks and real workloads.

Why the preview is limited¶

The stated reason is cyber capability. OpenAI says the U.S. government requested a limited start with a small set of trusted partners whose participation was shared with the government.¹

That is not a normal model launch. OpenAI also says this type of government access process should not become the long-term default. So the rollout is also a test of how frontier models may be staged when security-sensitive capability rises.

The safety stack is layered. OpenAI describes training-time refusals, real-time output classification, pausing suspicious generations for review by a larger reasoning model, account-level review, and differential access.² The system card also says automated red teaming used the equivalent of 700,000 A100e GPU hours.

Treat benchmark claims as provisional¶

OpenAI claims improvements across several benchmarks. The announcement mentions Terminal-Bench 2.1, GeneBench v1, and ExploitBench, including cases where GPT-5.6 uses fewer output tokens for comparable or better results.¹

The caveat is simple. These are preview-period claims, and OpenAI says expanded benchmark results will come with general availability. For now, the right wording is that OpenAI claims these results, not that the market has independently confirmed them.

In practice, teams should evaluate their own tasks. Code repair, long-form research, multi-step agent work, and security-sensitive validation should compare success rate, output tokens, latency, and the cost of recovering from failures.

Summary¶

GPT-5.6 is a generation update that moves both capability and price efficiency. But preview access is limited, and it is not yet the moment for a blanket migration.

There are three practical decisions to prepare.

Test whether Terra can reduce cost for existing high-end model workloads.
Measure whether Sol's max reasoning and ultra mode reduce wall-clock time on difficult work.
Use Luna and cache design to estimate high-volume processing cost.

The release process itself is the additional signal. A government-reviewed staged rollout creates a governance precedent separate from model quality. Future frontier models may be compared on performance, pricing, safety controls, and release governance at the same time.

OpenAI, Previewing GPT-5.6 Sol: a next-generation model, 2026-06-26. ↩↩↩↩↩
OpenAI, GPT-5.6 Preview system card, 2026-06-26. ↩