Microsoft MAI-Code-1-Flash in GitHub Copilot: Availability, Pricing, and Performance¶

For / Key Points

For: Developers tracking GitHub Copilot model selection, AI Credits billing, and Microsoft's in-house coding model strategy.

Key Points:

MAI-Code-1-Flash is rolling out to Copilot Free, Pro, Pro+, and Max, starting with a limited set of VS Code users².
The model is both 137B total parameters and 5B active parameters; those numbers describe different parts of the MoE architecture³.
Its Copilot pricing sits in the lightweight tier, so it is best read as an efficient everyday coding model, not a full replacement for frontier models⁵.

On June 2, 2026, Microsoft announced MAI-Code-1-Flash for GitHub Copilot¹. The easy headline is that Copilot now has a free Microsoft-built coding model. The practical reading is narrower.

The question is straightforward. Who can use MAI-Code-1-Flash, what does it cost, and where does its performance actually fit?

What Microsoft Added¶

MAI-Code-1-Flash is a lightweight coding model built for GitHub Copilot. Microsoft says it trained the model end to end using Microsoft infrastructure, data pipelines, and curated data³.

The most important specification detail is 137B total / 5B active. In a Mixture-of-Experts model, total parameters and active parameters are not the same thing. The official model card includes both numbers, so treating one of them as an error misses the point.

Item	Detail
Developer	Microsoft
Model type	text-to-text coding model
Architecture	sparse Mixture-of-Experts
Parameters	137B total / 5B active
Context length	256K tokens
Supported language	English
Training dates	March-May 2026
Release date	June 2, 2026

This is not just another third-party model appearing in the Copilot picker. It is a Microsoft-controlled model lane being inserted directly into Copilot.

Availability¶

The plan coverage is broad, but the rollout is gradual. GitHub says MAI-Code-1-Flash is beginning to roll out to Copilot Free, Pro, Pro+, and Max, first for a limited set of users and then more broadly over the following weeks².

Area	Status
Plans	Free / Pro / Pro+ / Max
Initial client	Visual Studio Code
Selection	Model picker or Auto picker
Copilot CLI	Planned for a later rollout
API access	Awaiting future documentation

Being included in your plan does not mean the model appears in your picker today. If you do not see it in VS Code, the likely explanation is rollout timing, not a local setup issue.

The word "free" also needs care. Copilot moved to AI Credits-based usage billing on June 1, 2026, and each plan is governed by included allowances and additional spending settings⁴. MAI-Code-1-Flash is better described as a low-cost model available inside Copilot's existing usage system, not as unlimited free compute.

Performance¶

Microsoft's directly comparable benchmark is against Claude Haiku 4.5 in the same production harness. In that setup, MAI-Code-1-Flash beats Haiku 4.5 across the core coding evaluations Microsoft reports³.

Benchmark	MAI-Code-1-Flash	Claude Haiku 4.5
SWE-Bench Verified	71.6%	66.6%
SWE-Bench Pro	51.2%	35.2%
SWE-Bench Multilingual	65.5%	62.7%
Terminal Bench 2	54.8%	41.6%

On SWE-Bench Verified, Microsoft reports 10.8K average solution tokens for MAI-Code-1-Flash versus 27.3K for Haiku 4.5³. That is the basis for Microsoft's claim that the model can solve harder problems with up to 60% fewer tokens¹.

That does not make it a frontier model. Google DeepMind's Gemini 3.1 Pro model card lists separate SWE-Bench Verified results for several models⁶. It reports 80.6% for Gemini 3.1 Pro, 80.8% for Claude Opus 4.6, and 79.6% for Claude Sonnet 4.6. Those are not directly comparable to Microsoft's production harness, but they are enough to avoid overstating MAI-Code-1-Flash as Opus- or Sonnet-class.

The useful positioning is simple. Use MAI-Code-1-Flash for completions, small refactors, repository Q&A, and short fixes. Keep larger design changes and long autonomous implementation work on stronger models unless your own measurements show otherwise.

Pricing¶

GitHub Docs list MAI-Code-1-Flash as a GA Microsoft model in the lightweight category⁵. Its per-1M-token Copilot prices are $0.75 input, $0.075 cached input, and $4.50 output.

Model	Category	Input	Cached input	Output
MAI-Code-1-Flash	Lightweight	$0.75	$0.075	$4.50
Claude Haiku 4.5	Versatile	$1.00	$0.10	$5.00
GPT-5 mini	Lightweight	$0.25	$0.025	$2.00
Gemini 3 Flash	Lightweight	$0.50	$0.05	$3.00

The signal is not that Microsoft is trying to win every benchmark. It is trying to build a model with the cost, latency, and quality profile needed for everyday Copilot traffic. Under AI Credits billing, both model price and token usage determine how quickly a user's allowance is consumed⁵.

For annual Pro and Pro+ users who remain on legacy request-based billing, GitHub Docs list a 0.33 model multiplier for MAI-Code-1-Flash⁷. That same page notes that the multiplier is promotional. Do not mix that legacy multiplier with the newer AI Credits price table when estimating long-term cost.

Why This Matters¶

MAI-Code-1-Flash landed one day after GitHub activated usage-based billing for Copilot⁴. That timing matters.

Copilot benefits from offering OpenAI, Anthropic, Google, and other models. But from Microsoft's side, relying only on external frontier models limits control over cost, latency, routing, and product-specific tuning. An in-house model gives Microsoft more control over the default lane for high-volume coding tasks.

Microsoft announced a broader family of MAI models on the same day⁸. MAI-Code-1-Flash is the coding piece of that strategy. The strategic shift is from "Copilot as a picker of external models" toward "Copilot as a routing system with Microsoft-owned models in the efficient default path."

Summary¶

This is a measurement moment. When MAI-Code-1-Flash appears in your VS Code model picker, test it on the same lightweight tasks you already give to other models.

Track three things:

Whether it reduces the number of back-and-forth turns.
Whether concise output still passes tests and review.
Whether AI Credits consumption drops for similar tasks.

Do not start by using it as the default model for complex architecture changes. Its value is not replacing frontier models. Its value is reducing the cost and latency of frequent, smaller development tasks.

The Copilot competition is no longer only about benchmark peaks. The next important question is which tasks Auto picker routes to which model tier. MAI-Code-1-Flash is the first visible sign that Microsoft wants to control that routing layer with its own models.

Introducing MAI-Code-1-Flash, Microsoft AI, June 2, 2026. ↩↩
MAI-Code-1-Flash is now available for GitHub Copilot, GitHub Changelog, June 2, 2026. ↩↩
MAI-Code-1-Flash model card, Microsoft AI, June 2, 2026. ↩↩↩↩
Updates to GitHub Copilot billing and plans, GitHub Changelog, June 1, 2026. ↩↩
Models and pricing for GitHub Copilot, GitHub Docs. ↩↩↩
Gemini 3.1 Pro - Model Card, Google DeepMind, February 2026. ↩
Model multipliers for annual plans on request-based billing (legacy), GitHub Docs. ↩
Building a hill-climbing machine: Launching seven new MAI models, Microsoft AI, June 2, 2026. ↩