Microsoft MAI-Code-1-Flash in GitHub Copilot: Availability, Pricing, and Performance¶
For / Key Points
For: Developers tracking GitHub Copilot model selection, AI Credits billing, and Microsoft's in-house coding model strategy.
Key Points:
- MAI-Code-1-Flash is rolling out to Copilot Free, Pro, Pro+, and Max, starting with a limited set of VS Code users2.
- The model is both 137B total parameters and 5B active parameters; those numbers describe different parts of the MoE architecture3.
- Its Copilot pricing sits in the lightweight tier, so it is best read as an efficient everyday coding model, not a full replacement for frontier models5.
On June 2, 2026, Microsoft announced MAI-Code-1-Flash for GitHub Copilot1. The easy headline is that Copilot now has a free Microsoft-built coding model. The practical reading is narrower.
The question is straightforward. Who can use MAI-Code-1-Flash, what does it cost, and where does its performance actually fit?
What Microsoft Added¶
MAI-Code-1-Flash is a lightweight coding model built for GitHub Copilot. Microsoft says it trained the model end to end using Microsoft infrastructure, data pipelines, and curated data3.
The most important specification detail is 137B total / 5B active. In a Mixture-of-Experts model, total parameters and active parameters are not the same thing. The official model card includes both numbers, so treating one of them as an error misses the point.
| Item | Detail |
|---|---|
| Developer | Microsoft |
| Model type | text-to-text coding model |
| Architecture | sparse Mixture-of-Experts |
| Parameters | 137B total / 5B active |
| Context length | 256K tokens |
| Supported language | English |
| Training dates | March-May 2026 |
| Release date | June 2, 2026 |
This is not just another third-party model appearing in the Copilot picker. It is a Microsoft-controlled model lane being inserted directly into Copilot.
Availability¶
The plan coverage is broad, but the rollout is gradual. GitHub says MAI-Code-1-Flash is beginning to roll out to Copilot Free, Pro, Pro+, and Max, first for a limited set of users and then more broadly over the following weeks2.
| Area | Status |
|---|---|
| Plans | Free / Pro / Pro+ / Max |
| Initial client | Visual Studio Code |
| Selection | Model picker or Auto picker |
| Copilot CLI | Planned for a later rollout |
| API access | Awaiting future documentation |
Being included in your plan does not mean the model appears in your picker today. If you do not see it in VS Code, the likely explanation is rollout timing, not a local setup issue.
The word "free" also needs care. Copilot moved to AI Credits-based usage billing on June 1, 2026, and each plan is governed by included allowances and additional spending settings4. MAI-Code-1-Flash is better described as a low-cost model available inside Copilot's existing usage system, not as unlimited free compute.
Performance¶
Microsoft's directly comparable benchmark is against Claude Haiku 4.5 in the same production harness. In that setup, MAI-Code-1-Flash beats Haiku 4.5 across the core coding evaluations Microsoft reports3.
| Benchmark | MAI-Code-1-Flash | Claude Haiku 4.5 |
|---|---|---|
| SWE-Bench Verified | 71.6% | 66.6% |
| SWE-Bench Pro | 51.2% | 35.2% |
| SWE-Bench Multilingual | 65.5% | 62.7% |
| Terminal Bench 2 | 54.8% | 41.6% |
On SWE-Bench Verified, Microsoft reports 10.8K average solution tokens for MAI-Code-1-Flash versus 27.3K for Haiku 4.53. That is the basis for Microsoft's claim that the model can solve harder problems with up to 60% fewer tokens1.
That does not make it a frontier model. Google DeepMind's Gemini 3.1 Pro model card lists separate SWE-Bench Verified results for several models6. It reports 80.6% for Gemini 3.1 Pro, 80.8% for Claude Opus 4.6, and 79.6% for Claude Sonnet 4.6. Those are not directly comparable to Microsoft's production harness, but they are enough to avoid overstating MAI-Code-1-Flash as Opus- or Sonnet-class.
The useful positioning is simple. Use MAI-Code-1-Flash for completions, small refactors, repository Q&A, and short fixes. Keep larger design changes and long autonomous implementation work on stronger models unless your own measurements show otherwise.
Pricing¶
GitHub Docs list MAI-Code-1-Flash as a GA Microsoft model in the lightweight category5. Its per-1M-token Copilot prices are $0.75 input, $0.075 cached input, and $4.50 output.
| Model | Category | Input | Cached input | Output |
|---|---|---|---|---|
| MAI-Code-1-Flash | Lightweight | $0.75 | $0.075 | $4.50 |
| Claude Haiku 4.5 | Versatile | $1.00 | $0.10 | $5.00 |
| GPT-5 mini | Lightweight | $0.25 | $0.025 | $2.00 |
| Gemini 3 Flash | Lightweight | $0.50 | $0.05 | $3.00 |
The signal is not that Microsoft is trying to win every benchmark. It is trying to build a model with the cost, latency, and quality profile needed for everyday Copilot traffic. Under AI Credits billing, both model price and token usage determine how quickly a user's allowance is consumed5.
For annual Pro and Pro+ users who remain on legacy request-based billing, GitHub Docs list a 0.33 model multiplier for MAI-Code-1-Flash7. That same page notes that the multiplier is promotional. Do not mix that legacy multiplier with the newer AI Credits price table when estimating long-term cost.
Why This Matters¶
MAI-Code-1-Flash landed one day after GitHub activated usage-based billing for Copilot4. That timing matters.
Copilot benefits from offering OpenAI, Anthropic, Google, and other models. But from Microsoft's side, relying only on external frontier models limits control over cost, latency, routing, and product-specific tuning. An in-house model gives Microsoft more control over the default lane for high-volume coding tasks.
Microsoft announced a broader family of MAI models on the same day8. MAI-Code-1-Flash is the coding piece of that strategy. The strategic shift is from "Copilot as a picker of external models" toward "Copilot as a routing system with Microsoft-owned models in the efficient default path."
Summary¶
This is a measurement moment. When MAI-Code-1-Flash appears in your VS Code model picker, test it on the same lightweight tasks you already give to other models.
Track three things:
- Whether it reduces the number of back-and-forth turns.
- Whether concise output still passes tests and review.
- Whether AI Credits consumption drops for similar tasks.
Do not start by using it as the default model for complex architecture changes. Its value is not replacing frontier models. Its value is reducing the cost and latency of frequent, smaller development tasks.
The Copilot competition is no longer only about benchmark peaks. The next important question is which tasks Auto picker routes to which model tier. MAI-Code-1-Flash is the first visible sign that Microsoft wants to control that routing layer with its own models.
Related Articles¶
- GitHub Copilot AI Credits: Usage-Based Billing Starts June 1, 2026
- GitHub Copilot AI Credits Cost Design: Skills, MCP, and External Context
- Why GitHub Copilot's Individual Plan Tightening Shows Request-Based Pricing Has Hit Its Limit
Introducing MAI-Code-1-Flash, Microsoft AI, June 2, 2026. ↩↩
MAI-Code-1-Flash is now available for GitHub Copilot, GitHub Changelog, June 2, 2026. ↩↩
MAI-Code-1-Flash model card, Microsoft AI, June 2, 2026. ↩↩↩↩
Updates to GitHub Copilot billing and plans, GitHub Changelog, June 1, 2026. ↩↩
Models and pricing for GitHub Copilot, GitHub Docs. ↩↩↩
Gemini 3.1 Pro - Model Card, Google DeepMind, February 2026. ↩
Model multipliers for annual plans on request-based billing (legacy), GitHub Docs. ↩
Building a hill-climbing machine: Launching seven new MAI models, Microsoft AI, June 2, 2026. ↩