GitHub Copilot is moving to usage-based billing:
Starting June 1, your Copilot usage will consume GitHub AI Credits.
The subscription prices stay the same: Copilot Pro is $10/month and Pro+ is $39/month.
The change is in how heavier usage gets counted. Chat, CLI usage, agent sessions, Spark, Spaces, and third-party coding agents now consume credits. Code completions and next edit suggestions stay included for paid plans.
The line that made me stop was not in the announcement post. It was in the annual-plan multiplier table:
Claude Opus 4.7 3 27
For annual Copilot Pro and Pro+ subscribers who stay on premium-request billing until their plan expires, the examples look like this:
- Claude Opus 4.7: 3x to 27x
- Claude Opus 4.6: 3x to 27x
- Claude Sonnet 4.6: 1x to 9x
- GPT-5.4 mini: 0.33x to 6x
Twenty-seven times made me blink.
I do not read this as “stop using frontier models.” If a model saves hours on a production bug, a migration, or some miserable debugging session, use the expensive model and move on with your life.
The default is where I would be more careful.
A lot of coding tasks are not frontier-model tasks:
- explain a stack trace
- draft a few tests
- rewrite a small function
- summarize a diff
- search project docs
- clean up a script
- turn a rough note into a checklist
For that work, local and self-hosted models become much easier to justify.
The good news is that open-weight models have not dried up everywhere, even if the US big-company side has been weirdly quiet after Meta’s rough Llama 4 cycle.
Some releases I am watching:
- Qwen3.6: Apache 2.0 open-weight models, including Qwen3.6-27B and Qwen3.6-35B-A3B, with support across llama.cpp, MLX, vLLM, and SGLang.
- Kimi K2.6: Moonshot describes it as an open-source model for coding, long-horizon execution, and agent workflows.
- GLM-4.6: MIT-licensed weights on Hugging Face. GLM-5.1 is aimed at longer agentic engineering tasks.
I would not want a local-model plan to depend on one vendor anyway.
My likely setup is:
- local or self-hosted model for routine or private work
- cheap fast hosted model for convenience
- expensive frontier model when the task is worth it
The expensive model can still be there. I just do not want it burning credits every time I ask for a small refactor.