GitHub Copilot is moving to usage-based billing:
Starting June 1, your Copilot usage will consume GitHub AI Credits.
The subscription prices stay the same: Copilot Pro is $10/month and Pro+ is $39/month.
The important change is that heavier usage is now tied to tokens: chat, CLI usage, agent sessions, Spark, Spaces, and third-party coding agents. Code completions and next edit suggestions stay included for paid plans.
The part that stood out to me is the annual-plan multiplier table:
Claude Opus 4.7 3 27
That table applies to annual Copilot Pro and Pro+ subscribers who stay on premium-request billing until their plan expires. Some examples from the table:
- Claude Opus 4.7: 3x to 27x
- Claude Opus 4.6: 3x to 27x
- Claude Sonnet 4.6: 1x to 9x
- GPT-5.4 mini: 0.33x to 6x
That is a big jump.
I do not think this means “stop using frontier models.” If a model saves hours on a hard migration or production bug, use it.
But it does make the default model choice matter more. Plenty of coding tasks are routine:
- Explain a stack trace
- Draft tests
- Rewrite a small function
- Summarize a diff
- Search project docs
- Clean up a script
For those, local or self-hosted models look more practical now.
Some relevant open-weight releases:
- Qwen3.6: Apache 2.0 open-weight models, including Qwen3.6-27B and Qwen3.6-35B-A3B. Supports llama.cpp, MLX, vLLM, and SGLang.
- Kimi K2.6: Moonshot describes this as an open-source model for coding, long-horizon execution, and agent workflows.
- GLM-4.6: MIT-licensed weights on Hugging Face. GLM-5.1 is aimed at longer agentic engineering tasks.
Meta’s Llama 4 launch was less convincing for me because of the benchmark confusion and mixed reports after release. I would not want my local-model plan to depend on one vendor anyway.
My likely setup:
- Local or self-hosted model for routine/private work.
- Cheap fast cloud model for convenience.
- Expensive frontier model when the task justifies it.
The expensive model can still be there. It just should not be the default for every small task.