Claude Opus 4.6 Fast Mode
https://code.claude.com/docs/en/fast-mode
Anthropic has added a new Fast Mode to Opus 4.6, increasing token output speed by approximately 2.5x. Response quality remains the same. It is significantly more expensive (6x) and is available as a research preview. The mode is also available in GitHub Copilot.
GPT‑5.3‑Codex‑Spark
https://openai.com/index/introducing-gpt-5-3-codex-spark/
GPT-5.3-Codex-Spark is a smaller version of GPT-5.3-Codex and a model optimized for real-time code generation (exceeding 1,000 tokens per second) through a partnership with Cerebras. This is a step toward a hybrid Codex with two modes: long-horizon tasks (hours/days) and real-time interaction. The API is currently available only to partners, and pricing has not been disclosed.
Following the updates to top proprietary models, leading models from Chinese companies have also been updated.
MiniMax M2.5
https://www.minimax.io/news/minimax-m25
The new flagship model from Chinese company MiniMax operates at a speed of 100 tokens per second, which is nearly twice as fast as other leading models. It performs complex tasks 37% faster than M2.1 and is on par with Claude Opus 4.6. On average, M2.5 is 10-20 times cheaper than Claude Opus, Gemini 3 Pro, and GPT-5.
Fully deployed in the MiniMax Agent product, where users can create their own "Experts" for specific tasks using "Office Skills."
The model will be available for free for 7 days in OpenCode.
GLM-5
https://z.ai/blog/glm-5
The new flagship open-source model from Chinese company Zhipu AI (now branded Z.ai) focuses on "Agentic engineering"—long-term complex tasks and coding at the level of frontier models. Low hallucination rate, improved reasoning, and support for long context. It is reported that training was conducted on Huawei chips.
https://www.youtube.com/watch?v=vtWMgVCMsx8
Leader among open-weights models according to Artificial Analysis. The model is compatible with Claude Code and OpenClaw. Currently free at Kilo Code and OpenCode.
Ollama Cloud
https://docs.ollama.com/cloud and https://ollama.com/pricing
https://ollama.com/library/glm-5
Added ollama launch opencode --model minimax-m2.5:cloud or ollama launch claude --model glm-5:cloud, allowing you to run core programming CLIs by pulling new models from the Ollama cloud. You can start using this feature for free, with subscription plans available at $20 and $100 per month.
#newllmmodel #claudecode #glm #minimax