Cerebras GLM 4.6
https://inference-docs.cerebras.ai/support/change-log
Cerebras announced the replacement of the Qwen3 Coder 480B model with the new GLM 4.6, which also applies to the Cerebras Code subscription ($50 or $200/month). The model is suitable for fast UI iterations and refactoring.
- GLM 4.6 operates at 1000 tokens/second - this is fast, but still roughly twice as slow as Qwen3 Coder
- Code quality approaches Claude Sonnet 4.5, making it competitive, but it easily gets confused on complex tasks
- Fewer errors in tool calls compared to Qwen3, but sometimes switches to Chinese or cuts off
https://news.ycombinator.com/item?id=45852751
The discussion concluded that the replacement makes sense for Cerebras (GLM 4.6 is an open model with a clear roadmap), but for users, it's a sidestep rather than a step forward. Qwen3 was a better choice for many tasks.