CodeWithLLM-Updates
-

Claude Max in Opencode
https://github.com/anomalyco/opencode/issues/7410
Opencode users have started reporting notifications about the inability to use the Claude subscription plan for $200/month. Some thought it was an account-specific issue, but it seems Anthropic has decided to crack down on third-party CLIs.

One commenter noted that using the "Max" plan within Opencode was never officially authorized by Anthropic, so the block was only a matter of time.

Many developers in the comments state they have cancelled their paid Claude subscriptions because the service is not suitable for them without OpenCode. Some users are switching to "Zen," an internal service from the OpenCode developers that works via API, where payment is based on actual token usage rather than a fixed monthly fee.

https://news.ycombinator.com/item?id=46549823
Many developers on HN claim that OpenCode has recently become technically superior to Anthropic's official tool. The community believes the blocking decision is tied to telemetry: by using the official Claude Code CLI, you agree by default (or via manipulative UI) to provide Anthropic with data on how you accept or reject code. This is invaluable for training future models. Third-party clients like OpenCode "steal" this data from Anthropic.

Later, it was revealed that the "blocking" was quite primitive: OpenCode simply mimicked official behavior by sending a system prompt: "You are Claude Code, the official CLI from Anthropic." The community has already found "workarounds": changing tool names (e.g., using different capitalization) or updating plugins restores functionality for the time being.

https://github.com/anomalyco/opencode/releases/tag/v1.1.11
Opencode has added support for OpenAI's Codex pricing plan authentication.

Decline in Code Generation Quality
https://spectrum.ieee.org/ai-coding-degrades
Data scientist Jamie Twiss says that in his experience, AI code generation agents reached a plateau in 2025 and the quality of their work has now begun to decline.

His assumption: previously, models often made mistakes in syntax or structure. However, an increasing number of modern AI code generator users are "vibe" programmers. If a user accepts the code, the model considers its job done. This is how Reinforcement Learning from Human Feedback (RLHF) works, which "poisons" new model iterations and teaches them to "please" the user by masking problems instead of writing correct and secure code.

https://news.ycombinator.com/item?id=46542036
A significant portion of HN commenters believe that models are not getting worse; rather, their "capability architecture" is changing, and old prompting methods no longer work. Developers must constantly adapt. Working with AI is not about magically correct answers every time, but a separate engineering discipline that requires thorough auditing and complex control tools.

Some suggest that large subscription providers dynamically swap large models for smaller (distilled) ones during peak loads. Because of this, users periodically experience AI "stupidity."

Autonomous Coding Experiment
https://cursor.com/blog/scaling-agents
Cursor launched hundreds of AI agents simultaneously to work on a single collaborative project for weeks without human intervention. The essence is to move from the "one chatbot solves one task" format to a "virtual IT company" model, where agents work in parallel without interfering with each other.

The main takeaway is that simply increasing the number of agents is effective for solving complex tasks if prompts and models are properly configured (Opus 4.5 tends to "cut corners," while GPT-5.2 is better at long-term planning). The solution was a hierarchical "Planners and Workers" approach. Planners continuously explore the code and create tasks, while Workers implement them without being distracted by overall coordination.

Agents wrote over a million lines of code, building a web browser, a Windows 7 emulator, and an Excel clone from scratch.

https://www.youtube.com/watch?v=U7s_CaI93Mo

Agents created a browser, but it doesn't work
https://emsh.cat/cursor-implied-success-without-evidence/
A blog post by embedding-shapes debunks this "success." The author claims that Cursor's experiment is a marketing illusion and fiction, and the agents' output is non-working garbage: the project cannot be built. The cargo build command returns dozens of errors. Agents spent weeks writing code but seemingly never tested it for functionality and ignored compilation errors.

This is "AI slop"—generated text that looks like code but lacks real logic or a working structure. The agents simply "inflated" the volume (a million lines) but failed the basic minimum: creating a program that at least launches and opens a simple HTML file. In other words, they created code, not a program.

https://news.ycombinator.com/item?id=46646777
Users (specifically nindalf) looked into the dependency file (Cargo.toml) and discovered that the "browser" uses ready-made components from Servo (a motor by Mozilla/Igalia) for HTML and CSS parsing, as well as the QuickJS library for JavaScript. Cursor's claim that agents wrote all of this "from scratch" was deemed a lie. The code generated by the agents is mostly "glue" connecting existing third-party libraries.

The community confirmed the findings of the embedding-shapes author: the code does not compile, tests fail, and the commit history shows that agents simply generated gigabytes of text without functional verification. The claims about "millions of lines of code" and "autonomous agents" are targeted at managers and investors who won't check the repository. The situation is being compared to fraud.

Getting Started with Codex
https://www.youtube.com/watch?v=px7XlbYgk7I
OpenAI released a detailed 53-minute workshop on how to start working with Codex, their code generation tool. The presentation covers all stages: from installation to advanced use cases.

Differences between Codex in the terminal (CLI), as a VS Code extension, and in the cloud. What the AGENTS.md file does. How to connect external services (e.g., Jira, Figma, documentation databases) via MCP servers.

https://www.youtube.com/watch?v=px7XlbYgk7I

Effective Prompting: Using @ to reference specific files. Ability to add screenshots (e.g., UI mockups) for code generation. Session restoration (codex resume) to continue working on complex tasks.

Advanced Scenarios: Code Review. Writing unit tests and documentation. Automated fixing of failed tests in CI/CD pipelines. Generating diagrams (Mermaid sequence diagrams) to explain code logic.

How Codex Works
https://openai.com/index/unrolling-the-codex-agent-loop/
Recently, distrust towards Anthropic has been growing, with many highlighting that Claude Code is not an open-source project. Against this backdrop, OpenAI has an opportunity to promote Codex. They released an article emphasizing that their project is open-source, allowing anyone to audit the code, and explained how it works.

At the core of Codex CLI is an "agent loop" that coordinates interaction between the user, the AI model, and tools. This loop repeats until the model provides a final text response. Constructing the initial prompt is a complex procedure: it consists of system instructions, a list of available tools (both built-in and external via MCP servers), and a description of the local environment.

Architecturally, Codex uses a stateless approach, moving away from the previous_response_id parameter. This means all necessary information is resent in every request, supporting a "Zero Data Retention" policy for enterprise clients. It is possible to use the gpt-oss model via Ollama 0.13.4+ or LM Studio 0.3.39+ entirely locally.

https://news.ycombinator.com/item?id=46737630
Many were pleasantly surprised by the transition to Rust (the codex-rs project), which has become the primary version, though some are confused by the npm installation method. The context compaction mechanism (/responses/compact) was highly praised as being superior to competitors.

Agent Skills Adaptation
https://agentskills.io/home
Following Anthropic's rollout of the Skills API (skills-2025-10-02) and the release of the standard on December 18, 2025, OpenAI with GPT-5.2 Thinking quietly responded almost immediately by adding /home/oai/skills to ChatGPT and skills.md support in Codex. MS integrated support into VS Code in December as well. and Cursor did too.

https://opencode.ai/docs/skills/ in OpenCode CLI v1.0.186, December 22, 2025.
https://qwenlm.github.io/qwen-code-docs/en/users/features/skills/ in Qwen code v0.6, December 26, 2025.
https://geminicli.com/docs/cli/skills/ in Gemini CLI v0.23, January 7, 2026.

Clawdbot
https://molt.bot/ and https://www.clawhub.ai/
Skills are exactly what make Clawdbot/Moltbot such a powerful tool.

Atlassian, Figma, Canva, Stripe, Notion, Zapier—just as they did with the Model Context Protocol (MCP) a year ago—have also released their own skills.

Catalogs have started to emerge:

https://github.com/runkids/skillshare - synchronization of skills between Claude Code, ClawdBot, OpenCode, etc.

Ollama Updates
https://github.com/ollama/ollama/releases
Ollama is a project for automating and simplifying the deployment of open LLMs locally. It allows generation to take place directly on your own hardware, protecting private data and removing dependence on network access.

v0.14 - added compatibility with Anthropic API. Now any open model can be connected to Claude Code.

https://docs.ollama.com/integrations/claude-code
v0.15 - New convenient launch command ollama launch for using Ollama models with Claude Code, Codex, OpenCode, and Droid without separate configuration.

https://www.youtube.com/watch?v=3x2q6-5XbQ8

Of course, the generation quality will be lower than with Anthropic models, but it's 100% private and works without the internet.

https://docs.ollama.com/integrations/clawdbot
Later, ollama launch clawdbot was added to run Clawdbot/Moltbot/OpenClaw with local models.