<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Code With LLM Updates</title>
    <link>https://aicode.danvoronov.com/</link>
    <atom:link href="https://aicode.danvoronov.com/feed.xml" rel="self" type="application/rss+xml" />
    <description>Updates and tips about using Large Language Models (LLM) for programming and development</description>
    <language>en</language>
    <lastBuildDate>Fri, 24 Apr 2026 14:00:14 GMT</lastBuildDate>
    
      <item>
        <title><![CDATA[2026-04-24 14:37]]></title>
        <link>https://aicode.danvoronov.com/2026-04/24_14-37/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-04/24_14-37/</guid>
        <description><![CDATA[<p><strong>Zed Agent Interface</strong><br><a target="_blank" href="https://zed.dev/blog/parallel-agents">https://zed.dev/blog/parallel-agents</a><br>Following Cursor&#39;s lead, Zed is adapting its interface to manage multiple agent chats simultaneously. The main innovation is the <strong>Threads Sidebar</strong>, which helps group threads by project, flexibly configure agent access to repositories, and track their progress. AI panels have been moved to the left, while files and Git are now on the right. </p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=OLit5C1XE0k">https://www.youtube.com/watch?v=OLit5C1XE0k</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=OLit5C1XE0k">Video on YouTube: https://www.youtube.com/watch?v=OLit5C1XE0k</a></p>
          </p>
<p><strong>Discussion</strong><br><a target="_blank" href="https://news.ycombinator.com/item?id=47866750">https://news.ycombinator.com/item?id=47866750</a><br>Many programmers are dissatisfied with the interface changes, noting that running several agents <strong>simultaneously</strong> creates a massive &quot;cognitive load&quot; and complicates code review, as AI still generates too much &quot;garbage&quot; code. Users also mention the unfinished Git interface and the lack of proper code review tools—issues that should be addressed first.</p>
<p>The biggest pain point remains the <strong>isolation of databases</strong>, configurations, ports, and test data. Developers are actively discussing how to automate this: some write custom shell scripts, some use Devcontainers, while others praise third-party tools like Conductor or Ouijit for managing the lifecycle of such environments.</p>
<p><strong>Claude Design</strong><br><a target="_blank" href="https://www.anthropic.com/news/claude-design-anthropic-labs">https://www.anthropic.com/news/claude-design-anthropic-labs</a><br>Anthropic introduced a specialized AI tool based on the new Claude Opus 4.7 model and a design system (DESIGN.md) tailored for the product design process. It creates fully functional interactive prototypes, presentations, landing pages, and UI components, outputting ready-to-use HTML, CSS, and JavaScript code in real-time. </p>
<p>A one-click export allows users to transfer the finished design directly into the Claude Code environment.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#zed';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#zed</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span> </p>
]]></description>
        <pubDate>Fri, 24 Apr 2026 14:37:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-04-24 13:08]]></title>
        <link>https://aicode.danvoronov.com/2026-04/24_13-08/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-04/24_13-08/</guid>
        <description><![CDATA[<p>Models updated, all promising agency:</p>
<ul>
<li><strong>DeepSeek</strong> V3.2 -&gt; <a target="_blank" href="https://api-docs.deepseek.com/news/news260424">V4</a>. Two versions: V4-Pro and V4-Flash. Open-source. Context: 1M in, 384K out. China. Cheaper scenarios for long documents, agents, and automation. Code quality is lower than other announced models.</li>
<li><strong>GPT</strong>-5.4 -&gt; <a target="_blank" href="https://openai.com/index/introducing-gpt-5-5/">GPT-5.5</a>. Presented as an agent that can be trusted with work where the model must plan several steps ahead. Code generation is even better according to tests, while token consumption remains the same. The best model on the market right now, according to OpenAI.</li>
<li><strong>Kimi</strong> K2.5 -&gt; <a target="_blank" href="https://www.kimi.com/blog/kimi-k2-6">K2.6</a>. Open-source. China. Moonshot AI positions the model as an agent for long-term programming tasks.  </li>
<li><strong>GLM</strong>-5 -&gt; <a target="_blank" href="https://z.ai/blog/glm-5.1">5.1</a>. Open-source. China. Claims significant improvements in code generation and cybersecurity.</li>
<li><strong>Qwen</strong> 3.5 -&gt; 3.6. Qwen3.6-Plus released as a closed model, followed by the flagship <a target="_blank" href="https://qwen.ai/blog?id=qwen3.6-max-preview">Qwen3.6-Max-Preview</a>. </li>
<li><strong>MiniMax</strong> M2.5 -&gt; <a target="_blank" href="https://www.minimax.io/models/text/m27">M2.7</a>. Open-weights. China. Also for long tasks; said to have good emotional intelligence and stability on OpenClaw skills.</li>
<li>Important open-source / open-weight releases of small Qwen3.6 models for coding: <strong>Qwen3.6-35B-A3B</strong> — MoE model (35B total / 3B active), and <strong>Qwen3.6-27B</strong> — <a target="_blank" href="https://qwen.ai/blog?id=qwen3.6-27b">dense 27B</a>. These are particularly interesting for running on local hardware.</li>
</ul>
<p>The difference between GPT-5.5, Kimi K2.6, GLM-5.1, Qwen3.6 Plus, MiniMax M2.7, and DeepSeek-V4-Pro-Max on the SWE-Bench Pro test lies in the 55–59% range, meaning it is already a dense group of strong coding/agent models.</p>
<p><strong>End of free Qwen Code</strong><br><a target="_blank" href="https://www.reddit.com/r/Qwen_AI/comments/1skeeu5/goodbye_qwen_you_tried_but_you_failed/">https://www.reddit.com/r/Qwen_AI/comments/1skeeu5/goodbye_qwen_you_tried_but_you_failed/</a><br>The Qwen OAuth free tier for Qwen Code was disabled on April 15, 2026. The old &quot;log in via browser and use for free&quot; scenario no longer works or returns errors such as <code>401 invalid access token</code>, <code>token expired</code>, <code>Internal error</code>, or <code>free tier quota exceeded</code>.</p>
<p><strong>Claude Code removal test from $20 plan</strong><br><a target="_blank" href="https://www.reddit.com/r/ClaudeAI/comments/1ss3asp/does_claudes_20_plan_no_longer_include_claude_code/">https://www.reddit.com/r/ClaudeAI/comments/1ss3asp/does_claudes_20_plan_no_longer_include_claude_code/</a><br>On April 21, 2026, users noticed that Claude Code disappeared from the $20 Pro plan on Anthropic&#39;s pricing page, remaining only in the more expensive Max plans. Anthropic explained that this was an A/B test / pricing experiment affecting approximately 2% of new users. </p>
<p>It seems cheap AI coding is gradually coming to an end.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span> </p>
]]></description>
        <pubDate>Fri, 24 Apr 2026 13:08:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-04-18 10:08]]></title>
        <link>https://aicode.danvoronov.com/2026-04/18_10-08/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-04/18_10-08/</guid>
        <description><![CDATA[<p>If Anthropic is following the path of integrating Claude Code into its Work desktop app (finally adding parallel sessions: <a target="_blank" href="https://claude.com/blog/claude-code-desktop-redesign">https://claude.com/blog/claude-code-desktop-redesign</a>), OpenAI is coming from a different angle: this week they updated the Codex coding app and added computer control features. Different paths — same result.</p>
<p><strong>Codex as a Superapp</strong><br><a target="_blank" href="https://openai.com/index/codex-for-almost-everything/">https://openai.com/index/codex-for-almost-everything/</a><br>On macOS, Codex can now see the screen, move its own cursor, click, type text, open any application, and work in the background. Across all platforms, there is a built-in browser, image generation, memory (remembers preferences and previous actions — not yet in EU/UK), and over 90 plugins and integrations.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=sdNoaztocs0">https://www.youtube.com/watch?v=sdNoaztocs0</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=sdNoaztocs0">Video on YouTube: https://www.youtube.com/watch?v=sdNoaztocs0</a></p>
          </p>
<p>While Codex has introduced a very &quot;Cursor-like&quot; pleasant feature — where you can simply click on any element (button, block, text, image) in a generated website to immediately add it to the prompt as a reference — the general trend of both companies (Anthropic and OpenAI) expanding their product audiences is slightly concerning for programmers.</p>
<p><strong>Discussion</strong><br><a target="_blank" href="https://news.ycombinator.com/item?id=47796469">https://news.ycombinator.com/item?id=47796469</a><br>Many see this as a revolution for ordinary people (non-programmers): agents will be able to create personal UIs, automate business processes, replace entire programs, and radically increase productivity. At the same time, programmers are wary — <strong>security</strong> and privacy are still neglected: full agent access turns a computer into a &quot;hostile device&quot; where even a .txt file becomes an attack vector.</p>
<p><strong>ChatGPT Pro for $100/mo</strong><br><a target="_blank" href="https://help.openai.com/en/articles/9793128-about-chatgpt-pro-tiers">https://help.openai.com/en/articles/9793128-about-chatgpt-pro-tiers</a><br>At the beginning of April, the Codex token promotion ended; now, a free account can only run about two simple tasks before hitting the weekly limit. The $20 Plus plan also doesn&#39;t offer much headroom now, as the weekly limit is only suitable for light work (1-2 hours a day). That’s why, as of April 9th, an intermediate option between Plus and the $200 Pro was added. </p>
<p>The new $100 Pro tier has 5× higher limits than Plus and provides access to GPT-5.4 Pro and GPT-5.3 Instant. There is also a promotion until May 31, 2026, offering double tokens.</p>
<p>This is a direct response to Anthropic, which has Claude Max for $100.</p>
<p><strong>Opus 4.7</strong><br><a target="_blank" href="https://www.anthropic.com/news/claude-opus-4-7">https://www.anthropic.com/news/claude-opus-4-7</a><br>Claude Opus has updated from 4.6 to 4.7 — same features, but even better on benchmarks. They added &quot;adaptive thinking&quot;: the model decides for itself how much to &quot;think&quot; before responding and hides the internal reasoning (it no longer shows the full chain of thought by default).</p>
<p><strong>Discussion</strong><br><a target="_blank" href="https://news.ycombinator.com/item?id=47793411">https://news.ycombinator.com/item?id=47793411</a><br>The model has become stronger, especially in coding and long contexts. However, it is becoming less debuggable. It is now impossible to properly disable adaptive thinking, which makes Claude Code even worse to use; one has to jump through hoops with commands like <code>/effort xhigh</code>, <code>CLAUDE_CODE_DISABLE_1M_CONTEXT=1</code>, <code>&quot;display&quot;: &quot;summarized&quot;</code>, etc., just to understand what the model is generating.</p>
<p>Anthropic makes cool models, but the programming tools around them are getting worse.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#codex';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#codex</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span></p>
]]></description>
        <pubDate>Sat, 18 Apr 2026 10:08:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-04-12 20:05]]></title>
        <link>https://aicode.danvoronov.com/2026-04/12_20-05/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-04/12_20-05/</guid>
        <description><![CDATA[<p>For about three years, most programming applications were clones of VS Code with a side chat. A new wave seems to have been started by Codex — they released their desktop app on Electron without VSC, as did OpenCode.</p>
<p><strong>Cursor 3</strong><br><a target="_blank" href="https://cursor.com/blog/cursor-3">https://cursor.com/blog/cursor-3</a><br>The company has completely abandoned the VS Code fork model and built a new interface code-named Glass. The main innovation is the ground-up Agents Window, which allows running an unlimited number of agents simultaneously in parallel: locally, in worktree, via SSH, in the cloud, or even across multiple repositories at once. The new part is reportedly <strong>written in Rust</strong>+TS.</p>
<p><a target="_blank" href="https://cursor.com/blog/agent-web">https://cursor.com/blog/agent-web</a><br>Later, they integrated mobile devices via <strong>PWA</strong>. Cursor Agents on web and mobile is an official way to run cloud agents directly from a phone or mobile browser. You can start a chat on your phone and continue on your desktop (or vice versa).</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=HTKGyLar8AU">https://www.youtube.com/watch?v=HTKGyLar8AU</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=HTKGyLar8AU">Video on YouTube: https://www.youtube.com/watch?v=HTKGyLar8AU</a></p>
          </p>
<p>The phrase &quot;Cursor 3 just killed the IDE&quot; is repeated as the main hook. </p>
<p>Discussion<br><a target="_blank" href="https://news.ycombinator.com/item?id=47618084">https://news.ycombinator.com/item?id=47618084</a><br>Many praise the boldness and technical progress of an agentic future, but even more express disappointment or even outrage that Cursor is radically moving away from the familiar &quot;IDE + plugins + AI assistant&quot; model. Critics argue the company is chasing investor hype that &quot;AI will replace developers&quot; rather than addressing programmers&#39; real needs.</p>
<p>People who want to write code rather than manage a team of agents will have to look for something else, like VS Code or Zed.</p>
<p><strong>App from The Factory</strong><br><a target="_blank" href="https://factory.ai/news/factory-desktop">https://factory.ai/news/factory-desktop</a><br>Another company made a similar interface clone for &quot;agent management.&quot; Interestingly, after installing on Windows 11, it tells me &quot;Not connected to Local Machine. Please download and start the Desktop app, or upgrade to a paid plan to unlock more features,&quot; asking me to download their app. While their design is very cool, I couldn&#39;t even test their buggy Electron app.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#cursor';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#cursor</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#factory';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#factory</span></p>
]]></description>
        <pubDate>Sun, 12 Apr 2026 20:05:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-04-09 19:08]]></title>
        <link>https://aicode.danvoronov.com/2026-04/9_19-08/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-04/9_19-08/</guid>
        <description><![CDATA[<p>While Claude Code was the undisputed favorite last year, with many tutorials and side projects, I can&#39;t quite understand what is happening with the project in 2026. Judging by the decreasing number of YouTube videos, other people can&#39;t either.</p>
<p>In February–March, Anthropic announced and rolled out several features that made Claude Code much more autonomous (agentic). There is an active transition from &quot;a single agent in the terminal&quot; to a managed <strong>task system</strong> and coordination of background agents (Ctrl+B) with an ecosystem of hot-reloaded MCP integrations, skills, hooks, and plugins. Through <code>/teleport</code>, you can initialize <code>/remote-control</code> sessions that can be managed from a mobile app. <code>/loop</code> was introduced for periodic prompt/command execution, along with in-session cron scheduling tools, etc.</p>
<p>Of the truly useful additions, only Auto Mode is worth noting.</p>
<p><strong>Auto Mode</strong><br><a target="_blank" href="https://claude.com/blog/auto-mode">https://claude.com/blog/auto-mode</a><br>Presented as a &quot;middle ground&quot; between two extremes in Claude Code. Previously, you either had to manually approve <em>every</em> file change and bash command (very secure but annoying) or use the <code>--dangerously-skip-permissions</code> flag. The new Auto Mode allows Claude to decide for itself which actions are safe and execute them automatically without approval.</p>
<p>Before each tool call, a separate classifier (based on Sonnet 4.6) quickly checks the action for danger. Safe actions proceed automatically; risky ones are blocked. If the model persistently insists on blocked actions, a user prompt eventually appears anyway.</p>
<p><strong>Claude Mythos Announcement Discussion</strong><br><a target="_blank" href="https://news.ycombinator.com/item?id=47679258">https://news.ycombinator.com/item?id=47679258</a><br>Anthropic describes the personality, goals, and limitations of the new model in a system card. It is not being released publicly—allegedly due to a sharp jump in capabilities and security risks. They claim Mythos has found <strong>thousands of zero-day vulnerabilities</strong> in OSs, browsers, virtual machines, etc. (including very old bugs). Many write that this could significantly change cybersecurity—for better or worse.</p>
<p><a target="_blank" href="https://red.anthropic.com/2026/mythos-preview/">https://red.anthropic.com/2026/mythos-preview/</a><br>They also announced Project Glasswing, providing Mythos access to a limited circle of companies to fix critical software using the model.</p>
<hr>
<p>Recently, many people paying for subscriptions have found Claude Code becoming practically unusable due to recent changes in Anthropic&#39;s policies and restrictions without clear rules. Even just mentioning OpenClaw in the system prompt causes the request to be rejected with an error. The system has also become worse at handling non-coding tasks.</p>
<p>Most likely, due to the launch of the new model, they had to maximize the squeeze on all compute that was previously distributed just to attract people to the infrastructure.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span></p>
]]></description>
        <pubDate>Thu, 09 Apr 2026 19:08:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-04-02 19:33]]></title>
        <link>https://aicode.danvoronov.com/2026-04/2_19-33/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-04/2_19-33/</guid>
        <description><![CDATA[<p><strong>Claude Code Source Code</strong><br><a target="_blank" href="https://twitter.com/Fried_rice/status/2038894956459290963">https://twitter.com/Fried_rice/status/2038894956459290963</a><br><a target="_blank" href="https://news.ycombinator.com/item?id=47584540">https://news.ycombinator.com/item?id=47584540</a><br>On March 31, someone <strong>accidentally</strong> published a production build with a sourcemap file (~60 MB) to npm — and the entire Claude Code source code became <strong>publicly available</strong>. Some thought it was a brilliant April Fools&#39; prank. A mention of a rollout window specifically for April 1–7 was even found in the code. Whether it was a joke or a real mistake is still being debated.</p>
<p>What exactly leaked (based on thread discussions):</p>
<ul>
<li>Full Claude Code agent architecture (tool use, computer use, bash, file operations, etc.).</li>
<li>Permission system and &quot;Bypass Permissions Mode&quot; — a detailed description of how guardrails work.</li>
<li>Full Claude Code system prompt (including security rules and &quot;cyber risk instructions&quot;).</li>
<li>Telemetry logic — what exactly is sent to Datadog (model, session ID, subscription type, whether the user is an Anthropic employee, etc.).</li>
<li>Internal infrastructure: WebSocket sessions, JWT for IDE integration, feature flags via GrowthBook, session-ingress, etc.</li>
<li>Hidden/unreleased features (many posts with &quot;hidden features&quot; breakdowns).</li>
<li>&quot;Undercover Mode&quot; subsystem — designed to prevent Claude from disclosing Anthropic&#39;s internal information and publishing production builds with sourcemap files.</li>
</ul>
<p><strong>Analysis by Alex Kim</strong><br><a target="_blank" href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/">https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/</a><br><a target="_blank" href="https://news.ycombinator.com/item?id=47586778">https://news.ycombinator.com/item?id=47586778</a><br>Anthropic specifically injects fake tools to poison attempts to copy Claude&#39;s behavior. There is server-side text summarization with a cryptographic signature. A special mode (undercover.ts) forces the model to hide mentions of internal names (Capybara, Tengu, Slack channels, &quot;Claude Code,&quot; etc.). Rigid security for bash commands (23 checks against injections, zero-width characters, etc.). A prompt caching system with &quot;sticky latches&quot; and 14 invalidation vectors.</p>
<p>The autonomous agent KAIROS is mentioned with a <code>/dream</code> skill, daily logs, GitHub webhooks, and updates every 5 minutes. It looks like the next big step after the current Claude Code.</p>
<p>The most meme-worthy moment — userPromptKeywords.ts contains a large regex that catches phrases like: wtf, ffs, omfg, shit, dumbass, fuck you, this sucks, damn it, showing that the <strong>user is angry</strong>, and the model likely reacts differently (the author assumes this is for experience improvement or escalation).</p>
<p>The leak is dangerous not so much for the code itself, but for revealing the roadmap and internal protection mechanisms.</p>
<p><strong>Visualization</strong><br><a target="_blank" href="https://ccunpacked.dev/">https://ccunpacked.dev/</a> and <a target="_blank" href="https://ccleaks.com/">https://ccleaks.com/</a><br><a target="_blank" href="https://news.ycombinator.com/item?id=47597085">https://news.ycombinator.com/item?id=47597085</a><br>Especially useful for developers who want to understand how Anthropic builds agentic systems (tool calling, multi-agent, planning loop, bash security, etc.).</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=LA3l81oEzJQ">https://www.youtube.com/watch?v=LA3l81oEzJQ</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=LA3l81oEzJQ">Video on YouTube: https://www.youtube.com/watch?v=LA3l81oEzJQ</a></p>
          </p>
<p>Key findings — hidden features: </p>
<ul>
<li><strong>KAIROS</strong>: A constantly active background agent that works 24/7, monitors repositories, and fixes bugs on its own.</li>
<li><strong>ULTRAPLAN</strong>: Deep planning for up to 30 minutes in the cloud for complex tasks.</li>
<li><strong>BUDDY</strong>: A terminal-based Tamagotchi companion with 18 species and statistics.</li>
<li><strong>DREAM</strong>: An automatic self-cleaning and memory consolidation system.</li>
</ul>
<p><strong>Analysis by Joe Fabisevich</strong><br><a target="_blank" href="https://build.ms/2026/4/1/the-claude-code-leak/">https://build.ms/2026/4/1/the-claude-code-leak/</a><br><a target="_blank" href="https://news.ycombinator.com/item?id=47609294">https://news.ycombinator.com/item?id=47609294</a><br>An indie developer, author of Plinky, writes not about the leak itself, but about what it says about modern development. Anthropic immediately started sending DMCA notices to GitHub (even for their own forks of skills and examples). And then clean-room implementations in Python and Rust appeared.</p>
<p>The discussion jokes about &quot;Claude leaking itself&quot;: the classic hype about the model deciding to &quot;open&quot; itself.</p>
<p><strong>Analysis by Han HELOIR YAN, Ph.D.</strong><br><a target="_blank" href="https://medium.com/@han.heloir/everyone-analyzed-claude-codes-features-nobody-analyzed-its-architecture-1173470ab622">https://medium.com/@han.heloir/everyone-analyzed-claude-codes-features-nobody-analyzed-its-architecture-1173470ab622</a><br>The article is more technical and calm - it focuses not on meme features (like Buddy, Undercover Mode or frustration regex), but on the architecture of Claude Code as a full-fledged production-grade AI agent.</p>
<p>Anthropic&#39;s moat is not in the model itself (LLM), but in the harness (the wrapper, the system around the model). It is thanks to this harness that Claude Code feels significantly more powerful than competitors, even if the model is not always the best.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span> </p>
]]></description>
        <pubDate>Thu, 02 Apr 2026 19:33:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-03-21 17:08]]></title>
        <link>https://aicode.danvoronov.com/2026-03/21_17-08/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-03/21_17-08/</guid>
        <description><![CDATA[<p><strong>Nvidia Nemotron 3 Super</strong><br><a target="_blank" href="https://build.nvidia.com/nvidia/nemotron-3-super-120b-a12b">https://build.nvidia.com/nvidia/nemotron-3-super-120b-a12b</a><br>Nvidia presented their new model - <code>Nemotron 3 Super</code>, an open hybrid Mamba-Transformer MoE model: 120B total / 12B active parameters, 1M token context. Currently available for free in Kilo Code: <a target="_blank" href="https://blog.kilo.ai/p/nvidia-nemotron-3-super-launch">https://blog.kilo.ai/p/nvidia-nemotron-3-super-launch</a></p>
<p>The Hacker News post about the release gained only 13 points and 2 comments; generally, no one seems to care. Nvidia took a long time to develop this, and <code>Qwen 3.5</code> has already &quot;caught up and surpassed&quot; many competitors.</p>
<p><strong>Cursor Model Update</strong><br><a target="_blank" href="https://forum.cursor.com/t/introducing-composer-2/155288">https://forum.cursor.com/t/introducing-composer-2/155288</a><br><a target="_blank" href="https://cursor.com/blog/composer-2">https://cursor.com/blog/composer-2</a><br><code>Composer</code> is <strong>Cursor&#39;s own coding LLM</strong>, which delivers good results on simple tasks. Version 2 was specifically trained on long coding tasks using reinforcement learning. The model is quite affordable, with both standard and fast variants available.</p>
<p><strong>and this is Kimi K2.5</strong><br><a target="_blank" href="https://news.ycombinator.com/item?id=47452404">https://news.ycombinator.com/item?id=47452404</a><br>Users noticed that Cursor Composer 2 is based on the Chinese open-weight model Kimi K2.5 from <strong>Moonshot AI</strong>, rather than being a completely in-house development by Cursor from scratch.</p>
<p>The Kimi K2.5 model has a specific modified MIT license. It requires mandatory disclosure of the name &quot;Kimi K2.5&quot; in the interface if the company&#39;s revenue exceeds $20 million per month. Later, representatives from Moonshot and Cursor confirmed an official partnership between them. Cursor accesses Kimi through the inference provider <code>Fireworks AI</code>.</p>
<p><strong>Cursor Interface Update</strong><br><a target="_blank" href="https://forum.cursor.com/t/what-is-cursor-glass/155327">https://forum.cursor.com/t/what-is-cursor-glass/155327</a><br><a target="_blank" href="https://cursor.com/glass">https://cursor.com/glass</a><br><code>Glass</code> is a completely new interface currently in early access, based on an agent command center paradigm. Some users are already complaining that the <strong>update &quot;forcefully&quot;</strong> installs Glass without a way to switch back yet.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=stRhZIrwa-w">https://www.youtube.com/watch?v=stRhZIrwa-w</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=stRhZIrwa-w">Video on YouTube: https://www.youtube.com/watch?v=stRhZIrwa-w</a></p>
          </p>
<p>Now agents are managed in a single space: project threads, parallel sessions, plugin marketplace, built-in browser+terminal, one-click Git, Shift+Tab planning with Mermaid diagrams, and todos.  </p>
<p>It&#39;s a good step to stay competitive. Of course, there&#39;s a lack of original ideas, as the name sounds like an Apple interface and the look mimics the Codex app. However, a bigger problem now is that it’s no longer easy to <strong>create or open files manually</strong>. Consequently, Cursor is losing its status as an &quot;AI IDE&quot; where one could still write code directly (an editor for humans).</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#cursor';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#cursor</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span> </p>
]]></description>
        <pubDate>Sat, 21 Mar 2026 17:08:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-03-17 09:38]]></title>
        <link>https://aicode.danvoronov.com/2026-03/17_09-38/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-03/17_09-38/</guid>
        <description><![CDATA[<p><strong>Leanstral Model</strong><br><a target="_blank" href="https://mistral.ai/news/leanstral">https://mistral.ai/news/leanstral</a><br>Mistral AI introduces Leanstral — an open-source code agent for the Lean 4 programming language (which is also an interactive theorem prover). This model, with 6B active parameters in a sparse architecture, is trained not only to perform tasks but also to formally <strong>prove</strong> the correctness of implementations. This makes it a powerful tool for code verification.</p>
<p>Available for free in Mistral Vibe <a target="_blank" href="https://mistral.ai/products/vibe">https://mistral.ai/products/vibe</a> (via API labs-leanstral-2603) and for download for on-premise hosting and integration with lean-lsp-mcp. This is the first contribution to a future where formal verification becomes commonplace and human review is no longer a bottleneck.</p>
<p>HN Reaction<br><a target="_blank" href="https://news.ycombinator.com/item?id=47404796">https://news.ycombinator.com/item?id=47404796</a><br>Enthusiasts see a future in &quot;executable specs&quot; where an agent writes code + proofs, making regressions impossible. Skeptics remind that proofs only guarantee validity, not that you proved exactly what you intended, and for ordinary projects (non-mathematical/critical software), this is currently &quot;overkill&quot;.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#mistral';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#mistral</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span> </p>
]]></description>
        <pubDate>Tue, 17 Mar 2026 09:38:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-03-13 19:04]]></title>
        <link>https://aicode.danvoronov.com/2026-03/13_19-04/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-03/13_19-04/</guid>
        <description><![CDATA[<p><strong>JetBrains Air</strong><br><a target="_blank" href="https://air.dev/changelog">https://air.dev/changelog</a><br>JetBrains is developing Air as an Agentic Development Environment, which is very similar to a response to the OpenAI Codex app — available via <strong>JetBrains AI Pro/Ultimate subscription</strong>. Currently, a Preview version is available for macOS, while Windows and Linux versions are under development.</p>
<p>It started as a wrapper for Codex and Claude. On March 5, Gemini CLI and Junie were added. You can now choose between different agents depending on the task or combine them — one agent can verify the work of another.</p>
<p>You can use a ChatGPT subscription (in which case only Codex will be available). Login via Claude Pro, Max, and Team has been discontinued due to Anthropic&#39;s new usage policy — API keys must be added.</p>
<p><strong>T3 Code</strong><br><a target="_blank" href="https://t3.codes/">https://t3.codes/</a><br>For some reason, Theo decided to be a developer in addition to being a vlogger — so far, it&#39;s a buggy wrapper for Codex (Claude Code to follow) with minimal description and documentation. Why anyone would use this instead of the original Codex app is unclear to me.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#air';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#air</span></p>
]]></description>
        <pubDate>Fri, 13 Mar 2026 19:04:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-03-06 20:26]]></title>
        <link>https://aicode.danvoronov.com/2026-03/6_20-26/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-03/6_20-26/</guid>
        <description><![CDATA[<p>A year ago, Cursor was the most famous AI-oriented code editor, but competition has significantly increased since then.</p>
<p>They launched their own <strong>CLI</strong> — adding Plan and Ask modes, sub-agents, skills, image generation, built-in Mermaid ASCII diagrams, and keyboard shortcuts over the winter.</p>
<p><strong>Cursor Cloud Agents with Computer Use</strong><br><a target="_blank" href="https://forum.cursor.com/t/cloud-agents-with-computer-use/152829">https://forum.cursor.com/t/cloud-agents-with-computer-use/152829</a><br><a target="_blank" href="https://cursor.com/blog/third-era">https://cursor.com/blog/third-era</a><br>Agents now run the created software in their own VM (a full-fledged computer), test changes, and generate PRs with screenshots and logs. They can record short demo videos. You can connect to the agent&#39;s VM and watch the process.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=tMflcZHo2zI">https://www.youtube.com/watch?v=tMflcZHo2zI</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=tMflcZHo2zI">Video on YouTube: https://www.youtube.com/watch?v=tMflcZHo2zI</a></p>
          </p>
<p>Recorded right in the new Cursor office. A deep dive into the latest major update, calling it the <strong>&quot;third era&quot; of Cursor</strong>: the first was simple AI completions in the editor, the second was local agents, and the third is full cloud agents with their own computer. They are moving towards becoming an agentic platform.</p>
<p><strong>Cursor in Zed and JetBrains</strong><br><a target="_blank" href="https://forum.cursor.com/t/cursor-is-now-available-in-jetbrains-ides/153584">https://forum.cursor.com/t/cursor-is-now-available-in-jetbrains-ides/153584</a><br>Added support for the Agent Client Protocol (ACP), meaning you can now use your Cursor subscription and agent in IDEs that support it, such as IntelliJ IDEA, PyCharm, and WebStorm.</p>
<p><strong>Zed AI is for adults only</strong><br><a target="_blank" href="https://zed.dev/blog/terms-update">https://zed.dev/blog/terms-update</a><br>Among other changes, Zed introduced an 18+ restriction that applies to the &quot;Service&quot; — the cloud SaaS part: account creation and AI features (Zed Pro, edit prediction, etc.).</p>
<p>In a Hacker News thread, they explained that allowing users under 18 would require verifying parental consent, maintaining separate data storage/processing policies, and implementing an age-gate system. It was simply easier to prohibit it.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#cursor';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#cursor</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#zed';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#zed</span></p>
]]></description>
        <pubDate>Fri, 06 Mar 2026 20:26:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-03-06 13:27]]></title>
        <link>https://aicode.danvoronov.com/2026-03/6_13-27/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-03/6_13-27/</guid>
        <description><![CDATA[<p>OpenAI is actively trying to seize the initiative from Claude Code, investing heavily in this effort.</p>
<p><strong>Codex remains free for another month</strong><br><a target="_blank" href="https://openai.com/codex/">https://openai.com/codex/</a><br>An extension of the original limited-time promo from February 2, 2026. Following the release of the Windows version of the Codex app, the promotion has been extended by another month; free ChatGPT accounts can now generate code until April 2. Plus accounts receive double limits.</p>
<p><strong>Codex app for Windows and GPT‑5.4</strong><br><a target="_blank" href="https://openai.com/index/introducing-gpt-5-4/">https://openai.com/index/introducing-gpt-5-4/</a><br>OpenAI has finally introduced the <strong>Windows</strong> version of the Codex app and GPT‑5.4 - a new model that combines the coding capabilities of GPT-5.3-Codex with powerful reasoning. As usual, the model is more token-efficient, faster in iterations, and more proactive.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=8hNcRChDrNk">https://www.youtube.com/watch?v=8hNcRChDrNk</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=8hNcRChDrNk">Video on YouTube: https://www.youtube.com/watch?v=8hNcRChDrNk</a></p>
          </p>
<p>A specialized <strong>WinUI App skill</strong> has been added for Windows developers. You can now select different terminals and switch to WSL.</p>
<p>Starting from version 26.305, a <code>fast mode</code> has been introduced where GPT-5.4 operates 1.5 times faster while maintaining the same level of intelligence. </p>
<p>On the downside, the &quot;Default open destination&quot; list cannot be edited.</p>
<p>Reports suggest GPT-5.4 can view screenshots, control the mouse and keyboard, and run Playwright in Interactive mode for real-time visual debugging.</p>
<p><strong>WebSocket Mode</strong><br><a target="_blank" href="https://developers.openai.com/api/docs/guides/websocket-mode/">https://developers.openai.com/api/docs/guides/websocket-mode/</a><br>This is a persistent connection for the Responses API, specifically designed for long agentic workflows with numerous tool calls (agentic coding, automation, orchestration). For coding agents, it significantly reduces iteration latency, offering up to 40% faster execution with 20+ tool calls.</p>
<p>The mode is built into the Codex App (macOS/Windows). In Codex-Spark, this mode is enabled by default. For other models, you need to add <code>responses_websockets_v2 = true</code> to the <code>~/.codex/config.toml</code> configuration file (CLI version v0.110 will display an &quot;Under-development features&quot; warning).</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#codex';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#codex</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#chatgpt';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#chatgpt</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span></p>
]]></description>
        <pubDate>Fri, 06 Mar 2026 13:27:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-03-04 17:24]]></title>
        <link>https://aicode.danvoronov.com/2026-03/4_17-24/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-03/4_17-24/</guid>
        <description><![CDATA[<p>Some people are already tired of increasingly heavy tools like Claude Code or Cursor, where more and more features are unnecessary, prompts are massive, and everything is hidden.</p>
<p><strong>Pi Agent</strong><br><a target="_blank" href="https://shittycodingagent.ai/">https://shittycodingagent.ai/</a> <a target="_blank" href="https://pi.dev/">https://pi.dev/</a><br>A super-minimalist open-source AI coding agent for the terminal — just 4 basic tools: <code>read, write, edit, bash</code>. Everything else is handled via extensions. It works as a CLI, headless, RPC, or SDK — which is why Pi is &quot;under the hood&quot; of OpenClaw.</p>
<p>Tree-based sessions — you can branch out, go back, and export to HTML. Full transparency — you can see everything that is happening.</p>
<p>Pi allows connecting various LLM providers. Settings are stored in <code>~/.pi/agent/</code> (globally) or <code>.pi/</code> (locally in the project). Key files: <code>settings.json</code> for general parameters and files like <code>SYSTEM.md</code> for custom prompts. Authentication can be done in two ways: via subscription (OAuth/login) or via API key.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=boSPk_Ig4gU">https://www.youtube.com/watch?v=boSPk_Ig4gU</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=boSPk_Ig4gU">Video on YouTube: https://www.youtube.com/watch?v=boSPk_Ig4gU</a></p>
          </p>
<p>You can set up and use Pi Coding Agent locally for free via Ollama.</p>
<p><strong>How the author built it</strong><br><a target="_blank" href="https://mariozechner.at/posts/2025-11-30-pi-coding-agent/">https://mariozechner.at/posts/2025-11-30-pi-coding-agent/</a><br><a target="_blank" href="https://news.ycombinator.com/item?id=46844822">https://news.ycombinator.com/item?id=46844822</a><br><strong>Without</strong> built-in planning modes, background <strong>bash, sub-agents, or MCP</strong>. The agent avoids hidden injections from other harnesses, ensuring full observability of interactions. It avoids frequent prompt/tool changes (unlike Claude Code) that break workflows.</p>
<p>5–10× longer context windows thanks to the minimal prompt, with the ability to change the model mid-session.</p>
<p>It <strong>works with unlimited access</strong> to the file system and commands, recognizing that guardrails are often ineffective and productive work requires full capabilities. The &quot;YOLO mode&quot; scares Hacker News commenters: risks of exfiltration, prompt injection, accidental database deletion, etc. Some suggest chroot / containers / VMs, while others argue that sandboxing in Codex is &quot;security theater.&quot;</p>
<p><a target="_blank" href="https://news.ycombinator.com/item?id=47143754">https://news.ycombinator.com/item?id=47143754</a><br>Users write that Pi provides a &quot;level of control not seen before.&quot; The RPC/headless mode is great for integrations. There is an ecosystem of forks and extensions — the <strong>&quot;oh-my-pi&quot; project</strong> (<a target="_blank" href="https://github.com/can1357/oh-my-pi">https://github.com/can1357/oh-my-pi</a>) is a notable &quot;batteries-included&quot; version, though it is said to often break tools after updates.</p>
<p>Possible Anthropic ban: there are warnings about the risk of account suspension for using alternative clients (similar to OpenCode).</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#piagent';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#piagent</span></p>
]]></description>
        <pubDate>Wed, 04 Mar 2026 17:24:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-03-03 07:41]]></title>
        <link>https://aicode.danvoronov.com/2026-03/3_07-41/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-03/3_07-41/</guid>
        <description><![CDATA[<p>Two years ago, programming models behaved like a <em>genie</em> — you’d ask them for something, and they’d do it technically correctly but with a catch. To combat this, many &quot;harnesses&quot; (wrappers) were devised. Apps like Cursor were pioneers in exploring how to do this effectively.</p>
<p>2026 models have become significantly more obedient, so, as I wrote earlier, the <code>AGENTS.md</code> file is no longer as critical. Another recent example is Vercel, which removed 80% of specialized tools from its internal text-to-SQL agent, leaving only a single &quot;execute bash&quot; in a sandbox (<a target="_blank" href="https://vercel.com/blog/we-removed-80-percent-of-our-agents-tools">https://vercel.com/blog/we-removed-80-percent-of-our-agents-tools</a>).</p>
<p>We are learning to <strong>simplify</strong> the architectures we over-engineered over the past two years, using minimal tools to avoid hindering powerful models.</p>
<p><strong>NxCode Team on AI Agent Operations</strong><br><a target="_blank" href="https://www.nxcode.io/resources/news/harness-engineering-complete-guide-ai-agent-codex-2026">https://www.nxcode.io/resources/news/harness-engineering-complete-guide-ai-agent-codex-2026</a><br>Explains the harness as a &quot;bridle + saddle + reins&quot; for a powerful but uncontrolled &quot;horse&quot; (the model). An example is LangChain, which boosted a coding agent from 52.8% to 66.5% on Terminal Bench without changing the model—only through middleware (self-verification, loop detection, context mapping).</p>
<p>Agents fail not because of model quality, but because of a poor harness.</p>
<p>It’s important to add that an ideal harness won&#39;t save a weak model.</p>
<p><strong>OpenAI on Harness Engineering</strong><br><a target="_blank" href="https://openai.com/index/harness-engineering/">https://openai.com/index/harness-engineering/</a><br>They state that in the world of agents, the engineer&#39;s role is shifting from &quot;writing code&quot; to &quot;managing the environment,&quot; where humans steer the direction and agents execute.</p>
<p>The most important thing now is not just a high-quality model, but the environment:<br>– A structured <code>docs/</code> folder as the single source of truth,<br>– A short <code>AGENTS.md</code> (~100 lines) instead of a massive prompt,<br>– Mechanical linters + CI that check invariants (architecture rules, naming, file size, etc.),<br>– A &quot;doc-gardening&quot; agent that automatically fixes outdated documentation.</p>
<p>A single Codex run can last up to 6 hours (often overnight). Therefore, it’s better to have all knowledge contained within the repository (versioned artifacts). No external chats or verbal discussions.</p>
<p><strong>Discussion on HN about Harness Engineering</strong><br><a target="_blank" href="https://news.ycombinator.com/item?id=46988596">https://news.ycombinator.com/item?id=46988596</a><br>Can Bölük (author of <a target="_blank" href="https://github.com/can1357/oh-my-pi">https://github.com/can1357/oh-my-pi</a>) took 16 different LLM models and ran them twice on the same benchmark for fixing real bugs in a React app. He changed <strong>only one tool</strong>—the file editing format. Instead of <code>apply_patch</code> / <code>str_replace</code>, he introduced <strong>Hashline</strong> (each line gets a short hash, and the model edits by hash rather than text). From this change alone, 14 out of 16 models <strong>improved</strong> their results.</p>
<p>The primary skill for an IT developer now is designing the harness, not writing code manually. Many confirm that hash-line gives agents a significant boost.</p>
<p>Conspiracy theory: &quot;Companies intentionally keep the best harnesses secret to avoid decreasing token consumption.&quot; In recent weeks, Anthropic and Google have been banning custom harnesses; even the post&#39;s author was cut off from Gemini during his benchmark.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#harness';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#harness</span></p>
]]></description>
        <pubDate>Tue, 03 Mar 2026 07:41:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-02-28 17:53]]></title>
        <link>https://aicode.danvoronov.com/2026-02/28_17-53/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-02/28_17-53/</guid>
        <description><![CDATA[<p><strong>Separating Planning and Execution</strong><br><a target="_blank" href="https://boristane.com/blog/how-i-use-claude-code/">https://boristane.com/blog/how-i-use-claude-code/</a><br>The author shares a structured methodology that divides the process into stages so that Claude doesn&#39;t write code &quot;blindly&quot; but instead works according to an approved plan. Research and planning are always conducted first, and only then comes the implementation. This prevents mistakes, maintains control over the architecture, and minimizes token usage.</p>
<p>Workflow Stages</p>
<ol>
<li><strong>Research Phase:</strong> We use words like &quot;deeply&quot; and &quot;in detail&quot; in prompts to review what already exists — the agent documents this in a <code>research.md</code> file.</li>
<li><strong>Planning Phase:</strong> We create a detailed plan in a <code>plan.md</code> file describing the approach, code snippets, file paths, and trade-offs.</li>
<li><strong>Annotation Cycle:</strong> We open the editor and add notes directly into the plan (e.g., &quot;use PATCH, not PUT&quot;), then tell the agent: &quot;I added a few notes to the document, address all the notes and update the document accordingly. don’t implement yet&quot;. This is done iteratively several times.</li>
<li><strong>Todo List:</strong> When everything looks good, the agent converts the plan into a detailed checklist of tasks. We continuously remove unnecessary items from the plan to avoid scope creep.</li>
<li><strong>Implementation Phase:</strong> After the plan is approved, a standard prompt is used: &quot;implement it all,&quot; with instructions to mark completed tasks, check types, and avoid unnecessary comments: &quot;implement it all. when you’re done with a task or phase, mark it as completed in the plan document. do not stop until all tasks and phases are completed. do not add unnecessary comments or jsdocs, do not use any or unknown types. continuously run typecheck to make sure you’re not introducing new issues.&quot;</li>
</ol>
<p>Practical Tips: Provide the agent with links to open-source projects that contain examples of similar code. Refer back to the plan when something goes wrong.</p>
<p>Discussion<br><a target="_blank" href="https://news.ycombinator.com/item?id=47106686">https://news.ycombinator.com/item?id=47106686</a><br>Many users agree with the principle of <strong>separating</strong> planning and execution, considering it an effective way to reduce errors. Using detailed plans in .md files provides a clear track record of decisions and reasoning. Plans help identify the model&#39;s biases, making the process more transparent.</p>
<p>Critics call this method of programming &quot;garbage&quot; or a &quot;gamble,&quot; arguing that it leads to both &quot;brain atrophy&quot; due to AI dependency and poor code quality.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#prompts';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#prompts</span></p>
]]></description>
        <pubDate>Sat, 28 Feb 2026 17:53:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-02-25 20:43]]></title>
        <link>https://aicode.danvoronov.com/2026-02/25_20-43/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-02/25_20-43/</guid>
        <description><![CDATA[<p><strong>Does AGENTS.md Actually Help?</strong><br><a target="_blank" href="https://arxiv.org/abs/2602.11988">https://arxiv.org/abs/2602.11988</a><br>The first large-scale empirical study testing whether repository-level context rule files actually help. Three scenarios were tested on real SWE-bench tasks and a custom dataset of repositories containing <code>AGENTS.md</code> files.</p>
<p>Main conclusion: Modern agents are excellent at finding necessary information directly in the code (package.json, README, schemas, types). Additional instructions often <strong>hinder</strong> rather than help.</p>
<p>Key downsides of such files: Increased costs as the agent reads more files, runs more tests, and performs redundant actions trying to &quot;fulfill all requirements&quot; in <code>AGENTS.md</code>, where outdated instructions often mislead the model.</p>
<p>If writing <code>AGENTS.md</code> manually — keep only minimal, specific requirements to fix recurring agent errors.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=GcNu6wrLTJc">https://www.youtube.com/watch?v=GcNu6wrLTJc</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=GcNu6wrLTJc">Video on YouTube: https://www.youtube.com/watch?v=GcNu6wrLTJc</a></p>
          </p>
<p><strong>Practical recommendations from Theo:</strong></p>
<ul>
<li>Better to invest time in clean architecture, strong typing, tests, CI/CD, and documentation <strong>directly in the code</strong>.</li>
<li>Blindly following &quot;best practices&quot; from agent developers can be harmful. Try removing CLAUDE.md / AGENT.md and compare the agent&#39;s speed and quality.</li>
<li>If a file is necessary — keep it <strong>short</strong> (15–30 lines) and focused on fixing <strong>one</strong> specific problem.</li>
</ul>
<p>Special prompt engineering technique for AI agents: Instead of long rules in <code>CLAUDE.md</code>, add short, intentionally <strong>false</strong> but useful statements that guide the model&#39;s behavior much more effectively.</p>
<p><strong>Examples shown by Theo:</strong></p>
<ul>
<li><strong>&quot;This project is green&quot;</strong> → The agent stops searching for non-existent errors, doesn&#39;t run extra tests, and doesn&#39;t &quot;fix&quot; what isn&#39;t broken.</li>
<li><strong>&quot;This is a brand new feature&quot;</strong> → The agent doesn&#39;t copy old code or try to &quot;adapt&quot; existing solutions, but writes clean code from scratch.</li>
<li>Other common variants: &quot;All tests are passing&quot;, &quot;We always write production-ready code&quot;.</li>
</ul>
<p><strong>HN Discussion:</strong><br><a target="_blank" href="https://news.ycombinator.com/item?id=47034087">https://news.ycombinator.com/item?id=47034087</a><br>Almost everyone agrees that LLM-generated context files (often via the <code>/init</code> command) worsen results. Well-written manual <code>AGENTS.md</code> files are useful only if they contain non-obvious domain knowledge that the model cannot infer from the code. Add them only <strong>after</strong> failed agent attempts.</p>
<p><strong>Critique of the study:</strong> Lack of code quality measurement (only success rate), Python-only dataset, mostly small/LLM-generated repositories, and models change rapidly — results might differ in a month.</p>
<p><strong>Documentation in <code>AGENTS.md</code></strong><br><a target="_blank" href="https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals">https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals</a><br>Agents write code for new Next.js 16 APIs that were not available in training data. Vercel tested passive documentation (an index of actual doc files) as context in <code>AGENTS.md</code>, and it outperformed active &quot;Skills&quot; because the agent doesn&#39;t have to decide &quot;should I call the tool now?&quot;. </p>
<p>This demonstrates that a short, smart <code>AGENTS.md</code> (8 KB index + one key phrase) is one of the best ways to provide an agent with knowledge the model lacks.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#prompts';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#prompts</span> </p>
]]></description>
        <pubDate>Wed, 25 Feb 2026 20:43:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-02-21 09:24]]></title>
        <link>https://aicode.danvoronov.com/2026-02/21_09-24/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-02/21_09-24/</guid>
        <description><![CDATA[<p>As of now, OpenAI has not released a full version of its Codex app for Windows. As of February 2026, the app is only available for macOS, and Windows support is announced as &quot;coming soon,&quot; with no specific date.</p>
<p><strong>OpenCode Desktop app</strong><br><a target="_blank" href="https://opencode.ai/download">https://opencode.ai/download</a><br>OpenCode continues to improve the beta version of their desktop application for macOS, <strong>Windows</strong>, and Linux. It is positioned as a free alternative to proprietary tools like Codex, Cursor, or Devin and is actively developing.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=cGA_6M9x7AM">https://www.youtube.com/watch?v=cGA_6M9x7AM</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=cGA_6M9x7AM">Video on YouTube: https://www.youtube.com/watch?v=cGA_6M9x7AM</a></p>
          </p>
<p>Although the application is still in beta, the video author notes its speed, good design, and adaptability.</p>
<p><a target="_blank" href="https://opencode.ai/docs/windows-wsl">https://opencode.ai/docs/windows-wsl</a><br>If you use the desktop version on Windows, it is better to <strong>run the backend</strong> (server-side) <strong>in WSL</strong> (Windows Subsystem for Linux) — there are currently open issues regarding integration improvements, but WSL already provides the most stable result. It offers significantly better file system performance, full terminal support, and compatibility with development tools.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#opencode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#opencode</span></p>
]]></description>
        <pubDate>Sat, 21 Feb 2026 09:24:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-02-21 08:19]]></title>
        <link>https://aicode.danvoronov.com/2026-02/21_08-19/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-02/21_08-19/</guid>
        <description><![CDATA[<p>Minimal <strong>model updates</strong>.<br>Google updated its Gemini 3 Pro model to version 3.1 with improved high-level agent capabilities. Anthropic updated its mid-tier Claude Sonnet 4.5 to 4.6. These are gradually being added to all major AI coding tools. The Qwen3.5-Plus model was added to Qwen Code.</p>
<p><a target="_blank" href="https://blog.kilo.ai/p/grok-code-fast-optimized">https://blog.kilo.ai/p/grok-code-fast-optimized</a><br><a target="_blank" href="https://kilo.ai/landing/grok-code-fast-1-optimized">https://kilo.ai/landing/grok-code-fast-1-optimized</a><br>xAI ended the free giveaway of <strong>Grok Code Fast</strong> 1 via Kilo as of January 20, but added (and then temporarily removed) an optimized free version.</p>
<hr>
<p><strong>Copilot Subscription in Zed</strong><br><a target="_blank" href="https://github.blog/changelog/2026-02-19-github-copilot-support-in-zed-generally-available/">https://github.blog/changelog/2026-02-19-github-copilot-support-in-zed-generally-available/</a><br>GitHub officially enabled the use of Copilot Pro, Pro+, Business, or Enterprise subscriptions in Zed through a partnership. Authentication happens directly via the GitHub Copilot account—no additional license or separate API key is required.</p>
<p><strong>Blocking from Anthropic</strong><br><a target="_blank" href="https://code.claude.com/docs/en/legal-and-compliance">https://code.claude.com/docs/en/legal-and-compliance</a><br>Since January–February 2026, users have encountered bans on their Pro/Max subscriptions when used in non-Anthropic tools. </p>
<p>This is now officially documented in the Legal &amp; Compliance section: OAuth tokens from Free, Pro, and Max plans are intended <strong>exclusively</strong> for the official Claude Code and Claude.ai. Using these tokens in any third-party tools, editors, or services is prohibited. Accounts will be blocked without warning.</p>
<p><a target="_blank" href="https://news.ycombinator.com/item?id=47069299">https://news.ycombinator.com/item?id=47069299</a><br>Approximately 80% of comments are critical of Anthropic. The decision is seen as classic &quot;enshittification&quot; and a <strong>lock-in attempt</strong> to artificially force all users into their own Claude Code, which has recently become less convenient (especially the decision to hide the model&#39;s reasoning/thinking) compared to OpenCode, Cursor, Codex, Aider, etc. This will likely accelerate the transition to alternatives.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#kilo';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#kilo</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#githubcopilot';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#githubcopilot</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#zed';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#zed</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span></p>
]]></description>
        <pubDate>Sat, 21 Feb 2026 08:19:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-02-12 17:22]]></title>
        <link>https://aicode.danvoronov.com/2026-02/12_17-22/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-02/12_17-22/</guid>
        <description><![CDATA[<p><strong>Claude Opus 4.6 Fast Mode</strong><br><a target="_blank" href="https://code.claude.com/docs/en/fast-mode">https://code.claude.com/docs/en/fast-mode</a><br>Anthropic has added a new <strong>Fast Mode</strong> to Opus 4.6, increasing token output speed by approximately 2.5x. Response quality remains the same. It is significantly more expensive (6x) and is available as a research preview. The mode is also available in GitHub Copilot.</p>
<p><strong>GPT‑5.3‑Codex‑Spark</strong><br><a target="_blank" href="https://openai.com/index/introducing-gpt-5-3-codex-spark/">https://openai.com/index/introducing-gpt-5-3-codex-spark/</a><br>GPT-5.3-Codex-Spark is a smaller version of GPT-5.3-Codex and a model optimized for <strong>real-time</strong> code generation (exceeding 1,000 tokens per second) through a partnership with Cerebras. This is a step toward a hybrid Codex with two modes: long-horizon tasks (hours/days) and real-time interaction. The API is currently available only to partners, and pricing has not been disclosed.</p>
<p>Following the updates to top proprietary models, leading models from Chinese companies have also been updated.</p>
<p><strong>MiniMax M2.5</strong><br><a target="_blank" href="https://www.minimax.io/news/minimax-m25">https://www.minimax.io/news/minimax-m25</a><br>The new flagship model from Chinese company MiniMax operates at a speed of 100 tokens per second, which is nearly twice as <strong>fast</strong> as other leading models. It performs complex tasks 37% faster than M2.1 and is on par with Claude Opus 4.6. On average, M2.5 is 10-20 times <strong>cheaper</strong> than Claude Opus, Gemini 3 Pro, and GPT-5.</p>
<p>Fully deployed in the MiniMax Agent product, where users can create their own &quot;Experts&quot; for specific tasks using &quot;Office Skills.&quot;</p>
<p>The model will be available for free for 7 days in OpenCode.</p>
<p><strong>GLM-5</strong><br><a target="_blank" href="https://z.ai/blog/glm-5">https://z.ai/blog/glm-5</a><br>The new flagship open-source model from Chinese company Zhipu AI (now branded Z.ai) focuses on &quot;Agentic engineering&quot;—long-term complex tasks and coding at the level of frontier models. Low hallucination rate, improved reasoning, and support for long context. It is reported that training was conducted on Huawei chips.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=vtWMgVCMsx8">https://www.youtube.com/watch?v=vtWMgVCMsx8</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=vtWMgVCMsx8">Video on YouTube: https://www.youtube.com/watch?v=vtWMgVCMsx8</a></p>
          </p>
<p><strong>Leader</strong> among open-weights models according to Artificial Analysis. The model is compatible with Claude Code and OpenClaw. <a target="_blank" href="https://blog.kilo.ai/p/glm-5-free-limited-time">Currently free</a> at Kilo Code and OpenCode.</p>
<p><strong>Ollama Cloud</strong><br><a target="_blank" href="https://docs.ollama.com/cloud">https://docs.ollama.com/cloud</a> and <a target="_blank" href="https://ollama.com/pricing">https://ollama.com/pricing</a><br><a target="_blank" href="https://ollama.com/library/glm-5">https://ollama.com/library/glm-5</a><br>Added <code>ollama launch opencode --model minimax-m2.5:cloud</code> or <code>ollama launch claude --model glm-5:cloud</code>, allowing you to run core programming CLIs by pulling new models from the Ollama cloud. You can start using this feature for free, with subscription plans available at $20 and $100 per month.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#glm';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#glm</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#minimax';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#minimax</span> </p>
]]></description>
        <pubDate>Thu, 12 Feb 2026 17:22:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-02-06 12:08]]></title>
        <link>https://aicode.danvoronov.com/2026-02/6_12-08/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-02/6_12-08/</guid>
        <description><![CDATA[<p>It was known that the model update announcements from the two top companies were scheduled for the same time on February 5, 2026, but then Anthropic went live 15 minutes early. However, with OpenAI&#39;s announcement, their model became available only within <strong>Codex</strong>, with no API access. This prevented third-party projects (like Cursor or Cline) from offering immediate access to it.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=9f2egsZZjnw">https://www.youtube.com/watch?v=9f2egsZZjnw</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=9f2egsZZjnw">Video on YouTube: https://www.youtube.com/watch?v=9f2egsZZjnw</a></p>
          </p>
<p><strong>Update to Claude Opus 4.6</strong><br><a target="_blank" href="https://www.anthropic.com/news/claude-opus-4-6">https://www.anthropic.com/news/claude-opus-4-6</a><br>Anthropic has improved upon Opus 4.5. It features enhanced skills in planning, <strong>autonomous operation</strong>, code review, document handling, and online search. The beta version includes a 1M token context window and automatic summarization of older context for lengthy tasks (<code>Context Compaction</code>). The key feature is the ability to execute <strong>longer</strong> and more complex tasks autonomously.</p>
<p><a target="_blank" href="https://code.claude.com/docs/en/agent-teams">https://code.claude.com/docs/en/agent-teams</a><br><strong>Claude Code</strong> has added Agent Teams for the autonomous coordination of multiple agents. Unlike sub-agents, which operate within a single session where interaction is restricted to the main agent, here you can interact directly with individual &quot;team members&quot; without going through the team lead.</p>
<p>HN Discussion<br><a target="_blank" href="https://news.ycombinator.com/item?id=46902223">https://news.ycombinator.com/item?id=46902223</a><br>Skepticism outweighs enthusiasm. Many users <strong>fail to notice a significant difference</strong> between 4.5 and 4.6, with some even noting, &quot;10x more expensive than Sonnet, but no difference.&quot; The general consensus is that &quot;all models have their flaws.&quot; There is widespread criticism of Claude Code for its slowness, high memory consumption, and the use of React for the terminal.</p>
<hr>
<p><strong>Update to GPT-5.3-Codex</strong><br><a target="_blank" href="https://openai.com/index/introducing-gpt-5-3-codex/">https://openai.com/index/introducing-gpt-5-3-codex/</a><br>An improvement over GPT-5.2-Codex. This is a specialized model for generating code for complex projects and automation. It aims to be 25% faster than 5.2-Codex while maintaining the same accuracy.</p>
<p>The main focus of the announcement is <strong>Interactive Collaboration</strong>. You can &quot;steer&quot; mid-execution—meaning you can re-prompt the model without stopping it, and it will <strong>immediately</strong> shift its strategy. This contrasts with Opus 4.6, which attempts to work autonomously for extended periods with minimal human intervention.</p>
<p><strong>Codex as an App</strong><br><a target="_blank" href="https://openai.com/index/introducing-the-codex-app/">https://openai.com/index/introducing-the-codex-app/</a><br>In addition to the CLI and IDE extension, there will now be a <strong>standalone app</strong> under this name. It is built on Electron, though only the Mac ARM version was available at launch, with a waitlist for other platforms. This is another attempt to create an agent &quot;control center,&quot; similar to what exists in Cursor and Antigravity. This one seems successful.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=ICYbOfW5RoQ">https://www.youtube.com/watch?v=ICYbOfW5RoQ</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=ICYbOfW5RoQ">Video on YouTube: https://www.youtube.com/watch?v=ICYbOfW5RoQ</a></p>
          </p>
<p>It is a graphical interface (GUI) for the Codex CLI that allows for managing multiple projects, agents, and conversations in a single window. Features include quick switching between projects and chats, voice control, open in IDE, automatic builds, diff-view, terminal.</p>
<p>HN Discussion<br><a target="_blank" href="https://news.ycombinator.com/item?id=46902638">https://news.ycombinator.com/item?id=46902638</a><br>Users highlight the diverging strategies chosen by the top players. With Codex, it&#39;s &quot;Steering mid-execution&quot;—the ability to control the process while it runs. The human stays &quot;in the loop.&quot; <strong>Faster recovery</strong> from errors. Better performance with backend and &quot;hard&quot; tasks. Claude focuses on increased autonomy for agent swarms and long-duration tasks, but users note that the &quot;Fire and forget&quot; approach often leads to chaos and poor-quality code.</p>
<hr>
<p>I believe OpenAI has made a <strong>series of good decisions</strong> regarding code generation for professional programmers, as opposed to &quot;vibe-coders&quot; or prototypers. The latter group is better suited for Opus 4.6, which will devour a lot of tokens as a swarm in Claude Code, but eventually generate a working version.</p>
<p>I like that, as of recently, Codex models have started writing back <strong>how they understood me</strong> after my request, and at every step, they report exactly what they are about to do. Generation can be quickly stopped if a misunderstanding occurs, allowing me to add new instructions and clarifications. Judging by the video, I see that in the new Codex app, the code being edited is hidden by default, showing only these text messages.</p>
<p>Furthermore, while working with the CLI, I built my own web app to manage all my chats across all my projects, because doing so from within the CLI is very inconvenient. The new Codex app, judging by the video, does exactly that—I&#39;ll be waiting for the Windows version.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#codex';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#codex</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#claudecode';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#claudecode</span></p>
]]></description>
        <pubDate>Fri, 06 Feb 2026 12:08:00 GMT</pubDate>
      </item>

      <item>
        <title><![CDATA[2026-02-03 10:03]]></title>
        <link>https://aicode.danvoronov.com/2026-02/3_10-03/</link>
        <guid isPermaLink="true">https://aicode.danvoronov.com/2026-02/3_10-03/</guid>
        <description><![CDATA[<p><strong>Model Upgrade to Kimi K2.5</strong><br><a target="_blank" href="https://www.kimi.com/blog/kimi-k2-5.html">https://www.kimi.com/blog/kimi-k2-5.html</a><br>This is an open-source model (though very large — requiring hundreds of GB of VRAM), setting new standards in multimodality, programming, and autonomous agent work. The developers&#39; main pride is the <strong>Agent Swarm</strong> mode. Instead of one agent performing tasks sequentially, K2.5 can independently create and coordinate an entire &quot;swarm&quot; of <strong>100 sub-agents</strong>.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=eQyAzZboDbw">https://www.youtube.com/watch?v=eQyAzZboDbw</a></p>
<p>
            <p><a href="https://www.youtube.com/watch?v=eQyAzZboDbw">Video on YouTube: https://www.youtube.com/watch?v=eQyAzZboDbw</a></p>
          </p>
<p>High scores on SWE-Bench (76.8%), close to GPT-5.2 and Claude Opus 4.5. It handles real-world code generation tasks well.</p>
<p>Kimi K2.5 is not just a text model, but a &quot;natively multimodal&quot; intelligence. It was fine-tuned on a massive dataset of 15 trillion <strong>mixed visual and text tokens</strong>. Thanks to this, the model simultaneously improves its skills in both text understanding and image/video analysis.</p>
<p>As a result, Kimi K2.5 demonstrates excellent <strong>results in frontend development</strong>. The model can &quot;see&quot; its own mistakes in the visual interface and autonomously fixes them (autonomous visual debugging). The model can also convert video-to-site.</p>
<p><strong>Kimi Code CLI</strong> 1.0<br><a target="_blank" href="https://moonshotai.github.io/kimi-cli/en/">https://moonshotai.github.io/kimi-cli/en/</a><br>The Chinese company Moonshot AI is developing its own command-line interface, a cross-platform solution (Windows, macOS, Linux) — <strong>Kimi Code CLI</strong>. Recently, the project has evolved from a simple interactive shell into a complex system, although it is still in Technical Preview. In the best Chinese traditions, the interface is a copy of Claude Code.</p>
<p>The CLI already supports the <strong>Agent Client Protocol (ACP)</strong> for integration into Zed IDE, MCP, third-party providers, and its own OAuth via <code>login/logout</code>. There is also a web interface launch via the <code>kimi web</code> command.</p>
<p>Skills here are called <strong>Flow skills</strong>. Users can describe scenarios in <code>SKILL.md</code> files (with Mermaid/D2 diagram support) and trigger them with the <code>/flow</code> command.</p>
<p><strong>Subscription for $19</strong><br><a target="_blank" href="https://www.kimi.com/code">https://www.kimi.com/code</a><br>The subscription is focused on programming, providing access to the CLI and IDE. Prices ($19 / $39 / $199) are <strong>on par with American market leaders</strong>, which reflects Kimi&#39;s confidence in the competitiveness of its models.</p>
<p><span class="post-tag" onclick="document.getElementById('postsFilter').value='#kimi';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#kimi</span> <span class="post-tag" onclick="document.getElementById('postsFilter').value='#newllmmodel';document.getElementById('postsFilter').dispatchEvent(new Event('input'))">#newllmmodel</span></p>
]]></description>
        <pubDate>Tue, 03 Feb 2026 10:03:00 GMT</pubDate>
      </item>
  </channel>
</rss>