CodeWithLLM-Updates
-
🤖 AI tools for smarter coding: practical examples, step-by-step instructions, and real-world LLM applications. Learn to work efficiently with modern code assistants.

ARM package manager
https://github.com/jomadu/ai-rules-manager
A rather simple idea, we'll see its development later. A package manager for AI rules. Identical sets of rules in different projects and keeping them automatically synchronized with a source of truth. Git can be used, similar to awesome-cursorrules or other rule collections.

For some reason, I didn't see a mention of https://agents.md/ which offers a standard option for connecting rules and context for various AI systems.

In 2023, Google seemed very confused about AI code generation, but over 2024 and 2025, they managed to deliver both good models and cover every niche of AI programming with their tools. Code can be generated from both the basic and advanced web interfaces of the Gemini chat. There are two proprietary IDEs (though browser-based), plugins for popular IDEs, a terminal agent, and a cloud agent.


Gemini web chat
https://gemini.google.com/
The web chat can generate code; model 2.5 Pro is available for free in a limited capacity. If you select Canvas from Tools, the chat will be on the left, and the entire screen will be code and preview mode. You can also display the console. In preview mode, there is an option from the panel to (A) add AI functionality (B) highlight an element or part of the screen with the mouse to write subsequent queries about it.

Suitable for: simple visualizations, interactive explanations, single-page website prototypes.

Build apps with Gemini
https://aistudio.google.com/apps
A significantly better web version, similar to Canvas mode. Uses model 2.5 Pro and React & Tailwind without the ability to change them. There is an option to open the application in full screen and deploy it to Google Cloud.

Suitable for: great for generating UI, small applications.


Stitch from Google Experiments
https://stitch.withgoogle.com/ from https://labs.google/
A tool in limited access. Generates mobile and desktop interfaces from a text description or sketch image. After a request, the system will analyze the requirements and propose the number and content of screens with the ability to export as code (HTML with tailwind classes) and images; it also seems to have export to Figma.

Opal from Google Experiments
https://opal.withgoogle.com/
If you draw an algorithm diagram, it becomes a mini-program that can be shared.


Gemini Code Assist
https://codeassist.google/
This page covers many projects. In addition to the free plan, Standard ($19) and Enterprise ($45) plans are available with higher limits for Gemini CLI and Agent Mode. Google by default collects and uses data to train AI models in the free plan.

https://marketplace.visualstudio.com/items?itemName=Google.geminicodeassist and https://plugins.jetbrains.com/plugin/24198-gemini-code-assist
The plugin for VSC and JetBrains compatible systems (as well as Android Studio) in Agent Mode can perform multi-step tasks where the user can review and approve an action plan. Now supports MCP.

Suitable for: improving manual code work, questions, and generating necessary fragments, tests.

Gemini CLI https://github.com/google-gemini/gemini-cli
Google's answer to Claude Code - an autonomous agent that runs in the terminal with the gemini command. The code is open source. Supports macOS, Linux, and Windows. Can process PDFs and images, use Google Search, and pull additional instructions from GEMINI.md. Creates a shadow git repository ~/.gemini/history/<project_hash> where it logs the final state of each development stage - disabled by default, can be added in settings.

Suitable for: step-by-step automatic project generation, with checks at each step and additional edits.

Jules
https://jules.google.com/
Launched as an experiment, it quickly moved to beta, with a subscription available from August 6, 2025. This is an autonomous agent (or rather two - an executor and a critic) that runs on a Google virtual machine with a 20GB disk. Its idea is for it to create its own plan, work step-by-step with the selected GitHub repository for an extended period, and then send a report to our browser. 15 free task/day.

Suitable for: I have little experience; here, you need to understand what level of task complexity on the repository will not confuse the autonomous system and set tasks accordingly.


Cloud Shell Editor
https://shell.cloud.google.com/
An IDE that does not require local setup and is available directly from the browser. Integrated with Google Cloud Platform. The VM has a 5GB disk. Where the Copilot button is in VSC, there is a button to activate the Gemini Code Assist panel. Gemini CLI can be called from the terminal.

Suitable for: manual and automatic code work, if Google Cloud Platform is needed.

Firebase Studio (formerly IDX)
https://studio.firebase.google.com
Project IDX emerged as an experimental browser-based IDE. Announced on August 8, 2023, for closed testing, opened for public testing on May 14, 2024, and rebranded to Firebase Studio on April 9, 2025. This is a full-fledged IDE where we have full access to the code and its modification.

Requests can be automatically improved; first, a plan and documentation will be developed, and then the code will be generated. Templates are available for Astro, Go, Python/Flask, Solid.js, Node.js, with support for Flutter and React Native. Integration with GitHub and Google Cloud services.

Suitable for: as an IDE for both manual and automatic work with complex projects.

GLM plan for programming
https://docs.z.ai/devpack/overview
A subscription package created specifically for programming with artificial intelligence. It provides access to the powerful GLM-4.5 and GLM-4.5-Air models in popular tools like Claude Code, Cline-Roo-Kilo, OpenCode, and others.

The cost starts from $3 per month. Significantly higher usage limits compared to standard plans like Claude Pro. The quota is renewed every 5 hours. The company's services are located in Singapore. Z.ai does not store user queries or generated data.

DeepSeek in Claude Code
https://api-docs.deepseek.com/guides/anthropic_api
By setting ANTHROPIC_BASE_URL to https://api.deepseek.com/anthropic, you can use models from DeepSeek.

OpenAI Codex
On its YouTube channel, OpenAI released a video about Codex during a series of virtual "Build Hours" events.

https://www.youtube.com/watch?v=WvMqA4Xwx_k

OpenAI's goal is to make Codex the sole, universal agent for code generation. It is recommended to view Codex as a full-fledged team member or even delegate tasks to it, acting as an architect or manager.

Codex can operate in two main modes: locally and in the cloud. Both environments are integrated via a ChatGPT account and synchronized with GitHub repositories. Task delegation is available through the web interface and the ChatGPT mobile app (iOS), allowing users to "run tasks when inspiration strikes," even away from the workstation.

A new extension, compatible with VS Code, Cursor, and other VSC forks. Codex can automatically check pull requests on GitHub. This functionality is not limited to static analysis – the agent can run code, check logic, and validate changes.

It is recommended to organize repositories with smaller files. The presence of tests, linters, and formatters allows the agent to independently detect and correct errors. For complex tasks, it's better to generate detailed plans in Markdown (plan.md). Using an agents.md file to document architectural decisions helps Codex immediately understand what it's working with.

Specification-Driven Development
https://github.com/github/spec-kit
Github has proposed a new paradigm for software creation. And the tool for its implementation, which works on Linux/macOS (or WSL2 under Windows) with Claude Code, GitHub Copilot, Gemini CLI agents.

Before the AI era, we usually wrote code first, and that was the "real work," and then "on the side" we finished specifications and documentation. Spec-Driven Development (SDD) says that specifications become primary, directly generating working implementations. Specifications define "what" before "how."

Kiro from Amazon has a similar approach (Spec mode), which first writes project requirements and only then generates code. Qoder also has a Quest mode with similar logic.

https://www.youtube.com/watch?v=LA_HqmiGvsE

Detailed document
https://github.com/github/spec-kit/blob/main/spec-driven.md
The main idea of SDD is to bridge the gap between intent (specification) and execution (code). This is achieved by making specifications (such as Product Requirement Documents – PRDs – and implementation plans) so precise, complete, and unambiguous that they can be used to automatically generate working code. SDD ensures systematic alignment of all components of a complex system.

The quality of specifications is ensured by structured templates that prevent premature implementation details, require explicit markers for uncertainties, include checklists, and ensure compliance with the project's "constitution." The "constitution" contains immutable architectural principles (e.g., "Library-first," "CLI-interface mandatory," "Test-First Development," "Simplicity," "No excessive abstraction") that guarantee consistency, simplicity, and quality of the generated code.

Agent Client Protocol (ACP)
https://agentclientprotocol.com
https://github.com/zed-industries/agent-client-protocol
A new open standard developed by Zed Industries, aiming to standardize communication between code editors (IDEs, text editors) and AI agents. The approach is similar to how Language Server Protocol (LSP) standardized the integration of language servers.

Agents run as child processes of the code editor and exchange data using JSON-RPC over standard input/output (stdio) streams. The protocol is still under development.

https://zed.dev/blog/claude-code-via-acp
Zed editor now supports integration with Claude Code via ACP and runs in the sidebar. The adapter is released as open-source under the Apache license, allowing other editors to use it as well. https://github.com/Xuanwo helped with the implementation.

Practical Techniques for Claude Code and Codex CLI
https://coding-with-ai.dev/
https://github.com/inmve/coding-with-ai
The website contains practical techniques for effective work with AI assistants. Each stage has a checklist.

Raise the following topic-sections:
1. Planning (preparation for working with AI),
2. UI & Prototype (rapid creation of interfaces and prototypes),
3. Coding (efficient code generation and manipulation),
4. Debuging (identifying and fixing errors with AI),
5. Testing & QA (ensuring code quality through tests),
6. Review (checking and improving AI-generated code)
and Cross-stage (general advice for all stages).

The main idea: "Brain First, AI Second" meaning to think independently first, and then use AI for interface prototyping and delegating tedious, systematic tasks.

Start working with AI with thorough planning - choose stable "boring" libraries and provide extremely detailed specifications. At least sketch the interface. It's better to start with some code, you can write the main algorithm yourself. And write tests at the beginning.

It is important to create a context-memory file (for example, `AGENTS.md`). Actively manage context, interrupt the agent if it deviates from the course, and use it as a learning partner by asking open-ended questions to choose one of several options. Write everything to logs as much as possible. Use screenshots to explain problems.

Run multiple agents in parallel on separate tasks without conflicts. Keep the code as simple as possible, ask for refactoring and simplification repeatedly. It's better to generate your own code instead of connecting more and more third-party libraries.

Always read generated code! Use a second agent without chat history to check the code. Never delegate testing entirely to AI, always check the code yourself and thoroughly review all changes (diff).

Start with cheaper models (Sonnet 4), moving to more complex ones (Opus 4.1) only when necessary. Claude models can adjust effort using the words think < think hard < think harder < ultrathink.. For interfaces, using more beautiful or more elegant works.

AI integration experience in production processes
https://www.sanity.io/blog/first-attempt-will-be-95-garbage
The author used Cursor for 18 months for code generation, and Claude Code for the last 6 weeks. The transition to Claude Code took only hours to understand how to work with it.

Author's mental model: AI is a junior developer who never learns. Costs for Claude Code can be a significant percentage of an engineer's monthly salary ($1000-1500 per month).

Conclusions:

  • Rule of three attempts: Forget about perfect code on the first try - 95% will be garbage, which helps the agent identify real tasks and limitations. The second attempt might yield 50% good code, and by the third time, it will most likely implement something that can be iterated and improved.
  • Retain knowledge between sessions: update Claude.md with architectural decisions, typical code patterns, "gotchas", and links to documentation. And configure MCP to retrieve data from Linear, Notion/Canvas, Git/GitHub history, and others.
  • Teams and AI agents: uses linear. It's important never to run multiple agents on the same problem space. Explicitly mark correct, human-edited code.
  • Code review: agents --> agent overseers --> team. Always check, especially for complex state management, performance-critical, and security-critical sections.

The main problem today is that AI does not learn from mistakes. Solution: better documentation, clearer instructions.

The author emphasizes that giving up "ownership" of the code (since AI wrote it) leads to more objective reviews, faster removal of unsuccessful solutions, and lack of ego during refactoring.

Discussion on HN
https://news.ycombinator.com/item?id=45107962
LLMs have already become a standard tool for engineers. It is confirmed that several iterations with LLMs (3 or more) to achieve an acceptable result are typical. LLMs do not write complex code very well on their own unless it is an extremely simple or templated task. Code written by LLMs always requires careful monitoring and editing.

There is concern that excessive use of LLMs may reduce a developer's own "thinking" skills. If a junior developer simply "vibe" code with an LLM without deep understanding, it undermines the trust of senior colleagues.

Some users create additional "critical agents" (LLMs trained to find errors) to check code written by the main agent. Breaking down complex tasks into small, manageable parts is key to success. Using TDD with LLMs works very well.

A sample instruction file CLAUDE.md is available at https://www.dzombak.com/blog/2025/08/getting-good-results-from-claude-code/

However, there is a discussion about the effectiveness of this file for providing context: some consider it useful for AI's long-term memory, while others argue that the AI often ignores its content. From the HN Discussion, further conclusions can be drawn:

  • The best results are achieved when the developer spends significant time creating very clear, step-by-step specifications (documents describing exactly how a project should be implemented). This requires more initial effort but allows Claude Code to follow clear instructions and generate more accurate and organized code.
    • Some users employ other AIs (e.g., ChatGPT, Gemini) for brainstorming, creating specifications, critiquing, and refining them before submitting the final document to Claude Code.
    • Integration with code quality tools (husky, lint-staged, commitlint) helps maintain standards.
  • Claude Code, despite marketing, does not "think" in a human sense; at any step, it can make strange mistakes or "hallucinate."
    • Since we have a limited context window, it's better to work with Claude Code in small, sequential steps. Ask it to write one function or make one change, then check the result, fix errors, commit, and only then move to the next step.
    • Some successfully use Claude to write unit tests, and then ask it to write minimal code to pass those tests, similar to TDD (Test-Driven Development).
  • Some users have noticed that asking Claude to review its own work can be surprisingly fruitful, as it often points out shortcomings itself.

An analysis of how Claude Code works
https://minusx.ai/blog/decoding-claude-code/
The author believes that the CLAUDE.md file is key for conveying user context and preferences (e.g., which folders to ignore, which libraries to use). Its content is sent with each user request. Phrases like IMPORTANT, VERY IMPORTANT, NEVER, and ALWAYS are still effective for preventing undesirable behavior.

Checkpoints for Claude Code
https://claude-checkpoints.com/
The project adds checkpoints to Claude Code, similar to Cursor. The main goal is to ensure that we do not lose correctly generated code by tracking changes and enabling the restoration of previous project states. It includes a visual diff viewer.

Checkpoints are created automatically after Claude completes tasks. It integrates with Claude Desktop via the MCP (Model Context Protocol) protocol.

Auggie
https://www.augmentcode.com/product/CLI
Augment Code CLI - this is a attempt to copy Claude Code, this time from the developers of Augment Code. As a standalone product, it's hard to say if it has any value; it's more likely an addition to their main product and tied to their pricing plans.

An interesting feature is the hotkey for improving a request. Otherwise, it has basic terminal agent functions, with MCP support like everywhere else now. The models are Claude Sonnet 4 and GPT-5.

Privacy mode in TRAE IDE v2.1.0
https://docs.trae.ai/ide/privacy-mode?_lang=en
One of the reasons I didn't use ByteDance's Chinese TRAE was their unclear policy of "we do what we want with your code." Now they have added the ability to activate privacy mode in the settings. I think this is a response to Alibaba's Qoder.

Privacy mode is only active when your account is active. After logging out, privacy mode ceases to be active. Privacy mode does not apply to SOLO mode, which is still in the testing phase.

SOLO mode
https://docs.trae.ai/ide/solo-mode
Appeared in TRAE v2.0.0. Available only for users with a Pro subscription and SOLO Code. This is a highly automated mode that independently plans and executes the entire development cycle: from requirements analysis, code and testing generation to results preview and deployment. Works with Figma / Vercel / Supabase / Stripe.

https://www.youtube.com/watch?v=4JObEIIK8Uo

SOLO Builder is an agent for creating web applications. Analyzes requirements → generates PRD → writes code → provides a preview. During development, AI independently orchestrates tools such as editor, browser, terminal, and doc viewer (project description with diagrams). Sometimes you need to click confirmation buttons.

Codex IDE
https://developers.openai.com/codex/ide
OpenAI has once again changed what is called Codex. It is now not only the old model and a new agent in CLI and ChatGPT, it is also an IDE extension (VSCode, Cursor, and Windsurf).

From it, you can both modify code locally and send tasks to the cloud. It works on Mac and Linux. Windows support is still experimental.

https://www.youtube.com/watch?v=SgJaSmD3u3k

https://help.openai.com/en/articles/11369540-using-codex-with-your-chatgpt-plan
A response to fixed pricing plans in Claude Code. The main thing is that Codex supports paid ChatGPT plans (except Enterprise or Edu, but that's for now), which many people have.
Plus offers 30-150 messages per 5 hours, Pro offers 300-1500.

ByteRover 2.0
https://www.byterover.dev/blog/byterover-2-0
A new version of ByteRover 2.0 has been released with an improved interface and two key features. Context Composer: A tool for collecting and managing precise context for AI agents from various sources (documents, images, web, etc.). Git for AI Memory: Allows version control of AI agent memory, tracking changes, and collaboration, similar to Git for code.


xAI: Grok Code Fast 1
https://openrouter.ai/x-ai/grok-code-fast-1
https://vercel.com/ai-gateway/models/grok-code-fast-1
xAI has released a new MoE model, Grok Code Fast 1, for code generation. Designed for "agentic coding," it is fast and relatively inexpensive. It has a 256k token context window.

https://github.blog/changelog/2025-08-26-grok-code-fast-1-is-rolling-out-in-public-preview-for-github-copilot/
The model is available in GitHub Copilot for users with Copilot Pro, Pro+, Business, and Enterprise plans. You can also use your own API key (BYOK). Free access is provided until September 2, 2025.

Also added to Cursor.

Zed and CLI agent support
https://zed.dev/blog/bring-your-own-agent-to-zed
Zed, in partnership with Google, has integrated Gemini CLI into the IDE via the new Agent Client Protocol (ACP).

This is an open, JSON-RPC-based protocol that allows connecting any agents to Zed (similar to how LSP works for languages). AI data is not sent to Zed servers. The protocol is Apache licensed — you can create your own agents or connect them to other editors (for example, Neovim already supports it).

When is the Windows version coming?
https://zed.dev/windows
Finally, Zed will get native support for Windows, which thousands of users have requested. The developers have faced a number of challenges and are actively working on it.

The company invites users to join the closed beta test.

Alibaba Qoder
https://qoder.com/ https://qoder.com/changelog
Not to be confused with Qodo. A new AI-powered IDE (another VSC clone), this time from Chinese giant Alibaba. They claim their system can deeply understand complex project architecture, design patterns, and its operational logic. Solution: in addition to code graphs and indexing, it automatically maintains a project description Wiki.

https://www.youtube.com/watch?v=xSWj_Pe3CWo

It offers two operating modes similar to those in Amazon Kiro: in 'Agent Mode' it works as an interactive vibe-coder. In 'Quest Mode' it takes on the role of an engineer (Spec-Driven Development), independently planning work, breaking down tasks, creating function implementations, and automating testing, delivering ready-to-use code according to specified requirements. The second mode requires a GIT repository. There is a history of completed tasks. Multiple tasks can be launched and, unlike Kiro, they will not queue but execute in parallel.

It can work with Anthropic Claude, Google Gemini, and OpenAI GPT, automatically selecting the most efficient and economical model, similar to Auto mode in Cursor. Manual model switching is (currently?) not available. The website does not have a Linux version.

Qoder is currently available for free as part of a public preview. The privacy mode in the settings cannot be changed – it is activated by default.

Amazon Kiro v0.2
https://kiro.dev/pricing/
Kiro IDE is still in preview development, but they have already launched paid subscriptions. Perhaps this is because in v0.1 they gave a lot of free tokens for Sonnet 4, and now registration is closed. For new users, there is only a waiting list.

Only 50 vibe requests per month will be free; currently, all new users have been given 100 vibe and spec requests, which must be used within 2 weeks. This is very unfortunate; my simple test project took 40 spec requests in just 2 hours. And 125 spec requests are now $20/month.

Furthermore, compared to Cursor, there is almost no functionality or variety of models here yet, and it feels 2-3 times slower. I don't know who would pay so much for this now.