CodeWithLLM-Updates
-

Claude Code 2.0.0
https://www.npmjs.com/package/@anthropic-ai/claude-code
Redesigned UI. Ctrl-R to search history. Tab to switch thinking modes (persists between sessions). /usage command to view plan limits. /rewind command to revert code changes - native checkpoints finally.

Added a native VS Code extension. Claude Code SDK is now called Claude Agent SDK and allows others to create their own intelligent "agents" based on this technology.

Claude Sonnet 4.5
https://www.anthropic.com/news/claude-sonnet-4-5
They claim it's the best model for programming today. Safety is specifically emphasized: the model has become "more obedient" and less prone to errors or harmful actions. Importantly, this new, more powerful version is already available to everyone at the same price as the previous one.

AI Programming University
https://cline.bot/learn
Cline also offers a resource for beginners in this field. The course consists of two main modules that sequentially introduce key concepts.

The first module, "Prompting", focuses on communicating with AI assistants. It covers clear and specific prompts to get results: writing prompts, gradually refining them, and using system prompts to configure coding style, security, and workflows.

The second module, "LLM Fundamentals", delves into the technical aspects of AI. It explains what Large Language Models (LLMs) are, how they process and generate code. It teaches how to evaluate and select the best model for specific tasks, considering criteria such as speed, cost, accuracy, and privacy.

AI Foundations
https://www.youtube.com/playlist?list=PLuI2ZfvGpzwCEXrl_K6bW5OqNpZq3HkMa
After further simplifying their website design, Cursor has now published a series of "for beginners" videos on their YouTube channel called "AI Foundations" – seems they are looking for new clients.

Topics of each video:

  1. "How AI Models Work": This video explains the basic principles of how artificial intelligence models function, for example, how they process information and generate responses.
  2. "Hallucinations": This section discusses the phenomenon of "hallucinations" in AI — when a model generates confident but untrue or fabricated information that is not based on the input data.
  3. "Tokens & Pricing": The video is dedicated to tokens — units into which text is broken down for processing by the model. It also explains how the number of tokens affects the cost of using AI models via API.
  4. "Context": It talks about the importance of context for AI models. This likely refers to the "context window" — the amount of information (previous messages) that a model can remember during a conversation to provide relevant responses.
  5. "Tool Calling": This video explains how modern AI models can interact with external tools and APIs (e.g., search engines, calculators, databases) to perform more complex tasks.
  6. "Agents": The last video reveals the concept of AI agents — autonomous systems that can independently plan and execute a sequence of actions to achieve a set goal, using various tools.

GitHub Copilot CLI
https://github.com/github/copilot-cli
https://github.blog/changelog/2025-09-25-github-copilot-cli-is-now-in-public-preview/
GitHub Copilot can now also be used from the terminal. The CLI is currently in Public Preview. Supports Linux, macOS, and Windows experimentally (via PowerShell v6+).

By default, it uses the Claude Sonnet 4 model, but also supports GPT-5 via the environment variable COPILOT_MODEL=gpt-5; there is no slash command to switch the model. AI response blocks can be expanded using Ctrl+r.

It uses the GitHub MCP server by default. Provides access to repositories, issues, and pull requests using agent queries.

https://www.youtube.com/watch?v=7tjmA_0pl2c

GitHub Spec Kit and Copilot CLI can be used together. The video shows how to create custom slash commands using additional instruction files.

llmswap
https://github.com/sreenathmmenon/llmswap
A universal tool (SDK and CLI) for developers, allowing easy interaction with various AI providers (OpenAI/GPT-4o, Claude, Gemini, Groq, etc.) directly from the terminal.

This is a Python project - installation pip install llmswap.

Directly from the terminal you can: Generate code and commands, for example llmswap generate "sort files by size" will output du -sh * | sort -hr. Create functions and entire scripts, for example llmswap generate "Python function to read JSON" will generate ready-to-use code with error handling. Perform code reviews, debug errors, analyze logs.

Use cases and examples
https://sreenathmenon.com/blog/2025-09-04-stopped-alt-tabbing-chatgpt-while-coding/
The author emphasizes the main advantage: llmswap brings AI directly into the terminal. Instead of constantly switching between a code editor, a browser with ChatGPT, a search engine, and other assistants, you get answers instantly without losing focus.

Killer feature: Integration with Vim (and other editors). Using the :r ! command in Vim, you can insert the result of any console command directly into a file. llmswap makes this incredibly useful – it can insert code snippets where the cursor is located.

The tool can be part of the workflow for:

  • DevOps: Generating docker-compose.yml, Kubernetes configurations, systemd services.
  • Database Administration: Creating complex SQL queries (find duplicate emails in table), commands for MongoDB.
  • Log Analysis: Creating commands for awk, grep, zgrep to analyze regular and archived logs.
  • and much more.

Codex and GPT-5-Codex Update
https://openai.com/index/introducing-upgrades-to-codex/
OpenAI has unified Codex into a single ecosystem. Access to the updated tool is included in ChatGPT Plus, Pro, Business, Edu, and Enterprise subscriptions. It cannot be used for free.

Codex is now available in the terminal (Codex CLI), as The Codex IDE extension, on the web (Codex cloud), on GitHub, and in the ChatGPT mobile app. There is also an option to switch to the GPT-5-Codex model.

https://www.youtube.com/watch?v=yqtujbev9zI

The new version of the GPT-5 model - GPT-5-Codex is specially optimized for independently performing complex and long tasks (tested for 7 hours), such as creating projects from scratch, refactoring, and bug fixing. Special attention is paid to code review.

https://news.ycombinator.com/item?id=45252301
https://news.ycombinator.com/item?id=45253458
People are debating whether Codex CLI is finally better than Claude Code. It seems GPT-5 still follows instructions better and writes clearer and simpler code.

ARM package manager
https://github.com/jomadu/ai-rules-manager
A rather simple idea, we'll see its development later. A package manager for AI rules. Identical sets of rules in different projects and keeping them automatically synchronized with a source of truth. Git can be used, similar to awesome-cursorrules or other rule collections.

For some reason, I didn't see a mention of https://agents.md/ which offers a standard option for connecting rules and context for various AI systems.

In 2023, Google seemed very confused about AI code generation, but over 2024 and 2025, they managed to deliver both good models and cover every niche of AI programming with their tools. Code can be generated from both the basic and advanced web interfaces of the Gemini chat. There are two proprietary IDEs (though browser-based), plugins for popular IDEs, a terminal agent, and a cloud agent.


Gemini web chat
https://gemini.google.com/
The web chat can generate code; model 2.5 Pro is available for free in a limited capacity. If you select Canvas from Tools, the chat will be on the left, and the entire screen will be code and preview mode. You can also display the console. In preview mode, there is an option from the panel to (A) add AI functionality (B) highlight an element or part of the screen with the mouse to write subsequent queries about it.

Suitable for: simple visualizations, interactive explanations, single-page website prototypes.

Build apps with Gemini
https://aistudio.google.com/apps
A significantly better web version, similar to Canvas mode. Uses model 2.5 Pro and React & Tailwind without the ability to change them. There is an option to open the application in full screen and deploy it to Google Cloud.

Suitable for: great for generating UI, small applications.


Stitch from Google Experiments
https://stitch.withgoogle.com/ from https://labs.google/
A tool in limited access. Generates mobile and desktop interfaces from a text description or sketch image. After a request, the system will analyze the requirements and propose the number and content of screens with the ability to export as code (HTML with tailwind classes) and images; it also seems to have export to Figma.

Opal from Google Experiments
https://opal.withgoogle.com/
If you draw an algorithm diagram, it becomes a mini-program that can be shared.


Gemini Code Assist
https://codeassist.google/
This page covers many projects. In addition to the free plan, Standard ($19) and Enterprise ($45) plans are available with higher limits for Gemini CLI and Agent Mode. Google by default collects and uses data to train AI models in the free plan.

https://marketplace.visualstudio.com/items?itemName=Google.geminicodeassist and https://plugins.jetbrains.com/plugin/24198-gemini-code-assist
The plugin for VSC and JetBrains compatible systems (as well as Android Studio) in Agent Mode can perform multi-step tasks where the user can review and approve an action plan. Now supports MCP.

Suitable for: improving manual code work, questions, and generating necessary fragments, tests.

Gemini CLI https://github.com/google-gemini/gemini-cli
Google's answer to Claude Code - an autonomous agent that runs in the terminal with the gemini command. The code is open source. Supports macOS, Linux, and Windows. Can process PDFs and images, use Google Search, and pull additional instructions from GEMINI.md. Creates a shadow git repository ~/.gemini/history/<project_hash> where it logs the final state of each development stage - disabled by default, can be added in settings.

Suitable for: step-by-step automatic project generation, with checks at each step and additional edits.

Jules
https://jules.google.com/
Launched as an experiment, it quickly moved to beta, with a subscription available from August 6, 2025. This is an autonomous agent (or rather two - an executor and a critic) that runs on a Google virtual machine with a 20GB disk. Its idea is for it to create its own plan, work step-by-step with the selected GitHub repository for an extended period, and then send a report to our browser. 15 free task/day.

Suitable for: I have little experience; here, you need to understand what level of task complexity on the repository will not confuse the autonomous system and set tasks accordingly.


Cloud Shell Editor
https://shell.cloud.google.com/
An IDE that does not require local setup and is available directly from the browser. Integrated with Google Cloud Platform. The VM has a 5GB disk. Where the Copilot button is in VSC, there is a button to activate the Gemini Code Assist panel. Gemini CLI can be called from the terminal.

Suitable for: manual and automatic code work, if Google Cloud Platform is needed.

Firebase Studio (formerly IDX)
https://studio.firebase.google.com
Project IDX emerged as an experimental browser-based IDE. Announced on August 8, 2023, for closed testing, opened for public testing on May 14, 2024, and rebranded to Firebase Studio on April 9, 2025. This is a full-fledged IDE where we have full access to the code and its modification.

Requests can be automatically improved; first, a plan and documentation will be developed, and then the code will be generated. Templates are available for Astro, Go, Python/Flask, Solid.js, Node.js, with support for Flutter and React Native. Integration with GitHub and Google Cloud services.

Suitable for: as an IDE for both manual and automatic work with complex projects.

GLM plan for programming
https://docs.z.ai/devpack/overview
A subscription package created specifically for programming with artificial intelligence. It provides access to the powerful GLM-4.5 and GLM-4.5-Air models in popular tools like Claude Code, Cline-Roo-Kilo, OpenCode, and others.

The cost starts from $3 per month. Significantly higher usage limits compared to standard plans like Claude Pro. The quota is renewed every 5 hours. The company's services are located in Singapore. Z.ai does not store user queries or generated data.

DeepSeek in Claude Code
https://api-docs.deepseek.com/guides/anthropic_api
By setting ANTHROPIC_BASE_URL to https://api.deepseek.com/anthropic, you can use models from DeepSeek.

OpenAI Codex
On its YouTube channel, OpenAI released a video about Codex during a series of virtual "Build Hours" events.

https://www.youtube.com/watch?v=WvMqA4Xwx_k

OpenAI's goal is to make Codex the sole, universal agent for code generation. It is recommended to view Codex as a full-fledged team member or even delegate tasks to it, acting as an architect or manager.

Codex can operate in two main modes: locally and in the cloud. Both environments are integrated via a ChatGPT account and synchronized with GitHub repositories. Task delegation is available through the web interface and the ChatGPT mobile app (iOS), allowing users to "run tasks when inspiration strikes," even away from the workstation.

A new extension, compatible with VS Code, Cursor, and other VSC forks. Codex can automatically check pull requests on GitHub. This functionality is not limited to static analysis – the agent can run code, check logic, and validate changes.

It is recommended to organize repositories with smaller files. The presence of tests, linters, and formatters allows the agent to independently detect and correct errors. For complex tasks, it's better to generate detailed plans in Markdown (plan.md). Using an agents.md file to document architectural decisions helps Codex immediately understand what it's working with.

Specification-Driven Development
https://github.com/github/spec-kit
Github has proposed a new paradigm for software creation. And the tool for its implementation, which works on Linux/macOS (or WSL2 under Windows) with Claude Code, GitHub Copilot, Gemini CLI agents.

Before the AI era, we usually wrote code first, and that was the "real work," and then "on the side" we finished specifications and documentation. Spec-Driven Development (SDD) says that specifications become primary, directly generating working implementations. Specifications define "what" before "how."

Kiro from Amazon has a similar approach (Spec mode), which first writes project requirements and only then generates code. Qoder also has a Quest mode with similar logic.

https://www.youtube.com/watch?v=LA_HqmiGvsE

Detailed document
https://github.com/github/spec-kit/blob/main/spec-driven.md
The main idea of SDD is to bridge the gap between intent (specification) and execution (code). This is achieved by making specifications (such as Product Requirement Documents – PRDs – and implementation plans) so precise, complete, and unambiguous that they can be used to automatically generate working code. SDD ensures systematic alignment of all components of a complex system.

The quality of specifications is ensured by structured templates that prevent premature implementation details, require explicit markers for uncertainties, include checklists, and ensure compliance with the project's "constitution." The "constitution" contains immutable architectural principles (e.g., "Library-first," "CLI-interface mandatory," "Test-First Development," "Simplicity," "No excessive abstraction") that guarantee consistency, simplicity, and quality of the generated code.

Agent Client Protocol (ACP)
https://agentclientprotocol.com
https://github.com/zed-industries/agent-client-protocol
A new open standard developed by Zed Industries, aiming to standardize communication between code editors (IDEs, text editors) and AI agents. The approach is similar to how Language Server Protocol (LSP) standardized the integration of language servers.

Agents run as child processes of the code editor and exchange data using JSON-RPC over standard input/output (stdio) streams. The protocol is still under development.

https://zed.dev/blog/claude-code-via-acp
Zed editor now supports integration with Claude Code via ACP and runs in the sidebar. The adapter is released as open-source under the Apache license, allowing other editors to use it as well. https://github.com/Xuanwo helped with the implementation.

Practical Techniques for Claude Code and Codex CLI
https://coding-with-ai.dev/
https://github.com/inmve/coding-with-ai
The website contains practical techniques for effective work with AI assistants. Each stage has a checklist.

Raise the following topic-sections:
1. Planning (preparation for working with AI),
2. UI & Prototype (rapid creation of interfaces and prototypes),
3. Coding (efficient code generation and manipulation),
4. Debuging (identifying and fixing errors with AI),
5. Testing & QA (ensuring code quality through tests),
6. Review (checking and improving AI-generated code)
and Cross-stage (general advice for all stages).

The main idea: "Brain First, AI Second" meaning to think independently first, and then use AI for interface prototyping and delegating tedious, systematic tasks.

Start working with AI with thorough planning - choose stable "boring" libraries and provide extremely detailed specifications. At least sketch the interface. It's better to start with some code, you can write the main algorithm yourself. And write tests at the beginning.

It is important to create a context-memory file (for example, `AGENTS.md`). Actively manage context, interrupt the agent if it deviates from the course, and use it as a learning partner by asking open-ended questions to choose one of several options. Write everything to logs as much as possible. Use screenshots to explain problems.

Run multiple agents in parallel on separate tasks without conflicts. Keep the code as simple as possible, ask for refactoring and simplification repeatedly. It's better to generate your own code instead of connecting more and more third-party libraries.

Always read generated code! Use a second agent without chat history to check the code. Never delegate testing entirely to AI, always check the code yourself and thoroughly review all changes (diff).

Start with cheaper models (Sonnet 4), moving to more complex ones (Opus 4.1) only when necessary. Claude models can adjust effort using the words think < think hard < think harder < ultrathink.. For interfaces, using more beautiful or more elegant works.

AI integration experience in production processes
https://www.sanity.io/blog/first-attempt-will-be-95-garbage
The author used Cursor for 18 months for code generation, and Claude Code for the last 6 weeks. The transition to Claude Code took only hours to understand how to work with it.

Author's mental model: AI is a junior developer who never learns. Costs for Claude Code can be a significant percentage of an engineer's monthly salary ($1000-1500 per month).

Conclusions:

  • Rule of three attempts: Forget about perfect code on the first try - 95% will be garbage, which helps the agent identify real tasks and limitations. The second attempt might yield 50% good code, and by the third time, it will most likely implement something that can be iterated and improved.
  • Retain knowledge between sessions: update Claude.md with architectural decisions, typical code patterns, "gotchas", and links to documentation. And configure MCP to retrieve data from Linear, Notion/Canvas, Git/GitHub history, and others.
  • Teams and AI agents: uses linear. It's important never to run multiple agents on the same problem space. Explicitly mark correct, human-edited code.
  • Code review: agents --> agent overseers --> team. Always check, especially for complex state management, performance-critical, and security-critical sections.

The main problem today is that AI does not learn from mistakes. Solution: better documentation, clearer instructions.

The author emphasizes that giving up "ownership" of the code (since AI wrote it) leads to more objective reviews, faster removal of unsuccessful solutions, and lack of ego during refactoring.

Discussion on HN
https://news.ycombinator.com/item?id=45107962
LLMs have already become a standard tool for engineers. It is confirmed that several iterations with LLMs (3 or more) to achieve an acceptable result are typical. LLMs do not write complex code very well on their own unless it is an extremely simple or templated task. Code written by LLMs always requires careful monitoring and editing.

There is concern that excessive use of LLMs may reduce a developer's own "thinking" skills. If a junior developer simply "vibe" code with an LLM without deep understanding, it undermines the trust of senior colleagues.

Some users create additional "critical agents" (LLMs trained to find errors) to check code written by the main agent. Breaking down complex tasks into small, manageable parts is key to success. Using TDD with LLMs works very well.

A sample instruction file CLAUDE.md is available at https://www.dzombak.com/blog/2025/08/getting-good-results-from-claude-code/

However, there is a discussion about the effectiveness of this file for providing context: some consider it useful for AI's long-term memory, while others argue that the AI often ignores its content. From the HN Discussion, further conclusions can be drawn:

  • The best results are achieved when the developer spends significant time creating very clear, step-by-step specifications (documents describing exactly how a project should be implemented). This requires more initial effort but allows Claude Code to follow clear instructions and generate more accurate and organized code.
    • Some users employ other AIs (e.g., ChatGPT, Gemini) for brainstorming, creating specifications, critiquing, and refining them before submitting the final document to Claude Code.
    • Integration with code quality tools (husky, lint-staged, commitlint) helps maintain standards.
  • Claude Code, despite marketing, does not "think" in a human sense; at any step, it can make strange mistakes or "hallucinate."
    • Since we have a limited context window, it's better to work with Claude Code in small, sequential steps. Ask it to write one function or make one change, then check the result, fix errors, commit, and only then move to the next step.
    • Some successfully use Claude to write unit tests, and then ask it to write minimal code to pass those tests, similar to TDD (Test-Driven Development).
  • Some users have noticed that asking Claude to review its own work can be surprisingly fruitful, as it often points out shortcomings itself.

An analysis of how Claude Code works
https://minusx.ai/blog/decoding-claude-code/
The author believes that the CLAUDE.md file is key for conveying user context and preferences (e.g., which folders to ignore, which libraries to use). Its content is sent with each user request. Phrases like IMPORTANT, VERY IMPORTANT, NEVER, and ALWAYS are still effective for preventing undesirable behavior.

Checkpoints for Claude Code
https://claude-checkpoints.com/
The project adds checkpoints to Claude Code, similar to Cursor. The main goal is to ensure that we do not lose correctly generated code by tracking changes and enabling the restoration of previous project states. It includes a visual diff viewer.

Checkpoints are created automatically after Claude completes tasks. It integrates with Claude Desktop via the MCP (Model Context Protocol) protocol.

Auggie
https://www.augmentcode.com/product/CLI
Augment Code CLI - this is a attempt to copy Claude Code, this time from the developers of Augment Code. As a standalone product, it's hard to say if it has any value; it's more likely an addition to their main product and tied to their pricing plans.

An interesting feature is the hotkey for improving a request. Otherwise, it has basic terminal agent functions, with MCP support like everywhere else now. The models are Claude Sonnet 4 and GPT-5.

Privacy mode in TRAE IDE v2.1.0
https://docs.trae.ai/ide/privacy-mode?_lang=en
One of the reasons I didn't use ByteDance's Chinese TRAE was their unclear policy of "we do what we want with your code." Now they have added the ability to activate privacy mode in the settings. I think this is a response to Alibaba's Qoder.

Privacy mode is only active when your account is active. After logging out, privacy mode ceases to be active. Privacy mode does not apply to SOLO mode, which is still in the testing phase.

SOLO mode
https://docs.trae.ai/ide/solo-mode
Appeared in TRAE v2.0.0. Available only for users with a Pro subscription and SOLO Code. This is a highly automated mode that independently plans and executes the entire development cycle: from requirements analysis, code and testing generation to results preview and deployment. Works with Figma / Vercel / Supabase / Stripe.

https://www.youtube.com/watch?v=4JObEIIK8Uo

SOLO Builder is an agent for creating web applications. Analyzes requirements → generates PRD → writes code → provides a preview. During development, AI independently orchestrates tools such as editor, browser, terminal, and doc viewer (project description with diagrams). Sometimes you need to click confirmation buttons.