CodeWithLLM-Updates
-
🤖 AI tools for smarter coding: practical examples, step-by-step instructions, and real-world LLM applications. Learn to work efficiently with modern code assistants.

Claude Code on the Web
https://www.anthropic.com/news/claude-code-on-the-web
A response to the popularity of Google Jules. The online service allows delegating several tasks to Claude Code in parallel from the browser. A new interface is also available as an early version in the mobile app for iOS. Currently in beta testing and available for Pro and Max plans.

Users can connect their GitHub repositories, describe tasks, after which the system will autonomously write code, tracking progress in real-time and automatically creating pull requests. Each task is executed in an isolated environment ("sandbox") to protect code and data.

https://www.youtube.com/watch?v=hmKRlgEdau4

Claude Haiku 4.5
https://www.anthropic.com/news/claude-haiku-4-5
The updated Haiku model, known for being fast and cheap, now matches the code generation performance of the previous-gen Sonnet 4, while being twice as fast (160-220 tokens/sec) and three times less expensive.

Most will use an architectural approach: using a smarter model (e.g., Sonnet 4.5) as an "orchestrator" that breaks down a complex problem into smaller subtasks. These subtasks are then executed in parallel by a "team" of several Haiku 4.5s.

Haiku 4.5 appears to make code changes significantly more accurately compared to GPT-5 models.


Skills for Claude Models
https://www.anthropic.com/news/skills
https://simonwillison.net/2025/Oct/16/claude-skills/
Essentially, "Agent Skills" are a folder containing onboarding, instructions, resources, and executable code. This allows Claude to be trained for specialized tasks, such as working with internal APIs, or adhering to coding standards. Integrated into all Claude products, a new /v1/skills API endpoint has appeared for management. In Claude Code, they can be installed as plugins from the marketplace or manually by adding them to the ~/.claude/skills folder.

Simon Willison believes the new feature is a huge breakthrough, potentially more important than the MCP protocol. Unlike MCP, which is a complex protocol, a Skill is just a folder with a Markdown file containing instructions and optional scripts. This approach doesn't invent new standards but relies on the existing ability of LLM agents to read files and execute code, making it incredibly flexible and intuitive. Since they are simple files, they are easy to create and share.

https://www.youtube.com/watch?v=kHg1TfSNSFI

Compared to MCP, Skills have a key advantage in token efficiency: instead of loading thousands of tokens to describe tools, the model reads only a brief description of the skill, and loads the full instructions only when needed.

https://news.ycombinator.com/item?id=45607117
https://news.ycombinator.com/item?id=45619537
Many commentators note that Skills are essentially just a way to dynamically add instructions to the model's context when needed. Their proponents say that this simplicity is precisely the genius. Skills represent a new paradigm for organizing and dynamically assembling context. Everyone generally agrees that this is a more successful and lightweight alternative to MCP, which saves us from context overload and consuming thousands of tokens.

Users have noticed that Skills are essentially a formalization of an existing AGENTS.md (or CLAUDE.md) pattern, where instructions for an agent are collected in one file, telling it where to look when something is needed. But Skills make this process more standardized, organized, and scalable. The LLM knows the standard and can help in generating a Skill.

Evaluating AI Assistants
https://www.youtube.com/watch?v=tCGju2JB5Fw

Three developers (Wes, Scott, and CJ) discuss and rank various tools, sharing their own experiences, evaluating interface usability, the quality of generated code, and the unique capabilities of each tool.

Services such as Replit and Lovable received specific criticism for their aggressive and sometimes opaque marketing strategy involving influencers. For serious development, CLI tools or IDEs are more suitable, while browser-based solutions are ideal for quick experiments.

Ultimately, Claude Code, Open Code, and ChatGPT received S-tier. Claude Code is praised for its ability to strictly follow instructions and plan work, Open Code — for its openness and the ability to use custom API keys, and ChatGPT remains indispensable for quick queries without the context of the entire project. Most other tools were rated as average — they are useful but do not offer unique advantages.


Vibe Coding Ranking
https://www.youtube.com/watch?v=ebacH8tdXug

This video is a humorous response to the previous one, and the author immediately warns that his ranking should not be taken seriously. Theo examines tools not by technical capabilities but by so-called "vibe coding." The main priority is how much it allows you to create something without looking at the code or understanding technical details.

The author jokes that "true vibe coders" avoid seeing code. Therefore, Cursor, VS Code Copilot, Open Code, and Codex receive the lowest rating because they are assistants for real developers who require active participation, writing, and reviewing code. They destroy the "vibe."

The highest rating was given to a platform that maximally abstracts from code, and that is V0 from Vercel — it has a simple interface, replaces technical terms (e.g., "fork" with "duplicate"), and offers powerful integrations that can be configured with a few clicks without any knowledge of APIs.

Surprisingly, Claude Code received an A-tier for its ability to perform tasks autonomously, hiding the technical implementation from the user.

Almost all modern AI coding tools have added the Claude Sonnet 4.5 model.

Cursor 1.7
https://cursor.com/changelog/1-7
Responding to Kiro and Github SpecKit, Cursor has redesigned its Planning mode; it now creates descriptions, plans, and task lists before starting.

https://www.youtube.com/watch?v=WInPBmCK3l4

The terminal finally runs commands in a separate sandbox, and Windows PowerShell interaction has been fixed. The agent can also open a Browser and take screenshots, and has learned to read images from disk. The OS taskbar now shows a list of agents and what they are doing.

Kiro v0.3.0
https://kiro.dev/changelog/spec-mvp-tasks-intelligent-diagnostics-and-ai-commit-messages/
Kiro has finally replaced separate limits for two modes with unified points that count everywhere, now it works like Windsurf. Sonnet 4.5 has been added, but it's strange that the coefficient is like Sonnet 4, only Auto mode is 1 credit. They still haven't made it possible to drag files or folders into the chat area to reference them as context, only via hashtag.

Codex Github Action
https://github.com/openai/codex-action
OpenAI announced at DevDay[2025] that Codex has exited beta — now stable with enhanced capabilities. There's a Codex Github Action, a built-in widget gallery, and MCP support. A Codex SDK is available for integration.

OpenAI is also transforming ChatGPT into an "operating system" for AI agents. You can now write your own applications and agents inside ChatGPT, connect payments, authorization, and metrics.

Gemini CLI Extensions
https://blog.google/technology/developers/gemini-cli-extensions/
Google has launched a separate website for Gemini CLI https://geminicli.com/ with a documentation section. Extensions for Gemini CLI are a new feature that allows developers to customize and connect various programs, integrating services like Dynatrace, Figma, Stripe, Snyk, and others.

The system is open, allowing anyone to create their own extensions, and Google has already released a set for integration with its products (Google Cloud, Firebase, Flutter).

Jules CLI
https://jules.google/docs/changelog/
The Jules cloud agent has a rather imperfect web interface from which it is unclear what it is currently doing - but it gives as many as 15 tasks a day without paying for tokens. Now you can install npm install -g @google/jules locally, all commands jules help. Windows is not supported.

The CLI allows you to create tasks, view active sessions (jules remote list), and monitor from the terminal in a convenient visual format. It supports scripting by combining with utilities such as gh, jq, or cat.

There is an option to take code from an active Jules session and apply it to a local machine for immediate testing of changes without waiting for a commit to GitHub.

ALSO

  • From September 30, 2025, Jules can learn from interaction: save settings, prompts, and corrections.
  • From September 29, 2025, you can precisely specify to Jules which files to work with for any task.
  • From September 23, 2025, Jules can read and respond to comments in pull requests.

Code Mode
https://blog.cloudflare.com/code-mode/
A new approach called "Code Mode" improves AI's interaction with external tools. Instead of forcing Large Language Models (LLMs) to directly "call tools" via the MCP protocol, which is unnatural for them, it's proposed to ask them to write TypeScript code that accesses these tools via an API.

The system automatically converts tools available via the MCP protocol into a clear TypeScript API with documentation. The AI-generated code is executed in a secure isolated "sandbox." The MCP protocol itself remains important, as it provides a standardized way to connect to services, obtain their descriptions, and securely authorize, allowing the system to manage access without directly involving the AI.

This method is much more effective because LLMs are trained on vast arrays of real code and are better at writing it than using specialized, artificially created commands.

The technological basis for this "sandbox" is the Cloudflare Workers platform, which uses lightweight and extremely fast V8 isolates instead of slow containers. This ensures high efficiency and security: the code is completely isolated from the internet and can only interact with permitted tools.

Claude Code 2.0.0
https://www.npmjs.com/package/@anthropic-ai/claude-code
Redesigned UI. Ctrl-R to search history. Tab to switch thinking modes (persists between sessions). /usage command to view plan limits. /rewind command to revert code changes - native checkpoints finally.

Added a native VS Code extension. Claude Code SDK is now called Claude Agent SDK and allows others to create their own intelligent "agents" based on this technology.

Claude Sonnet 4.5
https://www.anthropic.com/news/claude-sonnet-4-5
They claim it's the best model for programming today. Safety is specifically emphasized: the model has become "more obedient" and less prone to errors or harmful actions. Importantly, this new, more powerful version is already available to everyone at the same price as the previous one.

AI Programming University
https://cline.bot/learn
Cline also offers a resource for beginners in this field. The course consists of two main modules that sequentially introduce key concepts.

The first module, "Prompting", focuses on communicating with AI assistants. It covers clear and specific prompts to get results: writing prompts, gradually refining them, and using system prompts to configure coding style, security, and workflows.

The second module, "LLM Fundamentals", delves into the technical aspects of AI. It explains what Large Language Models (LLMs) are, how they process and generate code. It teaches how to evaluate and select the best model for specific tasks, considering criteria such as speed, cost, accuracy, and privacy.

AI Foundations
https://www.youtube.com/playlist?list=PLuI2ZfvGpzwCEXrl_K6bW5OqNpZq3HkMa
After further simplifying their website design, Cursor has now published a series of "for beginners" videos on their YouTube channel called "AI Foundations" – seems they are looking for new clients.

Topics of each video:

  1. "How AI Models Work": This video explains the basic principles of how artificial intelligence models function, for example, how they process information and generate responses.
  2. "Hallucinations": This section discusses the phenomenon of "hallucinations" in AI — when a model generates confident but untrue or fabricated information that is not based on the input data.
  3. "Tokens & Pricing": The video is dedicated to tokens — units into which text is broken down for processing by the model. It also explains how the number of tokens affects the cost of using AI models via API.
  4. "Context": It talks about the importance of context for AI models. This likely refers to the "context window" — the amount of information (previous messages) that a model can remember during a conversation to provide relevant responses.
  5. "Tool Calling": This video explains how modern AI models can interact with external tools and APIs (e.g., search engines, calculators, databases) to perform more complex tasks.
  6. "Agents": The last video reveals the concept of AI agents — autonomous systems that can independently plan and execute a sequence of actions to achieve a set goal, using various tools.

GitHub Copilot CLI
https://github.com/github/copilot-cli
https://github.blog/changelog/2025-09-25-github-copilot-cli-is-now-in-public-preview/
GitHub Copilot can now also be used from the terminal. The CLI is currently in Public Preview. Supports Linux, macOS, and Windows experimentally (via PowerShell v6+).

By default, it uses the Claude Sonnet 4 model, but also supports GPT-5 via the environment variable COPILOT_MODEL=gpt-5; there is no slash command to switch the model. AI response blocks can be expanded using Ctrl+r.

It uses the GitHub MCP server by default. Provides access to repositories, issues, and pull requests using agent queries.

https://www.youtube.com/watch?v=7tjmA_0pl2c

GitHub Spec Kit and Copilot CLI can be used together. The video shows how to create custom slash commands using additional instruction files.

llmswap
https://github.com/sreenathmmenon/llmswap
A universal tool (SDK and CLI) for developers, allowing easy interaction with various AI providers (OpenAI/GPT-4o, Claude, Gemini, Groq, etc.) directly from the terminal.

This is a Python project - installation pip install llmswap.

Directly from the terminal you can: Generate code and commands, for example llmswap generate "sort files by size" will output du -sh * | sort -hr. Create functions and entire scripts, for example llmswap generate "Python function to read JSON" will generate ready-to-use code with error handling. Perform code reviews, debug errors, analyze logs.

Use cases and examples
https://sreenathmenon.com/blog/2025-09-04-stopped-alt-tabbing-chatgpt-while-coding/
The author emphasizes the main advantage: llmswap brings AI directly into the terminal. Instead of constantly switching between a code editor, a browser with ChatGPT, a search engine, and other assistants, you get answers instantly without losing focus.

Killer feature: Integration with Vim (and other editors). Using the :r ! command in Vim, you can insert the result of any console command directly into a file. llmswap makes this incredibly useful – it can insert code snippets where the cursor is located.

The tool can be part of the workflow for:

  • DevOps: Generating docker-compose.yml, Kubernetes configurations, systemd services.
  • Database Administration: Creating complex SQL queries (find duplicate emails in table), commands for MongoDB.
  • Log Analysis: Creating commands for awk, grep, zgrep to analyze regular and archived logs.
  • and much more.

Codex and GPT-5-Codex Update
https://openai.com/index/introducing-upgrades-to-codex/
OpenAI has unified Codex into a single ecosystem. Access to the updated tool is included in ChatGPT Plus, Pro, Business, Edu, and Enterprise subscriptions. It cannot be used for free.

Codex is now available in the terminal (Codex CLI), as The Codex IDE extension, on the web (Codex cloud), on GitHub, and in the ChatGPT mobile app. There is also an option to switch to the GPT-5-Codex model.

https://www.youtube.com/watch?v=yqtujbev9zI

The new version of the GPT-5 model - GPT-5-Codex is specially optimized for independently performing complex and long tasks (tested for 7 hours), such as creating projects from scratch, refactoring, and bug fixing. Special attention is paid to code review.

https://news.ycombinator.com/item?id=45252301
https://news.ycombinator.com/item?id=45253458
People are debating whether Codex CLI is finally better than Claude Code. It seems GPT-5 still follows instructions better and writes clearer and simpler code.

ARM package manager
https://github.com/jomadu/ai-rules-manager
A rather simple idea, we'll see its development later. A package manager for AI rules. Identical sets of rules in different projects and keeping them automatically synchronized with a source of truth. Git can be used, similar to awesome-cursorrules or other rule collections.

For some reason, I didn't see a mention of https://agents.md/ which offers a standard option for connecting rules and context for various AI systems.

In 2023, Google seemed very confused about AI code generation, but over 2024 and 2025, they managed to deliver both good models and cover every niche of AI programming with their tools. Code can be generated from both the basic and advanced web interfaces of the Gemini chat. There are two proprietary IDEs (though browser-based), plugins for popular IDEs, a terminal agent, and a cloud agent.


Gemini web chat
https://gemini.google.com/
The web chat can generate code; model 2.5 Pro is available for free in a limited capacity. If you select Canvas from Tools, the chat will be on the left, and the entire screen will be code and preview mode. You can also display the console. In preview mode, there is an option from the panel to (A) add AI functionality (B) highlight an element or part of the screen with the mouse to write subsequent queries about it.

Suitable for: simple visualizations, interactive explanations, single-page website prototypes.

Build apps with Gemini
https://aistudio.google.com/apps
A significantly better web version, similar to Canvas mode. Uses model 2.5 Pro and React & Tailwind without the ability to change them. There is an option to open the application in full screen and deploy it to Google Cloud.

Suitable for: great for generating UI, small applications.


Stitch from Google Experiments
https://stitch.withgoogle.com/ from https://labs.google/
A tool in limited access. Generates mobile and desktop interfaces from a text description or sketch image. After a request, the system will analyze the requirements and propose the number and content of screens with the ability to export as code (HTML with tailwind classes) and images; it also seems to have export to Figma.

Opal from Google Experiments
https://opal.withgoogle.com/
If you draw an algorithm diagram, it becomes a mini-program that can be shared.


Gemini Code Assist
https://codeassist.google/
This page covers many projects. In addition to the free plan, Standard ($19) and Enterprise ($45) plans are available with higher limits for Gemini CLI and Agent Mode. Google by default collects and uses data to train AI models in the free plan.

https://marketplace.visualstudio.com/items?itemName=Google.geminicodeassist and https://plugins.jetbrains.com/plugin/24198-gemini-code-assist
The plugin for VSC and JetBrains compatible systems (as well as Android Studio) in Agent Mode can perform multi-step tasks where the user can review and approve an action plan. Now supports MCP.

Suitable for: improving manual code work, questions, and generating necessary fragments, tests.

Gemini CLI https://github.com/google-gemini/gemini-cli
Google's answer to Claude Code - an autonomous agent that runs in the terminal with the gemini command. The code is open source. Supports macOS, Linux, and Windows. Can process PDFs and images, use Google Search, and pull additional instructions from GEMINI.md. Creates a shadow git repository ~/.gemini/history/<project_hash> where it logs the final state of each development stage - disabled by default, can be added in settings.

Suitable for: step-by-step automatic project generation, with checks at each step and additional edits.

Jules
https://jules.google.com/
Launched as an experiment, it quickly moved to beta, with a subscription available from August 6, 2025. This is an autonomous agent (or rather two - an executor and a critic) that runs on a Google virtual machine with a 20GB disk. Its idea is for it to create its own plan, work step-by-step with the selected GitHub repository for an extended period, and then send a report to our browser. 15 free task/day.

Suitable for: I have little experience; here, you need to understand what level of task complexity on the repository will not confuse the autonomous system and set tasks accordingly.


Cloud Shell Editor
https://shell.cloud.google.com/
An IDE that does not require local setup and is available directly from the browser. Integrated with Google Cloud Platform. The VM has a 5GB disk. Where the Copilot button is in VSC, there is a button to activate the Gemini Code Assist panel. Gemini CLI can be called from the terminal.

Suitable for: manual and automatic code work, if Google Cloud Platform is needed.

Firebase Studio (formerly IDX)
https://studio.firebase.google.com
Project IDX emerged as an experimental browser-based IDE. Announced on August 8, 2023, for closed testing, opened for public testing on May 14, 2024, and rebranded to Firebase Studio on April 9, 2025. This is a full-fledged IDE where we have full access to the code and its modification.

Requests can be automatically improved; first, a plan and documentation will be developed, and then the code will be generated. Templates are available for Astro, Go, Python/Flask, Solid.js, Node.js, with support for Flutter and React Native. Integration with GitHub and Google Cloud services.

Suitable for: as an IDE for both manual and automatic work with complex projects.

GLM plan for programming
https://docs.z.ai/devpack/overview
A subscription package created specifically for programming with artificial intelligence. It provides access to the powerful GLM-4.5 and GLM-4.5-Air models in popular tools like Claude Code, Cline-Roo-Kilo, OpenCode, and others.

The cost starts from $3 per month. Significantly higher usage limits compared to standard plans like Claude Pro. The quota is renewed every 5 hours. The company's services are located in Singapore. Z.ai does not store user queries or generated data.

DeepSeek in Claude Code
https://api-docs.deepseek.com/guides/anthropic_api
By setting ANTHROPIC_BASE_URL to https://api.deepseek.com/anthropic, you can use models from DeepSeek.

OpenAI Codex
On its YouTube channel, OpenAI released a video about Codex during a series of virtual "Build Hours" events.

https://www.youtube.com/watch?v=WvMqA4Xwx_k

OpenAI's goal is to make Codex the sole, universal agent for code generation. It is recommended to view Codex as a full-fledged team member or even delegate tasks to it, acting as an architect or manager.

Codex can operate in two main modes: locally and in the cloud. Both environments are integrated via a ChatGPT account and synchronized with GitHub repositories. Task delegation is available through the web interface and the ChatGPT mobile app (iOS), allowing users to "run tasks when inspiration strikes," even away from the workstation.

A new extension, compatible with VS Code, Cursor, and other VSC forks. Codex can automatically check pull requests on GitHub. This functionality is not limited to static analysis – the agent can run code, check logic, and validate changes.

It is recommended to organize repositories with smaller files. The presence of tests, linters, and formatters allows the agent to independently detect and correct errors. For complex tasks, it's better to generate detailed plans in Markdown (plan.md). Using an agents.md file to document architectural decisions helps Codex immediately understand what it's working with.