CodeWithLLM-Updates
-

Mintlify Autopilot
https://www.mintlify.com/blog/autopilot
An AI-powered system that monitors changes in your repository. On every push, it analyzes what needs to be updated in the documentation (both for humans and for AI agents). In the Autopilot dashboard, it shows which changes might require documentation updates. Then the Mintlify agent automatically generates a draft that you can review and refine. It takes into account the code context and the existing tone/style of your documentation.

Code Wiki
https://codewiki.google/
Google launched Code Wiki (currently in public preview) — a platform designed to solve the problem of AI (and humans) reading and understanding existing codebases. The system creates and continuously maintains a structured wiki page for the entire repository.

Key features: full automation, Gemini-powered chat, hyperlinked answers that point directly to code files. The system automatically generates and keeps up-to-date architectural diagrams, class diagrams, sequence diagrams, and detailed descriptions.

There is a waitlist for the upcoming version of Code Wiki that will allow teams to run the exact same system locally and securely on internal, private repositories.

Qoder Repo Wiki
https://docs.qoder.com/user-guide/repo-wiki
A feature inside the Qoder IDE that automatically generates structured documentation for a project (up to 10,000 files per project, in English and Chinese) and continuously tracks changes in both the code and the documentation itself.

It deeply analyzes project structure and implementation details, providing rich context that helps AI agents work more effectively. Wiki generation is fully dynamic.

Full Git synchronization is supported. Generated content is stored in language-specific directories (e.g., repowiki/zh/, repowiki/en/), which can be committed and pushed like regular code. Initial wiki is created with one click (up to ~120 minutes for 4,000 files). After that, the system constantly watches for code changes and can update only the affected sections when modifications are detected.

Originally the feature worked only with Git repositories, but as of December 2, 2025, they added support for generating wikis from local projects without Git.

DeepWiki (by Cognition AI)
https://deepwiki.com/
A free AI tool that turns any GitHub repository (public or private) into a Wikipedia-style knowledge base. It analyzes code, READMEs, and configs, then creates structured pages with architectural and flow diagrams, interactive code hyperlinks, and a natural-language chat interface for asking questions.

Already supports >30,000 open repositories with automatic updates after new commits. An open-source version is available for local/self-hosted deployment.

Mistral Devstral 2 and Vibe
https://mistral.ai/news/devstral-2-vibe-cli
The European company Mistral AI is known for its LLMs independent of the US/China. They have updated their programming model and finally released their CLI. These announcements are extremely important for the development of the open-source AI ecosystem in software development.

https://openrouter.ai/mistralai/devstral-2512:free
The new generation of models is called Devstral 2 (123B) and Devstral Small 2 (24B), released under flexible licenses: modified MIT for Devstral 2 and Apache 2.0 for Devstral Small 2. Devstral 2 demonstrates an impressive 72.2% on the SWE-bench benchmark for open models.

The Small version can run locally on NVIDIA hardware, although the larger model (due to its density, not MoE architecture) will require serious hardware like a Mac Studio or several 3090/4090 GPUs.

Currently, Devstral 2 is offered for free via API. The model is already available in Kilo Code and Cline. According to feedback, it is quite mediocre at generating websites, frontend, and animation — it works better with small tasks involving local Python scripts.

https://help.mistral.ai/en/articles/496007-get-started-with-mistral-vibe
Mistral Vibe CLI is like Claude Code, an open-source command-line tool that runs on Windows, macOS, and Linux, based on Devstral models. It can also be run in Zed. It features interface themes, Git integration, MCP support, and agents with custom settings. It supports both interactive and autonomous operation.

https://news.ycombinator.com/item?id=46205437
Commentators noted that "Vibe" sounds like the product is geared towards vibe-coding "played around with an agent and let it churn something out" rather than controlled work by a professional programmer. Some directly call this message "the opposite" of what's needed in real work: augmenting humans, not replacing the process with "chat + tools, good luck."

Cursor Visual Editor
https://cursor.com/blog/browser-visual-editor
Cursor introduced a visual editor with a "Point and Prompt" feature: you can simply click on any interface element and describe in text what needs to be changed. It also allows manipulating the site structure using drag-and-drop elements in the DOM tree, changing button order or grid settings.

https://www.youtube.com/watch?v=1S8S89X-xbs

The editor's sidebar provides visual control over component properties (props) and styles: from typography sliders to a color palette. The update aims to blur the line between design and programming, allowing developers to focus on ideas rather than mechanical code work.

Claude Code Plugins
https://code.claude.com/docs/en/plugin-marketplaces
Anthropic launched a plugins marketplace, seemingly in response to a similar one in Gemini CLI. It is not a separate website with an App Store-like interface. It's a system within Claude Code itself, where marketplaces are plugin catalogs (often based on GitHub repositories) that are added and managed via slash commands.

https://www.youtube.com/watch?v=1uWJC2r6Sss

Also, there are now prompt suggestion variants and a hotkey for switching models during a prompt. Subagents can work in parallel. Improved usage statistics and a visual fill-indicator for the context window have been added.

You can now run Claude Code tasks directly from the Claude Android mobile app. This is not a full-fledged terminal on the phone, but an asynchronous integration where Claude runs in the cloud.

Kiro Powers
https://kiro.dev/docs/powers/
Kiro is testing the concept of Powers for a model that solves the problem of context window clutter through dynamic tool activation; the system analyzes the user's query and enables only the necessary "knowledge pack." This is very similar to "Skills" in Anthropic's models.

When many tools (MCP servers) are connected to an agent, it is forced to load hundreds of function descriptions simultaneously. This "eats up" to 40% of the limit before work even begins, leading to irrelevant advice. Instead, each Power is a ready-made set containing instructions (how and when to use tools), server configuration, and automated scenarios.

For example, if you mention "payment," the Power for Stripe is activated, providing specific knowledge about the API and security. As soon as you move to working with a database, Stripe tools are disabled, and instead, the Power for Supabase or Neon is loaded. This allows the agent to remain fast, focus on a specific topic, and produce higher quality code.

The system offers an open ecosystem with one-click installation for popular services (AWS, Figma, Stripe, etc.).

MCP as an independent standard
https://aaif.io/
https://openai.com/index/agentic-ai-foundation/
In December 2025, Anthropic transferred the Model Context Protocol (MCP) to the Agentic AI Foundation (AAIF) — a specialized foundation managed by the Linux Foundation. MCP became one of the founding projects of the newly created foundation. Along with MCP, the foundation included projects like goose by Block and AGENTS.md by OpenAI.

Agent Skills as an open standard
https://claude.com/blog/organization-skills-and-directory
https://agentskills.io and https://claude.com/connectors
Agent Skills was announced as an independent open standard on December 18, 2025, with a specification and SDK; it was not transferred to the Linux Foundation or AAIF. Microsoft has already adopted Agent Skills in VS Code and GitHub Copilot; it is also supported by Cursor, Goose, Amp, and OpenCode where Anthropic models are available.

Agent Skills Playground
https://skillsplayground.com/
On this site, by entering your API key, you can experiment with how different models utilize various skills.

Claude code 2.0.74
Added the LSP tool (Language Server Protocol) for code intelligence features such as go-to-definition, find references, and hover tooltips. This significantly improves the development experience, making code navigation faster and more convenient. For now, the agent rarely uses LSP autonomously. The open-source project OpenCode has had LSP support for about 6 months, making the slow progress of proprietary software surprising.

New Models

Gemini 3 Flash
https://blog.google/products/gemini/gemini-3-flash/
Google is gradually rolling out its junior multimodal agentic model of the new series—benchmarks suggest it is closer to Gemini 3 Pro than Gemini 2.5 Flash. The model outperforms Gemini 2.5 Pro in many tests while being three times faster and significantly cheaper. In some benchmarks, it even surpasses flagship models from other companies.

Since its release, Gemini 3 Flash has become the default model in the Gemini mobile app (replacing 2.5 Flash) and in Google Search's AI Mode. In my Gemini CLI, neither 3 Flash nor 3 Pro has appeared yet—they can be accessed via Google AI Studio.

GLM-4.7
https://z.ai/blog/glm-4.7
Zhipu AI has updated its GLM model. Version 4.7 shows significant progress over GLM-4.6 in multilingual code generation scenarios. It supports "thinking before acting" in frameworks like Claude Code, Kilo Code, Cline, and Roo Code, ensuring stability in complex tasks. Interface generation quality has also been improved.

Model weights (MoE architecture, up to 200K token context) are publicly available on Hugging Face and ModelScope for local deployment. Access is available via Z.ai API, OpenRouter, the z.ai chat interface, and a special GLM Coding Plan ($3 for the first month, then $6).

MiniMax M2.1
https://www.minimax.io/news/minimax-m21
Release of the improved MiniMax M2 model by the Chinese company MiniMax, focused on practical development and agentic systems. It is reported that the model is significantly enhanced for working with non-Python programming languages (Rust, Java, Golang, C++, Kotlin, Objective-C, TypeScript, JavaScript, etc.), outperforming Claude Sonnet 4.5 and approaching Claude Opus 4.5 in multilingual scenarios.

The model is open-source. API costs are quite low, about 10% of Claude Sonnet. It is compatible with popular agents like Claude Code, Droid (Factory AI), Cline, Kilo Code, Roo Code, and BlackBox, and supports context mechanisms (Skill.md, agent.md, etc.).

They also have a web platform at https://agent.minimax.io/ where you can test how the model builds applications.

https://www.youtube.com/watch?v=kEPLuEjVr_4

SWE-bench Verified comparison: Gemini 3 Flash 78%, MiniMax M2.1 74%, GLM-4.7 73.8%.

Superset - a multiterminal for Agents
https://superset.sh/
Currently Mac only, with Windows and Linux versions planned. It is an Electron-based app, a terminal with tabs specifically adapted to manage multiple agents like Claude Code, OpenCode, OpenAI Codex, and others simultaneously.

It automatically creates isolated git worktrees (best practice), sets up environments, isolates tasks to avoid conflicts, adds notification hooks, and includes a built-in diff-viewer for quick review of changes and PR creation. Future plans include cloud workspaces, context sharing between agents, and orchestration.

Logic-wise, it is similar to https://github.com/tmux/tmux, the well-known "terminal of terminals" for Unix-like systems (Linux, macOS, BSD, etc.), which allows creating and managing multiple sessions in a single window using panes and windows.

Mysti as a team of agents
https://github.com/DeepMyst/Mysti
A VS Code extension that allows combining any two different models (shared context) in Brainstorm Mode to receive higher-quality advice. It solves the issue of switching between different paid AI subscriptions to get alternative opinions on complex architectural decisions. Currently supports models from Claude, Codex, Gemini, and GitHub Copilot CLI.

HN Discussion
https://news.ycombinator.com/item?id=46365105
The community shows significant interest in the idea of multi-agent collaboration, actively sharing personal workflows and alternative tools. Many participants experiment with similar approaches manually (e.g., via Tmux panes with several CLI agents) and believe that debates between models help identify weak ideas and improve solutions, especially when one model gets "stuck."

Regarding Mysti, there is criticism of its dependency on VS Code, as many users prefer a pure CLI experience.