CodeWithLLM-Updates
-

Zed released in version 1.0
https://zed.dev/blog/zed-1-0
Just as Cursor bumped its major version after an interface overhaul, the code editor from the creators of Atom officially reached 1.0 on April 29, 2026. They write: "we've reached a tipping point where most developers can quickly feel at home in Zed."

Built with Rust, it features GPU acceleration, collaborative mode, built-in Git, a debugger, and AI (native and via Agent Client Protocol). Available on macOS, Windows, and Linux. Along with the release, it gained the ability to run multiple agents simultaneously in one window.

Discussion:
https://news.ycombinator.com/item?id=47949027
Many praise the speed, collaboration, native feel, and progress. There is criticism regarding project-specific configuration, AI features (which can be disabled), accessibility, and some minor nuances. Lots of practical feedback from those who switched or tried it.

Warp goes fully open source
https://www.warp.dev/blog/warp-is-now-open-source
On April 28, the AI terminal client Warp became open-source (AGPL for the core code + MIT for the UI framework). Now the community can contribute, including developing agent-first workflows through their cloud agent/orchestrator Oz.

Following the open-sourcing of the Warp client, a popular community fork called OpenWarp (https://openwarp.zerx.dev, zerx-lab) emerged. The project quickly gained popularity. It retains all the familiar Warp functionality (blocks, workflows, speed, UI) but, most importantly, fully opens the AI layer: you can connect any OpenAI-compatible provider (DeepSeek, Qwen, Ollama, OpenRouter, LM Studio, etc.), set custom system prompts via templates, and keep all keys locally without depending on a Warp cloud account or paid plans.

GitHub Copilot moves to usage-based billing
https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/
Starting June 1, 2026, all plans will transition to a usage-based model with GitHub AI Credits (1 credit = $0.01). Code completions remain unlimited, while chat, agents, CLI, and other heavy features consume credits based on tokens.

GitHub explains the transition by stating that Copilot is no longer just the simple autocomplete tool it was a year ago—it now includes powerful agentic workflows, chats, code reviews, and complex agents that consume significantly more compute resources. Fixed subscriptions no longer cover the costs.

Discussion:
https://news.ycombinator.com/item?id=47923357
Many understand the reasons (expensive agents and inference) but complain heavily about the loss of predictability, rising costs for power users, and multipliers for powerful models. Tools are available to estimate future bills.

Vibe with a new model and cloud
https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5
Mistral introduced a new agentic model, Medium 3.5 (128B, 256k context), and made it the primary model in the Vibe CLI. They also added remote agents that run asynchronously in isolated cloud sandboxes (similar to Codex or Claude Code) for long-running tasks. These can be launched from the CLI or the Le Chat web interface with history synchronization.

AI Didn't Delete the Database
https://idiallo.com/blog/ai-didnt-delete-your-database-you-did
A tweet went viral: a startup founder claimed an AI agent deleted their production database in seconds. He was outraged, questioning the model and blaming "bad AI." However, the author argues: it's not the AI's fault. The issue was a public API endpoint in production that could destroy the entire database with a single request.

It's like placing a self-destruct button in plain sight and being surprised when someone presses it. Ibrahim Diallo says the AI didn't delete the database—the developers did by using unsafe architecture, lack of protection, and irresponsibility. The AI simply discovered what they carelessly left behind.

Discussion
https://news.ycombinator.com/item?id=48022742
Most people fully agree: it's not the AI's fault, but rather the person who gave the agent unrestricted production access without limiting API token permissions or setting up safeguards. A tool can be dangerous, but responsibility always lies with the operator. Many criticize "AI-maximalism"—when developers enthusiastically grant agents full access instead of using sandboxes and reviews.

10 Lessons for Agentic Coding
https://www.dbreunig.com/2026/05/04/10-lessons-for-agentic-coding.html
Thanks to modern AI agents, code has become extremely cheap to create but expensive to maintain, secure, and support. This completely changes the development approach: the key is no longer saving on writing code, but wisely utilizing its low cost.

  1. Implement to Learn.
  2. Rebuild Often.
  3. Invest in End-to-End Tests.
  4. Document Intent.
  5. Keep Your Specs Synced.
  6. Find the Hard Things.
  7. Automate Everything Easy.
  8. Develop Your Taste.
  9. Agents Amplify Expertise.
  10. Code is Cheap, but Maintenance, Support, and Security are Not. Agentic code is "free as in puppies." Maintenance isn't cheap, and neither is security. Build fast, but remember the maintenance burden you are taking on.

Discussion
https://news.ycombinator.com/item?id=48019025
The discussion is active and mostly positive—many consider this one of the most practical and sober publications on working with AI agents. Most agree: code is extremely cheap, so focus must shift to architecture, security, E2E tests, maintenance, and "taste." Skeptics argue that coding is only a small part of the job; business and organizational bottlenecks remain, and in large companies, development speed isn't the primary constraint.

Speaking of the major LLM players, xAI is the only one that hasn't been monetizing developers and programmers until now. It seems they are starting to change that.

Cursor and xAI
https://techsifted.com/posts/spacex-cursor-acquisition-april-2026/
SpaceX/xAI has secured an option to acquire Cursor for $60 billion. If the acquisition does not go through, Cursor will still receive $10 billion for partnership and joint R&D work. This is a right to buy the company later at a fixed price.

In March, several key Cursor engineers moved to xAI. In May, Cursor began a massive international expansion and hiring push. If xAI's infrastructure makes future versions even more powerful, most Cursor users will likely stay.

The developer reaction has been mixed. Part of Cursor's audience chose it specifically for its independence—not being OpenAI, Microsoft, or Google, but offering any model of choice. Now that the service is potentially joining Elon Musk's ecosystem, it remains unclear whether this will impact the priority of the Grok model.

Fine-tuning Grok on Cursor Data
https://x.com/elonmusk/status/2055914584373141906
On May 17, xAI completed the primary training of the massive Grok V9 model (1.5 trillion parameters). The next stage is supplemental training using data from Cursor. This will allow Grok models to significantly improve their coding skills, as Cursor has collected a vast database of high-quality code from developers.

Grok Build CLI Launch
https://x.ai/news/grok-build-cli https://x.ai/cli
On May 14, xAI released an early beta version of Grok Build—a code generation agent: task planning, sub-agents for parallel work, headless mode for scripts, support for AGENTS.md, diffs, plugins, etc. It’s a fully-featured tool and a direct competitor to Claude Code and similar instruments.

However, it is currently only available with a SuperGrok Heavy subscription ($300/month with a three-day trial) and runs in the terminal only on Linux/macOS (Windows via WSL). Updates are released almost daily, and users are already praising its speed and quality. Elon Musk is personally asking for feedback.

https://www.youtube.com/watch?v=l_dAOKHLiYw

xAI is currently offering a promotional SuperGrok Heavy subscription: instead of $300 per month, the rate is temporarily around $99 for the first six months. However, users are complaining that even the Heavy plan doesn't feel "unlimited," as real limits can change depending on system load.

A few interesting updates for May. Amidst the news about xAI, Anthropic also surprised many by announcing a partnership with SpaceX on May 6 to expand their computing power.

Anthropic Discounts and Transition to New Pricing
https://www.anthropic.com/news/higher-limits-spacex
Anthropic announced a temporary "spring discount" on their API models. They also stopped blocking OpenClaw-style usage. However, this is likely an attempt to smooth things over before big changes: the company is increasingly hinting at a revision of the classic "fixed subscription — unlimited chat" model. Instead of "per token" payment, Compute-based pricing is being introduced. The cost of a request will depend on how many computing resources the model spent on "reasoning."

Claude Code Updates
https://code.claude.com/docs/en/whats-new#week-18
On Windows, Claude Code finally no longer requires Git Bash; if it's missing, the tool now natively uses PowerShell.

Cloud functions. Public access (research preview) was opened for a new command /ultrareview, which spins up several autonomous AI agents in the cloud to check the repository for vulnerabilities and bugs in parallel. Before this, they also launched the /ultraplan command — a large planning task is pushed to Anthropic servers, where an isolated virtual machine is spun up for it (4 CPU cores, 16 GB RAM, with Node.js, Python, Rust, Docker, etc., pre-installed), eventually providing a link to a web interface with the results.

Managing OpenAI Codex from Mobile
https://openai.com/news/codex-mobile-app/
Responding to a similar feature in Claude Code, OpenAI released an update for Codex that allows managing AI agents from a smartphone. Now developers don't have to be near a laptop: they can approve pull requests, run testing pipelines, resolve merge conflicts, or give prompts to fix small bugs on the go. The interface is fully optimized for voice and quick commands — essentially, it's a pocket remote for the agent on your computer.

Gemma Models in Gemini CLI
https://geminicli.com/docs/changelogs/
The Gemini CLI terminal client update (v0.40.0) added experimental integration for local Gemma models. Version v0.41.0 added support for Gemma 4 models (experimental). While intelligent Model Routing and full offline agent execution are not yet available, the team is already preparing full local task execution.

Memory handling was also improved. Tiered Memory allows the agent to store context directly in Markdown files across four levels: from global developer styles (in ~/.gemini/GEMINI.md) to specific project directory rules. A new Auto Memory feature background-analyzes old sessions, finds successful solutions, and suggests saving them as reusable skills in SKILL.md. Auto Memory Inbox (from v0.42) is a system that automatically collects, classifies, and surfaces important pieces of information for an AI assistant’s long-term memory.

Voice mode has also been improved.

Google at May I/O 2026 has already begun "tightening the screws" and radically reshaping its infrastructure for developers.

Gemini 3.5 Flash
https://deepmind.google/models/gemini-3-5-flash/
The main "engine" of the announcement was the Gemini 3.5 Flash model, which precedes the upcoming 3.5 Pro. Google claims the model works significantly faster than previous generations and shows frontier-level results in agentic coding tasks: ~76.2% on Terminal Bench 2.1 and ~55.1% on SWE-Bench Pro.

The new Flash is significantly more expensive than the previous one, and the massive use of agents quickly burns through tokens and compute.

$100 Plan
https://blog.google/innovation-and-ai/technology/ai/google-io-2026-all-our-announcements/
Google is introducing a new tariff plan — Google AI Ultra for $100 per month, which provides higher limits for using agents in Antigravity. The more expensive enterprise tier is also being updated: instead of simple message limits, a "compute-used" model is increasingly being adopted — actual payment for agent resources and execution.

Everything will be Antigravity
https://antigravity.google/blog/introducing-google-antigravity-2-0
Previously, Project IDX was based on Code OSS (open-source VS Code). Now the strategy has changed: Google is actively shifting focus from IDX and Firebase Studio toward Antigravity.

Instead of fragmented tools, Antigravity 2.0 is now being promoted — an "agent-first" development platform following the chat-centered approach popular in recent months. This is a direct response to the Codex app and Cursor 3, but with full control by Google over the execution environment, sandboxing, and agent orchestration. They are also moving away from "VS Code-like" editors, but have radically removed the text editor altogether.

https://www.youtube.com/watch?v=3arUEZlv9mc

Judging by the low activity on Hacker News and early feedback on Antigravity 2, it seems many developers haven't switched to active use of the tool after launch — it is perceived more as another experimental AI-IDE than a stable work tool.

From Gemini CLI to Antigravity CLI
https://developers.googleblog.com/an-important-update-transitioning-gemini-cli-to-antigravity-cli/
Google officially announced the sunsetting of old tools. Most notably, from June 18, 2026, Gemini CLI (open source, daily quotas) and the Gemini Code Assist extension will disappear — they will stop serving requests for free users and even for AI Pro/Ultra subscribers, remaining available only for Enterprise.

Google is effectively shifting focus from Gemini CLI and Gemini Code Assist to the new Antigravity CLI (closed source), which becomes the primary terminal tool for agentic workflows. Quotas now look less like "number of prompts" and more like a compute usage model — how many agents and resources you actually use. Currently, it performs poorly and is gathering bug reports rather than serving as a developer tool.

In addition to Google's models, two Claude models from Anthropic and, for some reason, GPT-OSS 120B from OpenAI are available. That's it.

Native Android in Google AI Studio
https://android-developers.googleblog.com/2026/05/build-android-apps-google-ai-studio.html
In Google AI Studio, you can now generate a native Android app (Kotlin/Jetpack Compose) from a prompt and run it in an emulator directly in the browser.

If a project becomes complex, Google offers "seamless" export to Android Studio for further agentic development.

More new models of May.

Cursor Composer 2.5
https://cursor.com/blog/composer-2-5
On May 18, 2026, the Cursor team released Composer 2.5. It is based on the open Kimi K2.5 model from Moonshot AI, but now approximately 85% is Cursor's own fine-tuning. The main change compared to Composer 2 is increased autonomy and cost optimization.

The model offers two tiers: Standard at $0.50 per 1M input / $2.50 per 1M output tokens, and Fast at $3/$15. In SWE-Bench Pro tests, it achieved a 49% success rate (compared to 12% in Composer 2), meaning coding skills and context understanding have grown significantly at a reasonable price.

Qwen 3.7 Max
https://qwen.ai/blog?id=qwen3.7
On May 20, 2026, at the Alibaba Cloud Summit, Qwen 3.7-Max was announced. Unlike the previous Qwen 3.6 line, which focused on general tasks, the new version is positioned exclusively as an agentic model for ultra-long autonomous work cycles. The key change is stability during long-running tasks.

Alibaba demonstrated a case where the model fully autonomously optimized a GPU kernel over 35 hours without any human intervention, performing over 1,100 tool calls. The context window was expanded to 1 million tokens (up from 256k in its predecessor), and the "reasoning density" per token was increased.

Qwen 3.7-Max can generate complex interactive web applications from a single prompt—including 3D scenes on Three.js, Canvas animations, full-page layouts, and dynamic SVGs.

https://openrouter.ai/qwen/qwen3.7-max
There is currently a 50% discount on the model at OpenRouter ($1.25/$3.75), making Qwen 3.7 Max perhaps the best choice for price/performance in long-running tasks.

Claude Opus 4.8 — fewer hallucinations and more control
https://www.anthropic.com/news/claude-opus-4-8
On May 28, 2026, Anthropic introduced Claude Opus 4.8 (pricing remains the same as 4.7 at $5/$25 per million tokens) and once again topped the Artificial Analysis global rankings with a score of 61.4, overtaking GPT-5.5.

Instead of focusing on abstract benchmarks, Anthropic prioritized system "honesty": the model learned to directly state "I don't know" or ask for clarification, and it misses hidden bugs in its own code 4 times less often than Opus 4.7.

Dynamic workflows appeared in Claude Code. Now Opus 4.8 can independently plan large-scale tasks, launch parallel sub-agents, and verify results before submitting the work.