CodeWithLLM-Updates
-

https://docs.cursor.com/guides/advanced/large-codebases

Cursor developers shared tips and techniques for effectively working with large and complex codebases.

They highlighted key aspects that help in navigating unfamiliar code faster. Key recommendations include:

  • Using Chat for Code Understanding: Via the chat mode, you can quickly get explanations on how certain parts of the code work. It is also recommended to enable the "Include Project Structure" feature for better understanding of the project structure.
  • Writing Rules: Creating rules allows emphasizing important project information and ensures better understanding for the Cursor agent.
  • Detailed Planning of Changes: For large tasks, it's worth spending time creating an accurate and well-structured plan of action steps.
  • Choosing the Right Tool: Cursor offers various tools (Tab, Cmd K, Chat), each with its advantages for specific tasks – from quick fixes to large-scale changes across multiple files.

They emphasize the importance of breaking down large tasks into smaller parts, including relevant context, and frequently creating new chats to maintain focus.

https://memex.tech/blog/introducing-memex-the-everything-builder-for-your-computer

Memex has officially announced the launch of its platform, which allows you to create any software, from web applications to 3D designs. It is worth noting that they chose a very unfortunate name for themselves, because firstly it is the term of the inventor Vannevar Bush, and secondly there are already many projects with it.

Memex is positioned as "The Everything Builder" for the computer. The platform supports any technology stacks and programming languages. Memex works on Windows/Mac/Linux (this is the Tauri framework) and allows everyone, regardless of their technical experience, to explore, build and deploy software solutions by talking to AI.

The agent uses Claude models - a combination of Sonnet 3.7 + Haiku, and has access to the Internet. Creates checkpoints via built-in shadow git. Plans to support Gemini 2.5 and MCP.

https://www.byterover.dev/

Memory of the code and implementation of various functions as an MCP server was done by ByteRover. That is, using this or a similar project, you can switch between Cursor, Windsurf, Cline/Roo, and other coding agents with MCP, and each will know what has already been done. Free plan for 1k records/month.

Minus - their cloud is used, that is, the data is not stored locally, but in a company that needs to be trusted.

https://www.youtube.com/watch?v=9sPsraoe0_c

https://github.com/github/github-mcp-server

GitHub launched their official MCP server.

https://www.youtube.com/watch?v=d3QpQO6Paeg


https://modelcontextprotocol.io/

The Model Context Protocol (MCP) was introduced by Anthropic on November 24, 2024, as an open standard for connecting AI systems to data sources. The first connectors released were for GitHub, Google Drive, and Slack.

By February 2025, the developer community had created over 1000 open MCP connectors, demonstrating significant ecosystem growth and interest in the protocol. Support for MCP also gradually appeared in all major AI programming applications/extensions, including Cline/Roo, Cursor, Windsurf, and Continue.

Through MCP, you can work with Postgres, Upstash, and Slack directly in the code editor. Browsertools MCP provides access to the browser console for debugging. And https://context7.com/ provides up-to-date documentation for AI code editors.

A significant step was OpenAI's announcement on March 26, 2025, of support for MCP. Soon after, at Google Next 2025, Google announced MCP support in the SDK for their Gemini models (though they also introduced the A2A protocol). Thus, the protocol is gradually becoming universal.


Organization and Ecosystem. Following the initial repository (https://github.com/modelcontextprotocol/servers), third-party online catalogs began to emerge (such as https://opentools.com/ https://mcp.so/ https://mcpserverdirectory.org/, etc.), where you can find the necessary server. Projects for MCP managers are appearing that simplify installation, for example https://mcp-get.com/ https://mcpm.sh/ https://mcpmanager.app/ https://mcpmcp.io/, etc.

There are projects that help convert a standard REST API to MCP - for example https://rapid-mcp.com/ https://api200.co/mcp.

The problem with open catalogs is the unclear reliability of the hosted servers.

Security. Since an MCP server acts as an intermediary between the model and the data source, a malicious actor who sets up a server can log everything, including API access keys to the data. Authentication and authorization are not yet standardized within MCP.

Servers are divided into official and community types. Obviously, official servers are not intermediaries, and requests to them are analogous to requests to API endpoints. Community servers, set up by third parties, should be treated with caution, and it's worth checking who is behind them. You can also set up your own server in cloud (for example, weather on AWS lambda) or with a container via mcp-containers.

The more the protocol spreads, the more official servers will appear, as was the case with REST API.

Claude Code, OpenAI Codex, and Aider are agents that work with the console.

https://github.com/coder/agentapi
The AgentAPI project allows managing such systems via HTTP API (GET and POST). This allows, for example, launching multiple systems and "talking" to them through one chat, or creating an MCP so that one agent system can task another.

https://github.com/eyaltoledano/claude-task-master
For clear management of development steps, you can use this project and connect it as an MCP.

https://www.anthropic.com/engineering/claude-code-best-practices
For Claude Code, it turns out that there is a command word "ultrathink", which you can read about in a fairly detailed document they posted on the site.

"We recommend using the word "think" to activate an extended reasoning mode that gives Claude additional compute time to more thoroughly evaluate alternatives. These specific phrasings map directly onto increasing levels of compute budget in the system:
"think" < "think hard" < "think harder" < "ultrathink".
Each level allocates more and more compute budget for Claude to use."

Other recommendations:

  • configure the context (here the CLAUDE.md file) through system instructions. Сode standards, commands, etc.
  • use .allowed-tools to allow frequently used tools. Configure secure MCPs
  • plan and add tests (via TTD) before generating code
  • have the agent make regular commits
  • explain to the agent specifically and thoroughly. The more specific the request, the better the result.
  • use less automatic (auto-accept) mode: monitor what the agent outputs and correct it as early as possible (Escape key to stop) if it chooses the wrong path
  • Advanced level — run two agents: one writes code, the other checks.

Varun Mohan, co-founder and CEO of Codeium, now Windsurf, shares the company's story, discusses two key pivots, hiring philosophy, the impact of AI on the engineering profession, corporate market entry strategy, and demonstrates Windsurf capabilities.

https://www.youtube.com/watch?v=5Z0RCxDZdrE

First pivot (2022): With the emergence of ChatGPT, the team shifted its focus to AI-powered coding, creating a free plugin for code autocompletion (supporting VSCode, JetBrains, etc.). Second pivot → Windsurf: VSCode API limitations forced them to fork the IDE and create an AI-native environment with advanced features (e.g., visual editing).

New paradigm: AI writes >90% of the code → the developer focuses on review and architecture. For non-developers: creating simple applications without deep knowledge.

AI model usage strategy - hybrid approach: Frontier models (e.g., Sonnet) for high-level tasks. Own models for code retrieval and editing.

The conversation highlights how quickly the development landscape is changing thanks to AI. Windsurf is actively shaping this future, not afraid of radical pivots and betting on a deep understanding of code and "agentic" AI capabilities, not just autocompletion.

The possibility of OpenAI acquiring Windsurf is currently being actively discussed in the news.

https://github.com/openai/codex

OpenAI finally responded to Claude Code and released their version of an agent for programming that works through the terminal and can create and edit code files. The project is open source.

Also, like Claude Code, it officially supports only macOS and Linux. Windows support is available through WSL.

They named it Codex, which may now be confusing, as one of the first models for programming (from 2021), on which GitHub Copilot started working, had the same name.

It is installed simply as a global package npm install -g @openai/codex. There are three Approval Modes - by default, it's Suggest (read-only), but it can also be set to editing and full auto (with command execution in the terminal).

https://www.youtube.com/watch?v=FUq9qRwrDrI

Announced along with the thinking models o3 and o4-mini, which were finally given the ability to use tools. By default, Codex uses o4-mini, but you can specify any model available in the Responses API.

All file operations and command executions occur locally - the request, context, and diff summaries are sent to the model on the server for generation.

https://openai.com/index/gpt-4-1/

New model update from OpenAI is a response to new Google's Gemini models, which all have a 1 million token context window and more accurate instruction following.

We are particularly interested in the fact that, according to their own tests, the GPT 4.1 model has become better at code generation. That is, if 4o produced decent code on one out of three requests, then 4.1 will do it on every second one 😉.

https://aider.chat/docs/leaderboards/
In the article, the model is compared only to its own models. Overall, it can be evaluated on the Aider LLM Leaderboards, where it achieves 52.4% accuracy, while Gemini 2.5 Pro Preview 03-25 scores 72.9%.


In Cursor, gpt-4.1 is now available in the settings for available models.

This update is particularly important for GitHub Copilot (gpt 4.1 is already available), because their agent and chat are initially tied to the GPT-4 model of OpenAI, and in the free plan Claude Sonnet is still not 3.7, but 3.5.

Tomorrow there will be VS Code Live: Agent Mode Day, where I think they will tell more details.

https://www.pillar.security/blog/new-vulnerability-in-github-copilot-and-cursor-how-hackers-can-weaponize-code-agents

How can you attack automatic code generators?
By poisoning system instructions (“Rules File Backdoor”) of LLM.

Many AI coding programs now have the ability to load them from a text file (for example, in Cursor it is .cursorrules or a rules folder in the project root) - just a text file(s).

I think only inexperienced programmers or those who are not familiar with how new IDEs with agent coders work will run someone else's code without reading the instruction file beforehand if it exists.

The next option is when we create a project and copy such instructions ourselves from open directories, such as cursor.directory - again, you need to understand what you are doing, and read it beforehand.


But Pillar Security researchers found that attackers can use hidden Unicode characters and other bypass techniques in text rule files to trick agent assistants (such as Cursor or GitHub Copilot) and force them to generate code with backdoors or vulnerabilities (for example, to load external hacker javascript to the main page of the site).

How does it work?

  • Creating a malicious rules file: A hacker creates a rules file that looks harmless 👀, but contains hidden malicious instructions 😈 using Unicode characters.
  • Injection into the project: The rules file gets into a shared repository 🌐 or is distributed through communities 🧑‍🤝‍🧑.
  • Code generation: A developer, using an AI assistant, generates code 💻. AI, following malicious rules, creates code with vulnerabilities or backdoors 💥.
  • Malicious code spreads: Due to the fact that rule files are often shared and reused, infection can spread to many projects 🦠.

"Unlike traditional code injection attacks targeting specific vulnerabilities, “Rules File Backdoor” poses a significant risk because it turns AI itself into an attack vector."

The most vulnerable to such an attack are those who think little when creating code - do not read instruction files, do not check everything that was generated. Publishes code or deploys projects without prior security audit.

Theoretically, agent IDEs should be responsible at least for checking rule files and code comments for inserted invisible instructions, but, judging by the article, the developers of Cursor and GitHub Copilot said that users themselves (!) are responsible for the code they generate.

https://windsurf.com/blog/windsurf-wave-7

"Windsurf Wave 7" Update

Cascade is now available in JetBrains IDEs (IntelliJ, WebStorm, PyCharm, GoLand, and many others).

Codeium is now Windsurf
"We decided to rename the company to Windsurf and the product extension to Windsurf Plugin". There will be no more Codeium.

The company was founded in 2021 by Varun Mohan and Douglas Chen with the goal of increasing developer productivity through AI-based coding solutions, and the first year was called Exafunction (engaged in GPU virtualization).

Later, they started code autocompletion, creating a plugin for IDEs. In 2023, chat features inside the IDE and code generation were added. GPT-4 model was integrated.

On November 11, 2024, Windsurf Editor was launched, which they began to promote as the first AI agent-based IDE. Despite the fact that Cursor was first (spring 2023), their marketers tried to pretend it didn't exist.

Chats with different contexts (usually frameworks) are now available at https://windsurf.com/live/

https://console.x.ai/
Model xAI Grok-3 is finally available via API

In programming extensions where you can add your keys (Cline, Roo), you can now use it directly or through https://openrouter.ai/x-ai/grok-3-beta

In Windsurf, all top models are available today, including Gemini 2.5 Pro (which is ahead in many tests) and DeepSeek V3 (0324).

Similarly, in Cursor, you can now select deepseek-v3.1, grok-3-beta, gemini-2.5-pro-exp-03-25 and gemini-2.5-pro-max models in the settings.

In Trae, there are currently no models from Google or xAI.

https://block.github.io/goose/blog/2025/04/08/vibe-code-responsibly

The creators of the Codename Goose project (AI agent for computer control) described their pain points and possible solutions to the problem of vibe coding.

After Karpathy's tweet, which was picked up by the media, more and more people began to create "programs" simply by talking to AI and not looking at the code. But an LLM is not a programmer, it is a coder (code generator).

To put it mildly, this creates very low-quality, unprofessional code, the main problems of which are:

  • "spaghetti"-code that is difficult for a human to understand, where everything is mixed up with everything else. Usually also in one long file of thousands of lines.
  • constant mutation and drifting bugs: pieces of code that no longer do anything, and replacing well-functioning pieces with garbage.
  • huge number of vulnerabilities, code that is easy to hack.
  • leakage of closed information, such as access keys, into publicly available code.

Such code is almost impossible to maintain. It is better not to create it at all if it is not a "program just for yourself for one time use."

Goose developers suggest better control and configuration of agent systems so that they monitor what is being generated in the code:

  • 🧠 "Even if you're vibe coding, don't turn off your brain."
  • use different modes of control for agents, not just fully automatic.
  • use an ignore file (in Cursor it is .cursorignore), where you list what agents should in no case read or modify, and a file of system instructions (here it is goosehints in Cursor .cursorrules) to set restrictions.
  • there are now many MCP servers, including vibe-coded ones; they need to be checked and an Allowlist (allow policy) created for the agent, including only high-quality ones.
  • first plan, then do — a plan breaks everything down well into understandable stages and different small code files. Steps can be checked (how to do this in Cursor — see this video).
  • commit every step and use git to revert to code that worked well.

Exponent
https://x.com/exponent_run
With all these ai programs, it is not entirely clear at what stage of development they are and what they have released, but they wrote that it is still early access, they wrote so 4 months ago, maybe they have finished something.

Augment Agent
https://www.augmentcode.com/
presented the agent. there is a 14-day trial. The agent is designed to solve complex software development tasks, especially in large projects. A key feature is "Memories" that are automatically updated and stored between sessions, improving the quality of generated code and adapting to the programmer's style.

Other features include MCP (Model Context Protocol), "Checkpoints" for safe rollback of changes, multimodal support (screenshots, Figma), execution of terminal commands, and automatic mode.

https://codeium.com/blog/windsurf-wave-6

Windsurf Wave 6 Update

The main feature is "Deploys", which allows publishing websites or Javascript applications to the internet with a single click. Let there be even more of this (vibecoding slop). Currently, this function is integrated with Netlify and aims to simplify the full application development cycle directly within the IDE.

Also, in dialogues with AI agent (Cascade), memory and navigation have been improved.

For paid users, commit description generation has been added with a single button click (this has been in Cursor for a very long time, and it appeared and works for free in Github Copilot).

It appears the developers behind the Zed editor – yes, the ones who've apparently spent the last year unable to procure a Windows machine to build a version for that OS – have noticed something: their unreleased Zed AI is already becoming outdated.

Consequently, they're now rolling out 'Agentic Editing' to their beta testers. Based on the description, it seems to offer the expected suite of modern features: automatic code editing, chat profiles, a rules file for system instructions, LLM switching (including non-Anthropic options), MCP, and checkpoints (currently handled via git in beta).

Importantly, this could genuinely position Zed as a strong alternative to the dominance of VS Code and its forks. Just as soon as they manage to, you know, finally ship that Windows version. In the meantime, Windows users can install Zed using Scoop.