Code With LLM

CodeWithLLM-Updates

January 2025

https://www.cursor.com/changelog

Cursor

DeepSeek models: DeepSeek R1 and DeepSeek v3 are supported in versions 0.45 and 0.44. You can enable them in Settings > Models. We host these models in the USA.

But the composer (automatic code writing system) still supports only outdated GPT-4o or Claude-3.5 for now. In the chat with r1, the <think> tag is not a hidden block.

#cursor #newllmmodel

https://www.all-hands.dev/

All Hands OpenHands
(previously OpenDevin)

Open Source Python tool inspired by Devin ', launching agents for programming: writes code, executes commands, goes online.

Runs through Docker, they advise using anthropic/claude-3-5-sonnet-20241022, but you can also use Gemini and DeepSeek - configured by entering an API key. The solution is automatic with a large number of requests, so it consumes a lot of tokens.

There is a waiting list to try their hosted version.

video on how to install and configure:
https://youtu.be/WDP2S4MOXPY

https://github.com/RooVetGit/Roo-Code
plugin is actively changing from just an automatic code editor to the orchestration of various agents with their prompts and limitations.

Roo Code Update (prev. Roo Cline) v 3.3

Code Actions

Roo Code now integrates directly with the native VS Code Code Actions system, providing quick fixes and refactoring options right in the editor. Look for a light bulb above the code 💡

Advanced mode capabilities:

Markdown Editing: implemented one of the most requested features - "Ask" and "Architect" modes can now create and edit Markdown files!
Custom File Restrictions: in general, custom modes can now be limited to certain file patterns (can only edit Markdown files).
Self-switching mode: modes can intelligently request switching between each other depending on the task. For example, the "Code" mode may request switching to the "Test Engineer" mode when it is ready to write tests.

#roo

https://www.technologyreview.com/2025/01/20/1110180/the-second-wave-of-ai-coding-is-here/

In the article, those who are now telling how their models and services for programmers will soon replace thousands of developers, but for now are happy to deceive at least one investor:

https://zencoder.ai/ (try for free with prices)
https://www.merly.ai/ (trial without price)
https://cosine.sh/ (waitlist)
https://www.tessl.io/ (waitlist)
https://www.poolside.ai/ (looks like B2B)

https://www.youtube.com/watch?v=itsGX3UioLk

Generating a draft site using bolt.new when the request for it is created by infranodus.com (there is a two-week trial) analyzing the gap between supply and demand, in this case, real estate in Berlin and the case of healthy food.

#bolt

https://www.youtube.com/watch?v=yHDvCGNjIqk

The video compares Bolt and Lovable, two AI tools for creating web applications.

The author gives both platforms the same task: to create a Trello-like application with draggable elements.

Bolt generated the initial version a little faster than Lovable.
Both applications allowed creating tasks and moving them between columns, but not editing.

Both tools quickly added this feature after the corresponding request.

When the author asked to change the design to Spotify style, both AIs successfully changed the color scheme to dark with green accents.

Bolt in this case offered a more interesting highlighting effect when hovering over the buttons.

When adding functionality for multiple boards,

Bolt implemented switching between them via a dropdown menu, and Lovable placed the board names in the top navigation bar, which the video author liked more.

In general, both tools are effective for rapid development, but the choice between them may depend on the user's priorities:

Bolt may be better for those who value speed, and Lovable – for those who prefer a more intuitive "out of the box" interface

PS from me: Lovable is a closed product (formerly GPT Engineer), where 5 requests/day are free and you cannot save the code directly. Bolt.new and fork bolt.diy are available on github - you can install it on your computer or use the website bolt.new

#bolt #lovable

OpenAI agent Operator (news on techcrunch ) via Google AI Google AI Studio and Repl can create a website =)

#newllmmodel

https://www.trae.ai/home currently there is only a version for MacOS, because who needs your Windows?

Trae
1.0.1 (January 23, 2025)

Another "genius" invention from China's ByteDance - now, in addition to them, together with Chinese intelligence, monitoring dances on TikTok, they decided to monitor everyone who writes code.

According to Trae Privacy Policy they can not only store "code, text, photographs, files, feedback, chat history, or any other content that you may upload to the Platform" (When you interact with the Platform, we may collect code, text, photographs, files, feedback, chat history, or any other content that you may upload to the Platform, and the associated metadata), but also take code for "training and improving their technologies"

They called their miracle clone of Cursor - Trae.
Has "built-in access to GPT-4o and Claude-3.5-Sonnet". And where is DeekSeek?

#trae

Doriandarko (Pietro Schirano) creates assistants for programmers on top of the most common top LLM models.

These are Python scripts that work from the command line (can be opened in the Terminal tab in VS Code). Similar to Aider, but simpler.

https://github.com/Doriandarko/deepseek-engineer

#autocomplete

https://github.com/yuaotian/go-cursor-help

If for some reason you are unable to use the Cursor trial (this is a common problem ), the Chinese have reached here too and made scripts to fix it.

#cursor

https://github.com/cline/cline/releases/tag/v3.2.0

Cline Update 3.2

Of course, they added the deepseek-reasoner model.

It's cool that the Roo Cline fork (now called Roo Code ) is doing something under test, as it was creating custom agent roles, and in Cline it is already added thoughtfully, as now it is the Plan/Act mode switch with changing the color of the query field.

#cline #roo #agentmode

https://github.com/PatrickJS/awesome-cursorrules

Catalog of examples of additional queries for projects.

Frontend Frameworks and Libraries
Backend and Full-Stack
Mobile Development
CSS and Styling
State Management
Database and API
Testing
Build Tools and Development

Project Nuxt 3:

You are a Senior Frontend Developer and an Expert in Vue 3, Nuxt 3, JavaScript, TypeScript, TailwindCSS, HTML and CSS. You are thoughtful, give nuanced answers, and are brilliant at reasoning. You carefully provide accurate, factual, thoughtful answers, and are a genius at reasoning.

Follow the user’s requirements carefully & to the letter. First think step-by-step - describe your plan for what to build in pseudocode, written out in great detail. Confirm, then write code!

Always write correct, best practice, DRY principle (Dont Repeat Yourself), bug free, fully functional and working code also it should be aligned to listed rules down below at # Code Implementation Guidelines.

Focus on easy and readability code, over being performant. Fully implement all requested functionality. Leave NO todo’s, placeholders or missing pieces. Ensure code is complete! Verify thoroughly finalised. Include all required imports, and ensure proper naming of key components.

Be concise Minimize any other prose. If you think there might not be a correct answer, you say so. If you do not know the answer, say so, instead of guessing

Coding Environment

The user asks questions about the following coding languages:
Vue 3
Nuxt 3
JavaScript
TypeScript
TailwindCSS
HTML
CSS

Code Implementation Guidelines

Follow these rules when you write code:
Use early returns whenever possible to make the code more readable.
Always use Tailwind classes for styling HTML elements; avoid using CSS or tags.
Always use composition api.
Use descriptive variable and function/const names. Also, event functions should be named with a “handle” prefix, like “handleClick” for onClick and “handleKeyDown” for onKeyDown.
Implement accessibility features on elements. For example, a tag should have a tabindex=“0”, aria-label, on:click, and on:keydown, and similar attributes.
Use consts instead of functions, for example, “const toggle = () =>”. Also, define a type if possible.

It will also work in Windsurf there is a .windsurfrules file
and in Cline there is a .clinerules file
and in Aider there is CONVENTIONS.md and config

#windsurf #cursor #aider #prompts

https://api-docs.deepseek.com/news/news250120

DeepSeek-R1

The Chinese startup DeepSeek continues to delight us with cheap clones. Here they got to openai o1.

Through the API, if you use it (directly or through openrouter ), you have to pay - an already generated key will do, you just need to change the model to deepseek-reasoner. It costs 4 times cheaper than o1.

Now it is not in the list in Cline and Aider - we are waiting for updates. But you can already throw your DeepSeek API key through OpenRouter, or pay them.

Through the web interface and their new phone app, you can use it for free. Canvas/Artifacts have not yet been copied.

#newllmmodel

https://codeium.com/changelog

Windsurf update 1.2.1

Cascade can now automatically perform web searches if the query requires up-to-date information from the internet. For an explicit search query, you can use the @web command, and for searching in popular documentation (including Windsurf's own help) - the command .

Can use URLs as context, which is useful when working with articles, documentation, and files from GitHub.

Automatic creation of Memories
Cascade now automatically creates "memories" to save context between conversations. Users can manually trigger the creation of "memories". They are displayed in a special panel and can be deleted.

All this together is called Windsurf Wave 2 https://codeium.com/blog/windsurf-wave-2

#windsurf

https://mistral.ai/news/codestral-2501/

Mistral AI introduced an updated model Codestral 25.01, which improves the speed and accuracy of code generation, especially in "fill-in-the-middle" (FIM) tasks.

It really generates quickly. For some reason, the model has become worse in Java, but added % HumanEval in Python / C++ / Javascript

You can check it through Continue.dev or via the openai compatible API (codestral-latest model) - key in the console

The model in the console is not yet displayed in the limits section, most likely now is the test period.

VS code - Cline [3.1.9] & Roo Cline [3.1.6]

Add Mistral API provider with codestral-latest model

#newllmmodel #cline #roo

https://thegroundtruth.substack.com/p/devin-first-impressions

Devin (v1.1.0, as of 15 January 2025) - a promising tool with the best UX among analogues, but still at an early stage of development.

Potentially, it can significantly change the software development process after addressing the current shortcomings. Currently, according to the author, it is worth its $500 per month.__(It is worth noting that on the tariff plan page, $500 is indicated as a "discounted price", while the initial crossed-out price is $1250)

Main advantages:
It has a user-friendly interface and simple setup. It demonstrates high speed in performing simple tasks. An interesting feature is the ability to analyze (“visually see“) web pages using screenshots. Devin also supports parallel work on multiple tasks and in some cases can make intelligent decisions, for example, automate repetitive actions.

Main disadvantages:
Slowly works with more complex tasks that require code refactoring or debugging. Sometimes it gets into an infinite editing loop. Its capabilities are limited: there is no access to sites with authentication (for example, it cannot visit the GitHub website and see the created PR). There is a decrease in performance during a long session (2.5 hours or 10 ACUs), as well as difficulties in using the knowledge base.

The video compares the work of Cursor (proprietary fork of VS Code) and Cline (open-source VS Code extension) - they edit a React project of 240k code tokens.

https://youtu.be/AtuB7p-JU8Y

Here both use claude-3.5-sonnet.

Cursor also uses some embedding model (either with OpenAI’s embedding API or by a custom embedding model ) and a cloud vector database to vectorize code chunks for semantic search. [Explanation with pictures ]

The first simple task was completed by both in 1 minute, but Cline returned broken code, on task 3 it "looped".

That is, Cursor Composer won 3 out of 3.

#cursor #cline #compare

A platform for automatic full-stack generation using the "Prompt to edit" approach.

Generated projects are added to the public catalog for free, and you can continue to do something of your own on top of other people's projects. There are also templates. They give 5 chat messages per day for free. There is no downloading to your disk, but there is a possibility to export to your github as a repository.

In beta now is the integration with supabase (authentication and db)

example:
https://www.youtube.com/watch?v=c6rd2iZ_A48

#lovable

Cline v3.1 update

Now saves DIFF changes at each step of the task.

Two new features:

Compare (see new changes) shows the difference between the snapshot and the current workspace for each of the files.

Restore allows you to return files or any parts of the project to this point in the task.

#cline

AiDE provides a structured approach to developing projects with the help of artificial intelligence. The framework offers a standardized way for artificial intelligence to understand the context of your project and maintain documentation.

https://github.com/FixingPixels/AiDE

there is a custom GPT

https://nmn.gl/blog/ai-senior-developer
and from the comments https://nmn.gl/blog/hn-rank-1-analysis

The code analyzer, analyzing linearly, often got stuck in details. To improve the analysis, we changed the approach, modeling the Mindset of experienced developers:

📝 File grouping: files are grouped by functionality (e.g., "authentication", "database").

ℹ️ Context: a description of the group's functionality within the overall architecture is added before code analysis. Impact analysis: consider changes in relation to the entire system

🕰 Historical understanding: track why the code evolved in a certain way

prompt to the group

Analyzing authentication system files:
- Core token validation logic
- Session management
- Related middleware

Focus on:
1. How these integrate with existing auth patterns
2. Security implications
3. Performance impact on other systems

Files to analyze:
${formatFiles(group.files)}

The result was an improvement in the quality of understanding, from simple observations to identifying potential problems, such as conflicts and relationships between components.

#prompts

Aider LLM Leaderboards
https://aider.chat/docs/leaderboards/

Polyglot test measures the ability of LLMs to program in popular languages.

Aider works best with LLMs that are good at editing code, not just generating code well. To evaluate the editing skills of LLMs, Aider uses tests that assess the model's ability to consistently follow system prompts to successfully edit code.

At the beginning of 2025, unexpectedly, the Chinese DeepSeek V3 (671B MoE) is showing very good results. Now they have discounts until February 8 on tokens, well, and the price is $0.14/M input $0.28/M output, but the context window is cut (you can buy on openrouter ) does not compare to o1 and claude-3.5-sonnet.

#newllmmodel #compare

https://open-vsx.org/extension/saoudrizwan/claude-dev

Two weeks ago version three was released of the VSC plugin Cline (prev. Claude Dev). It's like Composer in Cursor or Cascade in Windsurf but not locked to private subscriptions.

Two very nice features:

Auto-approve – now for programming, you don't even need to press the button every time ;), you can send the agent to the background and enable notifications when needed.

Tokens, of course, are consumed like crazy and sometimes it loops.

.clinerules – as I already wrote, this is also in Cursor and in Windsurf - it is a file in the root of the project with a custom instruction, where you can write out the tech stack, database structure, external APIs, and other things so that the agent does not get confused.

The problem, of course, is that now every app calls this file whatever it wants (cursorrules, windsurfrules) and they have not agreed on a standard.

fork Roo-Cline is still on v2

#cline #roo

January 2025