Autonomous Coding Experiment
https://cursor.com/blog/scaling-agents
Cursor launched hundreds of AI agents simultaneously to work on a single collaborative project for weeks without human intervention. The essence is to move from the "one chatbot solves one task" format to a "virtual IT company" model, where agents work in parallel without interfering with each other.
The main takeaway is that simply increasing the number of agents is effective for solving complex tasks if prompts and models are properly configured (Opus 4.5 tends to "cut corners," while GPT-5.2 is better at long-term planning). The solution was a hierarchical "Planners and Workers" approach. Planners continuously explore the code and create tasks, while Workers implement them without being distracted by overall coordination.
Agents wrote over a million lines of code, building a web browser, a Windows 7 emulator, and an Excel clone from scratch.
https://www.youtube.com/watch?v=U7s_CaI93Mo
Agents created a browser, but it doesn't work
https://emsh.cat/cursor-implied-success-without-evidence/
A blog post by embedding-shapes debunks this "success." The author claims that Cursor's experiment is a marketing illusion and fiction, and the agents' output is non-working garbage: the project cannot be built. The cargo build command returns dozens of errors. Agents spent weeks writing code but seemingly never tested it for functionality and ignored compilation errors.
This is "AI slop"—generated text that looks like code but lacks real logic or a working structure. The agents simply "inflated" the volume (a million lines) but failed the basic minimum: creating a program that at least launches and opens a simple HTML file. In other words, they created code, not a program.
https://news.ycombinator.com/item?id=46646777
Users (specifically nindalf) looked into the dependency file (Cargo.toml) and discovered that the "browser" uses ready-made components from Servo (a motor by Mozilla/Igalia) for HTML and CSS parsing, as well as the QuickJS library for JavaScript. Cursor's claim that agents wrote all of this "from scratch" was deemed a lie. The code generated by the agents is mostly "glue" connecting existing third-party libraries.
The community confirmed the findings of the embedding-shapes author: the code does not compile, tests fail, and the commit history shows that agents simply generated gigabytes of text without functional verification. The claims about "millions of lines of code" and "autonomous agents" are targeted at managers and investors who won't check the repository. The situation is being compared to fraud.
#cursor #autonomousagents