Cursor’s New Bugbot Is Designed to Save Vibe Coders From Themselves

Anysphere, the company behind the wildly popular vibe coding platform Cursor, is officially launching a new tool that’s designed to spot errors in code. The release comes as software developers are expected to code at a higher velocity than ever before due to the rise of AI-assisted coding.

The new tool, Bugbot, integrates with Github, a platform where engineers keep their code. When a human or agent introduces changes, Bugbot automatically flags any errors. While that’s crucial for human coders, it’s particularly useful when using AI coding agents, which work incredibly fast and can introduce errors that are difficult for humans to spot and untangle.

Anysphere sees the tool’s release as an opportunity to lure more potential vibe-coders onto the Cursor platform. “Our core product is giving you software engineering super powers, but software engineering goes beyond just writing code in your editor,” Jon Kaplan, an engineer at Anysphere, tells WIRED. “Bugbot is one of the ways we’re now stepping out of the editor.”

Last month, Anysphere invited a few thousand engineering teams to beta-test the new tool. Now the company is making it publicly available for $40 per month per person. (Annual customers will get a discount.) That means that existing Cursor customers, who pay between $20 and $200 per year depending on the level of premium features, will now pay an additional $40 for access to Bugbot.

Image may contain File Webpage Page Text Computer Hardware Electronics and Hardware

Anysphere, which was cofounded in 2022 and has around 60 employees, has raised $900 million dollars from marquee firms like Andreessen Horowitz and Thrive Capital, alongside angel investors like Google chief scientist Jeff Dean, Stripe CEO Patrick Collison, and former GitHub CEO Nat Friedman (who now works at Meta Superintelligence Labs). The startup counts OpenAI, Shopify, Instacart, Midjourney, Discord, and Rippling among its thousands of customers. Even Alphabet CEO Sundar Pichai has copped to vibe coding with Cursor.

But the competitive landscape for AI-assisted coding platforms is crowded. Startups Windsurf, Replit, and Poolside also sell AI code-generation tools to developers. Cline is a popular open-source alternative. GitHub’s Copilot, which was developed in collaboration with OpenAI, is described as a “pair programmer” that auto-completes code and offers debugging assistance.

Most of these code editors are relying on a combination of AI models built by major tech companies, including OpenAI, Google, and Anthropic. For example, Cursor is built on top of Visual Studio Code, an open-source editor from Microsoft, and Cursor users are generating code by tapping into AI models like Google Gemini, DeepSeek, and Anthropic’s Claude Sonnet.

Several developers tell WIRED that they now run Anthropic’s coding assistant, Claude Code, alongside Cursor (or instead of it). Since May, Claude Code has offered various debugging options. It can analyze error messages, do step-by-step problem solving, suggest specific changes, and run unit tests in code.

All of which might beg the question: How buggy is AI-written code compared to code written by fallible humans? Earlier this week, the AI code-generation tool Replit reportedly went rogue and made changes to a user’s code despite the project being in a “code freeze,” or pause. It ended up deleting the user’s entire database. Replit’s founder and CEO said on X that the incident was “unacceptable and should never be possible.” And yet, it was. That’s an extreme case, but even small bugs can wreak havoc for coders.

Anysphere didn’t have a clear answer to the question of whether AI code demands more AI code debugging. Kaplan argues it is “orthogonal to the fact that people are vibe coding a lot.” Even if all of the code is written by a human, it’s still very likely that there will be bugs, he says.

Anysphere product engineer Rohan Varma estimates that on professional software teams, as much as 30 to 40 percent of code is being generated by AI. This is in line with estimates shared by other companies; Google, for example, has said that around 30 percent of the company’s code is now suggested by AI and reviewed by human developers. Most organizations are still making human engineers responsible for checking code before it’s deployed. Notably, one recent randomized control trial with 16 experienced coders suggested that it took them 19 percent longer to complete tasks than when they were not allowed to use AI tools.

Bugbot is meant to supercharge that. “The heads of AI at our larger customers are looking for the next step with Cursor,” Varma says. “The first step was, ‘Let’s increase the velocity of our teams, get everyone moving quicker.’ Now that they’re moving quicker, it’s, ‘How do we make sure we’re not introducing new problems, we’re not breaking things?’” He also emphasized that Bugbot is designed to spot specific kinds of bugs—hard-to-catch logic bugs, security issues, and other edge cases.

One incident that validated Bugbot for the Anysphere team: A couple months ago, the (human) coders at Anysphere realized that they hadn’t gotten any comments from Bugbot on their code for a few hours. Bugbot had gone down. Anysphere engineers began investigating the issue and found the pull request that was responsible for the outage.

There in the logs, they saw that Bugbot had commented on the pull request, warning a human engineer that if they made this change it would break the Bugbot service. The tool had correctly predicted its own demise. Ultimately, it was a human that broke it.

Read the full article on Wired.com