Anthropic relieves developers: Code review with AI agent team
Code Review for Claude Code checks pull requests for errors in parallel as a team of AI agents. This is intended to resolve human bottlenecks in code review.
(Image: erzeugt mit KI durch iX)
Anthropic has introduced a new multi-agent system for automated code reviews, designed to relieve developers, especially when checking AI-generated pull requests (PRs). It is available as a research preview for team and enterprise customers.
As Anthropic describes in its blog, the tool starts multiple agents upon opening a pull request, which work on different tasks in parallel. Some specifically look for logical errors; others verify the findings to filter out false positives. A final agent aggregates the results, removes duplicates, and prioritizes the identified issues by severity. The tool uses models from the Claude-4 series, including Claude Sonnet 4 and Sonnet 4.6, which can be operated via CLI.
The system scales with the size of the PRs: for over a thousand changed lines, Code Review deploys more agents and analyzes the entire codebase context. For smaller tasks, it is limited to a single pass. On average, an analysis takes about twenty minutes, according to Anthropic. The company has been using the system internally for months: before the introduction of AI review, only 16 percent of PRs received substantial review comments there; with AI, it was 54 percent. For large PRs, the rate is now even 84 percent, with an average of 7.5 issues found. Anthropic states the false positive rate is under one percent – measured by how often developers marked a finding as incorrect.
Videos by heise
Actions available as open source
Billing is token-based. On average, Anthropic estimates a review at 15 to 25 US dollars. Anthropic will continue to offer the previous alternative open-source actions on GitHub. Administrators receive a dashboard with an overview of reviewed PRs, acceptance rates, and costs, as well as the ability to set monthly limits.
The output of code with artificial intelligence has increased significantly, making code review a bottleneck in many companies, as well as in open-source projects. Anthropic, for example, speaks of an increase of 200 percent more output per developer in the past year.
(who)