Peer review for Google Jules: AI assistant keeps a close eye on AI assistants

The Critic-Augmented Generation checks all suggestions from the AI assistant for software development in order to detect subtle errors and gaps in the code.

listen Print view
Robots on the blackboard

(Image: Phonlamai Photo/Shutterstock.com)

2 min. read

Google has introduced a new mode for its AI assistant for software development, Jules. The so-called Critic-Augmented Generation first hands over all suggested changes to a critic integrated into Jules who checks them thoroughly. The critic therefore carries out a peer review and focuses on good code quality.

Google presented Jules at this year's Google I/O in May. The AI agent does not make code suggestions within the editor like Cursor but examines projects to track down bugs, create tests, or integrate new features.

A general problem with AI-supported software development is that the models typically create code that appears to work flawlessly at first glance. However, they often only consider the typical program flow, meaning that the applications can fail in borderline cases.

Videos by heise

The AI agents also usually assume suitable, regular input, meaning that unexpected input can lead to errors or, in the worst case, vulnerabilities. In addition, the AI assistant does not always create the most efficient code for a particular task.

The Critic Augmented Generation is designed to detect precisely these errors or weaknesses. The integrated critic does not correct them itself but marks them and returns them to Jules for revision.

As an example, the blog post entitled “Meet Jules' fiercest critic and most valuable ally” mentions code that passes all tests but introduces a subtle logic error. Jules receives the code with the comment “Output matches expected cases but fails on unseen inputs.” back.

In the current variant, the critic receives the complete output from Jules, checks it, and returns his comments. Jules then improves his code and submits it again for peer review. The process is repeated until the critic is satisfied. Only then does Jules return the changes to the users.

Google wants to expand the process in the future so that Jules can also ask the critic about subtasks, and the critic has access to external tools such as search engines or code interpreters to execute and check the code.

(rme)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.