With Operator, OpenAI offers an AI agent for almost all activities on the web
The operator can search for and order things on the web independently. The AI agent is still a preview and initially reserved for US subscribers to ChatGPT Pro.
(Image: Vasin Lee/Shutterstock.com)
OpenAI has released an early version of its first AI agent. The so-called "operator" can perform typical tasks on the internet, such as finding or searching for something and then placing orders. However, critical actions, such as entering payment information, confirming the order or sending a message, require the user's intervention. So far, the operator is only available to ChatGPT Pro subscribers in the USA. The developers promise wider availability in the coming weeks, although according to OpenAI CEO Sam Altman, it may take some time in Europe.
The operator uses the web like a human. The AI agent has its own browser (in the cloud) with which it interacts, taps, clicks and scrolls. This is transparent for the user, the individual steps are displayed visually and the human can take over at any time. At the moment, the operator is still a so-called "research preview", an early developer version, so to speak. To allow the AI agent to learn, OpenAI is now making it available to a limited user group. After all, the company charges 200 dollars for a subscription to ChatGPT Pro.
GPT-4o and new AI model CUA as a basis
Operator is based on a new AI model called "Computer-Using Agent" (CUA). This CUA uses the image processing capabilities of GPT-4o and combines them with reasoning through reinforcement learning. The AI agent is specially trained to operate graphical user interfaces, but errors cannot be ruled out at this stage. If the operator gets stuck, it hands over control to the user. For example, the AI agent refuses to access unsafe websites and then stops. The operator also needs the user to log in and provide payment information.
Videos by heise
OpenAI already cooperates with internet companies such as DoorDash, Instacart, OpenTable, StubHub and Uber, so that the user can select these directly in the operator to place food orders, reserve a table in a restaurant, buy tickets and order a ride, for example. The user can also leave it up to the AI agent to decide which provider they consider suitable for the desired task.
Data protection and security
OpenAI promises that any sensitive information entered when using the AI agent, such as login or credit card details, will not be logged. According to OpenAI, the operator is also trained to reject banking transactions or critical decisions such as job applications. However, OpenAI admits that the AI "should" leave the confirmation of an order to the human. This is therefore planned, but not guaranteed.
The operator is supposed to learn from interactions with humans, but users can prevent their own orders from being used to train the AI. This can be switched off in the ChatGPT settings. Users can also delete all conversations with the AI agent, as well as the operator's browser history. OpenAI has also trained the operator to detect malicious websites. If possible fraud attempts are detected, for example, the AI agent stops its work and informs the user.
Although the operator is currently only available to ChatGPT Pro subscribers in the US, OpenAI plans to extend the AI agent to Plus, Team and Enterprise users soon. The capabilities will also be integrated into ChatGPT in the future if OpenAI deems the operator sufficiently secure and suitable for more users at the same time. There are also plans to make the CUA model available via API so that developers can create their own AI agents.
Empfohlener redaktioneller Inhalt
Mit Ihrer Zustimmung wird hier ein externes YouTube-Video (Google Ireland Limited) geladen.
Ich bin damit einverstanden, dass mir externe Inhalte angezeigt werden. Damit können personenbezogene Daten an Drittplattformen (Google Ireland Limited) übermittelt werden. Mehr dazu in unserer Datenschutzerklärung.
(fds)