OpenAI presents deep research: Another new AI agent

Another new AI agent from OpenAI specializes in large-scale research. deep research is said to achieve 26 percent in the Last Exam.

(Image: Shutterstock/ioda)

Feb 3, 2025 at 1:31 pm CET

3 min. read

By

Eva-Maria Weiß

The new AI agent from OpenAI is called deep research. It is intended to be particularly good for large-scale research. To do this, deep research uses the recently published o3 model. A mini version of this has also been available for two days. The OpenAI portfolio is slowly including so many models, agents and subscriptions that it is easy to lose track of what is good for what.

Basically, OpenAI is all about ChatGPT. deep research is an agent that is part of the chatbot. According to the blog post, it can handle "multi-layered research on the internet and perform complex tasks". The agent only needs dozens of minutes to do what a human would need many hours for. This does not include the human checking time that follows an AI search. After all, as all AI providers repeatedly emphasize, a human-in-the-loop is needed to approve everything. OpenAI also explains that deep research can hallucinate, but says that this happens less than with other models.

Videos by heise

deep Research is based on a version of the announced o3 model, which has been optimized for web browsing and data analysis. OpenAI wouldn't be OpenAI if they didn't address their omnipresent goal in the blog post: "deep Research represents a significant step towards the development of an AGI that we believe will advance scientific research."

deep reseach passes the last exam

deep research can be used via the regular input field in ChatGPT in the web version, but only for people with a paid account. As an example, OpenAI writes that ChatGPT's new AI agent can be used to create a comparison of streaming services. This is not necessarily the kind of task you would first think of as scientific research. Answering such a question should take between 5 and 30 minutes. The costs for OpenAI are likely to be correspondingly high. The output is still limited to text, but images and graphics are to follow.

OpenAI writes that deep research is ideal for long searches where accuracy and citation are particularly important. GPT-4o, on the other hand, is the model of choice when it comes to multimodal conversations in real time. The new AI agent is also able to answer 26.6 percent of questions in the benchmark Humanity's Last Exam . This focuses on scientific topics; according to the developers, previous models were only able to achieve a maximum of ten percent. GPT-4o is at 3.3 percent, o3-mini-medium achieves 10.5 percent and o3-mini-high 13 percent, writes OpenAI.

Another recently introduced AI agent from OpenAI is called Operator. It is also designed to find things on the internet and take on additional tasks. If you give it your credit card details, it can also place orders.