ChatGPT gets new image AI that thinks along and researches the web

OpenAI's new image model "ChatGPT Images 2.0" plans and structures visual tasks, uses web search, and creates coherent image series on demand.

listen Print view
Open photobook with two black and white images: street scene and woman knitting in the subway.

OpenAI's new image AI imitates a photobook with spontaneous street shots from 1970s New York in the style of 35mm photography.

(Image: OpenAI)

3 min. read

OpenAI has introduced a new image model. The central innovation of “ChatGPT Images 2.0” is the thinking mode. When selected, the model structures visual tasks and incorporates current information from the web and its existing knowledge as needed before generating an image. The knowledge base now extends to December 2025 and is intended to ensure better results, especially for infographics, educational materials, and explanatory representations. In thinking mode, the model can also create a coherent series of up to eight images in one pass, building on each other with consistent characters and objects. The system ensures continuous visual design and thematic continuity.

In parallel, the model also improves in realism. According to OpenAI, it renders light, textures, and fine details more consistently and can also incorporate minor irregularities that make images appear more natural. This brings the output closer to photographic shots or cinematic scenes.

Videos by heise

Another focus is on text rendering. While previous models often had problems with longer or more complex texts, ChatGPT Images 2.0 is said to work much more reliably here. This also applies to non-Latin writing systems. According to OpenAI, languages such as Japanese, Korean, and Chinese have been improved so that texts not only appear correct but are also linguistically integrated into the overall image in a coherent way.

At the same time, the model is said to implement prompts more precisely overall, place objects more reliably, and represent complex layouts with text, symbols, or UI elements more consistently. According to OpenAI, these improvements also facilitate the creation of user interfaces and screenshots.

The improved model supports aspect ratios from 3:1 to 1:3, making it suitable for formats ranging from banners and presentation slides to posters, smartphone views, and social media graphics, according to OpenAI.

In the announcement, numerous examples of these and other new capabilities of the image generator can be found.

This is not a real screenshot.

(Image: heise medien)

ChatGPT Images 2.0 is now available in ChatGPT, with advanced features such as Thinking mode reserved for users of ChatGPT Plus, Pro, and Business. The image model is also available in the Codex programming tool, enabling visual workflows directly in the development environment. Developers can also integrate the functions into their applications via the “gpt-image-2” interface. Costs are primarily based on quality and resolution. Compared to the predecessor, larger formats are sometimes cheaper, while standard resolutions cost more. API outputs above 2K are still in beta because they are considered error-prone.

ChatGPT Images 2.0 follows the image generation launched in March 2025 with GPT-4o, which was further improved in December 2025 with GPT-Image-1.5. With the new model, OpenAI is catching up in the competition with other image generators, such as Google's Nano Banana Pro, which combined image generation with analysis capabilities and brought advances in text rendering and multilingualism.

(wpl)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.