New OpenAI model GPT-4o mini sends GPT-3.5 into retirement

With GPT-4o, OpenAI presents a small, fast and, above all, very cost-effective language model. It will replace GPT-3.5 in ChatGPT.

(Image: Tada Images/Shutterstock.com)

Jul 19, 2024 at 9:13 pm CEST

3 min. read

iX Magazin

By

Ulrich Wolf

This article was originally published in German and has been automatically translated.

As the basis for ChatGPT, GPT-3.5 is probably the most influential major language model of all. Now it has to make way for a successor, which OpenAI has now presented. GPT-4o mini is a smaller version of GPT-4o, which OpenAI released in May. Like its big brother, it is designed as a multimodal model, but is still limited in this respect at the moment. It can already process graphical input via the API. The generation of image, video and audio output is to follow in the future.

Multimodal model with great performance for the money

GPT-4o mini is trained on data that will last until October 2023. Although the context window of 128k tokens is around eight times larger than that of GPT-3.5 Turbo, it is still smaller than that of Claude 3 Haiku, Anthropic's smallest model. However, it can generate 16,000 output tokens, which is far more than most comparable models. Even Claude Sonnet, which competes in the next size class up, only manages half that. In terms of output speed, GPT-4o mini is also at the forefront of the most common LLMs, with 166 tokens per second.

As expected, the benchmarks published by OpenAI also put it ahead of smaller models such as Haiku or Gemini Flash from Google, but the gaps are not particularly large. OpenAI really scores with its price-performance ratio, at least for now. This is because Anthropic and Google have both aligned themselves with the price expectations of GPT-3.5 and are therefore currently significantly more expensive.

In the common AI benchmarks, GPT-4o is currently at the top of the "small" large language models. But the gaps are not particularly large.

(Image: OpenAI)

GPT-4o mini costs 15 cents for one million input tokens and 60 cents for one million output tokens. This is around 60 percent less than its predecessor. By comparison, OpenAI charges 5 US dollars per million input tokens and 15 dollars per million output tokens for the large GPT-4.o. In other words, more than thirty times the price. In other words, more than thirty times as much.

According to OpenAI, GTP-4o mini is the company's first AI model to use Instruction Hierarchy. This technique allows the model to prioritize certain instructions over others. This is intended to make it more difficult for users to carry out prompt injection attacks, jailbreaks or system prompt extractions that bypass built-in fine-tuning or instructions specified by a system prompt.