xAI releases Grok 3 – Musk-style model family

Content can "contradict what is politically correct". xAI publishes Grok 3 family including reasoning model.

A team from xAI and Elon Musk in the livestream.

(Image: Screenshot xAI)

Feb 18, 2025 at 9:17 am CET

4 min. read

By

Eva-Maria Weiß

Elon Musk's AI company xAI is releasing a new family of AI models. In this case, family means that there is a small model with the addition mini and the large version, as well as associated reasoning models. Both beat the competition's models in some benchmarks, say xAI and boss Elon Musk. Grok 3 is to be made available in the coming days. There is no more precise schedule. Initially, paying subscribers will be served.

Grok 3 was presented in a live presentation on X. Musk had already announced the release beforehand, writing that xAI would show “the best model ever”. Anyone who tuned into the livestream had to watch a good 20 minutes of black screen before the developers got started. They had prepared a presentation, all in Musk's black. The boss did not miss the opportunity to sit in and improve his employees. He said, for example, that they were continuously working on Grok 3 and that the model was possibly getting better every day. At least once the training of an AI model is finished, it is finished for good. After that, there is only so-called fine-tuning, i.e., changes to the guard rails or special knowledge.

Videos by heise

Right at the start, the team emphasizes how much faster xAI is with the development of its AI models. They only started in 2023, while other companies got started before 2019. What they don't say is that the basic technology and structure were not new for a long time, even in 2019, but that this was the first time so much data and the necessary computing power were available, which led to the results we have seen since ChatGPT. It is therefore relatively easy for all companies with sufficient financial resources to launch their developments at short notice.

According to the developers, xAI used a data center with 200,000 GPUs in Memphis for training. Grok 3 was developed with 10x more computing power than Grok 2. Musk said live that it was the ultimate truth-seeking AI, sometimes “at odds with what is politically correct”.

Grok 3 superior in benchmarks

Grok 3 is said to be better than GPT-4o in the AIME math benchmark and in a test that includes questions from the fields of physics, biology, and chemistry (GPQA). The model also performed particularly well in the test phase in the chatbot arena, under the name chocolate. There, models compete against each other and users can choose their favorite answer from a chatbot – which is generally said to have been that of Grok 3. The reasoning version of Grok 3 is also said to be better than OpenAI's o3-mini in some benchmarks.

Benchmark results for Grok 3 — xAI showed these benchmark results at the presentation.

(Image: Screenshot xAI)

Reasoning means that models go through a kind of thought process and take different paths until they decide which one is the best. The process is usually visible to users. This is not the case with Grok 3, where at least part of the thought process should remain hidden to prevent model distillation. This means that the knowledge of a large model is transferred to a smaller one. The Chinese Deepseek is accused of using this method illegally.

Grok also comes with a form of AI agent: DeepSearch is supposed to help with more extensive searches and provide a summary. Musk has also announced a voice mode. Grok 2 is to be made available as open source in the coming months.