Meta's LLM conference LlamaCon: Llama goes after the top dog

The conference did not feature any new models, but did include some interesting discussions on the future of LLMs and multimodal models.

(Image: Erstellt mit KI (Midjourney) durch iX-Redaktion)

May 2, 2025 at 1:34 pm CEST

7 min. read

Developer

By

Dr. Christian Winkler

Meta's LLM conference LlamaCon: Llama goes after the top dog

On April 29, the first conference on Meta's large Llama language models took place. Those interested can watch the recordings of the keynotes and the closing event.

Those who were only interested in the results did not need to watch the conference in full, as the debriefing was already available on the Meta blog before the conference. It also states that Meta had not announced the expected reasoning model. If you are looking for something along these lines, you can go to Qwen3, for example.

The publicly available content is limited to the keynote, Mark Zuckerberg's fireside chat with Ali Ghodsi, the CEO and founder of Databricks, and the discussion between Mark Zuckerberg and Microsoft CEO Satya Nadella.

Keynote on Llama 4

The conference began with Chris Cox, Chief Product Officer at Meta. His presentation focused mainly on the new Llama 4 models, which Meta has trained to be multimodal from the outset. This is indeed a differentiating feature compared to other new models such as Qwen3 or GLM, which focus mainly on text. It is understandable that Cox made little reference to small models or reasoning models: Meta does not yet offer such models.

However, Cox (and later his colleagues from development) did have a few announcements to make. For example, an API is now available for Llama that can be easily accessed with different programming languages. As the API is OpenAI-compatible, all tools can be used and only the URL needs to be exchanged. However, this is old hat, because almost all freely available tools such as llama.cpp, vLLM or SGLang offer this interface as standard. Of course, you can also host Llama models in it. Meta has now put it together as a package.

However, the API actually offers even more. It allows you to upload your own training data, train the models at Meta and download the modified model weights again. This is extremely practical and much more open than OpenAI. However, there are already other services that offer something similar and are not limited to the Llama models.

Videos by heise

Fireside Chat

The discussion between Zuckerberg and Ghodsi was interesting. Ghodsi said that language models are already being used in many customer projects. Whether one would agree with his assertion that RAG and information retrieval will become superfluous if generative models with sufficiently high context length can be used (Llama has up to ten million tokens in context) is questionable, however. Efficiency also plays a decisive role here, and embedding models and vector databases are superior to generative models by many orders of magnitude. However, this would not have fitted in well with the tone of the conference.

However, Ghodsi himself spoke about efficiency and would like to see smaller models. Zuckerberg mentioned the internal project "Little Llama", which should have such features. Reasoning and the closer integration of agents, which Ghodsi also asked for, cannot be offered by Meta at the moment. All of this is much better covered by Alibaba's Qwen3 models presented on the eve of the conference. The word Qwen was at least briefly mentioned in passing.

While there were 30,000 participants online for the keynote (which was broadcast live on Facebook and YouTube), the number quickly dropped to less than 10,000 for the next item on the program. This is actually surprising, as you would expect Zuckerberg to draw a large audience.

Mark Zuckerberg meets Satya Nadella

After a five-hour (!) break, the event finally continued. Presumably other sessions were taking place at the same time, but thanks to Meta's poor communication, remote observers felt left out.

It is interesting that Mark Zuckerberg chose Satya Nadella of all people to talk to. Given the new estrangement between Microsoft and OpenAI, Nadella was presumably happy to accept the offer of a meeting. However, the discussion was merely superficial; Nadella appeared much more technically savvy and pointed out the progress that has been made in IT as a whole in recent years, even if Moore's Law no longer applies without restriction.

Things got interesting when Zuckerberg asked how high the proportion of generated code is at Microsoft. Nadella mentioned 20 to 30 percent and explained in more detail that it depends on the type of code and that code for test cases can be generated particularly well. Zuckerberg did not know the answer to the counter question as to what the ratio would be for Meta. Nadella also talked a lot about agents, which play a major role in software development. It remains to be seen whether the hype will actually lead to usable software in the foreseeable future.

Finally, Zuckerberg praised his Llama models and claimed that Maverick is just as good as DeepSeek – but much smaller. The latter is indisputably true, but in the (recently controversial) LM Arena benchmark, Maverick is ranked 38th, while DeepSeek is ranked seventh. With Zuckerberg's statement, the wish was probably the father of the thought.

The discussion then moved on to infrastructure and smaller and smaller models. Zuckerberg explained that the Llama 4 models are built to run well on H100 GPUs because Meta uses the setup internally. Since only a few have access to such hardware, much smaller models are still needed for day-to-day work. Anyone familiar with the much smaller Qwen3 models, which are absolutely competitive with Llama 4 in terms of performance, could smile at Zuckerberg's statement.

Even though Meta organized LlamaCon, it became clear during the discussion that Satya Nadella has a much more concrete vision of the future for large language models than Mark Zuckerberg. It will be interesting to see whether the two companies will further strengthen their collaboration.

Without questions from the audience

Open source was emphasized again and again, as the Llama models are freely available. However, this is only partially true for the EU, as the license there prohibits the use of multimodal models, and all Llama 4 models are multimodal. This and the competitors (some of whom have much more liberal licenses) were not even mentioned. It is a pity that no questions from the audience were possible.

This leaves the feeling that Meta could have made more of the event. After the controversial Llama 4 release, one has the feeling that Meta, the former open source market leader in language models, has become just one of the many followers. And with rather moderate success at the moment. That could change quickly: A year ago, no one would have expected Google of all companies to be at the forefront of LLMs today.

(olb)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.