Llama 3.3: Meta releases new AI model

The new version of Meta's Llama-LLM is now available, including a benchmark comparison with other AI models. This is how Llama 3.3 performed.

listen Print view
Meta sign at property entrance

(Image: Michael Vi/Shutterstock.com)

4 min. read
Contents

The Meta Group has released the new version of its Large Language Model (LLM) called Lllama. Llama 3.3: 70B is designed to be easier and more cost-efficient to operate.

Ahmad Al-Dahle, Vice President for Generative AI at Meta, announced Llama 3.3 at X and published a comparison overview in which Llama 3.3 is pitted against Nova Pro from Amazon, Gemini Pro 1.5 from Google and ChatGPT-4o from OpenAI.

Empfohlener redaktioneller Inhalt

Mit Ihrer Zustimmung wird hier ein externer Inhalt geladen.

Ich bin damit einverstanden, dass mir externe Inhalte angezeigt werden. Damit können personenbezogene Daten an Drittplattformen übermittelt werden. Mehr dazu in unserer Datenschutzerklärung.

According to the overview, various established AI benchmarks that define certain data sets were used for the comparison. These are used to test the performance of an AI model in certain areas. And although Meta discloses a great deal of information about its Llama model family – , the exact training data is still under wraps. Llama achieved the best ranking in the "Instruction Following" category, i.e. following instructions precisely.

Meta used the IFEval benchmark for this, which comprises around 500 prompts with verifiable tasks, for example: "Write more than 400 words and mention the keyword AI at least three times" – but in English, like all IFEval prompts. Llama 3.3 answered 92.1 percent of the prompts correctly. Together with Amazon Nova Pro, this makes it the frontrunner in the – comparison compiled by Meta itself –.

Llama 3.3 achieved the highest hit rate of 97.5 percent in the "Long Context" category. Only the older Llama model 3.1 is even better here with 98.1 percent. This is a "NIH/Multi-needle" test (NIH = needle in haystack) in which the aim is to find a specific string of characters.

However, Meta uses a comparative value from Google Gemini Pro 1.5, which also appears in a Google research paper on its model. Llama 3.3 also achieved particularly good results on the Multilingual MGSM dataset –, which involves solving 250 school-level math problems in ten different languages – Llama 3.3 achieved 91.1 percent, with only Llama 3.1 achieving a slightly higher 91.6 percent. In some cases, the new model performs slightly worse than its predecessors – presumably in favor of the advantages touted by Al-Dahle in terms of operation and cost efficiency.

Llama 3.3 is now available for download from Hugging Face and the Meta website. According to Meta, the Llama AI models have been downloaded a total of 650 million times to date. According to Meta CEO Marc Zuckerberg in an Instagram reel, 600 million people now use Llama every month. Llama AIs are free to use for research and commercial purposes under certain conditions; only platforms with more than 700 million monthly active users require a special license from Meta.

Empfohlener redaktioneller Inhalt

Mit Ihrer Zustimmung wird hier ein externer Inhalt geladen.

Ich bin damit einverstanden, dass mir externe Inhalte angezeigt werden. Damit können personenbezogene Daten an Drittplattformen übermittelt werden. Mehr dazu in unserer Datenschutzerklärung.

In November, it became known that the Chinese military was using Meta's Llama AIs for its own purposes. As a result, Meta also allowed the US government to use its AI for national security purposes. Recently, Meta decided not to launch a Llama version in the EU – due to fears of non-compliance with EU regulations.

Videos by heise

For the upcoming Llama generation 4, Mark Zuckerberg expects a tenfold increase in the computing power required to train the models. Llama 4 is expected to be completed in 2025.

(nen)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.