20 new AI models published: Why Apple is letting us look at its cards

In the run-up to WWDC, AI releases were taken as a hint that Apple was planning something. Why Cupertino is now continuing to rely on open source.

Save to Pocket listen Print view
Apple Intelligence on Mac, iPhone and iPad

Apple Intelligence is likely to include some of the AI models Apple has now released.

(Image: Apple)

3 min. read
This article was originally published in German and has been automatically translated.

Since the WWDC developer conference at the beginning of June, it has been clear that Apple has not only developed its own AI models and demonstratively published some of them online before the presentation - the iPhone manufacturer even has concrete plans to use its own AI models as soon as possible with Apple Intelligence. Users with a US-English language setting should be able to make use of this as early as the fall. One week after WWDC, Apple has now uploaded 20 new CoreML models and four new data sets to an open-source platform.

Apple chose the AI open-source platform Hugging Face for the upload. The California-based company had already published four OpenELMs (Open Efficient Language Models) there in April. Previously, in October 2023, Apple uploaded the AI model Ferrer to GitHub, which can understand user interfaces - an anticipation of a new function that was announced for voice assistant Siri.

The current new models include Depth Anything, which can subsequently calculate depth information in photos to differentiate between objects in the foreground and background. According to Apple, the model was trained with around 600,000 images in which labels were applied. A further 62 million training images were used without labels. The ML model FastViT for the classification of images or DETR for semantic segmentation, i.e. the labeling or categorization of an image at pixel level, also goes into the area of images. The data sets include FLAIR, which comprises around 430,000 images from 51,000 Flickr users.

The newly released AI models can also be linked to the first Apple functions presented, such as GenMojis to create emojis using text input, Image Wand to create images from text transcriptions or line sketches, Image Playground for creating images based on keywords or Clean-Up, a function in the Photos app to remove unwanted items from images. What they all have in common is that they aim to process information on-device.

The fact that Apple is being so open may come as a surprise to some who know the manufacturer primarily from its products – but it is not unusual considering other open-source activities. Apple has also published its Swift programming language as open source on GitHub and works together with the international community. Unlike in product marketing, Apple is also open about its plans here.

In the case of the AI models, Apple announced transparency at WWDC to increase trust in its AI efforts. However, the publication on Hugging Face could also be aimed at encouraging developers to work with AI models in their apps themselves and to shift the processing of data away from the large cloud solutions to the devices in a more privacy-friendly way. Apple would also benefit from this development, as this could increase sales of current devices equipped with the latest, more powerful neural engine for processing for the hardware manufacturer.

(mki)