Questionable: Google's PaliGemma 2 can recognize emotions

Google's freely available AI model PaliGemma 2 can recognize emotions in images. This is actually prohibited in the EU.

Good mood, bad mood – AI can recognize this.

(Image: Gino Crescoli, gemeinfrei)

Dec 6, 2024 at 1:36 pm CET

3 min. read

By

Eva-Maria Weiß

Emotion and facial recognition are tricky areas. Among other things, both are banned in the EU under the AI Act –, albeit with some exceptions. Google's new and freely accessible vision language model PaliGemma 2 is now able to learn precisely this ability to recognize emotions. On the one hand, the model can fundamentally recognize emotions, and on the other, it is apparently easy to expand this ability through fine-tuning. This involves providing the AI with images and the corresponding information about the emotions. Of course, this should also be possible with free AI models from other providers.

Videos by heise

Version 2 of PaliGemma has just been launched. It can be used to process text as well as images, so you can ask questions about an image, for example. According to Google's blog post, the AI model can extract "detailed and contextually relevant information" from an image and goes beyond "pure object recognition". PaliGemma 2 can "describe actions, emotions and the narrative of a scene".

Concern about the use of emotion recognition

As PaliGemma 2 is an open model, i.e. freely accessible, experts are concerned about the easy access to emotion recognition. Its use is largely prohibited in the European AI Regulation. For example, neither employers, schools nor private individuals are allowed to use it, but the situation is different for border control authorities and a system is also allowed to monitor whether airplane pilots are tired or fit.

However, facial and emotion recognition is not necessarily as simple as one might initially think. While a smile is easy to recognize, the context of the smile is also important in order to interpret it correctly. This leads to many mistakes. Facial recognition software is also suspected of having a particularly strong bias. Among other things, dark-skinned people tend to be wrongly assigned negative characteristics and emotions.

In addition to visual emotion recognition, it has also long been possible to detect emotions from the voice – using artificial intelligence. The voice can provide a great deal of information about emotions and even illnesses. These systems are also not permitted without further ado. They are used in call centers, for example, but information must be provided. According to the AI Act, they are subject to certain transparency and documentation obligations.

It is questionable to what extent the ability to recognize emotions in a model such as PaliGemma 2 will lead to stricter review and categorization under the regulation. So far, vision language models have been subject to a relatively loose special regulation as general purpose AI, which does, however, stipulate that there may be a reclassification in the event of conspicuous risks.