Why AI image generators won't put all creatives out of work

AI software can create sophisticated illustrations from text. But rumours of the "death of art" are greatly exaggerated.

vorlesen Druckansicht

Ducks in the style of Picasso, generated by Craiyon (formerly Dall-E mini).

(Bild: Craiyon)

Lesezeit: 6 Min.
Inhaltsverzeichnis

(Hier finden Sie die deutsche Version des Beitrags)

When I wrote my first article on "artificial creativity" in 2013, I still had a hell of a lot of explaining to do. Because most people didn't realise why it was an issue at all. After all, the so-called Lovelace argument was on everyone's mind - named after Lady Ada Lovelace, the woman who wrote the first algorithms for Charles Babbage and his "Analytical Machine": A computer cannot create anything new, only what its programming tells it to. So why write about "creative machines" if they can't exist?

Videos by heise

With Dall-E 2, Midjourney or Stable Diffusion, this argument should be a thing of the past: The AIs turn texts into the most adventurous graphics. Not all, but many images are not only technically good - so good that one of them has already won an art competition. The text-to-image generators also prove to be quite imaginative when it comes to interpreting their "orders". This goes so far that some even fear the "death of art" - or at least mass unemployment among illustrators, graphic artists and concept artists.

Sure, text-to-image generators will replace the odd stock photo in cost-optimised media productions. Simon Colton of Queen Mary University London, who is still working on machine creativity, told me back in 2013 that software will soon "produce unique, authentic works of art for everyone individually and for every taste". And at "affordable prices". But is this really the end of all art? To answer this question, and to understand the limitation of the image generators that exist today, it indeed helps to study "computational creativity". Although creative ideas usually come to us quite suddenly and unexpectedly, they do not fall from the sky, they are not "divine sparks", but the result of cognitive processes that can - at least in principle - also be modelled and reproduced in the computer.

Eine Analyse von Wolfgang Stieler
Ein Kommentar von Wolfgang Stieler

Nach dem Studium der Physik wechselte Wolfgang Stieler 1998 zum Journalismus. Bis 2005 arbeitete er bei der c't, um dann als Redakteur der Technology Review zu wirken. Dort betreut er ein breites Themenspektrum von Künstlicher Intelligenz und Robotik über Netzpolitik bis zu Fragen der künftigen Energieversorgung.

The British cognitive researcher Margaret Boden from the University of Sussex distinguishes between three different types of creativity: combinatorial, exploratory and transformational creativity. Combinatorial creativity is about putting parts together in a new way - combining unusual flavours like mustard and candied fruit in new cooking recipes, for example.

Explorative creativity" works a little more abstractly: the aim here is to understand the implicit or explicit rules in the creation of an artefact - a text, a picture, a sculpture, a recipe - and to spin it further. A haiku, for example, a short Japanese poem, in German translation consists of - usually - seventeen syllables divided into three lines: The first line has five syllables, the second line seven syllables and the third line five syllables. The poem is written in the present tense, and thematises concrete circumstances and moments - often nature or the seasons - while feelings are not directly described, but only indirectly conveyed. An infinite number of poems can be generated according to this scheme, like possible moves in a chess game.

Transformative creativity" takes place on an even more abstract level: Here the creative person - human or machine - "plays" with the rules of the creative recipe, omits individual rules, invents new ones, or modifies them. The "conceptual space", as Boden has called it, is thus transformed "and it becomes possible to think something new that could not be thought before". The expressionist painters, for example, recognised that the overall impression of a painting is strongly dependent on the distribution of light and dark. Since completely different colours - blue and green, for example - have similar tonal values, colours can be exchanged in a painting without disturbing the structure of light and dark. The pictures then appear alienated, but still not simply colourful, but coherent.

Why am I telling you this? Because what the text-image generators do is at best exploratively creative. But mostly only combinatorial. What we still marvel at after centuries, the radical, new stuff - often so new that the audience doesn't understand it at all at first - is transformative. But in order to play with rules, you have to have access to them - in the best case, they are available in explicit form. And that is not the case with text-image generators. They work with learned, implicit representations, which they then recombine and modify. Something groundbreakingly new does not emerge in this way. On the contrary, I would even say that most of what is being celebrated in the social media right now is a specific kind of nerd stuff that the public will soon have enough of.

This is not to say that machines will never master this transformational creativity. There are, for example, exciting ideas about how to couple learning systems with rule-based machines. Such a combination could prove to be very exciting. And there are Jeff Hawkins' theories about how our brains manage to build an abstracted model of the world from individual sensory impressions - and one of the keys to this is sensory impressions from different perspectives combined together. But that's another story for another time.

(jle)