AI designs optimal prompts for text-image generators

The increasingly popular AI algorithms for generating images need suitable inputs. Now, a dedicated AI is to help find them.

An image generated with the image generator Stable Diffusion.

26.09.2022, 11:00 Uhr

Lesezeit: 3 Min.

Von

Ben Schwan

(Hier finden Sie die deutsche Version des Beitrags)

They are called DALL-E 2, Midjourney, Craiyon or Stable Diffusion - and they are fascinating examples of what is possible today with artificial intelligence (AI) or machine learning (ML). You enter a short text in English, called a prompt, into these software systems - and after a few seconds or minutes, they generate matching images.

It all depends on the input

The results are often astonishing: photorealistic graphics of non-existent landscapes, oil portraits that would not have been created in this way even with a lot of imagination, or simply crazy combinations of motifs that should not actually go together. Just how good the systems are can be seen from the fact that some observers are already speculating about the end of art.

Videos by heise

But as entertaining - and almost addictive - as the AI-based text-image generators are, they are not easy to use. The prompt has to be such that the AI "understands" it and then actually generates the appropriate images. In the meantime, this has led to the existence of user-managed databases that can be used for inspiration. They then execute the prompts entered as well as various other configuration features. For example, is "dog that looks like a giraffe; oil painting" better than "giraffe dog as an oil painting"? What does the generator understand and how?

AI helps AI paint pictures

It would be good if you could have an AI help you find the right prompts, which would save a lot of computing and waiting time. And in fact, such systems already exist. The start-up Phraser has developed software that is accessible via the web and even already contains adaptations for various text-image generators - currently DALL-E 2, Midjourney, Stable Diffusion, Disco Diffusion and Craiyon. To create a prompt, one clicks through a simple menu system.

You can choose from different types of art, such as painting, photo or 3D rendering. Then you enter a first descriptive sentence, for which Phraser also gives examples. Conveniently, examples of already generated images always appear on the right, so that you can adapt your prompt yourself. Finally, the style, colouring, textures, resolution, emotions to be generated by the image and even the era of an image can be selected. How well Phraser actually works, however, can so far only be checked by users with so-called API access - without this, the system does not provide the generated prompt, because it can generate images itself via stable diffusion.