AI Image Generators: This is How Much Electricity Generating Images Requires

Researchers have developed a method that can predict the electricity consumption of AI image generators. This is said to work even with proprietary systems.

listen Print view
Power poles against a cloudy sky

(Image: heise online / anw)

3 min. read
By
  • Boris Ruf

An international team of researchers from Stanford University and the insurance group Axa has investigated how the energy consumption of diffusion models, the architecture on which image-generating AI systems are based, can be systematically predicted. Popular examples include DALL-E, Midjourney, or Google's Nano Banana. While the high energy consumption of language models like ChatGPT and other transformer architectures is already widely known and scientifically investigated, the equally computationally intensive diffusion models are now coming into focus for sustainability research.

Boris Ruf
Im Interview: Boris Ruf

Boris Ruf is a data scientist at AXA and an expert in sustainable AI.

In their research paper "Energy Scaling Laws for Diffusion Models", which the scientists presented at a workshop of the EurIPS conference in early December, they show how the complexity of these algorithms can be theoretically modeled. Based on the computational operations (FLOPs) required to generate an image, electricity consumption can then be derived.

For the prediction, the researchers adapted the Kaplan scaling laws from OpenAI, which were originally developed to predict the performance of language models as a function of model size, data volume, and computational effort. In the new variant, they enable the estimation of the energy consumption of diffusion models based on the required FLOPs. Open-source image generators such as Stable Diffusion, Flux, and Qwen were used for the experiments. The study considers various combinations of hardware, the number of steps in the generation process, image resolution, and computational precision. Nvidia GPUs of the A100, RTX A4000, and RTX A6000 ADA series were used.

Videos by heise

The result: depending on the configuration, a single image can consume up to ten times more energy than an average ChatGPT request, which according to OpenAI CEO Sam Altman requires about 0.34 watt-hours. The energy demand varies considerably, especially depending on the resolution – from 0.051 watt-hours at 512 × 512 pixels to 3.58 watt-hours at 1024 × 1024 pixels per image.

The researchers' method is said to work across models. Trained on one model, it can predict the energy consumption of other architectures – even with different hardware. This allows for estimates for proprietary, closed systems like DALL-E or Midjourney, where operators have not yet published consumption data.

The study offers a comprehensive, scientifically sound approach to energy planning for AI image generators. Developers can use it to compare different diffusion models in terms of their energy consumption, and providers can estimate the expected energy output even before commissioning. The researchers hope that these findings will promote the efficient development and implementation of AI-powered image and video generators.

The preprint of the study can be found on arXiv.

Transparency notice: Boris Ruf is a co-author of the presented study.

(dmk)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.