Luma: AI agents for creative projects from audio to video to print

The web app Luma creates audio, image, and video projects using AI agents – from brainstorming and planning to delivery.

listen Print view

With Luma's AI agent, you communicate in natural language, just like with other language models.

(Image: Luma AI)

3 min. read
Contents

According to the manufacturer Luma AI, the AI platform Luma generates creative projects such as videos or print campaigns – from concept through intermediate stages like storyboards to the final design using various AI models.

Creatives who, for example, first generate a script in ChatGPT for a video project, then an image as a starting point in Midjourney. Then individual video clips in Runway ML, search for background music on the web or in a library, and finally combine everything into a video in Adobe Premiere Pro should now be able to do all of this with the help of AI.

According to the manufacturer, AI agents organize the workflow for text, image, video, and audio across AI models from various providers. In addition to the video model Ray3.14 developed by Luma AI itself, the agent also supports the video generators Google Veo 3, OpenAI Sora 2, and Kling AI 2.6. It also supports the image generators Nano Banana Pro, Seedream, and GPT Image 1.5, as well as the music, audio effect, and voice generators from ElevenLabs.

Luma is aimed at creative teams in agencies and marketing departments who want or need to produce content quickly without having to deal with the complex processes behind it.

Videos by heise

Luma is available as a web app. After creating a new project, users formulate a prompt and upload sources such as images, text files, PDF documents, audio, or video files (MOV, MP4). The agent analyzes the material, creates a plan, and gathers further feedback. Behind this is the AI Uni-1 developed by Luma.

Luma generates a kind of storyboard and asks follow-up questions in the chat. Finally, Luma outputs the result in PNG, JPEG, MP4, or MP3 formats, depending on the project. Video projects consist of individual clips lasting 4 to 12 seconds, depending on the model used. As a rule, the models support a maximum resolution of 1080p. The Luma agent scales them up to 4K.

After entering the prompt, the AI agent formulates a plan in various phases.

(Image: Luma AI)

Behind Luma is the Uni-1 model, a decoder-based, autoregressive transformer that processes language and image tokens in a common token space. This transformer variant is used in many large language models (LLMs).

The model is capable of reasoning in natural language and rendering visual content within the same computation run. Instead of controlling separate systems step by step, Uni-1 plans, visualizes, and generates results in one process – an approach that, according to the provider, is closer to human intelligence than independently operating models.

According to the manufacturer, the product is available immediately. A subscription costs 30, 90, or 300 US dollars per month. Users receive 10,000, 40,000, or 150,000 credits, respectively. The costs for individual video clips depend on the video model used. Users should note when estimating costs that Luma also generates larger projects with multiple clips at once.

(akr)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.