Quality assurance measures are essential for GenAI applications

Lars Röwekamp talks about the opportunities and challenges of AI and the critical questions that need to be asked when integrating generative AI.

(Image: erstellt mit KI (Dall-E) durch iX-Redaktion)

Nov 7, 2024 at 8:41 am CET

5 min. read

Developer

By

Rainald Menge-Sonnentag

Lars Röwekamp is the founder of open knowledge GmbH and, as "CIO New Technologies", is involved in the analysis and evaluation of new software and technology trends. A particular focus of his work is currently on enterprise and cloud computing, big data and AI, with a particular focus on real-life aspects in addition to design and architecture issues.

iX: If you ask where generative AI can be used, you will find many answers. But asked the other way round: in which processes do you consider its use to be inappropriate? Where are the limits of your enthusiasm?

Lars Röwekamp: (Generative) AI brings with it a great opportunity, but also a great responsibility. Caution is therefore always advised when the well-being of the individual is jeopardized, for example in lending or through social scoring or the well-being of society through filter bubbles, among other things.

Is there a fundamental difference in the design of the architecture of a GenAI-based system compared to "normal" software systems?

Normal applications usually act deterministically. GenAI-based applications, on the other hand, work with a certain degree of creativity. This must be taken into account in the application design and architecture.

Suitable quality assurance measures such as the integration of guardrails that match the AI model or the checking of reference data sets are essential. In addition, the results of a GenAI application should be subjected to an additional plausibility check in order to filter out hallucinations.

What are the main risks associated with the development of AI-supported applications?

The biggest risks of an AI-based application are the lack of causality – we are only dealing with correlation – and the insufficient transparency of the basis for an AI prediction.

Depending on the application and criticality, it may therefore make sense to switch to a weaker but more transparent model in order to achieve better user acceptance.

How relevant is the actual understanding of (large) neural networks and machine learning in order to develop applications based on them?

In a naïve view, one might assume that an AI model can be offered as a black box inference service and therefore no detailed knowledge of the model itself is required. In reality, however, it is important to understand that an AI model is always based on correlation and not causality.

An AI-based application must take this fact into account and, depending on how critical the application is, provide the end user with appropriate transparency for the AI's decision and, in case of doubt, show alternative paths – without AI.

Videos by heise

Is training custom models a viable option for many companies?

Training an AI model is time-consuming and expensive. It also requires a high level of expertise. Training a large GenAI model from scratch is not realistic for most companies. Whenever possible and practical, you should therefore start with a basic model and simply refine it to suit your needs.

Even then, sufficient expertise must be available to train a model. Expertise is also required to determine during operation if and when the behavior of the model no longer meets expectations (drift) and when the time has come to retrain it. This type of monitoring and follow-up – including an MLOps pipeline for automatic training – should be an integral part of your AI landscape.

Energy efficiency is seen as a standard goal of software architecture. Is this compatible with the use of GenAI models?

AI applications in general and GenAI applications in particular are extremely resource-intensive. A GenAI architecture should therefore always take this aspect into account from the outset. Can a smaller model be used as an alternative? To what extent does multi-level caching help to avoid unnecessary inference? Can cloud-based resources be used in countries that rely heavily on renewable energy? These are just some of the questions that need to be asked.

The first and most important step in the right direction is to create awareness of this issue in your own project environment and not simply ignore it.

The interview was conducted by Lukas Zühl

(rme)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.