38C3: AI tools must be evaluated before being used in schools

Teachers use AI tools to design work and assess students. Researchers show why the systems should be evaluated before they are rolled out.

Save to Pocket listen Print view
Person at a computer, next to icons symbolizing education

(Image: Miha Creative/ Shutterstock.com)

4 min. read

Generative artificial intelligence has long since arrived in schools. In some cases, AI teachers are already being tested; in the UK, teacher-free AI classes have been launched as a pilot project. In Germany, teachers are using AI tools to help them with tasks and corrections, among other things, and to take the pressure off teachers. At the 38th Chaos Communication Congress, researchers took a closer look at the most widespread offering for schools in Germany, which comes from the provider Fobizz. Fobizz now even has state licenses for Rhineland-Palatinate, Mecklenburg-Western Pomerania and Saxony.

Laut own information over 500,000 teachers at over 7,500 schools use the service. However, the effectiveness and reliability of Fobizz do not stand up to closer analysis. This revealed shortcomings in the design of the prompts used to assess pupils' texts. According to the study by Prof. Rainer MĂĽhlhoff and Marte Henningsen, the tool showed a large variance in the evaluation of the texts. When the same text was processed several times, the grades fluctuated greatly in some cases. In one text that the tool had evaluated, the score varied between one and 14 points in several runs. "It's basically like rolling the dice [...] if I don't like the score, I just press 'redo' and then a new score comes out," explains MĂĽhlhoff. There are further concerns regarding the design of data protection: "We are dealing with highly sensitive data of minors that is disclosed there and handed over to the companies," warns Henningsen.

In addition, the tool often failed to recognize false information in texts and could not reliably distinguish between relevant and irrelevant content. Even when the improvements suggested by Fobizz were incorporated, they were criticized unless ChatGPT was asked for help. If the tool is used correctly, it does not keep its promises. It would not save as much time and would not relieve the workload sufficiently. The results that teachers received would have to be carefully examined.

The study highlights that the current use of such tools may not provide the promised time savings and relief and emphasizes the need for careful evaluation of such technologies before widespread use. Inadequately evaluated AI tools could therefore not be unleashed on schools with a clear conscience. When asked by the state education authorities in the various federal states, Rhineland-Palatinate stated that an annual license for Fobizz costs 1.75 million euros, according to Henningsen. At the same time, trainee teachers there were not paid over the summer vacation.

Instead of technological solutions such as AI tools, more should be invested in improving the working conditions of teachers and in education as a whole. "We actually need more teachers and therefore [...] more investment in the education system and, if you like, AI where it makes sense. [...] It doesn't seem to make sense here," says MĂĽhlhoff. Instead, political measures are needed to stop the education crisis.

Immediately after the study was published, Fobizz responded to some of the points of criticism and made adjustments to the tool. The prompt was adapted to the study results. Fobizz did not want to disclose the original prompt to the researchers.

After a recommended task was published in the study, Fobizz adapted its prompt accordingly, according to the researchers.

(Image: Chatbots im Schulunterricht!?)

(mack)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.