Manipulation of the system prompt: Grok fabricated about "white genocide"

The AI Grok spread false information about an alleged genocide in South Africa for several hours on Wednesday. xAI responded with a statement.

listen Print view
Grok logo

(Image: xAI)

2 min. read

On Wednesday, the Grok AI model from xAI spent several hours spreading false information about an alleged "white genocide" in South Africa. As Elon Musk's company has now announced, an "unauthorized change" to the system prompt was the trigger. xAI vowed to improve, including through better protective measures and more transparency by publishing the system prompt on GitHub.

The malfunction was detected by users in various requests to the chatbot. The incorrect statements appeared without any connection to the question. After a few hours, the problem was resolved and Grok started answering questions again.

In a statement, xAI speaks of a violation of "internal guidelines and core values". The company did not say how it was possible for an unauthorized person to have extensive access to the system prompt and for this to initially go unnoticed.

Videos by heise

However, a 24/7 monitoring team is in place to ensure that this does not happen again. This complements existing automatic mechanisms to detect changes. There will also be additional checks to ensure that employees cannot change prompts without a review.

The system prompt published by xAI on GitHub in various versions essentially contains technical instructions to the AI on how it should handle user requests. There is no evidence of any influence on the content. The publication is intended to increase trust in Grok, writes xAI.

There was already a similar incident in February 2024, when a former OpenAI employee made changes to the system prompt. This led Grok to ignore sources accusing Elon Musk and Donald Trump of spreading false information.

(mki)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.