Stroop effect: AI models fail classic attention test

When color terms are printed in the wrong color, we need longer to name them – but we can. It's different with AI models.

(Image: GoStone / Shutterstock.com)

Jun 3, 2026 at 4:03 pm CEST

3 min. read

By

Martin Holland

AI text generators understandably fail a classic test from psychology and cannot correctly name colored color words when the two do not match. This was discovered by a US research team that had GPT-4o, Claude 3.5 Sonnet, GPT-5, Claude Opus 4.1, and Gemini 2.5 complete the so-called Stroop test.

When both colors did not match, the AI models could only correctly assign the visible color for a few words. When more were presented, the error rate increased sharply. If matching and differing pairs of color and term were shown alternately, the AI models were consistently wrong. The finding must be considered in the development of general artificial intelligence, the team believes.

More words, significantly more errors

The Stroop test is used clinically to assess how well people can suppress an automatic reaction. For this, words are printed in colored script, and the test subjects are asked to say the color of the script, ignoring the meaning of the word. They do take slightly longer on average when the color and meaning (“red”, “blue”) do not match, explains the research team around Suketu Chandrakant Patel from the City University of New York. Nevertheless, they could “achieve stable and highly precise performance even with long word lists.” The same cannot be said for the AI models examined; quite the opposite.

As the research group explains, the results are as expected best when the AI models have to name font colors that match the respective word. But even then, there is a drop with 40 terms. If both do not match, naming only works for a few words; with ten words, the hit rate drops to 60%, and with 40, it's even less than 20%. If the color sometimes matches and not others, the models completely lose their footing; with 40 words, the hit rate even drops to 0%. The models were only completely correct under one condition: when neutral “X” characters were presented instead of color words, whose number corresponded to the number of letters in the respective color word.

Videos by heise

It is not new that AI models do not reliably determine what requires attention in a text and what does not. The study published in PNAS Nexus has now confirmed that large language models (LLMs) – like humans – are better trained to read words than to name colors. However, humans can suppress reading and concentrate on naming the color, even with long word lists. The complete performance drop of the AI models points to “fundamental limitations compared to biological attention.” However, these control mechanisms are fundamental to achieving general artificial intelligence. Furthermore, it could save computing power if AI models could more reliably ignore irrelevant information.