Researchers: AI makes phishing more effective

How effective is LLM-generated phishing? It is on a par with human-generated spear phishing, say researchers.

(Image: Bild erstellt mit KI in Bing Designer durch heise online / dmk)

Jan 8, 2025 at 9:05 pm CET

3 min. read

By

Dirk Knop

In an investigation into the effectiveness of phishing attacks generated automatically with large language models (LLMs), researchers have come to the conclusion that artificial intelligence is just as efficient as human-personalized phishing. Compared to random, non-targeted phishing, they see a 3.5-fold increase in efficiency.

The study can be found on arxiv.org. The research group, mainly from an institute at Harvard University, tested the phishing attacks with 101 participants. The aim was to find out how well LLMs are able to carry out personalized phishing attacks – so-called spear phishing.

AI very efficient at phishing

The results were surprising. The 101 participants were divided into four groups. The control group received random, non-personalized phishing. One group received phishing emails generated by LLMs. Human-generated phishing emails were sent to a third group and finally there was a group that received spear phishing, where humans fine-tuned the AI-generated emails.

The researchers measured how often links in these emails were clicked on. The random phishing achieved a click rate of 12 percent. Links in fully automated AI-generated phishing emails were clicked 54 percent of the time – just as often as in targeted, human-generated phishing emails. A slight increase in the click rate to 56% was observed when humans were involved in AI phishing.

To generate targeted phishing, the group created an AI-supported tool that evaluates the digital footprint of the people targeted and creates personalized emails based on this – and subsequently evaluates the success of the fraud strategy. In 88 percent of cases, the AI was able to find useful information about the target individuals. Their tool supports several LLMs, but for their study, the group mainly focused on the use of Claude 3.5 Sonnet and GPT-4o.

Videos by heise

The researchers also checked how well the five LLMs Claude 3.5 Sonnet, GPT-4o, Mistral, LLama 3.1 and Gemini can detect phishing emails. In the end, the most promising candidates were Claude 3.5 Sonnet and GPT-4o. On a larger data set of 363 phishing emails and 18 legitimate messages, Claude 3.5 Sonnet delivered a detection rate of 97.25 percent and had no false positives. The researchers assume that they could improve these results even further with prompt tuning. AI can therefore obviously not only be used to generate phishing, but also to filter it.

Read also

Silhouette of a woman at a laptop against the backdrop of a big city with skyscrapers

"The perfect phishing email": Using AI text generators against executives

AI-supported phishing is not just a theoretical research case. Last week, the British insurance company Beazley and other companies warned of an increasing number of "hyper-personalized" phishing emails that are written with the help of AI generators.