Infostealer on AI platform Hugging Face disguised as OpenAI repository

The Open-OSS/privacy-filter repository contained an infostealer and was downloaded over 240,000 times before Hugging Face removed it.

listen Print view
Key on a keyboard, the key hangs on a hook

(Image: evkaz / Shutterstock.com)

3 min. read
Contents

In early May, a repository appeared on Hugging Face that disguised itself as an OpenAI model and installed an infostealer on Windows systems. The attackers used typosquatting and distributed the repository as Open-OSS/privacy-filter, referencing the OpenAI model openai/privacy-filter.

During the attack, the repository landed at #1 of Trending Repositories within 18 hours, with over 240,000 downloads and 667 likes. The latter largely came from automated accounts to boost the repository.

Hugging Face has since removed the repository. Anyone who cloned it before on a Windows machine and executed either start.bat or loader.py should consider their system infected and credentials stored in browsers and their extensions as potentially compromised.

Which files may be affected can be found in the analysis by the AI security company HiddenLayer.

Apparently, the attackers copied the Model Card describing the model almost verbatim from OpenAI's privacy-filter, including a link to a PDF from OpenAI.

The instructions in the Readme were also largely similar, but additionally requested to clone the repository locally and execute start.bat on Windows or the Python loader loader.py on macOS or Linux.

Videos by heise

As a distraction, the loader first executes seemingly legitimate code, with a class DummyModel, feigned model training output, and a synthetic dataset.

The installation of the malicious code starts with the function _verify_checksum_integrity() called at the end. It starts a PowerShell command that only works on Windows systems and runs hidden in the background, via

powershell.exe -ExecutionPolicy Bypass -WindowStyle Hidden -Command <cmd>

With the Creation Flag CREATE_NO_WINDOW, the process runs without a console window.

The script downloads an update.bat file, which prepares the actual malware infection, and executes it. To achieve this, the file first checks for admin rights, which it requests if necessary, triggering at least a UAC prompt. It then downloads the malware and attempts to register it as an exception for Microsoft Defender.

The actual infostealer is a program written in Rust that uses numerous obfuscation techniques to avoid being detected as malware. Among other things, the program obfuscates the use of Windows APIs and checks if it is being executed by an anti-malware program in a virtual machine.

Finally, the infostealer collects information from browsers, Discord, wallets (including via browser extensions), various configuration files, and geolocation data. It also creates screenshots using the Windows Graphics Device Interface (gdi32.dll).

The infostealer packs the collected data into a JSON file, which it uploads to a remote server.

The likes were likely created largely automatically to boost the repository. According to HiddenLayer's analysis, 504 follow the pattern “firstname-lastname###” and another 153 follow the pattern “adjectivenoun####”.

A portion of the 244,000 downloads was also likely not from victims of the infostealer attack, but automated by the attackers themselves to drive the repository up in the Hugging Face ranking.

(rme)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.