Vulnerability in PyTorch allows command injection via RPC on the master node

A vulnerability in the machine learning framework allows arbitrary code to be executed on the master node during distributed training.

Save to Pocket listen Print view
A shimmering red Plexiglas triangle

(Image: JLStock/Shutterstock.com)

2 min. read
This article was originally published in German and has been automatically translated.

The CERT-Bund of the BSI warns of a vulnerability in PyTorch that occurs during the distributed training of models. The open-source ML framework initiated by Meta apparently executes the Python code sent by worker nodes, unchecked on the master node.

The CERT-Bund warning and information service lists the vulnerability as WID-SEC-2024-1323 with the highest CVSS score of 10, meaning the risk is critical. In the NIST Vulnerability Database, the vulnerability is listed as CVE-2024-5480.

It is not clear from the reports exactly which versions are affected. The NIST refers specifically to versions before 2.2.2, while the BSI refers to versions lower than 2.2.3. There is no explicit reference to an associated bugfix in either version 2.2.2 or version 2.3 in the official release notes.

The vulnerability can be found in the distributed RPC framework torch.distributed.rpc of PyTorch. It enables the execution of remote procedure calls (RPC) for the distributed training of models, among other things.

The worker nodes can serialize special function calls as user-defined functions (UDF) and send them to the master node, which deserializes and executes them. As the framework apparently does not check the functions before execution, it is vulnerable to command injection (CWE-77).

If an attacker has access to a worker node, they can pass an arbitrary Python function such as eval to the master node, which calls it. In turn, eval transfers the expression passed as a string parameter into a Python command and executes it.

A proof-of-concept exploit can be found on the bug bounty platform huntr. Even if the code only queries the IP of the master node, the article points out that sensitive data can be tapped from the master in this way.

(rme)