Last week, PyTorch identified a supply chain attack that potentially caused developers to download a compromised PyTorch dependency. The PyTorch team published an advisory to warn developers that the package includes malicious code to steal system data.
PyTorch is an open-source framework that allows Python developers to build machine-learning applications. It was mainly developed by the Facebook AI Research team and is widely used in industry and academia. PyTorch helps organizations and researchers to perform high computing tasks such as reinforcement learning, computer vision, and natural language processing.
On 26 December 2022, an unknown malicious actor uploaded a compromised torchtriton package (an internal dependency) with a duplicate name to the Python Package Index (PyPI) repository. It was designed to harvest system data and sensitive files from the victim’s machine and then send it to a specific domain via encrypted DNS queries.
“Since the PyPI index takes precedence, this malicious package was being installed instead of the version from our official repository. This design enables somebody to register a package by the same name as one that exists in a third party index, and pip will install their version by default. This malicious package has the same name torchtriton but added in code that uploads sensitive data from the machine,” the PyTorch team wrote.
Interestingly, the person who claims to be behind this incident stated that his actions were a part of ethical research. He has since acknowledged his mistake and apologized to all affected developers.
PyTorch advises that developers who installed PyTorch nightly via pip last week should uninstall it from their Linux machines. It is highly recommended to download the latest nightly binaries released after December 30, 2022. However, the issue doesn’t impact developers using the PyTorch stable packages.
The maintainers of PyTorch detailed several measures that have been taken to address the problem. The first step involves removing the torchtriton dependency for the nightly packages and then replacing it with pytorch-triton. Moreover, the maintainers registered a dummy package on PyPI to prevent dependency confusion attacks in the future.
The PyTorch team provided details to guide users on how to search for the malicious binary in the Torchtriton package. If you’re interested, you can check out the PyTorch blog for more details.