How hackers take advantage of pip to steal user's data ?

Introduction to pip

Most of the programmers would already know what is pip? Pip is a package manager for Python. It is used to install packages, update existing packages, uninstall packages. In order to install a package the following command is used

pip install <package-name>

The above command looks for the package in PyPI, resolves its dependencies, and installs everything in your current Python environment.

Installing a package from PyPI

There are two ways to install a package from PyPI

  1. Source Distribution - A source distribution is an archive that contains the source code. When pip install <package-name> is executed, pip downloads the source distribution and then compiles it on your end by executing a script which creates a wheel file (.whl) and finally the package is installed from this wheel file (.whl)
  2. Wheel (.whl) - A Wheel file is a compiled version of the Python package. This is compiled in the developer end. Installation is faster because the package is already compiled. So is not required on the user end.

What is ?

Every python package has a file. This file contains the metadata of the package like package name, dependencies, license and description. These are required to build a Wheel file (.whl). It also contains code that will be executed after the file is installed in the user's machine.

Problems with pip

Python packages can be installed through pip. Pip is misused by hackers to steal confidential information like API keys, passwords, tokens and even SSH keys. Pip actually executes a script which creates a wheel file (.whl) and then the package is installed with wheel file (.whl). The file could contain malicious code. When we install malicious packages with pip install using source distribution this enables the hackers to execute arbitrary code in your machine. With this they can spawn up a new process ,a reverse shell, a new python program, a ransomware, a malware. It is very dangerous to install a random python package which could be malicious

Some Python packages seem to be trusted because of the GitHub Statistics, but actually they are not. Some hackers add the link of some random GitHub repository, so that the number of stars, forks and open issues/PR will look genuine. This way they are making users, developers to install the Python package that could be malicious. Once they install this package the file executes. This gives the attackers Remote Code Execution access through which the attackers can gain full control of the system. PypI doesn’t validate the URL of the GitHub repository with the package name. It is possible to add link of any random GitHub repository. PyPI just displays the stats of the GitHub Repository that has been added.

Recently Sonatype discovered some Python packages that were stealing AWS credentials, environmental variables and network information from your machine and sending them to hackers. Those malicious packages are recently removed from PyPI and are listed below

  • loglib-modules - target developers familiar with loglib library
  • pyg-modules -  target developers familiar with pyg library
  • pygrata
  • pygrata-utils
  • hkg-sol-utils

Any individual can create an account in PyPI and can upload their packages. This is the reason why PyPI aka pip is very dangerous

How can we stay safe from pip ?

  1. Visit and search for the package that you want to install
  2. Before copying and executing the pip install <package-name> command, visit the GitHub repository displayed and verify whether the repository is actually the one that belongs to the package
  3. After verifying that the GitHub repository belongs to the Python package, then go through to check whether there is something malicious
  4. Always install from Wheel file (.whl) and not from a source distribution. Wheel file (.whl) is a compiled version of a Python package
  5. This eliminates the need of to execute, as the wheel file is already generated
  6. The command to install a Python package only through a wheel file is pip install <package-name> --only-binary=:all:
  7. Wheel file (.whl) is the safest way to install Python packages
  8. But major Open-source Python packages only offer to install Python package through  source distribution and if you know that the Python package is genuine and is well known you can install using pip install <package-name>
  9. Use pip install <package-name> --only-binary=:all: to install random unknown Python packages, so you can be safe from malicious script