Python dependency management difficulty is an unhelpful meme

The meme

Python package management / installation is famously difficult… or so the story goes. This keeps getting reinforced by forum comments, quoting that one xkcd page, and people who aren’t actually running into the issues repeating the meme. In practice, it will take just a few minutes to understand and not end up in a mess.

That’s not to say the Python ecosystem doesn’t have its issues. But let’s have a look at what’s actually simple, how things work in practice and what the real issues are. Warning: things will be simplified - if you think “yes but actually …”, you’re likely right and likely not the target for this post :)

Let’s start with tooling. There are 3 categories of tools for helping with Python development:

  • Python version manager
  • dependencies manager
  • project environment manager

The boundaries may get a bit fuzzy since some tools have multiple roles. Lots of tools involved are also very similar to other languages or even have a direct equivalent. For example on Ruby side: setuptools - gem, pip - bundler (kinda), pyenv - rbenv.

Python version managers

The more complex the project is, the more likely you are to run into functionality which either requires newer version of Python than you have, or uses something that’s deprecated in the latest and greatest. That means you’ll want to run it with a specific version, often mentioned either in the setup file or .python-version or .tool-version. By installing pyenv and running pyenv install 3.7.2 for example, you get that specific version downloaded/compiled/installed. For more details see https://realpython.com/intro-to-pyenv/

This means you now likely have multiple versions installed now. Usually at least a system one in /usr/bin/python3 and the private ones somewhere. They will all have their separate standard libraries. That means they also look for default packages in different places. If you install your distribution’s package using apt or something similar, it will be installed for your system Python environment, not any other custom one.

Dependency managers

There are a few of these. Their role is to make sure that when you work on a project, you install the specific, known-good versions of dependencies. This may get a bit fuzzy because Python packages themselves give you the ability to specify the required versions (in setup.py / pyproject.toml), however they only affect the first level. This is good enough when you’re distributing a library. But when you’re dealing with the final app, you want to make sure the whole thing works correctly for everyone.

There’s a few similar, but different in details dependency manages: pipenv, poetry, pdm, and likely a few others. They all share a similar design though: adding a dependency writes its version (and those of indirect dependencies) to a “lock file” which is used for all future installations until its explicitly updated.

Environment managers

Once we have a dependency manager and a Python version ready, the dependencies need to be installed somewhere. Most tools here provide some kind of abstraction over the virtual environment / venv. It’s really important to understand what it is

  • in practice a typical venv is a directory with:
  • a link to a specific Python runtime
  • a directory for local Python packages
  • bin directory for script entry points

This is the part that differs the most from other languages. Ruby/bundler installs various versions of gems to either a common location or a project-specific path. Node uses local folder independent of the runtime version. (Yes, for Python there’s also https://peps.python.org/pep-0582/)

All together now

What does that all mean in practice? Let’s say you installed Python 3.7.1 with pyenv and activated it as the current one. Now you run “python -mvenv .venv” to create a local virtual environment in a directory called .venv. The “.venv/bin/python” link will point at your installation of 3.7.1.

If you install a package through “.venv/bin/python -mpip install requests” you’ll install it in the “.venv” environment and nowhere else. Your system Python package for example will not be able to see it. On the other hand, if you install a package, for example through “apt install python-requests”, it will not be visible in your local environment and has to be installed there separately.

Tools like poetry and pdm will have a bit of overlap with the environment managers and will create their own as needed. This means that instead of referencing a specific tool in a directory you created (like .venv), you’ll have to run them though with (for example) “poetry run the-tool.

What do I use?

The most common complaint i see is that there are so many options and someone doesn’t know what to use. This one’s actually relatively easy - there are two main scenarios: you’re installing someone’s app or working on your own. In the first case - read their docs and use the same build tools they use - just make sure you don’t use your default/global Python environment. (if you use “sudo pip install, you’re getting a lump of coal for Xmas)

If you want to install some Python app, but don’t have any instructions in the project, you can rely on this being the most likely working solution:

# make a separate environment for it
python -mvenv place/to/install
# install the app
place/to/install/bin/pip install app_name
# and run it
place/to/install/bin/app_name

In the second case, use something modern that manages dependencies for you with minimal effort. I like poetry, but you do you.

Where are the problems

All of this seems fine - so what are the actual problems people encounter? There are two main ones:

1. The default package manager “pip” ignores conflicts. You may run into a situation where it reports “ERROR: pip’s dependency resolver does not currently take into account all the packages that are installed.” That means installing package A, then B independently may result in a situation where A’s dependencies are replaced with incompatible versions. That’s why you want to use a more clever dependency manager which can deal with this problem. This doesn’t mean pip is useless though - it works fine for single app cases and can be used with frozen package lists which other dependency manages create. These are the requirements.txt files.

2. a) People get confused about which environment they install the packages into. Almost every stackoverflow question about why the author can’t import the installed package boils down to - you installed it to a different environment than you’re running from. Apt and pip don’t mix. venv and global installation don’t mix. Different Python versions don’t mix. (simplification!) Figure out which “python” you’re running and the issue will become clear.

2. b) People confuse tools from different environments. Many manuals will tell you about installing by running “pip install” rather than “python -mpip install” or “your/venv/bin/pip install. While all do basically the same thing, the second/third one guarantees that the you’re installing packages into the same environment that your current Python uses. Why would you ever end up with Python and pip coming from different paths? Honestly I don’t know and if you ever end up in that situation, you should fix whatever caused it. But that doesn’t change the fact that people do end up in that mess and end up installing a package using default pip into an environment that their default Python doesn’t check.

About that meme…

Both issues do result in some people actually creating a jumbled web of partially broken dependencies which may appear to work, or fail, or not load at all. But what happens is not black magic - verifying which Python you’re running (“which python” or a specific path) and where/how you installed the dependencies is enough to find almost all of those issues. There’s nothing wrong with multiple Python runtimes on one machine and lots of virtual environments which use them, as long as you keep them independent. This is also not specific to Python and you’ll find an equivalent local-environment solution in many other languages.

The big problem only starts when a dev runs into this issue, does not try to actually understand it and the meme of “Python package installation is bad/confusing/broken” continues. The default tools have some limitations - keep those in mind and use a better dependency manager for your own Python app.

If this is something you’ve struggled with, I invite you to find out and understand:

  • what Python versions are installed where on your system
  • where do the packages you install end up
  • what processes are you using for that and why.

A few minutes here will save you lots more down the track.

Was it useful? BTC: 182DVfre4E7WNk3Qakc4aK7bh4fch51hTY
While you're here, why not check out my project Phishtrack which will notify you about domains with names similar to your business. Learn about phishing campaigns early.

blogroll

social