Published: Mon 10 October 2022
By viraptor
tags: python packaging dependencies
The meme
Python package management / installation is famously difficult… or so the
story goes. This keeps getting reinforced by forum comments, quoting that one
xkcd page, and people who aren’t actually running into the issues repeating the
meme. In practice, it will take just a few minutes to understand and not end up
in a mess.
That’s not to say the Python ecosystem doesn’t have its issues. But let’s have a
look at what’s actually simple, how things work in practice and what the real
issues are. Warning: things will be simplified - if you think “yes but actually …”,
you’re likely right and likely not the target for this post :)
Let’s start with tooling. There are 3 categories of tools for helping with
Python development:
Python version manager
dependencies manager
project environment manager
The boundaries may get a bit fuzzy since some tools have multiple roles. Lots of
tools involved are also very similar to other languages or even have a direct
equivalent. For example on Ruby side: setuptools - gem, pip - bundler (kinda),
pyenv - rbenv.
Python version managers
The more complex the project is, the more likely you are to run into
functionality which either requires newer version of Python than you have, or
uses something that’s deprecated in the latest and greatest. That means you’ll
want to run it with a specific version, often mentioned either in the setup file
or .python-version or .tool-version . By installing pyenv and running pyenv
install 3.7.2 for example, you get that specific version
downloaded/compiled/installed. For more details see
https://realpython.com/intro-to-pyenv/
This means you now likely have multiple versions installed now. Usually at least
a system one in /usr/bin/python3 and the private ones somewhere. They will all
have their separate standard libraries. That means they also look for default
packages in different places. If you install your distribution’s package using
apt or something similar, it will be installed for your system Python
environment, not any other custom one.
Dependency managers
There are a few of these. Their role is to make sure that when you work on a
project, you install the specific, known-good versions of dependencies. This may
get a bit fuzzy because Python packages themselves give you the ability to
specify the required versions (in setup.py / pyproject.toml), however they only
affect the first level. This is good enough when you’re distributing a library.
But when you’re dealing with the final app, you want to make sure the whole
thing works correctly for everyone.
There’s a few similar, but different in details dependency manages: pipenv,
poetry, pdm, and likely a few others. They all share a similar design though:
adding a dependency writes its version (and those of indirect dependencies) to a
“lock file” which is used for all future installations until its explicitly updated.
Environment managers
Once we have a dependency manager and a Python version ready, the dependencies
need to be installed somewhere. Most tools here provide some kind of abstraction
over the virtual environment / venv. It’s really important to understand what it is
in practice a typical venv is a directory with:
a link to a specific Python runtime
a directory for local Python packages
bin directory for script entry points
This is the part that differs the most from other languages. Ruby/bundler
installs various versions of gems to either a common location or a
project-specific path. Node uses local folder independent of the runtime
version. (Yes, for Python there’s also https://peps.python.org/pep-0582/ )
All together now
What does that all mean in practice? Let’s say you installed Python 3.7.1 with
pyenv and activated it as the current one. Now you run “python -mvenv .venv
” to
create a local virtual environment in a directory called .venv . The
“.venv/bin/python
” link will point at your installation of 3.7.1.
If you install a package through “.venv/bin/python -mpip install requests
”
you’ll install it in the “.venv” environment and nowhere else. Your system
Python package for example will not be able to see it. On the other hand, if you
install a package, for example through “apt install python-requests”, it will
not be visible in your local environment and has to be installed there separately.
Tools like poetry and pdm will have a bit of overlap with the environment
managers and will create their own as needed. This means that instead of
referencing a specific tool in a directory you created (like .venv), you’ll have
to run them though with (for example) “poetry run the-tool
“ .
What do I use?
The most common complaint i see is that there are so many options and someone
doesn’t know what to use. This one’s actually relatively easy - there are two
main scenarios: you’re installing someone’s app or working on your own. In the
first case - read their docs and use the same build tools they use - just make
sure you don’t use your default/global Python environment. (if you use “sudo
pip install
“ , you’re getting a lump of coal for Xmas)
If you want to install some Python app, but don’t have any instructions in the
project, you can rely on this being the most likely working solution:
# make a separate environment for it
python -mvenv place/to/install
# install the app
place/to/install/bin/pip install app_name
# and run it
place/to/install/bin/app_name
In the second case, use something modern that manages dependencies for you with
minimal effort. I like poetry, but you do you.
Where are the problems
All of this seems fine - so what are the actual problems people encounter? There
are two main ones:
1. The default package manager “pip” ignores conflicts. You may run into a
situation where it reports “ERROR : pip’s dependency resolver does not currently
take into account all the packages that are installed.” That means installing
package A, then B independently may result in a situation where A’s dependencies
are replaced with incompatible versions. That’s why you want to use a more
clever dependency manager which can deal with this problem. This doesn’t mean
pip is useless though - it works fine for single app cases and can be used with
frozen package lists which other dependency manages create. These are the
requirements.txt files.
2. a) People get confused about which environment they install the packages
into. Almost every stackoverflow question about why the author can’t import the
installed package boils down to - you installed it to a different environment
than you’re running from. Apt and pip don’t mix. venv and global installation
don’t mix. Different Python versions don’t mix. (simplification!) Figure out
which “python” you’re running and the issue will become clear.
2. b) People confuse tools from different environments. Many manuals will tell
you about installing by running “pip install
” rather than “python -mpip
install
” or “your/venv/bin/pip install
“ . While all do
basically the same thing, the second/third one guarantees that the you’re
installing packages into the same environment that your current Python uses. Why
would you ever end up with Python and pip coming from different paths? Honestly
I don’t know and if you ever end up in that situation, you should fix whatever
caused it. But that doesn’t change the fact that people do end up in that mess
and end up installing a package using default pip into an environment that their
default Python doesn’t check.
About that meme…
Both issues do result in some people actually creating a jumbled web of
partially broken dependencies which may appear to work, or fail, or not load at
all. But what happens is not black magic - verifying which Python you’re running
(“which python” or a specific path) and where/how you installed the dependencies
is enough to find almost all of those issues. There’s nothing wrong with
multiple Python runtimes on one machine and lots of virtual environments which
use them, as long as you keep them independent. This is also not specific to
Python and you’ll find an equivalent local-environment solution in many other languages.
The big problem only starts when a dev runs into this issue, does not try to
actually understand it and the meme of “Python package installation is
bad/confusing/broken” continues. The default tools have some limitations - keep
those in mind and use a better dependency manager for your own Python app.
If this is something you’ve struggled with, I invite you to find out and understand:
what Python versions are installed where on your system
where do the packages you install end up
what processes are you using for that and why.
A few minutes here will save you lots more down the track.