In a previous post, we showed how to use shiv to bundle a Django project into a single file for distribution and deployment. Running a large Python project as a single file feels like magic – which is great until you need to debug a problem. At that point, you need to understand how things work and what is happening under the hood. With that in mind, let’s demystify shiv’s magic.
Zipapps
Shiv uses a little known feature of Python called a ZIP application or “zipapp”. Zipapps provide a way to bundle multiple Python files into a single ZIP archive which the Python interpreter can then execute. Here’s a trivial example:
Pretty cool, huh? Copy myapp.pyz
anywhere you have a compatible version of Python and it just works.
But real-world projects aren’t so simple. They have dependencies, non-Python files that need to be included, and complex build steps.
Enter Shiv
The folks at LinkedIn built shiv to work around the shortcomings of traditional zipapps. It includes dependencies in the zipapp and ensures they are on the PYTHONPATH
at runtime. Shiv is only needed for creation. The resulting zipapp is executable without any additional tooling. You can think shiv as a wrapper around pip
which it uses behind the scenes to download and install dependencies. Here is an example of creating a zipapp for awscli
:
This downloads awscli
and all its dependencies from PyPI and creates a zipapp (aws.pyz
) which will run a Python function specified by --entry-point
on execution. That file can be copied to any system with a compatible Python version and executed as if it were a binary executable (./aws.pyz
or python aws.pyz
).
Under the Hood
The good news is that shiv
is surprisingly simple. When we build a new archive, shiv
does the following:
- Use
pip
to download all the dependencies to a temporary directory - Write some metadata used during the bootstrap process to
environment.json
- Create a bootstrap script that is used when the zipapp is executed.
- Bundle those files up into zip archive
- Insert a shebang at the top of the zip archive so it can be used as an executable.
The bootstrap script’s responsibility is to:
- Unpack the dependencies to a unique path. This is skipped if it already exists from a previous run.
- Insert that path into the
PYTHONPATH
. - Execute the
--entry-point
we defined during creation.
The unique path is determined by combining the filename of the zipapp and a UUID generated at build time (stored in environment.json
). It is stored in ~/.shiv
by default but can be changed by setting the SHIV_ROOT
environment variable. After execution, we can see the following:
The site-packages
directory looks just like the site-packages
directory you’d find for a standard Python installation or a virtualenv. As we can see, this was assigned a UUID of 3e32a16c-6652-44cc-a561-3784814d736e
at build time which can be confirmed by inspecting the included metadata:
This directory is added to the PYTHONPATH
like so:
Note: The environment variable SHIV_INTERPRETER
allows us to drop down into a Python shell using the zipapp’s environment.
The fifth item, /home/user/.shiv/...
is the one that was injected in via the bootstrap script. The others are Python defaults.
If you prefer to run in an isolated environment without the global site packages, Python’s -S
flag can be used:
Once unpacked, you can inspect the files on disk and even edit them if you’re trying to do some tricky debugging. Shiv will not overwrite the files of a previously unpacked zipapp unless the environment variable SHIV_FORCE_EXTRACT
is set.
That’s It
Turns out shiv is pretty straightforward. One of the things I like most about it is its simplicity. When there’s a problem with my code, it’s easy to poke around and rule out shiv as a cause. The code is well commented and easy to follow (here’s the bootstrap script). It is also stable and appears to be feature complete which means not having to work against a moving target.
Photo by SHVETS production from Pexels: https://www.pexels.com/photo/stack-of-empty-cardboard-boxes-prepared-for-relocation-from-home-7203699/