In a previous post, we showed how to use shiv to bundle a Django project into a single file for distribution and deployment. Running a large Python project as a single file feels like magic -- which is great until you need to debug a problem. At that point, you need to understand how things work and what is happening under the hood. With that in mind, let's demystify shiv's magic.
Shiv uses a little known feature of Python called a ZIP application or "zipapp". Zipapps provide a way to bundle multiple Python files into a single ZIP archive which the Python interpreter can then execute. Here's a trivial example:
$ mkdir myapp $ echo 'print("hello world")' > myapp/__main__.py $ python -m zipapp myapp $ python myapp.pyz hello world
Pretty cool, huh? Copy
myapp.pyz anywhere you have a compatible version of Python and it just works.
But real-world projects aren't so simple. They have dependencies, non-Python files that need to be included, and complex build steps.
The folks at LinkedIn built shiv to work around the shortcomings of traditional zipapps. It includes dependencies in the zipapp and ensures they are on the
PYTHONPATH at runtime. Shiv is only needed for creation. The resulting zipapp is executable without any additional tooling. You can think shiv as a wrapper around
pip which it uses behind the scenes to download and install dependencies. Here is an example of creating a zipapp for
$ shiv --output-file=aws.pyz --entry-point=awscli.clidriver.main awscli
awscli and all its dependencies from PyPI and creates a zipapp (
aws.pyz) which will run a Python function specified by
--entry-point on execution. That file can be copied to any system with a compatible Python version and executed as if it were a binary executable (
Under the Hood
The good news is that
shiv is surprisingly simple. When we build a new archive,
shiv does the following:
pipto download all the dependencies to a temporary directory
- Write some metadata used during the bootstrap process to
- Create a bootstrap script that is used when the zipapp is executed.
- Bundle those files up into zip archive
- Insert a shebang at the top of the zip archive so it can be used as an executable.
The bootstrap script's responsibility is to:
- Unpack the dependencies to a unique path. This is skipped if it already exists from a previous run.
- Insert that path into the
- Execute the
--entry-pointwe defined during creation.
The unique path is determined by combining the filename of the zipapp and a UUID generated at build time (stored in
environment.json). It is stored in
~/.shiv by default but can be changed by setting the
SHIV_ROOT environment variable. After execution, we can see the following:
cd ~/.shiv && find . -maxdepth 2 . ./aws_3e32a16c-6652-44cc-a561-3784814d736e ./aws_3e32a16c-6652-44cc-a561-3784814d736e/site-packages
site-packages directory looks just like the
site-packages directory you'd find for a standard Python installation or a virtualenv. As we can see, this was assigned a UUID of
3e32a16c-6652-44cc-a561-3784814d736e at build time which can be confirmed by inspecting the included metadata:
$ unzip -p aws.pyz environment.json | jq .build_id "3e32a16c-6652-44cc-a561-3784814d736e"
This directory is added to the
PYTHONPATH like so:
$ echo "import sys, pprint; pprint.pprint(sys.path)" | \ SHIV_INTERPRETER=1 ./aws.pyz Python 3.7.3 (default, Jun 11 2019, 01:05:09) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> ['./aws.pyz', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/home/user/.shiv/aws_3e32a16c-6652-44cc-a561-3784814d736e/site-packages', '/usr/local/lib/python3.7/site-packages']
Note: The environment variable
SHIV_INTERPRETER allows us to drop down into a Python shell using the zipapp's environment.
The fifth item,
/home/user/.shiv/... is the one that was injected in via the bootstrap script. The others are Python defaults.
If you prefer to run in an isolated environment without the global site packages, Python's
-S flag can be used:
$ echo "import sys, pprint; pprint.pprint(sys.path)" | \ SHIV_INTERPRETER=1 python -S aws.pyz Python 3.7.3 (default, Jun 11 2019, 01:05:09) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> ['aws.pyz', '/usr/local/lib/python37.zip', '/usr/local/lib/python3.7', '/usr/local/lib/python3.7/lib-dynload', '/home/user/.shiv/aws_3e32a16c-6652-44cc-a561-3784814d736e/site-packages']
Once unpacked, you can inspect the files on disk and even edit them if you're trying to do some tricky debugging. Shiv will not overwrite the files of a previously unpacked zipapp unless the environment variable
SHIV_FORCE_EXTRACT is set.
Turns out shiv is pretty straightforward. One of the things I like most about it is its simplicity. When there's a problem with my code, it's easy to poke around and rule out shiv as a cause. The code is well commented and easy to follow (here's the bootstrap script). It is also stable and appears to be feature complete which means not having to work against a moving target.