This is part of a series of posts on ideas for an ansible-like provisioning system, implemented in Transilience.
Mitogen is a great library, but scarily complicated, and I've been wondering how hard it would be to make alternative connection methods for Transilience.
Here's a wild idea: can I package a whole Transilience playbook, plus dependencies, in a zipapp, then send the zipapp to the machine to be provisioned, and run it locally?
It turns out I can.
Creating the zipapp
This is somewhat hackish, but until I can rely on Python 3.9's improved
importlib.resources
module, I cannot think of a better way:
def zipapp(self, target: str, interpreter=None):
"""
Bundle this playbook into a self-contained zipapp
"""
import zipapp
import jinja2
import transilience
if interpreter is None:
interpreter = sys.executable
if getattr(transilience.__loader__, "archive", None):
# Recursively iterating module directories requires Python 3.9+
raise NotImplementedError("Cannot currently create a zipapp from a zipapp")
with tempfile.TemporaryDirectory() as workdir:
# Copy transilience
shutil.copytree(os.path.dirname(__file__), os.path.join(workdir, "transilience"))
# Copy jinja2
shutil.copytree(os.path.dirname(jinja2.__file__), os.path.join(workdir, "jinja2"))
# Copy argv[0] as __main__.py
shutil.copy(sys.argv[0], os.path.join(workdir, "__main__.py"))
# Copy argv[0]/roles
role_dir = os.path.join(os.path.dirname(sys.argv[0]), "roles")
if os.path.isdir(role_dir):
shutil.copytree(role_dir, os.path.join(workdir, "roles"))
# Turn everything into a zipapp
zipapp.create_archive(workdir, target, interpreter=interpreter, compressed=True)
Since the zipapp contains not just the playbook, the roles, and the roles' assets, but also Transilience and Jinja2, it can run on any system that has a Python 3.7+ interpreter, and nothing else!
I added it to the standard set of playbook command line options, so any Transilience playbook can turn itself into a self-contained zipapp:
$ ./provision --help
usage: provision [-h] [-v] [--debug] [-C] [--local LOCAL]
[--ansible-to-python role | --ansible-to-ast role | --zipapp file.pyz]
[...]
--zipapp file.pyz bundle this playbook in a self-contained executable
python zipapp
Loading assets from the zipapp
I had to create ZipFile varieties of some bits of infrastructure in Transilience, to load templates, files, and Ansible yaml files from zip files.
You can see above a way to detect if a module is loaded from a zipfile: check
if the module's __loader__
attribute has an archive
attribute.
Here's a Jinja2 template loader that looks into a zip:
class ZipLoader(jinja2.BaseLoader):
def __init__(self, archive: zipfile.ZipFile, root: str):
self.zipfile = archive
self.root = root
def get_source(self, environment: jinja2.Environment, template: str):
path = os.path.join(self.root, template)
with self.zipfile.open(path, "r") as fd:
source = fd.read().decode()
return source, None, lambda: True
I also created a FileAsset
abstract interface to represent a local file, and had Role.lookup_file
return
an appropriate instance:
def lookup_file(self, path: str) -> str:
"""
Resolve a pathname inside the place where the role assets are stored.
Returns a pathname to the file
"""
if self.role_assets_zipfile is not None:
return ZipFileAsset(self.role_assets_zipfile, os.path.join(self.role_assets_root, path))
else:
return LocalFileAsset(os.path.join(self.role_assets_root, path))
An interesting side effect of having smarter local file accessors is that I can
cache the contents of small files and transmit them to the remote host together
with the other action parameters, saving a potential network round trip for
each builtin.copy
action that has a small source.
The result
The result is kind of fun:
$ time ./provision --zipapp test.pyz
real 0m0.203s
user 0m0.174s
sys 0m0.029s
$ time scp test.pyz root@test:
test.pyz 100% 528KB 388.9KB/s 00:01
real 0m1.576s
user 0m0.010s
sys 0m0.007s
And on the remote:
# time ./test.pyz --local=test
2021-06-29 18:05:41,546 test: [connected 0.000s]
[...]
2021-06-29 18:12:31,555 test: 88 total actions in 0.00ms: 87 unchanged, 0 changed, 1 skipped, 0 failed, 0 not executed.
real 0m0.979s
user 0m0.783s
sys 0m0.172s
Compare with a Mitogen run:
$ time PYTHONPATH=../transilience/ ./provision
2021-06-29 18:13:44 test: [connected 0.427s]
[...]
2021-06-29 18:13:46 test: 88 total actions in 2.50s: 87 unchanged, 0 changed, 1 skipped, 0 failed, 0 not executed.
real 0m2.697s
user 0m0.856s
sys 0m0.042s
From a single test run, not a good benchmark, it's 0.203 + 1.576 + 0.979 =
2.758s
with the zipapp and 2.697s
with Mitogen. Even if I've been lucky,
it's a similar order of magnitude.
What can I use this for?
This was mostly a fun hack.
It could however be the basis for a Fabric-based connector, or a clusterssh-based connector, or for bundling a Transilience playbook into an installation image, or to add a provisioning script to the boot partition of a Raspberry Pi. It looks like an interesting trick to have up one's sleeve.
One could even build an Ansible-based connector(!) in which a simple Ansible playbook, with no facts gathering, is used to build the zipapp, push it to remote systems and run it. That would be the wackiest way of speeding up Ansible, ever!
Next: using Systemd containers with unittest, for Transilience's test suite.