This is part of a series of posts on ideas for an ansible-like provisioning system, implemented in Transilience.
Unit testing some parts of Transilience, like the apt and systemd actions, or remote Mitogen connections, can really use a containerized system for testing.
To have that, I reused my work on nspawn-runner. to build a simple and very fast system of ephemeral containers, with minimal dependencies, based on systemd-nspawn and btrfs snapshots:
Setup
To be able to use systemd-nspawn --ephemeral
, the chroots needs to be btrfs
subvolumes. If you are not running on a btrfs filesystem, you can create one to
run the tests, even on a file:
fallocate -l 1.5G testfile
/usr/sbin/mkfs.btrfs testfile
sudo mount -o loop testfile test_chroots/
I created a script to setup the test environment, here is an extract:
mkdir -p test_chroots
cat << EOF > "test_chroots/CACHEDIR.TAG"
Signature: 8a477f597d28d172789f06886806bc55
# chroots used for testing transilience, can be regenerated with make-test-chroot
EOF
btrfs subvolume create test_chroots/buster
eatmydata debootstrap --variant=minbase --include=python3,dbus,systemd buster test_chroots/buster
CACHEDIR.TAG
is a nice trick to tell backup software not to bother backing up
the contents of this directory, since it can be easily regenerated.
eatmydata
is optional, and it speeds up debootstrap quite a bit.
Running unittest
with sudo
Here's a simple helper to drop root as soon as possible, and regain it only
when needed. Note that it needs $SUDO_UID
and $SUDO_GID
, that are set by
sudo
, to know which user to drop into:
class ProcessPrivs:
"""
Drop root privileges and regain them only when needed
"""
def __init__(self):
self.orig_uid, self.orig_euid, self.orig_suid = os.getresuid()
self.orig_gid, self.orig_egid, self.orig_sgid = os.getresgid()
if "SUDO_UID" not in os.environ:
raise RuntimeError("Tests need to be run under sudo")
self.user_uid = int(os.environ["SUDO_UID"])
self.user_gid = int(os.environ["SUDO_GID"])
self.dropped = False
def drop(self):
"""
Drop root privileges
"""
if self.dropped:
return
os.setresgid(self.user_gid, self.user_gid, 0)
os.setresuid(self.user_uid, self.user_uid, 0)
self.dropped = True
def regain(self):
"""
Regain root privileges
"""
if not self.dropped:
return
os.setresuid(self.orig_suid, self.orig_suid, self.user_uid)
os.setresgid(self.orig_sgid, self.orig_sgid, self.user_gid)
self.dropped = False
@contextlib.contextmanager
def root(self):
"""
Regain root privileges for the duration of this context manager
"""
if not self.dropped:
yield
else:
self.regain()
try:
yield
finally:
self.drop()
@contextlib.contextmanager
def user(self):
"""
Drop root privileges for the duration of this context manager
"""
if self.dropped:
yield
else:
self.drop()
try:
yield
finally:
self.regain()
privs = ProcessPrivs()
privs.drop()
As soon as this module is loaded, root privileges are dropped, and can be regained for as little as possible using a handy context manager:
with privs.root():
subprocess.run(["systemd-run", ...], check=True, capture_output=True)
Using the chroot from test cases
The infrastructure to setup and spin down ephemeral machine is relatively simple, once one has worked out the nspawn incantations:
class Chroot:
"""
Manage an ephemeral chroot
"""
running_chroots: Dict[str, "Chroot"] = {}
def __init__(self, name: str, chroot_dir: Optional[str] = None):
self.name = name
if chroot_dir is None:
self.chroot_dir = self.get_chroot_dir(name)
else:
self.chroot_dir = chroot_dir
self.machine_name = f"transilience-{uuid.uuid4()}"
def start(self):
"""
Start nspawn on this given chroot.
The systemd-nspawn command is run contained into its own unit using
systemd-run
"""
unit_config = [
'KillMode=mixed',
'Type=notify',
'RestartForceExitStatus=133',
'SuccessExitStatus=133',
'Slice=machine.slice',
'Delegate=yes',
'TasksMax=16384',
'WatchdogSec=3min',
]
cmd = ["systemd-run"]
for c in unit_config:
cmd.append(f"--property={c}")
cmd.extend((
"systemd-nspawn",
"--quiet",
"--ephemeral",
f"--directory={self.chroot_dir}",
f"--machine={self.machine_name}",
"--boot",
"--notify-ready=yes"))
log.info("%s: starting machine using image %s", self.machine_name, self.chroot_dir)
log.debug("%s: running %s", self.machine_name, " ".join(shlex.quote(c) for c in cmd))
with privs.root():
subprocess.run(cmd, check=True, capture_output=True)
log.debug("%s: started", self.machine_name)
self.running_chroots[self.machine_name] = self
def stop(self):
"""
Stop the running ephemeral containers
"""
cmd = ["machinectl", "terminate", self.machine_name]
log.debug("%s: running %s", self.machine_name, " ".join(shlex.quote(c) for c in cmd))
with privs.root():
subprocess.run(cmd, check=True, capture_output=True)
log.debug("%s: stopped", self.machine_name)
del self.running_chroots[self.machine_name]
@classmethod
def create(cls, chroot_name: str) -> "Chroot":
"""
Start an ephemeral machine from the given master chroot
"""
res = cls(chroot_name)
res.start()
return res
@classmethod
def get_chroot_dir(cls, chroot_name: str):
"""
Locate a master chroot under test_chroots/
"""
chroot_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "test_chroots", chroot_name))
if not os.path.isdir(chroot_dir):
raise RuntimeError(f"{chroot_dir} does not exists or is not a chroot directory")
return chroot_dir
# We need to use atextit, because unittest won't run
# tearDown/tearDownClass/tearDownModule methods in case of KeyboardInterrupt
# and we need to make sure to terminate the nspawn containers at exit
@atexit.register
def cleanup():
# Use a list to prevent changing running_chroots during iteration
for chroot in list(Chroot.running_chroots.values()):
chroot.stop()
And here's a TestCase
mixin that starts a containerized systems and opens a Mitogen
connection to it:
class ChrootTestMixin:
"""
Mixin to run tests over a setns connection to an ephemeral systemd-nspawn
container running one of the test chroots
"""
chroot_name = "buster"
@classmethod
def setUpClass(cls):
super().setUpClass()
import mitogen
from transilience.system import Mitogen
cls.broker = mitogen.master.Broker()
cls.router = mitogen.master.Router(cls.broker)
cls.chroot = Chroot.create(cls.chroot_name)
with privs.root():
cls.system = Mitogen(
cls.chroot.name, "setns", kind="machinectl",
python_path="/usr/bin/python3",
container=cls.chroot.machine_name, router=cls.router)
@classmethod
def tearDownClass(cls):
super().tearDownClass()
cls.system.close()
cls.broker.shutdown()
cls.chroot.stop()
Running tests
Once the tests are set up, everything goes on as normal, except one needs to
run nose2
with sudo:
sudo nose2-3
Spin up time for containers is pretty fast, and the tests drop root as soon as possible, and only regain it for as little as needed.
Also, dependencies for all this are minimal and available on most systems, and the setup instructions seem pretty straightforward