Latest posts for tag devel
Things I learnt in March 2023
- str.endswith() can take a tuple of possible endings instead of a single string
About JACK and Debian
- There are 3 JACK implementations: jackd1, jackd2, pipewire-jack.
- jackd1 is mostly superseded in favour of jackd2, and as far as I understand, can be ignored
- pipewire-jack integrates well with pipewire and the rest of the Linux audio world
- jackd2 is the native JACK server. When started it handles the sound card directly, and will steal it from pipewire. Non-JACK audio applications will likely cease to see the sound card until JACK is stopped and wireplumber is restarted. Pipewire should be able to keep working as a JACK client but I haven't gone down that route yet
- pipewire-jack mostly works. At some point I experienced glitches in complex JACK apps like giada or ardour that went away after switching to jackd2. I have not investigated further into the glitches
- So: try things with pw-jack. If you see odd glitches, try without pw-jack to use the native jackd2. Keep in mind, if you do so, that you will lose standard pipewire until you stop jackd2 and restart wireplumber.
Heart-driven drum loop
I have Python code for reading a heart rate monitor.
I have Python code to generate MIDI events.
Could I resist putting them together? Clearly not.
Here's Jack Of Hearts, a JACK MIDI drum loop generator that uses the heart rate for BPM, and an improvised way to compute heart rate increase/decrease to add variations in the drum pattern.
It's very simple minded and silly. To me it was a fun way of putting unrelated things together, and Python worked very well for it.
Generating MIDI events with JACK and Python
I had a go at trying to figure out how to generate arbitrary MIDI events and send them out over a JACK MIDI channel.
Setting up JACK and Pipewire
Pipewire has a JACK interface, which in theory means one could use JACK clients out of the box without extra setup.
In practice, one need to tell JACK clients which set of libraries to use to communicate to servers, and Pipewire's JACK server is not the default choice.
To tell JACK clients to use Pipewire's server, you can either:
- on a client-by-client basis, wrap the commands with pw-jack
- to change the system default:
cp /usr/share/doc/pipewire/examples/ld.so.conf.d/pipewire-jack-*.conf /etc/ld.so.conf.d/
and runldconfig
(see the Debian wiki for details)
Programming with JACK
Python has a JACK client library that worked flawlessly for me so far.
Everything with JACK is designed around minimizing latency. Everything happens around a callback that gets called form a separate thread, and which gets a buffer to fill with events.
All the heavy processing needs to happen outside the callback, and the callback is only there to do the minimal amount of work needed to shovel the data your application produced into JACK channels.
Generating MIDI messages
The Mido library can be used to parse and create MIDI messages and it also worked flawlessly for me so far.
One needs to study a bit what kind of MIDI message one needs to generate (like "note on", "note off", "program change") and what arguments they get.
It also helps to read about the General MIDI standard which defines mappings between well-known instruments and channels and instrument numbers in MIDI messages.
A timed message queue
To keep a queue of events that happen over time, I implemented a Delta List that indexes events by their future frame number.
I called the humble container for my audio experiments pyeep and here's my delta list implementation.
A JACK player
The simple JACK MIDI player backend is also in pyeep.
It needs to protect the delta list with a mutex since we are working across thread boundaries, but it tries to do as little work under lock as possible, to minimize the risk of locking the realtime thread for too long.
The play
method converts delays in seconds to frame counts, and the
on_process
callback moves events from the queue to the jack output.
Here's an example script that plays a simple drum pattern:
#!/usr/bin/python3
# Example JACK midi event generator
#
# Play a drum pattern over JACK
import time
from pyeep.jackmidi import MidiPlayer
# See:
# https://soundprogramming.net/file-formats/general-midi-instrument-list/
# https://www.pgmusic.com/tutorial_gm.htm
DRUM_CHANNEL = 9
with MidiPlayer("pyeep drums") as player:
beat: int = 0
while True:
player.play("note_on", velocity=64, note=35, channel=DRUM_CHANNEL)
player.play("note_off", note=38, channel=DRUM_CHANNEL, delay_sec=0.5)
if beat == 0:
player.play("note_on", velocity=100, note=38, channel=DRUM_CHANNEL)
player.play("note_off", note=36, channel=DRUM_CHANNEL, delay_sec=0.3)
if beat + 1 == 2:
player.play("note_on", velocity=100, note=42, channel=DRUM_CHANNEL)
player.play("note_off", note=42, channel=DRUM_CHANNEL, delay_sec=0.3)
beat = (beat + 1) % 4
time.sleep(0.3)
Running the example
I ran the jack_drums
script, and of course not much happened.
First I needed a MIDI synthesizer. I installed fluidsynth, and ran it on the command line with no arguments. it registered with JACK, ready to do its thing.
Then I connected things together. I used qjackctl, opened the graph view, and connected the MIDI output of "pyeep drums" to the "FLUID Synth input port".
fluidsynth's output was already automatically connected to the audio card and I started hearing the drums playing! 🥁️🎉️
Monitoring a heart rate monitor
I bought myself a cheap wearable Bluetooth LE heart rate monitor in order to play with it, and this is a simple Python script to monitor it and plot data.
Bluetooth LE
I was surprised that these things seem decently interoperable.
You can use hcitool
to scan for devices:
hcitool lescan
You can then use gatttool
to connect to device and poke at them interactively
from a command line.
Bluetooth LE from Python
There is a nice library called Bleak which is also packaged in Debian. It's modern Python with asyncio and works beautifully!
Heart rate monitors
Things I learnt:
- The UUID for the heart rate interface starts with
00002a37
. - The UUID for checking battery status starts with
00002a19
. - A longer list of UUIDs is here.
- The layout of heart rate data packets and some Python code to parse them
- What are RR values
How about a proper fitness tracker?
I found OpenTracks, also on F-Droid, which seems nice
Why script it from a desktop computer?
The question is: why not?
A fitness tracker on a phone is useful, but there are lots of silly things one can do from one's computer that one can't do from a phone. A heart rate monitor is, after all, one more input device, and there are never enough input devices!
There are so many extremely important use cases that seem entirely unexplored:
- Log your heart rate with your git commits!
- Add your heart rate as a header in your emails!
- Correlate heart rate information with your work activity tracker to find out what tasks stress you the most!
- Sync ping intervals with your own heartbeat, so you get faster replies when you're more anxious!
- Configure workrave to block your keyboard if you get too excited, to improve the quality of your mailing list contributions!
- You can monitor the monitor script of the heart rate monitor that monitors you! Forget buffalo, be your monitor monitor monitor monitor monitor monitor monitor monitor...
Released staticsite 2.x
In theory I wanted to announce the release of
staticsite 2.0, but then I found
bugs that prevented me from writing this post, so I'm also releasing
2.1 2.2 2.3 :grin:
staticsite is the static site generator that I ended up writing after giving other generators a try.
I did a big round of cleanup of the code, which among other things allowed me to implement incremental builds.
It turned out that staticsite is fast enough that incremental builds are not really needed, however, a bug in caching rendered markdown made me forget about that. Now I fixed that bug, too, and I can choose between running staticsite fast, and ridiculously fast.
My favourite bit of this work is the internal cleanup: I found a way to simplify the core design massively, and now the core and plugin system is simple enough that I can explain it, and I'll probably write a blog post or two about it in the next days.
On top of that, staticsite is basically clean with mypy running in strict mode! Getting there was a great ride which prompted a lot of thinking about designing code properly, as mypy is pretty good at flagging clumsy hacks.
If you want to give it a try, check out the small tutorial A new blog in under one minute.
Really lossy compression of JPEG
Suppose you have a tool that archives images, or scientific data, and it has a test suite. It would be good to collect sample files for the test suite, but they are often so big one can't really bloat the repository with them.
But does the test suite need everything that is in those files? Not necesarily. For example, if one's testing code that reads EXIF metadata, one doesn't care about what is in the image.
That technique works extemely well. I can take GRIB files that are several megabytes in size, zero out their data payload, and get nice 1Kb samples for the test suite.
I've started to collect and organise the little hacks I use for this into a tool I called mktestsample:
$ mktestsample -v samples1/*
2021-11-23 20:16:32 INFO common samples1/cosmo_2d+0.grib: size went from 335168b to 120b
2021-11-23 20:16:32 INFO common samples1/grib2_ifs.arkimet: size went from 4993448b to 39393b
2021-11-23 20:16:32 INFO common samples1/polenta.jpg: size went from 3191475b to 94517b
2021-11-23 20:16:32 INFO common samples1/test-ifs.grib: size went from 1986469b to 4860b
Those are massive savings, but I'm not satisfied about those almost 94Kb of JPEG:
$ ls -la samples1/polenta.jpg
-rw-r--r-- 1 enrico enrico 94517 Nov 23 20:16 samples1/polenta.jpg
$ gzip samples1/polenta.jpg
$ ls -la samples1/polenta.jpg.gz
-rw-r--r-- 1 enrico enrico 745 Nov 23 20:16 samples1/polenta.jpg.gz
I believe I did all I could: completely blank out image data, set quality to zero, maximize subsampling, and tweak quantization to throw everything away.
Still, the result is a 94Kb file that can be gzipped down to 745 bytes. Is there something I'm missing?
I suppose JPEG is better at storing an image than at storing the lack of an image. I cannot really complain :)
I can still commit compressed samples of large images to a git repository, taking very little data indeed. That's really nice!
Mock syscalls with C++
I wrote and maintain some C++ code
to stream high quantities of data as fast as possible, and I try to use
splice
and sendfile
when available.
The availability of those system calls varies at runtime according to a number
of factors, and the code needs to be written to fall back to read
/write
loops depending on what the splice
and sendfile
syscalls say.
The tricky issue is unit testing: since the code path chosen depends on the kernel, the test suite will test one path or the other depending on the machine and filesystems where the tests are run.
It would be nice to be able to mock the syscalls, and replace them during tests, and it looks like I managed.
First I made catalogues of the mockable syscalls I want to be able to mock. One
with function pointers, for performance, and one with std::function
, for
flexibility:
/**
* Linux versions of syscalls to use for concrete implementations.
*/
struct ConcreteLinuxBackend
{
static ssize_t (*read)(int fd, void *buf, size_t count);
static ssize_t (*write)(int fd, const void *buf, size_t count);
static ssize_t (*writev)(int fd, const struct iovec *iov, int iovcnt);
static ssize_t (*sendfile)(int out_fd, int in_fd, off_t *offset, size_t count);
static ssize_t (*splice)(int fd_in, loff_t *off_in, int fd_out,
loff_t *off_out, size_t len, unsigned int flags);
static int (*poll)(struct pollfd *fds, nfds_t nfds, int timeout);
static ssize_t (*pread)(int fd, void *buf, size_t count, off_t offset);
};
/**
* Mockable versions of syscalls to use for testing concrete implementations.
*/
struct ConcreteTestingBackend
{
static std::function<ssize_t(int fd, void *buf, size_t count)> read;
static std::function<ssize_t(int fd, const void *buf, size_t count)> write;
static std::function<ssize_t(int fd, const struct iovec *iov, int iovcnt)> writev;
static std::function<ssize_t(int out_fd, int in_fd, off_t *offset, size_t count)> sendfile;
static std::function<ssize_t(int fd_in, loff_t *off_in, int fd_out,
loff_t *off_out, size_t len, unsigned int flags)> splice;
static std::function<int(struct pollfd *fds, nfds_t nfds, int timeout)> poll;
static std::function<ssize_t(int fd, void *buf, size_t count, off_t offset)> pread;
static void reset();
};
Then I converted the code to templates, parameterized on the catalogue class.
Explicit template instantiation helps in making sure that one doesn't need to include template code in all sorts of places.
Finally, I can have a RAII class for mocking:
/**
* RAII mocking of syscalls for concrete stream implementations
*/
struct MockConcreteSyscalls
{
std::function<ssize_t(int fd, void *buf, size_t count)> orig_read;
std::function<ssize_t(int fd, const void *buf, size_t count)> orig_write;
std::function<ssize_t(int fd, const struct iovec *iov, int iovcnt)> orig_writev;
std::function<ssize_t(int out_fd, int in_fd, off_t *offset, size_t count)> orig_sendfile;
std::function<ssize_t(int fd_in, loff_t *off_in, int fd_out,
loff_t *off_out, size_t len, unsigned int flags)> orig_splice;
std::function<int(struct pollfd *fds, nfds_t nfds, int timeout)> orig_poll;
std::function<ssize_t(int fd, void *buf, size_t count, off_t offset)> orig_pread;
MockConcreteSyscalls();
~MockConcreteSyscalls();
};
MockConcreteSyscalls::MockConcreteSyscalls()
: orig_read(ConcreteTestingBackend::read),
orig_write(ConcreteTestingBackend::write),
orig_writev(ConcreteTestingBackend::writev),
orig_sendfile(ConcreteTestingBackend::sendfile),
orig_splice(ConcreteTestingBackend::splice),
orig_poll(ConcreteTestingBackend::poll),
orig_pread(ConcreteTestingBackend::pread)
{
}
MockConcreteSyscalls::~MockConcreteSyscalls()
{
ConcreteTestingBackend::read = orig_read;
ConcreteTestingBackend::write = orig_write;
ConcreteTestingBackend::writev = orig_writev;
ConcreteTestingBackend::sendfile = orig_sendfile;
ConcreteTestingBackend::splice = orig_splice;
ConcreteTestingBackend::poll = orig_poll;
ConcreteTestingBackend::pread = orig_pread;
}
And here's the specialization to pretend sendfile
and splice
aren't
available:
/**
* Mock sendfile and splice as if they weren't available on this system
*/
struct DisableSendfileSplice : public MockConcreteSyscalls
{
DisableSendfileSplice();
};
DisableSendfileSplice::DisableSendfileSplice()
{
ConcreteTestingBackend::sendfile = [](int out_fd, int in_fd, off_t *offset, size_t count) -> ssize_t {
errno = EINVAL;
return -1;
};
ConcreteTestingBackend::splice = [](int fd_in, loff_t *off_in, int fd_out,
loff_t *off_out, size_t len, unsigned int flags) -> ssize_t {
errno = EINVAL;
return -1;
};
}
It's now also possible to reproduce in the test suite all sorts of system-related issues we might observe in production over time.
Software development links
Next time we'll iterate on Himblick design and development, Raspberry Pi 4 can now run plain standard Debian, which should make a lot of things easier and cleaner when developing products based on it.
Somewhat related to nspawn-runner, random links somehow related to my feeling that nspawn comes from an ecosystem which gives me a bigger sense of focus on security and solidity than Docker:
- Half of 4 Million Public Docker Hub Images Found to Have Critical Vulnerabilities
- systemd service sandboxing and security hardening 101
I did a lot of work on A38, a Python library to deal with FatturaPA electronic invoicing, and it was a wonderful surprise to see a positive review spontaneously appear! ♥: Fattura elettronica, come visualizzarla con python | TuttoLogico
A beautiful, hands-on explanation of git internals, as a step by step guide to reimplementing your own git: Git Internals - Learn by Building Your Own Git
I recently tried meson and liked it a lot. I then gave unity builds a try, since it supports them out of the box, and found myself with doubts. I found I wasn't alone, and I liked The Evils of Unity Builds as a summary of the situation.
A point of view I liked on technological debt: Technical debt as a lack of understanding
Finally, a classic, and a masterful explanation for a question that keeps popping up: RegEx match open tags except XHTML self-contained tags
Boring restaurants
While traveling around Germany, one notices that most towns have a Greek or Italian restaurant, and they all kind of have the same names. How bad is that lack of fantasy?
Let's play with https://overpass-turbo.eu/. Select a bounding box and run this query:
node
[cuisine=greek]
({{bbox}});
out;
Export the results as gpx and have some fun on the command line:
sed -nre 's/^name=([^<]+).*/\1/p' /tmp/greek.gpx \
| sed -re 's/ *(Grill|Restaurant|Tavern[ae]) *//g' \
| sort | uniq -c | sort -nr > /tmp/greek.txt
Likewise, with Italian restaurants, you can use cuisine=italian
and something like:
sed -nre 's/^name=([^<]+).*/\1/p' /tmp/italian.gpx \
| sed -re 's/ *(Restaurant|Ristorante|Pizzeria) *//g' \
| sort | uniq -c | sort -nr > /tmp/italian.txt
Here are the top 20 that came out for Greek:
162 Akropolis
91 Delphi
86 Poseidon
78 Olympia
78 Mykonos
78 Athen
76 Hellas
74 El Greco
71 Rhodos
57 Dionysos
53 Kreta
50 Syrtaki
49 Korfu
43 Santorini
43 Athos
40 Mythos
39 Zorbas
35 Artemis
33 Meteora
29 Der Grieche
Here are the top 20 that came out for Italian, with a sadly ubiquitous franchise as an outlier:
66 Vapiano
64 Bella Italia
59 L'Osteria
54 Roma
43 La Piazza
38 La Dolce Vita
38 Dolce Vita
35 Italia
32 Pinocchio
31 Toscana
30 Venezia
28 Milano
28 Mamma Mia
27 Bella Napoli
25 San Marco
24 Portofino
22 La Piazzetta
22 La Gondola
21 Da Vinci
21 Da Pino
One can play a game while traveling: being the first to spot a Greek or Italian restaurant earns more points the more unusual its name is. But beware of being too quick! If you try to claim points for one of the restaurant with the top-5 most common names, you will actually will actually lose points!
Have fun playing with other combinations of areas and cuisine: the Overpass API is pretty cool!
Update:
Rather than running xml through sed, one can export geojson, then parse it with the excellent jq:
jq -r '.features[].properties.name' italian.json \
| sed -re 's/ *(Restaurant|Ristorante|Pizzeria) *//g' \
| sort | uniq -c | sort -nr > /tmp/italian.txt