Solving XKCD 1683

xkcd 1683 - Digital Data

xkcd 1683 - “Digital Data”

Let’s say you are building an archive of the Internet, including all of the cat pictures and funny memes. You only have a limited amount of space, so you really want to use that space in the best way you can.

As illustrated, one way that digital data can “degrade” is through reposts and re-uploads: sharing images and videos often involves a process that performs a lossy transformation, such as re-uploading to media hosting platforms which re-compress all uploaded files, or saving the image by taking a screenshot or recording a video on their device. (This phenomenon is known as generation loss.) So, if you have many copies of the same image of various quality, you want to save the one closest to the original, i.e. the one with the least amount of noise. Looking at millions of cat pictures would be a nice way to spend a lifetime but would take too long to actually archive them—can we build a program to do that for us instead? Continue reading →

btdu - sampling disk usage profiler for btrfs


Here is a tool I’ve been dreaming about ever since I got really into btrfs snapshots for home and server backups: a sampling disk usage profiler! Unlike classical disk usage profilers, btdu doesn’t attempt to scan the directory tree starting from the top, but just picks random points on the disk and sees what’s on them.

One nice thing about btdu is that it starts showing results instantly. You only need 100 random samples to have a resolution of 1%, which is generally enough to know what ate your disk space. Even on very slow drives, resolving 100 samples takes very little time. It also works correctly with deduplication (snapshots / cloning) and compression.

More info and discussions:

Datamining Bandersnatch

You may have heard about Bandersnatch, an interactive film released on Netflix as part of the Black Mirror series. I’ve heard about it when it was released, but didn’t get around to watch it until recently, and I was surprised at how deep and thorough the implementation is. The film consists of:

  • 250 segments
  • 62 variables (mostly boolean, some having 3 or 4 possible values)
  • 174 choices
  • 111 segment groups (points at which variables are used to decide the outcome)
  • and 241 preconditions (boolean expressions used in segment groups and elsewhere).

That’s more data than you can shake a stick at, so I spent a few days down the rabbit hole and wrote some code to pick it apart. The fruits of this endeavor can be summarized as:

  1. An understanding of the data format used by the film.
  2. A working stand-alone player for the game’s data files.
  3. An actually complete yet mostly-readable flowchart containing all of the movie’s scenes and logic.
  4. An assortment of interesting observations, such as bugs in the script.

Let’s go over these in order: Continue reading →

D compilation is too slow and I am forking the compiler

While working on my current project, the constant creep of increasing compilation times was becoming more and more noticeable. Even after throwing my usual tools at the problem, the total time was still over 7 seconds.

Seven. Seconds!

Seven. Seconds. Unacceptable‼

Compile time profiling showed that the blame lay with my liberal use of metaprogramming and std.regex, which I wasn’t willing to give up on. The usual approach to reducing D build times is to split the program into packages, compile one package at a time (to a static library or object file), use D “header” files (.di) to avoid parsing implementations more often than necessary, then link everything together. However, this was too much work, didn’t fit neatly into my existing toolchain, and I wanted to try something else. Continue reading →

New Blog

In case you’ve been wondering why I haven’t posted anything here for the past 3½ years - well, it’s certainly not due to there being nothing to post. The true reason is that, simply put, my dislike for WordPress has surpassed the desire to document what I’ve been doing.

And so, as my backlog of things to document grew, the priority of having a way to do it that doesn’t force me to suffer a mediocre WYSIWYG editor and the limits of browser textarea / HTML editing also grew to the point where it could not be delayed further.

After some messing around, I’m glad to announce that I’ve salvaged my back catalog of posts from WordPress’s clutches, and converted everything to a nice Hugo website, where content is Markdown, versioning is Git, and the editing UI is the text editor of your (my) choice. (If nothing looks different, that would be because I may have been a bit overly meticulous in preserving the current style, with all of its flaws.) Continue reading →


I was playing a bit with targeting JavaScript / WebAssembly from D. Sebastien Alaiwan already did a great job of putting together a working toolchain, however it was aiming for a very narrow purpose (simple SDL games).

To improve on the situation, I wrote some wrappers around the toolchain which allow using existing D build tools and workflows for targeting emscripten / WASM. These come in the form of programs with the same general command-line syntax as dmd and rdmd, so build tools such as Dub should be usable. I’ve also included a copy of Phobos / Druntime hacked enough to get things like writeln working. Garbage collection is currently stubbed - allocations work but the memory is never freed (until the page is reloaded, of course). Continue reading →

Profiling DMD's Compilation Time with dmdprof

Compilation time is frequently linked to productivity, and is of particular interest to D programmers. Walter Bright designed the language to compile quickly (especially compared to C++), and DMD, the reference D compiler implementation, is itself quite optimized.

Direct quote from a colleague (after a long day of bisecting a regression in Clang, each step taking close to an hour):

Whaaaaaaaaaaat? How can a compiler compile in 3 seconds?!?

Indeed, a clean build of DMD itself (about 170’000 lines of D and 120’000 lines of C/C++) takes no longer than 4 seconds to build on a rather average developer machine. Code which takes advantage of more advanced language features, like string mixins and CTFE, compiles slower; on this subject, Dmitry Oslhansky is working on CTFECache for caching CTFE execution across compiler invocations, and there’s of course Stefan Koch’s new CTFE implementation. Continue reading →


term-keys is a package which allows configuring Emacs and a supported terminal emulator to handle keyboard input involving any combination of keys and modifiers.

I created it out of frustration with trying to get work done remotely, while I was abroad. Even with the help of aconfmgr, I still had projects that would be too cumbersome to have their development environment replicated locally. I was also experimenting with working on my phone + Bluetooth keyboard at the time.

Emacs allows remote interaction through two methods: X11 forwarding and TTY. Unfortunately, even though X11 forwarding works great on LAN and Wi-Fi, it doesn’t do so great when the latency is high, as the protocol requires many round-trips for most meaningful operations; which leaves TTY. Continue reading →


rtf2any is a D library for parsing, converting and emitting RTF (Rich Text Format) documents.

We use the library mostly to work with Worms Armageddon’s update documentation and changelog (ReadMe). One user-visible application can be seen at the previous link, which contains the same RTF document converted to MediaWiki syntax and split up into multiple pages.

The library received a major update in 2018, as part of moving to XML as the primary format for the changelog.

Note: this article is back-dated, and was originally written on 2018-11-05.

Twilight Princess quick item selection

If you plan on playing Legend of Zelda: Twilight Princess on a PC (with the Dolphin emulator), you may find this mod useful. Unlike e.g. Ocarina of Time / Majora’s Mask, this game allocates only two slots for items, which means lots of time wasted going through menus to switch them.

With this mod, you can select any item directly by pressing a single key. Continue reading →