company-llama: llama.cpp backend for company-mode

I need to get back into the habit of announcing my projects, otherwise they just go into the void.

Anyway, company-llama is some glue between between company-mode and llama.cpp.

I created it mainly because I wanted to replace the proprietary TabNine, which modern models running on a GPU outperform anyway. Continue reading →

Modding for Godot

I spent some time working on modding support for ΔV: Rings of Saturn. This hard-science-fiction physics-based asteroid mining sim is made in Godot, an open-source game engine not too unlike the proprietary but more popular Unity.

Although the game had some rudimentary modding support, as well as two mods published, the previous approach was fragile and did not allow mod interoperability. There is currently little public information about modding Godot games, which made this an interesting exploratory project. I present my findings in this article. Continue reading →

Solving XKCD 1683

xkcd 1683 - Digital Data — *xkcd 1683 - “Digital Data”*

Let’s say you are building an archive of the Internet, including all of the cat pictures and funny memes. You only have a limited amount of space, so you really want to use that space in the best way you can.

As illustrated, one way that digital data can “degrade” is through reposts and re-uploads: sharing images and videos often involves a process that performs a lossy transformation, such as re-uploading to media hosting platforms which re-compress all uploaded files, or saving the image by taking a screenshot or recording a video on their device. (This phenomenon is known as generation loss.) So, if you have many copies of the same image of various quality, you want to save the one closest to the original, i.e. the one with the least amount of noise. Looking at millions of cat pictures would be a nice way to spend a lifetime but would take too long to actually archive them—can we build a program to do that for us instead? Continue reading →

btdu - sampling disk usage profiler for btrfs

Here is a tool I’ve been dreaming about ever since I got really into btrfs snapshots for home and server backups: a sampling disk usage profiler! Unlike classical disk usage profilers, btdu doesn’t attempt to scan the directory tree starting from the top, but just picks random points on the disk and sees what’s on them.

One nice thing about btdu is that it starts showing results instantly. You only need 100 random samples to have a resolution of 1%, which is generally enough to know what ate your disk space. Even on very slow drives, resolving 100 samples takes very little time. It also works correctly with deduplication (snapshots / cloning) and compression.

More info and discussions:

GitHub (homepage and downloads)
Reddit, Hacker News, linux-btrfs, D Forum

Datamining Bandersnatch

You may have heard about Bandersnatch, an interactive film released on Netflix as part of the Black Mirror series. I’ve heard about it when it was released, but didn’t get around to watch it until recently, and I was surprised at how deep and thorough the implementation is. The film consists of:

250 segments
62 variables (mostly boolean, some having 3 or 4 possible values)
174 choices
111 segment groups (points at which variables are used to decide the outcome)
and 241 preconditions (boolean expressions used in segment groups and elsewhere).

That’s more data than you can shake a stick at, so I spent a few days down the rabbit hole and wrote some code to pick it apart. The fruits of this endeavor can be summarized as:

An understanding of the data format used by the film.
A working stand-alone player for the game’s data files.
An actually complete yet mostly-readable flowchart containing all of the movie’s scenes and logic.
An assortment of interesting observations, such as bugs in the script.

Let’s go over these in order: Continue reading →

D compilation is too slow and I am forking the compiler

While working on my current project, the constant creep of increasing compilation times was becoming more and more noticeable. Even after throwing my usual tools at the problem, the total time was still over 7 seconds.

Seven. Seconds! — *Seven*. **Seconds**. *Unacceptable‼*

Compile time profiling showed that the blame lay with my liberal use of metaprogramming and std.regex, which I wasn’t willing to give up on. The usual approach to reducing D build times is to split the program into packages, compile one package at a time (to a static library or object file), use D “header” files (.di) to avoid parsing implementations more often than necessary, then link everything together. However, this was too much work, didn’t fit neatly into my existing toolchain, and I wanted to try something else. Continue reading →

New Blog

In case you’ve been wondering why I haven’t posted anything here for the past 3½ years - well, it’s certainly not due to there being nothing to post. The true reason is that, simply put, my dislike for WordPress has surpassed the desire to document what I’ve been doing.

And so, as my backlog of things to document grew, the priority of having a way to do it that doesn’t force me to suffer a mediocre WYSIWYG editor and the limits of browser textarea / HTML editing also grew to the point where it could not be delayed further.

After some messing around, I’m glad to announce that I’ve salvaged my back catalog of posts from WordPress’s clutches, and converted everything to a nice Hugo website, where content is Markdown, versioning is Git, and the editing UI is the text editor of your (my) choice. (If nothing looks different, that would be because I may have been a bit overly meticulous in preserving the current style, with all of its flaws.) Continue reading →

dscripten-tools

I was playing a bit with targeting JavaScript / WebAssembly from D. Sebastien Alaiwan already did a great job of putting together a working toolchain, however it was aiming for a very narrow purpose (simple SDL games).

To improve on the situation, I wrote some wrappers around the toolchain which allow using existing D build tools and workflows for targeting emscripten / WASM. These come in the form of programs with the same general command-line syntax as dmd and rdmd, so build tools such as Dub should be usable. I’ve also included a copy of Phobos / Druntime hacked enough to get things like writeln working. Garbage collection is currently stubbed - allocations work but the memory is never freed (until the page is reloaded, of course). Continue reading →

Profiling DMD's Compilation Time with dmdprof

Compilation time is frequently linked to productivity, and is of particular interest to D programmers. Walter Bright designed the language to compile quickly (especially compared to C++), and DMD, the reference D compiler implementation, is itself quite optimized.

Direct quote from a colleague (after a long day of bisecting a regression in Clang, each step taking close to an hour):

Whaaaaaaaaaaat? How can a compiler compile in 3 seconds?!?

Indeed, a clean build of DMD itself (about 170’000 lines of D and 120’000 lines of C/C++) takes no longer than 4 seconds to build on a rather average developer machine. Code which takes advantage of more advanced language features, like string mixins and CTFE, compiles slower; on this subject, Dmitry Oslhansky is working on CTFECache for caching CTFE execution across compiler invocations, and there’s of course Stefan Koch’s new CTFE implementation. Continue reading →

term-keys

term-keys is a package which allows configuring Emacs and a supported terminal emulator to handle keyboard input involving any combination of keys and modifiers.

I created it out of frustration with trying to get work done remotely, while I was abroad. Even with the help of aconfmgr, I still had projects that would be too cumbersome to have their development environment replicated locally. I was also experimenting with working on my phone + Bluetooth keyboard at the time.

Emacs allows remote interaction through two methods: X11 forwarding and TTY. Unfortunately, even though X11 forwarding works great on LAN and Wi-Fi, it doesn’t do so great when the latency is high, as the protocol requires many round-trips for most meaningful operations; which leaves TTY. Continue reading →