Splicing git repositories

Some projects eventually get split up into multiple source repositories for whatever reason. Sometimes, it is useful however to present the project as a single repository – it’s more difficult to examine a project’s development history when it is scattered among several repositories. Specifically, git bisect will not be very helpful if there is strong inter-dependency between the repositories, such as with the reference D programming language implementation. Or you may just not like the layout, which pushes you into using a complicated build process you may not want to use, which was the case for me with the new DerelictOrg repositories.

I have created d-dot-git a while ago to solve the first problem. This program creates a new repository which contains the D component repositores as submodules, referencing mainline commits in each repository through a history which chronologically follows all D repositories. The result is a repository with a linear history which you can then easily use git bisect with. You can see the resulting repository here. Digger takes the chore of setting up git bisect run and building D away, and allows you to specify a D source code test case directly. I’ve also covered this during part 2 of my DConf 2014 presentation (“Reducing D Bugs”).

I wanted to do something different for Derelict – submodules wouldn’t cut it. The current structure of the Derelict project (it has a history of moves and refactorings) consists of a DerelictOrg GitHub organization, with a repository for each Derelict component. The repositories have a partially-overlapping directory structure (each repository has a source/derelict directory, which then contains the sub-package with the respective component source code). I’d have liked one repository with the root being the contents with the derelict package being the repository root, thus containing the component sub-packages at the root level directly. This would allow using the repository as a git submodule: one would only need to clone a project with --recursive, and no further setup or dependencies are needed (aside from rdmd, which is included with D).

There are existing approaches and tools that could’ve worked to achieve this:

  • git filter-branch followed by a subtree merge would be OK for a one-off conversion. However, I wanted a live mirror, which would keep itself in sync with the latest DerelictOrg changes.
  • David Fraser's splice-repos and Philippe Bruhat's git-stitch-repo solve the same problem.
    However, they use git-fast-import/export to move all the data around, which makes them less efficient than they could be, as they need to shuffle all the data from the target directories directly.

The program I wrote, git-splice-subtree, works by reading and writing raw Git objects from the Git database. Since the files and the subtrees are not affected, the program only concerns itself with creating git commit and tree objects that point to the existing ones in the right pattern. It does this by fetching all repositories' objects into a single one, then spawning batch git database object readers (git cat-file) and writers (git hash-object) and piping data to them directly. By not touching any of the repositories' actual files, and spawning as little processes overall as possible, it is quite speedy (almost instant on the DerelictOrg repositories, which is great for a cronjob). I've also managed to avoid having to choose between redundant disk writes for temporary files, or spawning a process for every commit, by tricking git hash-object into reading from the same named pipe over and over.

DerelictMerge generates the spec file for git-splice-subtree. The resulting repository is on BitBucket (to avoid unintentionally pinging any of the authors on GitHub).

Functional image processing in D

I’ve recently completed an overhaul of the graphics package of my D library. The goals for the overhaul were inspired by D’s std.algorithm and std.range modules:

  • Present everything as small, composable components
  • Avoid implicit copying and prefer lazy evaluation
  • Use templates for efficient code

From its first iteration, all components of the image processing package were templated by the color type. This is not a conventional way to implement graphics libraries – most libraries abstract away the exact color type the image uses behind an OOP interface, or they simply convert all images to a single in-memory pixel format. However, for most cases, this is wasteful and inefficient – usually, the programmer already knows the exact format that an image will be in, with notable exceptions being applications in which the image data comes from user input (e.g. image editors). Instead, this library declares all image types as templates, with the type indicating the image’s color being a template parameter.

I’m rather pleased with the result of this overhaul, so I’d like to share some highlights in this article. Continue reading

Low-overhead components

My personal dream of an ideal programming language is one that allows defining flexible, configurable components that can be coupled together with very little overhead, producing in the end code that, if reverse-engineered, would appear to be hand-written and optimized for the specific task at hand. Preconfigured component configurations / presets would be available for common use, favoring safety and feature-completeness, but for performance-critical cases, the programmer could break them down and strip out unneeded features to reduce overhead, or customize by injecting their own components into the mix. Continue reading

Installing PHP and Apache module under /home

Let’s say you have your own Apache 2 setup in your home directory, and you want to build and install PHP as well, and set it up as an Apache module without root privileges (e.g. if you want to use a different PHP version than the one installed globally).

You may run into problems such as PHP’s configure script not detecting apxs2 (and thus not building an Apache module). Continue reading

Very Sleepy fork

I got fed up with waiting/pestering Richard Mitton (aka Kayamon / @grumpygiant) to integrate the Very Sleepy patches I’ve sent him last year or putting the code on a software forge, so I’m publishing my patches on GitHub myself.

Very Sleepy is a polling Windows profiler with a wxWidgets-based GUI. This is a fork of the latest released version at the time of writing (0.82).

There have been a few more forks of Very Sleepy (e.g. here and here), but these are based off older versions. It’s possible that their changes had already been merged into the official version.

Update: I’ve continued development of my fork, at the above-mentioned location. Check the GitHub project for the changelog, downloads, and more information.

DHCP test client

While trying to set up my home network, I was dismayed that there was no simple way to test the DHCP server. Snooping packets is limited to examining existing traffic.

DHCP test tools exist (DHCPing and dhquery), however both are outdated and don’t work with the latest versions of their requirements, and both won’t work on Windows.

I’ve written a simple DHCP “client” which can receive and decode broadcasted DHCP replies, as well as send out DHCP “discover” packets.

Source, download.