Tag Archives: D

Import Wikipedia page history to git

I’ve written a small tool which downloads the history of a Wikipedia article, converts it and imports it into a new git repository. The main motivation behind writing it is being able to perform a per-line blame of the article’s history. I had tried levitation, but that tool seemed to be oriented towards large imports (or it might just be buggy), as it attempted to create huge binary files and ran longer than my patience would allow when I gave it the history of just one article. Also, I wanted the tool to take care of the downloading and importing part – so I could be one command away from a git repository of any WP article.

The tool can be made faster (all the XML and string management stuff adds an overhead), but right now it’s fast enough for me. One thing that can be optimized is making it not load the entire input XML into memory – it’s possible to do the conversion by “streaming” the XML. Another current limitation is that it’s currently hard-wired to the English Wikipedia.

Requires curl and (obviously) git. You’ll need a D1 D2 compiler to compile the code.

August 2013 update: Updated to D2. Now creates the directory automatically. Added --keep-history switch.

Source, Windows binary.

Announcing: RABCDAsm

RABCDAsm (Robust ABC (ActionScript Bytecode) [Dis-]Assembler) is a collection of utilities including an ActionScript 3 assembler/disassembler, and a few tools to manipulate SWF files.

This package was created due to lack of similar software out there.
Particularly, I needed an utility which would allow me to edit ActionScript 3 bytecode (used in Flash 9 and newer) with the following properties:

  • Speed. Less waiting means more productivity. rabcasm can assemble large projects (>200000 LOC) in under a second on modern machines.
  • Comfortably-editable output. Each class is decompiled to its own file, with files arranged in subdirectories representing the package hierarchy. Class files are #included from the main file.
  • Most importantly – robustness! If the Adobe AVM can load and run the file, then it must be editable – no matter if the file is obfuscated or otherwise mutilated to prevent reverse-engineering. RABCDAsm achieves this by using a textual representation closer to the ABC file format, rather than to what an ActionScript compiler would generate.

Read more on the project’s homepage on GitHub.