Reverse-engineering and deobfuscation of Flash apps

by on Feb.12, 2010, under Code

I probably should have known better when I started down this path with “cracking” a certain flash game. This has taken way more of my time than I had initially planned. On the other hand, now I know how to use Adobe Flash Builder, Adobe Flex SDK, XML schemas and JAXB and brushed up on my Java as well.

My previous versions of the cheat consisted of a Mozilla Firefox extension which redirected requests for the game SWF (and data.xml, a configuration file) to my server. The server (configured as a HTTP proxy) sent back an older version of the game SWF, which still had some debugging code left in which allowed you to edit the game board using hotkeys. The game developer has once changed the MD5 salt used to calculate validation checksums (after removing the debugging code), but I got away with it once by uncompressing (gzip) the SWF file, hex-editing the salt, and compressing it back. However, the latest build of the game was obfuscated, so no such tricks would pass this time.

In retrospect (as I wrote on the cheat page), I could have used simpler techniques such as memory editing and replay attacks, but all these could eventually be “patched up”. It would probably even have been simpler if I had written a bot (or just updated the old one) – there really isn’t much to do against screen-scraping bots, other than heuristics and changing the UI every once in a while. Still, I have written an ActionScript 3 deobfuscator.

I began my research with the Adobe Flex SDK. Considering that it includes a full open-source ActionScript/SWF compiler and utilities, I thought it would have some code which I could adapt to my needs. I was right, in fact there was actually an overlap/duplication of functionality in some classes. I have created the following “map” during my spelunking of the SDK source:

SWF

flash.swf.TagHandler: interface to handle SWF tags
flash.swf.TagDecoder: InputStream -> TagHandler
flash.swf.TagEncoder: TagHandler -> OutputStream/byte[]
flash.swf.MovieDecoder: TagHandler -> Movie
flash.swf.MovieEncoder: Movie -> TagHandler
flash.swf.tools.SwfxPrinter: TagHandler -> PrintWriter (Swfx)
flash.swf.tools.SwfxParser: XML (Swfx) -> TagHandler

Actions

flash.swf.ActionHandler: interface to handle actions(?)
flash.swf.tools.Disassembler: ActionHandler -> PrintWriter
flash.swf.ActionEncoder: ActionHandler -> SwfEncoder
flash.swf.ActionDecoder: SwfDecoder -> ActionList

ABC

macromedia.abc.Visitor: interface to handle ABC instructions
macromedia.abc.Decoder: BytecodeBuffer -> Visitor
macromedia.abc.Encoder: Visitor -> byte[]
macromedia.abc.DefaultVisitor: Visitor -> higher-level interface
Visitors (DefaultVisitor at least) still need reference to the original decoder
macromedia.abc.Printer.ABCVisitor: Visitor -> System.out
flash.swf.tools.AbcPrinter: byte[] -> PrintWriter
Used by flash.swf.tools.SwfxPrinter

Nodes

macromedia.asc.parser.Evaluator: interface to handle macromedia.asc.parser.*Node
flash.swf.tools.SyntaxTreeDumper: Evaluator -> XML
flash.swf.tools.as3.PrettyPrinter: Evaluator -> PrintWriter
macromedia.asc.semantics.Emitter: interface for handling ABC instructions?
macromedia.asc.semantics.CodeGenerator: Evaluator -> Emitter
macromedia.asc.embedding.avmplus.ActionBlockEmitter: Emitter -> ByteList

Optimizer

adobe.abc.GlobalOptimizer: reads/writes .abc files, uses adobe.abc.Expr/Node/Edge/etc.
It looks like the adobe.abc namespace is only used for optimization of already-compiled code.

Unfortunately, I haven’t found any code capable of robust deserialization of ABC. (macromedia.abc.Encoder/Decoder come close to that, however they choke on some obfuscated code and produce broken output.) I haven’t written a deserializer, but instead I wrote a class (based on flash.swf.tools.AbcPrinter) that allows its subclasses to edit ABC structures “on the fly”.

The deobfuscator classes (the main class extending TagEncoder, to also edit SWF tags “on the fly”) performed two functions: correct identifier names in SymbolClass tags and ABC string constant pools, and rearrange code blocks to eliminate dead “junk” code inserted by the obfuscator.

The first function is fairly self-explanatory: it searched for strings starting with “_-” (the obfuscator’s prefix for all obfuscated names) or containing ActionScript keywords, and replacing them with strings which an ActionScript compiler would be more happy with. This step wasn’t actually necessary as the decompiled code wasn’t directly reusable anyway. It also allowed replacing strings based on a dictionary read from a text file, which allowed me to “rename” identifiers (similar to IDA).

The second feature is more interesting. The gist is that it splits each function into blocks (delimited at jumps and jump targets), constructs a flow graph, and then rewrites the code by traveling across the graph starting with the first node, and writing out the blocks in visited order. Practically, this is done by in-place patching relative jump offsets then writing the same bytecode in a different order (inserting labels and jumps where appropriate). Thus, the code doesn’t actually create in-memory copies of the bytecode. This approach has some side effects though, in that all unconditional jumps are rewritten to be strictly backwards, and this seems to break the AVM as it will spit out a cryptic error such as “VerifyError: Error #1068: package.class and package.class cannot be reconciled” (yes, the very same class). However, it makes decompilers much more happy with the new code, and that’s what mattered to me most.

The deobfuscator allowed me to study the game’s code much easier, however that wasn’t enough to accomplish my goal. It looks like the obfuscator used for this game also had a “string encryption” feature, and the MD5 salt used for validating sent and received data was encrypted using this feature (namely, a class which contained an encrypted version of the script and allowed decrypting it at runtime). Code generated by a decompiler was unusable, and the encryption algorithm (which I later recognized as the Tiny Encryption Algorithm) was too long to decipher from bytecode directly. I needed some way to execute that code without decompiling it. After initially planning to do some string replacement tricks and merging ABC code, I noticed that the respective class was in a separate DoABC tag (the deobfuscator must have written that block manually). Then, running the code was as simple as:

  1. extracting the contents of the DoABC tag to an .abc file (AbcExport)
  2. creating an ActionScript 3 application with the code:
    var c:Class = getDefinitionByName("_-GF._-FA") as Class;
    var method:Function = c["_-NR"];
    throw new Error(method(117, -75));
  3. injecting the .abc file from step 1 into the resulting SWF (AbcReplace)

and you get the magic string in a Flash Player exception box :D

You can find the source code for my deobfuscator and other utilities on GitHub. Please don’t ask for help in using the source code, you’re on your own.

Continued in: Announcing: RABCDAsm

:, , , , , ,

50 Comments for this entry

  • mj

    awesome job man….. thanks for the hard work. just be responsible everyone.

  • Kim

    You are the best!!!!!!!!!!!!!!!!!!!!!!

  • Lou

    excellent job fella u r one clever mofo

  • mith

    amazing work mate.. hats off. :D

  • Joanna

    just wanted 2 say thanks for all the hard work that you do :) thank you

  • rback15

    Im guessing this tricking is working by the above replies. Can someone help me understand what im suppose to do here.. Im lost, thank you

  • TheSeeker

    Great post, and thanks for making the source available. I found it quite useful.

  • Bernard

    I’m not able to compile this for some reason? Using javac tools\*.java I’m getting 87 errors

  • azndrifter

    this is by far lots of work. and very much appreciated. thank you so much for your hard work. as much as it takes hard work and takes your time. on the plus side learning and recapping is the best to keep knowledge alive.. your skills will only get sharper. :)

  • locotomico

    very cool, good job

  • cvx

    Thank you for sharing this tool!
    Found so far two swf files that abcexport can’t process.
    http://www.filesonic.com/file/18485071/DC182DF8d01.swf
    (originally taken form xblaster.wp.pl)
    Second one is taken from same game (arena part).

    Keep up great work! :)

    • CyberShadow

      These Java tools are mostly obsoleted by RABCDAsm… you should look at that instead. It doesn’t have any deobfuscation, but it would be a lot easier to write a deobfuscator based on RABCDAsm than on the Flex framework, as I’ve done here.

    • CyberShadow

      By the way, the file you posted doesn’t seem to contain any DoABC tags… are you sure this is a file containing ActionScript 3 code?

  • cvx

    I’m not even newbie in this subject so I can’t be sure yet what I’m looking at when messing with swf files.

    This is code of one of obfuscated action scripts from inside of posted file.
    http://pastebin.com/mnDf5KDq

    Obviously it’s AS, but since there were no DoABC tags in posted file a logical explanation would be that this script is different from pure AS 3. Question remains, is this AS is pre 3 or file was modified somehow to harden any RE attempts?

    My wild guess is it’s just AS 2 and your tool works perfectly fine :)

  • CyberShadow

    It’s obviously not AS3 – AS3 has no eval function :)

  • zzz

    Hi, what is preferred flash compiler?

    thx

  • CyberShadow

    I can’t recommend a particular one, but if you happen to look for a disassembler, have a look at RABCDAsm: http://blog.thecybershadow.net/2010/05/05/announcing-rabcdasm/

  • zzz

    Hi, I’m not able to understand assembler yet :( but I got what I needed to do by trial and error in hex editor.

    crude but it work for simple tasks.

  • George Morbedadze

    Hi,
    what you have done is really very interesting and it’s great job, but I can not run RABCDAsm on my pc.
    I’m using Windows7, 64bit.
    I have dmd, git and svn installed.
    Maybe you know where can I find any information about using RABCDAsm, reverse-engining and deobfuscation of SWF files.
    Thenks in advance
    George

  • George Morbedadze

    Hi,
    After I begin hacking on a SWF file(abcexport file.swf),
    I can not find created a file0 directory.
    …or maybe I did something wrong?

  • George Morbedadze

    Hi CyberShadow,
    I have the question: can I use it for AS2, or adjast for AS2?

  • Jesse Nicholson

    Hi There,

    Just wanted to say, big, big thanks for this. I want to know if you’re making this code just plain public domain or under any type of license, or, if I need permission to use it in an application? I’ve literally been at my computer for 13 hours straight digging through websites, tutorials etc on different ways to properly extract ABC from current swf versions (There are many solutions that modify the ABC to be human readable, etc). The most promising route was downloading tamarin source, building it out and using it to compile ABCDump, but that for some ridiculous reason just would not work out. I had it working in the past but, they’ve done a lot of restructuring to the tamarin branches and I think it’s ongoing, and has broken some things. Anyway let me know, huge thanks for your work, once I got the source I was up and running with a packed jar working in a couple minutes. Great stuff!

    • CyberShadow

      Hi Jesse,

      Since the code is partially a derivative work of the Flex SDK, it’s subject to the SDK’s redistribution terms.

      You may also want to have a look at RABCDAsm.

  • Cyber clone

    Hello Cybershadow.
    Thanks for all your hard work here.

    I’ve a swf file after abcexport, you get like 600+.abc file.

    rabcasm swfname600/swfname600.main.asasm
    abcreplace CandyCrush.swf 600 swfname600/swfname600.main.abc

    result into a broking swf.
    any Idee what can be wrong here ?

  • Cyber clone

    Hello and again many Thanks for you’re hard work add into this project, much appreciated.

    I can’t download the swf above.

    But Good news is, That the new version of RABCDAsm did the job alright version 1.5 Rocks.

    regards,

    Cyberclone

  • Mick the Wig

    You, Sir, are a gentleman. These tools have made my life much easier and have almost resolved My SWF Hell. .

    Bestios etc.

  • George

    Hello Cybershadow,
    RABCDAsm is really great job and very interesting for me!
    I want to make FLA file from SWF file, using Sothink SWF Decompiler.
    To do this, before decompiling I am using RABCDAsm and after decompiler program.
    After all of this, when I Testing Movie in FLASH Professional CS5, I get 70 compiler errors:

    Warning: 3596: Duplicate variable definition.

    I can not understand what I did wrong?
    I will thankfull if you can help me to find the reason.

    Here are the SWF files, before and after using RABCDAsm:
    http://tg.cdn.ge/0/web.swf
    http://tg.cdn.ge/0/web(RABCDAsm_modified).swf

    Thenks in advance
    With kind Regards
    George

    • CyberShadow

      RABCDAsm has nothing to do with FLA files, and little with ActionScript the language or Flash Professional. I think you have severely misunderstood the purpose or the mode of use of RABCDAsm.

  • Alex

    Hi!

    First of all, thanks for your great stuff!

    I’m using your RABCDasm a lot, nowadays.

    Anyway I have the necessity to get the names of some function/variable names from a swf I have (they are replaced with ‘_-06′ etc)..

    Given that RABCDAsm does not include any sort of deobfuscation-related stuff, I started reading this post.

    But I think I’ve not completely understood the last part: could you explain me how did you manage to get the real string (using that ActionScript application etc)?

    And if I would let my application to do this automatically for each string in the file (writing a deobfuscator), have you some hints about how I could accomplish this?

    Thanks in advance! :D

    • CyberShadow

      The original strings cannot be found anywhere. This post is about control flow obfuscation and string literal encryption, not name obfuscation.

      • Alex

        So, there is no chance to get the name of a local variable in a class?

        • CyberShadow

          No, the original names are permanently removed during obfuscation. You could make up your own names and use search and replace to apply them.

          • Alex

            Is then possible to recognize the obfuscator (maybe the name is placed somewhere in the file) used to crypt the variables etc?

            If the obfuscator (and his algorithm) are known, then I suppose it would be possible to reverse-engineering it to get the original names…

  • Amitesh

    Hi Dear,

    My requirement is to edit the images used in SWF to do so. Used swfmill, I first convert to xml. do the required changes in xml(related to images DefinebitJpeg2 and lossless) and publish the swf again using swfmill.

    when i publish the swf I am loosing the actions like click on Replay button does not work in convered swf file.

    Step 1 – swfmill swf2xml input.swf output.xml
    Step 2 – swfmill xml2swf output.xml swfmill.swf

    After conversion swfmill.swf is broken since button clicks does not work.

    Could you please let me know if I need to do any code change or any alternate way to do swf -> xml and xml -> swf.

    I also used swfdump input.swf > output.xml, but I dont know how to convert the output.xml to SWF.

    Please let me know is there any safe way(without hurting any feature) to get XML and convert is back to SWF. thanks in advance.

  • Amitesh

    Thanks a lot for quick response.

    I need it on Linux either as Java API call or command line tool(so that I can integrate it in my app). SWix seems having only windows support.

    • CyberShadow

      SWiX has a command-line program. You may have some luck running it in wine / wineconsole.

      I don’t have any ideas other than swix, or using some library. Since you mentioned Java, you may want to look at the Flex SDK (discussed in this blog article), since it’s open-source and in Java.

  • Amitesh

    I can use flxsdk but is there any document that can provide me APIs to publish swf from (swfdump xml)?

    • CyberShadow

      I don’t know. Probably not. I had to figure things out by reading the source code when I wrote this article.

  • Mick the Wig

    Again, great tools. Only problem is that they take a while to run if the SWF file is big.

    abcexport
    abcreplace
    rabcdasm

    Have you any interest in compiling the applications above in CUDA, so that they would run quicker? I’m calling them from the command line in C#.Net.

    The (trial-and-error) process of (a) dissassemble (c) edit (d) reassemble (e) decompile and check changed code is a bit slow on the CPU.

    Is there anything I could do to make this run faster?

    Thanks.

Leave a Reply

*

Looking for something?

Use the form below to search the blog: