Reverse-engineering and deobfuscation of Flash apps

I probably should have known better when I started down this path with “cracking” a certain flash game. This has taken way more of my time than I had initially planned. On the other hand, now I know how to use Adobe Flash Builder, Adobe Flex SDK, XML schemas and JAXB and brushed up on my Java as well.

My previous versions of the cheat consisted of a Mozilla Firefox extension which redirected requests for the game SWF (and data.xml, a configuration file) to my server. The server (configured as a HTTP proxy) sent back an older version of the game SWF, which still had some debugging code left in which allowed you to edit the game board using hotkeys. The game developer has once changed the MD5 salt used to calculate validation checksums (after removing the debugging code), but I got away with it once by uncompressing (gzip) the SWF file, hex-editing the salt, and compressing it back. However, the latest build of the game was obfuscated, so no such tricks would pass this time.

In retrospect (as I wrote on the cheat page), I could have used simpler techniques such as memory editing and replay attacks, but all these could eventually be “patched up”. It would probably even have been simpler if I had written a bot (or just updated the old one) – there really isn’t much to do against screen-scraping bots, other than heuristics and changing the UI every once in a while. Still, I have written an ActionScript 3 deobfuscator.

I began my research with the Adobe Flex SDK. Considering that it includes a full open-source ActionScript/SWF compiler and utilities, I thought it would have some code which I could adapt to my needs. I was right, in fact there was actually an overlap/duplication of functionality in some classes. I have created the following “map” during my spelunking of the SDK source:

SWF

flash.swf.TagHandler: interface to handle SWF tags
flash.swf.TagDecoder: InputStream -> TagHandler
flash.swf.TagEncoder: TagHandler -> OutputStream/byte[]
flash.swf.MovieDecoder: TagHandler -> Movie
flash.swf.MovieEncoder: Movie -> TagHandler
flash.swf.tools.SwfxPrinter: TagHandler -> PrintWriter (Swfx)
flash.swf.tools.SwfxParser: XML (Swfx) -> TagHandler

Actions

flash.swf.ActionHandler: interface to handle actions(?)
flash.swf.tools.Disassembler: ActionHandler -> PrintWriter
flash.swf.ActionEncoder: ActionHandler -> SwfEncoder
flash.swf.ActionDecoder: SwfDecoder -> ActionList

ABC

macromedia.abc.Visitor: interface to handle ABC instructions
macromedia.abc.Decoder: BytecodeBuffer -> Visitor
macromedia.abc.Encoder: Visitor -> byte[]
macromedia.abc.DefaultVisitor: Visitor -> higher-level interface
Visitors (DefaultVisitor at least) still need reference to the original decoder
macromedia.abc.Printer.ABCVisitor: Visitor -> System.out
flash.swf.tools.AbcPrinter: byte[] -> PrintWriter
Used by flash.swf.tools.SwfxPrinter

Nodes

macromedia.asc.parser.Evaluator: interface to handle macromedia.asc.parser.*Node
flash.swf.tools.SyntaxTreeDumper: Evaluator -> XML
flash.swf.tools.as3.PrettyPrinter: Evaluator -> PrintWriter
macromedia.asc.semantics.Emitter: interface for handling ABC instructions?
macromedia.asc.semantics.CodeGenerator: Evaluator -> Emitter
macromedia.asc.embedding.avmplus.ActionBlockEmitter: Emitter -> ByteList

Optimizer

adobe.abc.GlobalOptimizer: reads/writes .abc files, uses adobe.abc.Expr/Node/Edge/etc.
It looks like the adobe.abc namespace is only used for optimization of already-compiled code.

Unfortunately, I haven’t found any code capable of robust deserialization of ABC. (macromedia.abc.Encoder/Decoder come close to that, however they choke on some obfuscated code and produce broken output.) I haven’t written a deserializer, but instead I wrote a class (based on flash.swf.tools.AbcPrinter) that allows its subclasses to edit ABC structures “on the fly”.

The deobfuscator classes (the main class extending TagEncoder, to also edit SWF tags “on the fly”) performed two functions: correct identifier names in SymbolClass tags and ABC string constant pools, and rearrange code blocks to eliminate dead “junk” code inserted by the obfuscator.

The first function is fairly self-explanatory: it searched for strings starting with “_-” (the obfuscator’s prefix for all obfuscated names) or containing ActionScript keywords, and replacing them with strings which an ActionScript compiler would be more happy with. This step wasn’t actually necessary as the decompiled code wasn’t directly reusable anyway. It also allowed replacing strings based on a dictionary read from a text file, which allowed me to “rename” identifiers (similar to IDA).

The second feature is more interesting. The gist is that it splits each function into blocks (delimited at jumps and jump targets), constructs a flow graph, and then rewrites the code by traveling across the graph starting with the first node, and writing out the blocks in visited order. Practically, this is done by in-place patching relative jump offsets then writing the same bytecode in a different order (inserting labels and jumps where appropriate). Thus, the code doesn’t actually create in-memory copies of the bytecode. This approach has some side effects though, in that all unconditional jumps are rewritten to be strictly backwards, and this seems to break the AVM as it will spit out a cryptic error such as “VerifyError: Error #1068: package.class and package.class cannot be reconciled” (yes, the very same class). However, it makes decompilers much more happy with the new code, and that’s what mattered to me most.

The deobfuscator allowed me to study the game’s code much easier, however that wasn’t enough to accomplish my goal. It looks like the obfuscator used for this game also had a “string encryption” feature, and the MD5 salt used for validating sent and received data was encrypted using this feature (namely, a class which contained an encrypted version of the script and allowed decrypting it at runtime). Code generated by a decompiler was unusable, and the encryption algorithm (which I later recognized as the Tiny Encryption Algorithm) was too long to decipher from bytecode directly. I needed some way to execute that code without decompiling it. After initially planning to do some string replacement tricks and merging ABC code, I noticed that the respective class was in a separate DoABC tag (the deobfuscator must have written that block manually). Then, running the code was as simple as:

  1. extracting the contents of the DoABC tag to an .abc file (AbcExport)
  2. creating an ActionScript 3 application with the code:
    var c:Class = getDefinitionByName("_-GF._-FA") as Class;
    var method:Function = c["_-NR"];
    throw new Error(method(117, -75));
  3. injecting the .abc file from step 1 into the resulting SWF (AbcReplace)

and you get the magic string in a Flash Player exception box :D

You can find the source code for my deobfuscator and other utilities on GitHub. Please don’t ask for help in using the source code, you’re on your own.

Continued in: Announcing: RABCDAsm

51 thoughts on “Reverse-engineering and deobfuscation of Flash apps

  1. rback15

    Im guessing this tricking is working by the above replies. Can someone help me understand what im suppose to do here.. Im lost, thank you

    Reply
  2. azndrifter

    this is by far lots of work. and very much appreciated. thank you so much for your hard work. as much as it takes hard work and takes your time. on the plus side learning and recapping is the best to keep knowledge alive.. your skills will only get sharper. :)

    Reply
    1. CyberShadow Post author

      These Java tools are mostly obsoleted by RABCDAsm… you should look at that instead. It doesn’t have any deobfuscation, but it would be a lot easier to write a deobfuscator based on RABCDAsm than on the Flex framework, as I’ve done here.

      Reply
    2. CyberShadow Post author

      By the way, the file you posted doesn’t seem to contain any DoABC tags… are you sure this is a file containing ActionScript 3 code?

      Reply
  3. cvx

    I’m not even newbie in this subject so I can’t be sure yet what I’m looking at when messing with swf files.

    This is code of one of obfuscated action scripts from inside of posted file.
    http://pastebin.com/mnDf5KDq

    Obviously it’s AS, but since there were no DoABC tags in posted file a logical explanation would be that this script is different from pure AS 3. Question remains, is this AS is pre 3 or file was modified somehow to harden any RE attempts?

    My wild guess is it’s just AS 2 and your tool works perfectly fine :)

    Reply
  4. zzz

    Hi, I’m not able to understand assembler yet :( but I got what I needed to do by trial and error in hex editor.

    crude but it work for simple tasks.

    Reply
  5. George Morbedadze

    Hi,
    what you have done is really very interesting and it’s great job, but I can not run RABCDAsm on my pc.
    I’m using Windows7, 64bit.
    I have dmd, git and svn installed.
    Maybe you know where can I find any information about using RABCDAsm, reverse-engining and deobfuscation of SWF files.
    Thenks in advance
    George

    Reply
    1. CyberShadow Post author

      George, you don’t need to install any of that to run RABCDAsm. You can find compiled executable on GitHub’s download page.

      Reply
    1. CyberShadow Post author

      No, this is specifically for the ActionScript 3 Virtual Machine. You can use Flasm to disassemble AS2.

      Reply
  6. Jesse Nicholson

    Hi There,

    Just wanted to say, big, big thanks for this. I want to know if you’re making this code just plain public domain or under any type of license, or, if I need permission to use it in an application? I’ve literally been at my computer for 13 hours straight digging through websites, tutorials etc on different ways to properly extract ABC from current swf versions (There are many solutions that modify the ABC to be human readable, etc). The most promising route was downloading tamarin source, building it out and using it to compile ABCDump, but that for some ridiculous reason just would not work out. I had it working in the past but, they’ve done a lot of restructuring to the tamarin branches and I think it’s ongoing, and has broken some things. Anyway let me know, huge thanks for your work, once I got the source I was up and running with a packed jar working in a couple minutes. Great stuff!

    Reply
    1. CyberShadow Post author

      Hi Jesse,

      Since the code is partially a derivative work of the Flex SDK, it’s subject to the SDK’s redistribution terms.

      You may also want to have a look at RABCDAsm.

      Reply
  7. Cyber clone

    Hello Cybershadow.
    Thanks for all your hard work here.

    I’ve a swf file after abcexport, you get like 600+.abc file.

    rabcasm swfname600/swfname600.main.asasm
    abcreplace CandyCrush.swf 600 swfname600/swfname600.main.abc

    result into a broking swf.
    any Idee what can be wrong here ?

    Reply
    1. CyberShadow Post author

      Wow, I’ve never encountered a SWF file with more than 2-3 DoABC tags. I’m not sure where the problem could be. Can you send me the SWF file?

      Reply
  8. Cyber clone

    Hello and again many Thanks for you’re hard work add into this project, much appreciated.

    I can’t download the swf above.

    But Good news is, That the new version of RABCDAsm did the job alright version 1.5 Rocks.

    regards,

    Cyberclone

    Reply
  9. George

    Hello Cybershadow,
    RABCDAsm is really great job and very interesting for me!
    I want to make FLA file from SWF file, using Sothink SWF Decompiler.
    To do this, before decompiling I am using RABCDAsm and after decompiler program.
    After all of this, when I Testing Movie in FLASH Professional CS5, I get 70 compiler errors:

    Warning: 3596: Duplicate variable definition.

    I can not understand what I did wrong?
    I will thankfull if you can help me to find the reason.

    Here are the SWF files, before and after using RABCDAsm:
    http://tg.cdn.ge/0/web.swf
    http://tg.cdn.ge/0/web(RABCDAsm_modified).swf

    Thenks in advance
    With kind Regards
    George

    Reply
    1. CyberShadow Post author

      RABCDAsm has nothing to do with FLA files, and little with ActionScript the language or Flash Professional. I think you have severely misunderstood the purpose or the mode of use of RABCDAsm.

      Reply
  10. Alex

    Hi!

    First of all, thanks for your great stuff!

    I’m using your RABCDasm a lot, nowadays.

    Anyway I have the necessity to get the names of some function/variable names from a swf I have (they are replaced with ‘_-06′ etc)..

    Given that RABCDAsm does not include any sort of deobfuscation-related stuff, I started reading this post.

    But I think I’ve not completely understood the last part: could you explain me how did you manage to get the real string (using that ActionScript application etc)?

    And if I would let my application to do this automatically for each string in the file (writing a deobfuscator), have you some hints about how I could accomplish this?

    Thanks in advance! :D

    Reply
    1. CyberShadow Post author

      The original strings cannot be found anywhere. This post is about control flow obfuscation and string literal encryption, not name obfuscation.

      Reply
        1. CyberShadow Post author

          No, the original names are permanently removed during obfuscation. You could make up your own names and use search and replace to apply them.

          Reply
          1. Alex

            Is then possible to recognize the obfuscator (maybe the name is placed somewhere in the file) used to crypt the variables etc?

            If the obfuscator (and his algorithm) are known, then I suppose it would be possible to reverse-engineering it to get the original names…

            Reply
  11. Amitesh

    Hi Dear,

    My requirement is to edit the images used in SWF to do so. Used swfmill, I first convert to xml. do the required changes in xml(related to images DefinebitJpeg2 and lossless) and publish the swf again using swfmill.

    when i publish the swf I am loosing the actions like click on Replay button does not work in convered swf file.

    Step 1 – swfmill swf2xml input.swf output.xml
    Step 2 – swfmill xml2swf output.xml swfmill.swf

    After conversion swfmill.swf is broken since button clicks does not work.

    Could you please let me know if I need to do any code change or any alternate way to do swf -> xml and xml -> swf.

    I also used swfdump input.swf > output.xml, but I dont know how to convert the output.xml to SWF.

    Please let me know is there any safe way(without hurting any feature) to get XML and convert is back to SWF. thanks in advance.

    Reply
  12. Amitesh

    Thanks a lot for quick response.

    I need it on Linux either as Java API call or command line tool(so that I can integrate it in my app). SWix seems having only windows support.

    Reply
    1. CyberShadow Post author

      SWiX has a command-line program. You may have some luck running it in wine / wineconsole.

      I don’t have any ideas other than swix, or using some library. Since you mentioned Java, you may want to look at the Flex SDK (discussed in this blog article), since it’s open-source and in Java.

      Reply
  13. Amitesh

    I can use flxsdk but is there any document that can provide me APIs to publish swf from (swfdump xml)?

    Reply
    1. CyberShadow Post author

      I don’t know. Probably not. I had to figure things out by reading the source code when I wrote this article.

      Reply
  14. Mick the Wig

    Again, great tools. Only problem is that they take a while to run if the SWF file is big.

    abcexport
    abcreplace
    rabcdasm

    Have you any interest in compiling the applications above in CUDA, so that they would run quicker? I’m calling them from the command line in C#.Net.

    The (trial-and-error) process of (a) dissassemble (c) edit (d) reassemble (e) decompile and check changed code is a bit slow on the CPU.

    Is there anything I could do to make this run faster?

    Thanks.

    Reply
    1. CyberShadow Post author

      Don’t disassemble it every time?

      The assembler is reasonably optimized, but not the disassembler.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *