ABC of cracking ABC and unknown protocols

I started playing some flash-based MMORPG for fun lately. The limited options available to the characters in RPGs are not as entertaining as programming, so this didn’t last long; but it definitely gave me an idea… Can I get the event stream and decode it without knowing anything about the game’s design and of course without any source available? Now that’s an interesting quest!

Warning: if you see anything silly about ABC or Flash, that might be because I learned everything I know about it during this project. Any corrections welcome.

Overview of communication

Before going into the details of communication I had to figure out what’s happening in general - what connections are started, how much data is sent, what does the encapsulation look like, etc. That task is pretty trivial with help of Wireshark. I set it to capture everything, connected, entered the game for a moment and left. So what could I learn from this short record?

There were three communication channels:

An unencrypted HTTP GET request which returned one word (number) - probably something related to a version check or cache invalidation
An encrypted HTTP request which contained (thanks to SSL MITM and own CA injection) the first login and some details about the available game characters
Then some time later, a single TCP connection constantly streaming loads of small portions of data

Everything looks pretty obvious, but… the long TCP connection seems to contain just garbage. That usually means that the data is either compressed or encrypted. The second one being more likely since that game would aim for minimum delays and small update packets. After checking the usual options - can ‘file’ identify the stream, does it start with any standard header, can standard tools unpack it, does it contain any readable strings - I found that the only answer was “no”. Encryption it is then…

Going deeper

Finding out more details about encryption was pretty easy. I disassembled the code in .swf and noticed that even though the main code was obfuscated, the libraries weren’t. This only required a quick grep for “cipher” and I knew the library was from crypto.hurlant.com. After a moment it was obvious which function to look for (Crypto.getCipher). This was very useful, because at that point I knew both what functions were used and what arguments they take. Calls to that function were present in a couple of places as expected and what was left was to figure out where does the key come from. That part was a bit harder.

Flash decompilers are very poor quality

I tried to find some decompiler for Flash to find the source of the key generation. Browsing through the ABC was a bit too hard (even though the code itself is quite simple) and I could not know how far the answer was. Maybe the key was generated in a trivial function, maybe it was split over many stages…

I thought: why not decompile everything to AS source code, surely it can’t be that tricky. Unfortunately it was - I got 3 different results from the software I could get my hands on: Complete crash, crash only on some modules, refusing to decompile. To be honest, I did not expect hexrays quality but something that does the minimum would be great. The partial results I got were completely silly and quite surprising. It seems that the easily available flash decompilers don’t even try to recreate what the code does - they just switch instructions into AS statements. If your ABC says “inclocal_i …; declocal_i …” that’s exactly what you’ll see done on a temporary variable… which turns out to be unused afterwards.

At this point I’ve done what any other insane person would do - started writing own decompiler. It’s not that hard really. Actually you can grab the last compiler you wrote (everyone wrote one, right?), and reuse large parts of it. Operations gets split into blocks linked to other blocks, you name every stack position and local variable with unique name, convert the code into SSA form, remove dead blocks, do peep-hole optimisations to strip silly obfuscation code, detect which loops can be converted into while-s/for-s and spit out the code… At stage 3 or 4, I noticed that the whole idea is silly and although the project was working nicely (probably at this point gave better results than some commercial solutions I tried before, even if not all opcodes were implemented) this is just wasting time. I had one simple task and this tool would take too long to complete even if it was limited to just ABC cleanup and propagating argument names into variables.

A better way

Since I didn’t really care about where the password comes from, but only what it is, I decided to do something else… print out the key itself. Apparently there’s this thing called flash debug player and you can use it to see the output of all “trace(…)” calls with it - awesome! Ah… and you need a 32b windows system for it, otherwise it’s not going to work - that was a bit painful, but virtualbox solved this problem quite well.

What I needed to do was to inject a bit of custom own code into the existing .swf, run it under the debug version of flash and collect the result. Injecting the code seems pretty hard if you want to do it yourself. There are loads of projects which will disassemble the .swf file, but almost none which can reassemble it again. Fortunately Apparat does this in a quite simple way - by providing an API written in Scala. It provides a complete, magical framework for modifying the code and the only thing you need to provide is a filter to choose where the modification should be applied and a new template to expand in that place.

Locating the needed part was pretty easy. It looked something like this:

getlocal 1
getlocal 2
getlocal 3
call_property getCipher, 3

This means just “load 3 local variables on the stack and call getCipher”. Of course the parameters were known from the library source:

getCipher(name:String, key:ByteArray, pad:IPad=null):ICipher

I was interested in the key and the cipher. The pad was not needed after all - because the cipher turned out to be “rc4”. The way it works (simplified) is similar to a combination of a known seed (the key), a pseudo-random number generator and the plaintext xor’ed with the PRNG’s output. Very simple design and there are lots of libraries available to verify the result.

Because only two arguments were interesting, it meant inserting the call to “trace” somewhere before the call to the library can give me the needed data. The only thing to remember is that the stack needs to be returned to exactly the same state as before, otherwise the rest of the code would fail. None of the local variables can be overwritten either. This is a work for a spy: get the key, convert to string, print out, put everything back in its place. Here’s the full filter for Apparat, printing out the second parameter:

private lazy val traceCipherCall = {
  (GetLocal(1) ~ GetLocal(2) ~ GetLocal(3) ~ BytecodeChains.partial {
    case originalCall @ CallProperty(name, 3) if name == getCipherQName => originalCall
  }) ^^ {
    case (GetLocal(1) ~ GetLocal(2) ~ GetLocal(3) ~ x) => List[AbstractOp](
      FindPropStrict(traceQName),
      GetLocal(2),
      ConvertString(),
      CallPropVoid(traceQName, 1),
      GetLocal(1),
      GetLocal(2),
      GetLocal(3),
      x)
    case _ => error("internal fail")
  }
}

This is fairly simple - match 3 getlocal-s followed by a call to “getCipher”. This sequence is pretty specific, so it matched only in 2 places - exactly where it was needed. Now I needed to get the output. One installation of windows + debug flash + tons of crap later, I discovered that… the application detects whether it’s running in a debugging environment and changes its configuration to use the test server instead.

Removing debugger detection

Two google searches later I found that the most likely way of debugger detection is checking the “isDebugger” variable. Unfortunately the check wasn’t done in a simple if/else way. The code pushed the “isDebugger” string to some other function which then saved the results of the number of checks. I really didn’t want to get into the details of how that happens. The easiest alternative was to use some other flag which was guaranteed to be false. Luckily, “avHardwareDisable” turned out to be a good candidate. The modification was quickly applied… and the app went into testing mode again. Something else was missing - the most trivial fix was not enough.

The second simplest thing to try was to look for the string describing the testing server and browse the code around it for some condition checking. And it was there! Some function was doing what looked like a dns lookup (still not sure if that was the case) and comparing the result to a known value - apparently running in a debugger influenced the result somehow. Since adding the “testing” string depended on the result of this function, it was a good enough candidate for patching. Fortunately the following sequence:

  convert_string
  equals
  getlocal_2
  if_true "L2"
  not
"L2:"
  return_value

is not that popular. What happens here is: two strings get compared first, then the result gets flipped if local_2 is not set, then the result is returned. It’s basically returning the result of string comparison xor local_2 - probably it went through bytecode obfuscation. The good way to hardcode the result without messing up the stack was changing the beginning to:

poppush_false
push_true

And… success! On the next try the application connected to a production server and printed out the key it used. Not only that, but the key was successfully used later on to read and write any of the events sent between the client and the server. But that’s a topic for another post in the future.

Lessons if you want to protect your communication:

obfuscate the object files after including the libraries, not before - especially if the libraries are open-source
do a random key exchange instead of hardcoding your passphrase
add CA verification (is it possible in flash? not sure)
disassemble your own binaries to find issues
… all of the above will be worked around and your app will be hacked anyway, get used to the idea :)

Was it useful? BTC: 182DVfre4E7WNk3Qakc4aK7bh4fch51hTY

While you're here, why not check out my project Phishtrack which will notify you about domains with names similar to your business. Learn about phishing campaigns early.

blogroll

social