blog
← Back to blog

What if Adobe generated code?

May 27, 2010 Zac White apple software

Wil Shipley had an idea for Adobe:

If Adobe were smart, they’d modify their Flash ▶ iPhone code to just output Obj-C source code. Not much Apple could do.”

I’m sure Adobe is smart. And that’s why they aren’t getting into a battle they can’t possibly win. Let’s explore a what-if scenario of Adobe going down this path. How would Apple react in this cat and mouse game? Could they create a tool to automatically filter Flash created apps without the sourcecode to your program?

Step 1: Class dump executable.

For those of you that don’t know, you can get all Objective-C symbols out of a Mach-O binary with a wonderful tool called class-dump. If they aren’t there, you’ve already detected it’s a possible obfuscated build.

Step 2: Search class-dump output.

Look for suspicious symbols. Because anything outputted by a source generating engine is um, source, you can see exactly what pops out of a typical compile. It isn’t as though Apple can’t afford a copy of CS5. You can search for “CS5FlashView”1 or any myriad of classes/symbols that would be put into the boilerplate output. This necessitates something way more devious as suggested by rentzsch: polymorphic code.

Step 3: Search for patterns.

After Adobe had spent a good chunk of time creating an Objective-C polymorphic coder, Adobe could now distribute a code generation engine that produces binaries that looks like it’s a home-made animation and application framework. But wait, if all the binaries produce the same obfuscated code, that is trivial to detect. Just look for “ER0GlashView.” This necessitates a nondeterministic Objective-C polymorphic coder. A perfect Ph.D topic, but nothing that’s been done to my knowledge. If it existed, this contest would be utterly pointless.

And you’re not just dealing with hiding a 200 line C function. Imagine randomizing the following:

  • Class Names

    If you produce class names like “QTi3uigejr” dictionary attacks can trivially red flag programs.

  • Symbols

    It isn’t enough to generate random symbol names. You have to randomize the structure of your classes so signatures of symbols don’t resemble the original versions. It would be fairly easy to create a JavaScript like obfuscater which converted your classnames/symbols to meaningless letter combinations. Now imagine running that on two trees of headers outputted from class-dump. One from a known Flash generated app and the other from an unknown. A significant chunk would turn out to be the exact same as the headers would be distilled into their underlying structure.

  • Class hierarchy

    Ok, so now you use perfectly hidden class names, but every class hierarchy produces a recognizable fingerprint for a library. Without knowing anything about the classes, you can see relative relationships between them. The bigger the code-base, the easier this is. I’d assume the flash support structure is fairly large and its interdependency very complicated.

Prohibitively hard.

Let’s say you find an awesome little UIView subclass, but Apple has banned anyone from using it on the store. It wouldn’t be too hard to get around. Copy/paste, rename some methods, change a few method names around, etc. You then would have plausible deniability and you could say it wasn’t the banned UIView.

Imagine me asking you to incorporate three20 into your app and obfuscate it in a way where I couldn’t tell you were using it. And assuming the code-generator produces deterministic output, access to all your source code.2 How hard would that be? You aren’t just dealing with changing the names of things. You are tasked with changing the fundamental structure of internal APIs and preserving assumptions that were undoubtedly made when making the code work with itself.

Now imagine making a program that could produce unique obfuscations of three20 and in a way that made it look from the outside like it was redesigned and rewritten from scratch by a human.3 Brain. Explode.

Consequences.

What if Adobe successfully can hide its entire Flash subsystem and so it debuts CS5.1 with the feature turned back on? Hundreds of apps get onto the store by tricking Apple gatekeepers. What do you think will happen when Apple figures out a way to detect with 99% accuracy that an app was generated with Flash?4 Yep, they’ll pull those apps faster than Brian J. Hogan picked up that phone.

The point is, there is no reason for Adobe to take that chance with their customers.

  1. This class name is just an example. It’s anyone’s guess as to how Adobe would need to implement this.
  2. One could just generate a ‘Hello World’ program and see that all Flash views inherit from some base class for example.
  3. Not to mention you couldn’t use any shared resources (images, plists, etc.) without obfuscating those.
  4. And for all the examples given, the detection takes a fraction of the engineering effort of the obfuscation. It’s only reasonable to assume that detection will take less effort than generation.