Jump to content
  • Advertisement
Sign in to follow this  
Sckuz

Recompilation help, please

This topic is 2068 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi!
 
   Several years ago, my best friend (AT THE TIME!) kept only the executables when i left for college (the University of Wyoming). The executables are attached to this file. If they could be decompiled, if that's even possible,  that would be a God-send! 
 
    Basically i'm willing top start over with a really old game named "Force Disruptor". It's currently title is "Disrupted Force". Ha-ha, "Force Disruptor" was the first thing that rolled off my tongue when asked to name it. I liked it and stuck with it.
 
   Attached is my game LIBRARY! Not the code to the games, just the games themselves! 
 
   GrrrRrrRrRR(grinds teeth out of frustration).
 
    I've recently re-ignited my desire for a de-compiler. Basically, I'm unsure where to post. Can you help me out by suggesting a place to post or a link to a decompiler? Also attached is the executable I've mentioned before.
 
   Thank  you!

Share this post


Link to post
Share on other sites
Advertisement

Decompile it to what language? Unfortunately, executables don't contain messages like "oh btw i was written in Java fyi kthxbai" in their header.

 

If the executable is compiled to machine code you're basically screwed.

 

If it's compiled to byte-code (Java, C#, Basic?) then there may be a tool that will allow you to decompile it again.

Edited by TheComet

Share this post


Link to post
Share on other sites

If it's compiled to byte-code (Java, C#, Basic?) then there may be a tool that will allow you to decompile it again.

 

Even then, you're very unlikely to get the original code.

 

Decompiling is a useful way to reverse engineer a program to get an idea of what it's doing, but you would have to rewrite what it gives you anyway so it's probabl easier to restart from scratch.  Especially if you progressed as a programmer from when you first wrote the code.

Share this post


Link to post
Share on other sites

I too recommend starting over. Since it was several years ago, you're probably a better programmer. Not to mention advances in the industry with SDK's and game libraries.

 

I'm a little bit confused. I'm guessing you wrote the code and your friend had the executables? And your friend kept, the executables? Or your friend has the code and he's no longer your friend? If it was several years ago, get back in touch with him and ask him for it?

 

If you're dead set on decompiling the code (again, probably not very useful) hit Google. Type decompile <language> exe. 

 

- Eck

Share this post


Link to post
Share on other sites

You can make hamburger from a cow, but it is a one-way process. You cannot recover the cow from a hamburger.

 

Compilation is also a one-way process.  Much information is discarded or otherwise lost in the process.

 

Code gets optimized away, duplicates get merged, dead code is removed, names are lost, loops are unrolled or fused or distributed, functions inlined, code is reordered, processes are parallelized and vectorized, data is prefetched, and many other transformations happen.  Comments are lost forever. File names are gone forever. Anything that was conditionally compiled out is gone forever. Preprocessor directives are likely gone forever. The names of most constants are likely gone forever. Nearly always your template classes and template functions are gone forever. 

 

Tools like C4Decompiler and Boomerang exist that can help recover some C++ code, but the results are not directly usable. Most function names and class names will be gone, almost all variable names will be gone, file names will be gone, and --- most significantly --- the results will generally not compile by themselves; too much information is lost in the process. 

 

Languages that compile to an intermediate byte code like Java or C# have more information available to them and can generally recover more of the structure, but even those generally have horrible luck at recovering anything beyond the most trivial toy programs. Depending on compiler settings and optimization levels you might get sections of the code back, but much code and many names will be missing.

Share this post


Link to post
Share on other sites
Even then, you're very unlikely to get the original code.

 

You'd be surprised what you can get back from bytecode, actually (I'm talking Java). Have you ever decompiled MineCraft? The entire source code was readable, with comments.

 

But yes, compiled code is basically irreversible.

Edited by TheComet

Share this post


Link to post
Share on other sites

You can make hamburger from a cow, but it is a one-way process. You cannot recover the cow from a hamburger.


This is an invalid analogy. Compilation and decompilation are both similar to translating from English to Japanese and back to English - the semantics are never lost when the translator functions properly but the words will change and may become extremely awkward to read depending on the quality of the translation program.

Native decompilation loses function/variable names, but *those are only meaningful to humans*. The code itself - no matter what representation it has - is the full, lossless description of what the program does. Function/variable names are only shortcuts that allow humans to understand what's going on more quickly. Humans can read raw decompiled code and use their IDE to iteratively refactor variable/method names bottom-up to restore the codebase to human-readible form. It's extremely time consuming but is *always* possible.

Share this post


Link to post
Share on other sites

I've had some very good luck using Reflector to decompile some .Net assemblies from third-party vendors that don't believe in documentation laugh.png

If it is native code, however, you're probably hosed.  I haven't tried using a C/C++ decompiler in a few years, but the last time I did, the code it produced came out looking like minified javascript...

Edited by ericrrichards22

Share this post


Link to post
Share on other sites

You can make hamburger from a cow, but it is a one-way process. You cannot recover the cow from a hamburger.

This is an invalid analogy. Compilation and decompilation are both similar to translating from English to Japanese and back to English - the semantics are never lost when the translator functions properly but the words will change and may become extremely awkward to read depending on the quality of the translation program.Native decompilation loses function/variable names, but *those are only meaningful to humans*. The code itself - no matter what representation it has - is the full, lossless description of what the program does. Function/variable names are only shortcuts that allow humans to understand what's going on more quickly. Humans can read raw decompiled code and use their IDE to iteratively refactor variable/method names bottom-up to restore the codebase to human-readible form. It's extremely time consuming but is *always* possible.

Being possible is an academic exercise. While it is true that there is a high level construct that generated the compiled code, that does not mean that you can recover the high level structure by looking at the low level output. Perhaps if we were in a computer theory class where we allow infinite time and infinite storage it is possible to recover code that would generate the same binary. But right now today no such tool exists.

Compiled code can be turned into C or C++ equivalent code. .... except for code that can't. Many times when code is vectorized, or parallelized, or restructured, or has had other heavy optimizations done on it, there is no directly equivalent high level code. Sure you can look at the disassembled code and try to work it out, but even the best decopmilers out there today struggle with even the simple optimizations. Link-time code generation is an another fun optimization trick where the end result is that some LTCG instructions cannot be traced back to any high level construct. Then there is the fun of instructions and assembly-level constructs that don't even exist in the higher level languages: there are many pieces of assembly code that have no C or C++ analogue. There is no C++ code equivalent if the compiler gets creative with 128-bit, 256-bit, or even 512-bit registers. There is no C++ code for prefetching data or manipulating FPU controls and FPU exceptions, or switching to SSE3 or AVX or other CPU states. There isn't even c++ code for a breakpoint instruction that has existed long before C was even around. These perfectly valid instructions get used in compiled and optimized code but just do not exist in C or C++. Yes it is theoretically possible to find a high level construct that can be optimized to match the compiled code, but there is no constructive way of doing it; so good luck on that.

Unfortunately, as I mentioned in my post above, there is also much code that gets eliminated and removed, never making it into the binary. If it isn't in the binary there is no way it can be recovered through decompilation. When the pre-processor excludes a block of code that bit of code cannot be recovered through decompilation because it doesn't exist in the binary. When the optimizer removes something as dead code -- such as an inline function that has a branch that won't be used -- the decompiler cannot identify the code because it doesn't exist in the binaries. You cannot recover code ex nihilo; nothing exists to be recovered.

So yes, while parts of it may be recovered, and in theory a decompiler could attempt infinitely many possible code combinations in an attempt to reverse the code that made it into the binary, there is no such tool today that can do it.

The tools that do exist today can isolate bits and pieces of code, and can sometimes recover interesting portions of code, but in practice today's tools when run on optimized code often just extract small, difficult to use code fragments that will not properly run by themselves.

Share this post


Link to post
Share on other sites

Decompilation is one of my obsessions, which is why I'm participating in the thread.  It's my opinion that challenges on the scale of native decompilation should be approached with a serious intent to implement a working solution.  I've been writing my own x86-64 decompiler over the past several years and it irks me to see people abruptly declaring the problem impossible.

 

Code which cannot be turned into equivalent C code such as FPU/Debug register/Control register manipulation can be left as inline assembly, intrinsics, or identified as the specific library functions they came from.  These instructions are extremely rare in actual programs and not worth doing anything special about.

 

SIMD instructions can be replaced with multiple-step non-SIMD equivalents (replacing one instruction with several simpler intermediate steps is something most decompilers I'm aware of do before they begin dataflow analysis - it's what I use in mine anyway).

 

Code which does not exist in the EXE obviously cannot be recovered, so I'm not sure why it's even an issue.  Surely you wouldn't #ifdef out enough code that the program doesn't work anymore.

 

Decompilers would never attempt "infinitely many possible code combinations".  Round-tripping (machine code -> high level code -> identical machine code) was never the goal (hence my English->Japanese->English mangling analogy).

 

Native decompilers which produce ready-to-compile code generally don't exist because there isn't enough return on investment.  Companies use source control and backups.  Hobbyists learn from their mistake the first time they lose their source.  People debugging without source code, analyzing code for malware, or hacking are typically skilled enough to do so at the disassembly level.

 

Bytecode decompilers are very common because they are almost trivial to write - you skip the entire disassembly step (IMO the hardest part to get right in the whole automated decompilation process) and start with IR which often includes type information, function and variable names.  Even if the market still doesn't really exist for them, they are so trivial to write that people do it because it's interesting and doesn't scare them off like native decompilation does.

Edited by Nypyren

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!