Sign in to follow this  

Embed a .exe inside of a .exe then execute it.

This topic is 3577 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

My chess program depends on several utility programs to handle opening books, tablebase compression, and other tasks. These utility programs are separate .exe files. I want to distribute my complete program as a single .exe file (call me oldschool. or crazy.) My chess engine runs on windows and is written in MSVC++. I've embedded other file types before. The thing I don't know how to do is embed an executable, load it into memory, then run it. I've seen C# examples that just write the embedded executable to disk, then ShellExecute it, but I'm thinking that in C++ I can do better. I might be looking for a piece of inline asm that just sets the pc register to an in-memory buffer. Googling has only brought up crap - I guess I don't know the right keywords. Any help greatly appreciated.

Share this post


Link to post
Share on other sites
Just use ShellExecute - running an executable is ALOT more complicated than just dumping it into memory and jumping to the entry point. In particular you need to patch up any relocated sections, load DLLs, set up the import address table. All of these things are handled by the windows loader. Furthermore you would probably want to run your other executable in its own address space - which I doubt you can even set up from user mode.

There isn't a simple ShellExecuteFromMemory function.

Share this post


Link to post
Share on other sites
Dump to disk and ShellExecute or CreateProcess it. As mentioned, you cannot start a separate process from user mode, at best you could load up the code and start it as a thread.

If you don't want that, you'll have to change the separate exes into libraries or DLLs and load them instead.

Share this post


Link to post
Share on other sites
Quote:
Original post by Telamon
My chess program depends on several utility programs to handle opening books, tablebase compression, and other tasks.

These utility programs are separate .exe files. I want to distribute my complete program as a single .exe file (call me oldschool. or crazy.)


You're old school and crazy :) This is just wrong!

Create libraries to do these tasks. Your utility programs and your main application should link against these. You may not even need the utility programs in the end as the functionality will be available in the main application via the libraries.

Quote:
Googling has only brought up crap - I guess I don't know the right keywords.


Or, it *may* be because what you're trying to do isn't particularly sensible and you've created a problem for yourself that everyone else has avoided by doing things normally.

Share this post


Link to post
Share on other sites
Quote:
Original post by Telamon
I want to distribute my complete program as a single .exe file (call me oldschool. or crazy.)
That's not oldschool, or crazy.
Quote:
My chess program depends on several utility programs to handle opening books, tablebase compression, and other tasks.
That's not oldschool or crazy either. But it is a bad idea.

Share this post


Link to post
Share on other sites
If you insist on embedding these programs, you might want to have a look at this site: http://www.joachim-bauch.de/tutorials/load_dll_memory.html/en

The article and the library is about loading DLLs, not programs, but it should work anyway (no big difference).
After having loaded the data into memory (from a resource or whatever) and calling MemoryLoadLibrary successfully, MemoryGetProcAddress on main() should do the trick (or so I hope).

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
The article and the library is about loading DLLs, not programs, but it should work anyway (no big difference).

You have no idea what you're talking about.
Quote:

After having loaded the data into memory (from a resource or whatever) and calling MemoryLoadLibrary successfully, MemoryGetProcAddress on main() should do the trick (or so I hope).

1) main() is not the entry point of an executable. Jumping to main would skip things like static initialization like that required for the C and C++ libraries to function properly. 2) main() isn't even guaranteed to exist. 3) If you did jump to the actual entry point rather than main(), the static initialization performed could clobber the static initialization performed by the host program, putting the host program into an unstable state.

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
After having loaded the data into memory (from a resource or whatever) and calling MemoryLoadLibrary successfully, ...


This would almost certainly not be successful as the preferred address for all EXEs is usually the same (0x400000 is the default), and EXEs are not relocatable (since they don't need to be).

Share this post


Link to post
Share on other sites
If it really needs to be a suite of separate applications, isn't this exactly where COM is applicable? Or if you wanted to be more modern you could use .Net Remoting, or WCF (for C# or managed C++). Or you could use web-services although personally I think that's an ugly solution.

COM is not a simple technology to use though, I'd probably prefer to use one of the newer .NET technologies if I could.

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCraneYou have no idea what you're talking about.
Thank you very much for such a kind answer. This kind of comment always cheers me up, especially if it comes from a staff member.

Having said that, you can have a look at http://www.savefile.com/files/1373408 where you will find two mini programs that I've hacked together in a few mins (source code and binary). One program loads the other using said library (stripped out 3-4 lines of code that won't work on exe), the other program initialises and accesses some static global variables, and calls functions from two different implementations of singletons, all of which works as exepected.
I'm sure you can construct an example where this will fail, but hey, don't bother.

I was trying to give the OP a pointer to something that might just work for him and might just do what he wants. Seeing that apparently other solutions like using dynamic libraries or including the functionality in the main program are no viable solution for him, this looked like a constructive advice. I apologize for being so ignorant.

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
Thank you very much for such a kind answer. This kind of comment always cheers me up, especially if it comes from a staff member.

Well, he wasn't entirely incorrect, though. You don't seem to fully understand the implications of the technique you're describing.

Of course you can load an EXE as a DLL, they have almost the same file format. However, this doesn't mean that it will work in the general case. Doing what you described by manually 'simulating' LoadLibrary on an EXE is very dangerous and can (will) lead to a lot of extremely obscure anomalies and bugs if applied to any non-trivial C++ executable. Relocation is one big problem, if you try to load two or more EXEs at the same time, with the same base address, that aren't relocatable. Then, runtime init routines might simply not be suited for it - because they weren't designed with the idea of loading several EXEs simultaneously in mind. What about non-trivial static constructors ? Or exceptions ? Or thread local storage ?

Not to mention the fact that said library code heavily relies on undocumented behaviour, gained by trial and error ("XXX: is it correct to commit the complete memory region at once? calling DllEntry raises an exception if we don't...").

I wouldn't touch such things with a ten feet pole, when doing any kind of real world or production level code.

Quote:
Original post by samoth
Quote:
Original post by Jeraxpreferred address for all EXEs is usually the same (0x400000 is the default), and EXEs are not relocatable (since they don't need to be).
That's a valid argument, but not necessarily a problem. The linker can set the image base.

Which means that you need to specifically link everyone of your little independent utilities with a different, non-conflicting base address... Ugh.

Share this post


Link to post
Share on other sites
If you want to distribute a multi-file program as a single file, the proper thing to do is use an archive format such as zip. If you absolutely must allow people to use it without having any idea what archive files are, then provide a self-extracting file as created by any number of archive programs.

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
Having said that, you can have a look at http://www.savefile.com/files/1373408 where you will find two mini programs that I've hacked together in a few mins (source code and binary). One program loads the other using said library (stripped out 3-4 lines of code that won't work on exe), the other program initialises and accesses some static global variables, and calls functions from two different implementations of singletons, all of which works as exepected.
I'm sure you can construct an example where this will fail, but hey, don't bother.


I can also construct an example where a function that returns an address of a local buffer appears to work. That doesn't mean returning addresses of local variables is going to work in the general case. Demonstrating this technique with a trivial application loading another trivial application is worthless.

Consider what happens when main() returns control to mainCRTStartup() in MSVC compiled applications. One of the things that happens is that the function doexit() is called. This function does a few things such as all the functions registered with _onexit() and atexit() are called. It doesn't do so conditionally. It doesn't check if there's a different program running that loaded the existing program. It goes ahead and calls the functions because there's no reason to do any checks since the program is ending. Boom. A bunch of cleanup functions in the host program have now gotten called leaving the host process in an unstable state.

Of course, that may not matter since unless the IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR field of the PE header is filled one of the functions that gets called by doexit() in the client application happens to be ExitProcess(), which will terminate the host process. This, of course, renders the question of the host process' data corruption entirely academic.

This actually depends on whether or not the host process has the IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR field filled, since the loaded process is going to check this with a call to GetModuleHandle(NULL) function, assuming that GetModuleHandle(NULL) will return a handle to itself and not some other foreign module.

This isn't actually the only time that GetModuleHandle() is used by MSVC's CRT. One of those times is actually the HINSTANCE that gets passed as the first argument to the WinMain() function. Which of course, means that a loaded process is going to have some identity issues. I sure hope it doesn't do something silly like ... I don't know ... have a menu. Or contain dialogs.

Just because exe and DLL files share a format doesn't mean there's "no big difference". One is intended to be loaded into the address space of a running process. The other is intended to be run in a virgin virtual address space. If you don't think that's a big difference then you have no idea what you're talking about. It's really that simple.

Share this post


Link to post
Share on other sites
Ok, ok. Some good info here. I'm looking at Molebox.

I don't have the source for the other utilities.

Since I'm curious now, for the sake of argument let's say I'm nuts and have lots of time on my hands. How would I write a ShellExecuteFromMemory function? My assumption is I'll need to setup a user-mode in-memory image of the executeable and then use kernel-mode code to allocate a new process space and thunk the whole thing in.

Is that the general gist?

Share this post


Link to post
Share on other sites
Quote:
Original post by Telamon
Since I'm curious now, for the sake of argument let's say I'm nuts and have lots of time on my hands. How would I write a ShellExecuteFromMemory function? My assumption is I'll need to setup a user-mode in-memory image of the executeable and then use kernel-mode code to allocate a new process space and thunk the whole thing in.

Is that the general gist?

The general gist is that if you shouldn't ever even think about trying to do that. And if you attempt it in Windows you will almost certainly be screwed.

On *nix this is much more feasible, since we have nice syscalls like fork() and exec().
(Can one of the exec variants be used to load from memory? Should it be used like that? i don't know.)

Share this post


Link to post
Share on other sites
It's not really a case of whether its practical, or whether anyone should do it. I want to know if it is possible and my feeling is that it is.

I've been thinking about it for a day or two and I had an idea.

Can you create memory-mapped files in Windows? I could just copy the embedded exe to a memory-mapped file and call ShellExecute from there. That wouldn't be too nasty. That's probably how MoleBox does it.

Share this post


Link to post
Share on other sites
Quote:
Original post by d000hg
If it really needs to be a suite of separate applications, isn't this exactly where COM is applicable? Or if you wanted to be more modern you could use .Net Remoting, or WCF (for C# or managed C++). Or you could use web-services although personally I think that's an ugly solution.

COM is not a simple technology to use though, I'd probably prefer to use one of the newer .NET technologies if I could.


Ugh, the mention of COM is utter sacrilege. Yes, you can use COM to launch another process that you can communicate with. I think, in general, you shouldn't, unless you're prepared to except all of the other restrictions as a result. COM just complicates the picture for little benefit. If you need to talk between processes, there's plenty of IPC and RPC mechanisms: take your pick. I think the other suggestions that there's no reason to have things in separate processes is probably the right approach.

Just assume COM is really a deprecated technology (there's a reason why Microsoft has moved away from it) and leave it at that.

Share this post


Link to post
Share on other sites
Quote:
Original post by Telamon
Can you create memory-mapped files in Windows? I could just copy the embedded exe to a memory-mapped file and call ShellExecute from there. That wouldn't be too nasty. That's probably how MoleBox does it.


That is identical to writing the executable out to a temporary file and then ShellExec'ing it. Maybe the file never hits the disk surface, but a memory mapped file is just as much a part of the file system as if it were on the disk, so I fail to see the benefit of complicating things...

Share this post


Link to post
Share on other sites

This topic is 3577 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this