• Advertisement
Sign in to follow this  

[c++ | asm x64] I think I call a function in the wrong way

This topic is 895 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

I must be doing a stupid mistake in asm for x64, but my comprehension of the subject is limited -- and I need help.

I'm writing a proxy for a 64-bit dll. My Visual Studio won't let me use inline assembly instructions, and so I've switched to MASM, or ML64 as it's now called (if I'm correct). My dll must initialize before it can act as a proper proxy, but the initialization can't occur in DllMain() or I risk to run into Access Violation exceptions (depends on the scenarios).

So I moved my initialization code in a safe place, inside the stubs for the functions exported by the proxy. When the host process calls any of the functions of the dll, my proxy first initializes itself, and then serves the request made by the host.
Please look at my code (it's short, I promise), see if you spot the error.

What follows is a bare bones version:

// file: DllMain.cpp

extern "C"
{
    // These are made visible for the .asm translation unit 'Stubs64.asm' (shown below)
    // The linker will match the symbols' names appropriately.
    FARPROC g_fp = NULL;
    void __cdecl InitLibrary (void);
    DWORD g_OneTimeInit = 0;
}


BOOL WINAPI DllMain (HINSTANCE hInst, DWORD dwReason, LPVOID lpReserved)
{
    return TRUE;
}


void __cdecl InitLibrary (void)
{
    /* initialization code - very long - omissis */

    // Make sure we don't come here again.
    g_OneTimeInit = 1;
}

Suppose that the x64 dll exports a function called DoStuff().
Then my proxy dll must export a DoStuff() as well. Here's the stub for it:
// file: Stubs64.asm

.data
extern InitLibrary : proc
extern g_OneTimeInit : dword
extern g_fp : qword

.code
public _DoStuff


_DoStuff proc
    cmp g_OneTimeInit, 0    ; Compare g_OneTimeInit with 0
    jne IsInitAlready       ; Jump to IsInitAlready if g_OneTimeInit is nonzero

    ;-----------------------------------------------------------------------------
    ; 'Microsoft x64 calling convention' specific:
    ; --------------------------------------------
    ; Each function call requires 32 bytes of 'shadow space' saved onto the stack.
    ; This holds true regardless of the signature of the function actually called.
    ;-----------------------------------------------------------------------------
    push rcx
    push rdx
    push r8
    push r9

    call InitLibrary        ; <-- void InitLibrary (void) is inside DllMain.cpp
                            ; InitLibrary() internally sets g_OneTimeInit to nonzero.

    ; Reclaim the 'shadow space' previously reserved for the call.
    pop r9
    pop r8
    pop rdx
    pop rcx

    IsInitAlready:
    jmp qword ptr [g_fp]    ; call the real DoStuff() function.
                            ; g_fp holds the address obtained with GetProcAddress() inside InitLibrary().
_DoStuff endp

end

And finally I have the .def file to tell which functions my proxy must export:
LIBRARY "MyProxy"
EXPORTS
    DoStuff = _DoStuff @1

    /* other functions - omissis */

The part I'm not sure about is where I use push and pop to wrap the 'call InitLibrary' inside the stub.
I understand that those 4 push instructions amount to 8 bytes each (64-bit registers), and so we have saved 32 bytes on the stack.
And of course each push mut be paired with a pop, and in reverse order.

However, this code works in x64 Debug builds only.
It crashes in the x64 Release build. During the library initialization (a long phase, that's why I'm not showing it) I call a function that uses variadic arguments (you know: va_list, va_start, va_arg, va_end). This function call happens almost immediately after entering the InitLibrary() function, hence only a few instructions 'away' from the asm in the _DoStuff stub.
It's an .Append() class method exposed by a custom String class.

Here's the code, just to show what I'm talking about:
    String sChannelName;
           sChannelName.Append (3, String ().Unsigned (::GetCurrentProcessId ()),
                                   String (':'),
                                   sExe); // <-- sExe is a String type as well.

The signature of the .Append() method is: String& String::Append (UINT NumArgs, ...);
The Access Violation is occurring on the Append call, claiming that I'm reading from a 0x0000000000000000 location. This is misleading, I'm sure!

I believe the real problem is elsewhere, because I know the above method is fine. It works correctly in x86 Debug and Release builds, and also on x64 Debug builds. It's crashing now in this x64 Release build because the code optimizer is rearranging the instructions in a way that reveals the issue (issue probably sparked from the _DoStuff asm stub).

Before posting here I have changed my code to force the library initialization to occur inside the very DllMain() entry point.
And the stub for DoStuff() was reduced to a single jmp instruction. Like this:
// file: DllMain.cpp

extern "C"
{
    // These are made visible for the .asm translation unit 'Stubs64.asm' (shown below)
    // The linker will match the symbols' names appropriately.
    FARPROC g_fp = NULL;
}


BOOL WINAPI DllMain (HINSTANCE hInst, DWORD dwReason, LPVOID lpReserved)
{
    if (dwReason == DLL_PROCESS_ATTACH) InitLibrary ();

    return TRUE;
}


void __cdecl InitLibrary (void)
{
    /* initialization code - omissis */
}
; file: Stubs64.asm

.data
extern g_fp : qword

.code
public _DoStuff


_DoStuff proc
    jmp qword ptr [g_fp]    ; call the real DoStuff() function.
                            ; g_fp holds the address obtained with GetProcAddress() inside InitLibrary().
_DoStuff endp

end

With the above code there's no more Access violation on the call to the .Append() method, even in the x64 Release build. And everything works fine. As the only significant difference appears to be the lack of the function call to InitLibrary inside the _DoStuff stub, I believe that my asm code for that is wrong.

Please help me, I'm no expert of assembly.

Share this post


Link to post
Share on other sites
Advertisement

Feels like I don't understand anything anymore.

I have removed the call to the Append method. Now using an alternative that doesn't involve variadic arguments.

 

The crash has moved elsewhere.

But again the crash only occurs on x64 Release builds -- the x64 Debug builds work perfectly.

How am I supposed to debug a problem that doesn't exist the moment I can look at it???

 

Somebody help - Please!

What's the proper way to call a 'void Func (void)' function in x64 assembly??

Share this post


Link to post
Share on other sites

Somebody help - Please!
What's the proper way to call a 'void Func (void)' function in x64 assembly??

 

Have you tried using the disassembler and debugger to see what a call look like?

 

[edit]

Drop a break point on a function call.  Run the program.  After debugger stops at the break point, Go to the Windows menu and choose Disassembly (or ctrl + alt +d).

Edited by Rattrap

Share this post


Link to post
Share on other sites

Have you tried using the disassembler and debugger to see what a call look like?

 

[edit]

Drop a break point on a function call.  Run the program.  After debugger stops at the break point, Go to the Windows menu and choose Disassembly (or ctrl + alt +d).

 

No offense, but I don't understand what you say.

 

It's either that, or you haven't read...

 


But again the crash only occurs on x64 Release builds -- the x64 Debug builds work perfectly.

How am I supposed to debug a problem that doesn't exist the moment I can look at it???

 

... that there is no problem what-so-ever in the very moment I make a Debug build.

I attach a debugger, look at things going, and everything is as is supposed to be: perfect.

 

But then I make a Release build -> and the crash happens.

 

In the x86 builds I could use .IF / .ENDIF macros, and I'd get away with this:

_DoStuff PROC
    .IF (g_OneTimeInit == 0)
        call InitLibrary
    .ENDIF

    jmp qword ptr [g_fp]
_DoStuff ENDP

But in x64 those macros don't exist. And I'm left to write my own assembly to call a stupid 'void InitLibrary (void)' function.

And I don't understand why things are working in x64 Debug (refer to 1st post's code) but NOT in x64 Release. For all I'm reading about x64 Assembly nothing is helping me understand the problem. They talk of registers to preserve, of stack space to reserve, of frame pointers (what the hell is a frame pointer???), and of minding the way the arguments are passed in (but I have no damn arguments it's a void!!). It's the simplest function call ever to exist, and nobody cared to write an actual example for it!

 

All I want is to translate the above snippet to its x64 counterpart. But I can't.

Do you know how it's done?

Share this post


Link to post
Share on other sites

They talk of registers to preserve, of stack space to reserve, of frame pointers (what the hell is a frame pointer???), and of minding the way the arguments are passed in (but I have no damn arguments it's a void!!). It's the simplest function call ever to exist, and nobody cared to write an actual example for it!

 

 


The part I'm not sure about is where I use push and pop to wrap the 'call InitLibrary' inside the stub.

 

The frame pointer is used to chain stack frames. If you don't understand what that means, you really need to go back to the books on how functions and stacks and stack frames work. That is fairly fundamental to the lowest levels of assembly code.

 

There are several differences between the 32-bit and 64-bit function calls. The ABI is different. It is more than just switching from EBP (a 32-bit register) to RBP (a 64-bit register), but also that a different set of registers must be preserved or may be destroyed, that different registers may have different uses, that certain flags must be set to specific values at specific times.

 

 

The ABI for both the 32-bit and 64-bit code is well documented.  There are several large PDFs about it, you need to read them and have a solid understanding of what they are and how they work for all the different calling conventions and parameter types, an understanding of what must be saved and what is invalidated, and know how to convert from one to the other. 

 

 

This is one of those "If you have to ask on forums you are not ready to do it" items.  The specs are readily available.  

 

Either you can read the specs and understand them, or you cannot.  If you can, then you will have no problems and would't be asking here.  If you cannot, then you need more experience.

 

 

 


It's the simplest function call ever to exist, and nobody cared to write an actual example for it!

That is probably why nobody cared to write an example for it.

 

Go through the calling conventions for your function in x32 and x64 land, it is pretty clear there are going to be problems.  It is instantly obvious that you will break if the 64-bit program uses any memory living beyond the 32-bit limit.

 

Just at a glance comparing it to the notes I can see some other issues issues.  You are preserving four registers, R8, R9, RDX, and RCX. I note that according to MSDN's notes on the x64 calling conventions, "The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are considered nonvolatile and must be saved and restored by a function that uses them."  But looking in your code they are not explicitly saved or restored.  On the other hand, the 32-bit cdecl calling convention will preserve EBX, EBP, EDI, ESI, ESP, CS, and DS.  While the 32-bit code shouldn't be touching the Rxx registers you need to know that they might, so you need to preserve them. Since the memory addresses in 64-bit code may not exist in the 32-bit address space you need to ensure proper virtualization. You don't do either.

 

 

 

So just right there you're not preserving the mandatory sets of variables and you are breaking memory address space requirements.  You've violated the ABI. So of course your program crashes.

 

 

 

That is why I am repeating:  

 

Either you can read the specs and understand them, or you cannot.  If you can, then you will have no problems and would't be asking here.  If you cannot, then you need more experience.

Edited by frob
More clear wording.

Share this post


Link to post
Share on other sites


No offense, but I don't understand what you say.

 

My sample C++ code.  InitLibrary is setup to prevent inlining.

 
#include <iostream>
__declspec(noinline) void __cdecl InitLibrary(void)
{
std::cout << "Hi There\n";
}
int main()
{
InitLibrary();
return 0;
}

 

x64 Debug Disassembly (using the technique I mentioned above).

 
int main()
{
00007FF6F5C62390 push rbp 
00007FF6F5C62392 push rdi 
00007FF6F5C62393 sub rsp,0E8h 
00007FF6F5C6239A lea rbp,[rsp+20h] 
00007FF6F5C6239F mov rdi,rsp 
00007FF6F5C623A2 mov ecx,3Ah 
00007FF6F5C623A7 mov eax,0CCCCCCCCh 
00007FF6F5C623AC rep stos dword ptr [rdi] 
InitLibrary();
00007FF6F5C623AE call InitLibrary (07FF6F5C6116Dh) 
return 0;
00007FF6F5C623B3 xor eax,eax 
}

 

x64 Release Disassembly

 
int main()
{
00007FF68A401010 sub rsp,28h 
InitLibrary();
00007FF68A401014 call InitLibrary (07FF68A401000h) 
return 0;
00007FF68A401019 xor eax,eax 
}

Share this post


Link to post
Share on other sites

The ABI for both the 32-bit and 64-bit code is well documented. There are several large PDFs about it, you need to read them and have a solid understanding of what they are and how they work for all the different calling conventions and parameter types, an understanding of what must be saved and what is invalidated, and know how to convert from one to the other.

So you're saying that I should go read a couple tomes, learn how to do proper assembly, implying that I shouldn't be asking for help in a forum.

Honestly, that wasn't nice. At all.

 

[...] On the other hand, the 32-bit cdecl calling convention will preserve EBX, EBP, EDI, ESI, ESP, CS, and DS. While the 32-bit code shouldn't be touching the Rxx registers you need to know that they might, so you need to preserve them. Since the memory addresses in 64-bit code may not exist in the 32-bit address space you need to ensure proper virtualization. You don't do either.

You talk to me as if I know this stuff already. Don't you realize that this is the same mistake made by every one of those that write the tutorials on assembly? Those very same tutorials that I don't understand -- because they assume I'm familiar with the subject _already_? If that was the case why would I be reading a tutorial?
And the same goes for the notes on the x64 ABI -- And by the way, by mere luck I know what A.B.I. stands for. But you just threw the acronim in the discussion, assuming that I knew it _already_. See? the same mistake
 

 

@ Rattrap:

I understand now. Makes sense.

But still: whatever I write, doesn't work. It crashes invariably. I'm noticing a misalignment with the arguments passed to the function that's crashing. Instead of a pointer to a pointer I'm seeing the address of a local function. And where should be a small integer value I'm seeing what might be a portion of an address. It's as if things were offset to the left or right altogether.

Share this post


Link to post
Share on other sites

 

The ABI for both the 32-bit and 64-bit code is well documented. There are several large PDFs about it, you need to read them and have a solid understanding of what they are and how they work for all the different calling conventions and parameter types, an understanding of what must be saved and what is invalidated, and know how to convert from one to the other.

So you're saying that I should go read a couple tomes, learn how to do proper assembly, implying that I shouldn't be asking for help in a forum.

Honestly, that wasn't nice. At all.

 

[...] On the other hand, the 32-bit cdecl calling convention will preserve EBX, EBP, EDI, ESI, ESP, CS, and DS. While the 32-bit code shouldn't be touching the Rxx registers you need to know that they might, so you need to preserve them. Since the memory addresses in 64-bit code may not exist in the 32-bit address space you need to ensure proper virtualization. You don't do either.

You talk to me as if I know this stuff already. Don't you realize that this is the same mistake made by every one of those that write the tutorials on assembly? Those very same tutorials that I don't understand -- because they assume I'm familiar with the subject _already_? If that was the case why would I be reading a tutorial?
And the same goes for the notes on the x64 ABI -- And by the way, by mere luck I know what A.B.I. stands for. But you just threw the acronim in the discussion, assuming that I knew it _already_. See? the same mistake
 

 

@ Rattrap:

I understand now. Makes sense.

But still: whatever I write, doesn't work. It crashes invariably. I'm noticing a misalignment with the arguments passed to the function that's crashing. Instead of a pointer to a pointer I'm seeing the address of a local function. And where should be a small integer value I'm seeing what might be a portion of an address. It's as if things were offset to the left or right altogether.

 

You are trying to do something in asm that is a fairly advanced affair, if you dont understand asm well enough I would not even start with DLL loading and stick with getting the hang of asm first. Visual Studio can show you what asm is generated for C/C++ code when you are on a breakpoint and tell it to show you the disassembly. It should look like this: L3ral.png

 

Assembly is not a topic you should jump in lightly and x86 and x64 is not the easiest instruction set to get to know. You should start learning asm with a simple board like an arduino or a Rasberry pi, this chips are simple enough to easily understand what is going on on them. From there moving to x86 and x64 is easier, because you understand the basics of asm by that point.

 

There are some serious tricks going on in some asm code like moving to only the ah or al register and then cmp on the next line with eax or rax, which require that you know things about the architecture of the chips involved. For example: al/ah, ax, eax and rax all refer to the same register for example but with different bit sizes and this is all legacy stuff coming from the shifts from 8 bits -> 16bits -> 32bits -> 64 bits. In x64 you will hardly see any x87 or FPU instructions any more these are almost always replaced by SSE2 instructions. I am surpised that link nypren provided didnt mention the LEA instruction because next to the most common once that lists.

Edited by NightCreature83

Share this post


Link to post
Share on other sites

Eeh, I must be sounding like a spoiled asshole dry.png I won't argue if you think that of me.
But I'm not that bad a person. Really.

And I appreciate the help.

I'm just bitter because I'm being halted by yet another problem that's only the last in a long streak you haven't seen me deal with. Only, this time it involves assembly, a subject I never delved in.
Could this be solved it'd be the very last brick to complete the wall. Which makes it even more frustrating a problem, 'cause I'm spending several hours on it instead of moving on the other modules in this project.

I'll read the link posted by Nypyren (<-- thanks for it, man), and then, dunno, we'll see...

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement