Sign in to follow this  
the_edd

[x86 asm] help understanding JMP opcodes

Recommended Posts

Hi folks, I'm trying to understand a function I've found that writes the x86 opcodes for an absolute jump:
void write_abs_jump(unsigned char *opcodes, const void *jmpdest)
{
    opcodes[0] = 0xFF; 
    opcodes[1] = 0x25;

    *reinterpret_cast<DWORD *>(opcodes + 2) = reinterpret_cast<DWORD>(opcodes + 6);
    *reinterpret_cast<DWORD *>(opcodes + 6) = reinterpret_cast<DWORD>(jmpdest);
}
Here's what I've gleamed so far. Please correct me if any of this is wrong. I believe 0x25 is a ModR/M byte whose constituent parts are in binary: mod=00, reg/opcode=100, r/m=101 The reg/opcode part is 4 in decimal, so the instruction is found under "FF /4" in the intel reference manual. The instruction mnemonic listed for this is "JMP r/m32" and it has a single operand: "ModRM:r/m (r)". The "(r)" in there means that the content of the operand will be read by the processor. At this point I'm a bit stuck. Specifically, I can't figure out what that first DWORD is for (opcodes + 6). It's a pointer to the memory location that contains the value of jmpdest, but why is it needed? What part of the intel manual do I have to understand to appreciate its role in the instruction? There's a table in Intel's manual ("Table 2.2: 32-Bit Addressing Forms with the ModR/M Byte") where I can lookup the "effective address" corresponding to the value of the ModR/M byte, which in this case is "disp32". The manual tells me "The disp32 nomenclature denotes a 32-bit displacement that follows the ModR/M byte (or the SIB byte if one is present) and that is added to the index". I haven't got a clue what this part means though, but I'm pretty sure I don't have an SIB byte here. So any help in getting further with this would be very much appreciated!

Share this post


Link to post
Share on other sites
The manual says that you can only do an absolute jump if its indirect, that is if the instruction has the form
JMP [eax]
JMP [00405748], etc

So your code sets up the 0xFF opcode, the 0x25 Mod R/M byte, then points to the 4 bytes just past the end of the command as the place to get the absolute location from. Then it writes the absolute location there.

I guess the point is that these 10 bytes can be relocated and be unaffected, whereas the relative form changes depending on where its executed.

Share this post


Link to post
Share on other sites
Quote:
Original post by DaBookshah
So your code sets up the 0xFF opcode, the 0x25 Mod R/M byte, then points to the 4 bytes just past the end of the command as the place to get the absolute location from. Then it writes the absolute location there.


Yep, I get that part. But I can't find where in the manual this follow-the-pointer-to-get-the-jump-location behaviour is described. Maybe that stuff I quoted effectively says this, but I can't so how if it does.

EDIT: ... but I am a total asm noob.

EDIT 2: could you tell me where in the manual it says that you can only do an absolute jump if it's indirect? That's probably a good bit for me to look at.

Share this post


Link to post
Share on other sites
The m32 refers to reading from memory. The alternative you might be thinking about is that m32 refers to a memory location, and to jump TO that location, but that's not the case here. You need to read Pages 51 - 54 of http://www.intel.com/Assets/PDF/manual/253666.pdf very carefully.

EDIT: Just look at all the possible permutations of the JMP command. You only want the ones marked "near", and there are only relative versions or "absolute indirect".

I'm a bit of a beginner too though, so someone else might correct me.

Share this post


Link to post
Share on other sites
I'm not sure I can help, but I did run-time jump generation once. I did it to get first class functions with closures in C.

I'm afraid I can't figure out what your code there does, so all I can say is how I did it, which might help.

As for as I can tell, there is no x86 instruction for absolute jump. Well, you can do an indirect jump off a register, but all direct addressed opcodes are relative. I decided to use the direct addressed form because it is smaller. Unfortunately, this also means calculating offsets. Here is my code, kindof butchered as i've only included the relevant parts



/*the structure of the run-time generated environment loader trampoline*/
#define TRAMP_SIZE 10
typedef struct {
char code[TRAMP_SIZE];
} closure;

/*the platform specific code stuff*/
/*the code template: load environment into %eax (0xb8), then jmp relative(0xe9)*/
static const closure trampcode = {{0xb8,0,0,0,0,0xe9,0,0,0,0}};
/*the important offsets*/
static const int ENVPTR = 1;
static const int CODEPTR = 6;
static const int JMP_DATUM = 10;

void *buildclosure(closure *tramp, void *env, void *f) {
/*load the code template*/
*tramp = trampcode;
/*calculate the jump offset (f - &nextinstr)*/
f = (char *)f - (tramp->code + JMP_DATUM);
/*put the two pointers in the correct locations in the code*/
/*memcpy instead of just = because unaligned byte pattern*/
memcpy(tramp->code + CODEPTR, &f, sizeof(void*));
memcpy(tramp->code + ENVPTR, &env, sizeof(void *));
/*tramp is now callable as a function*/
return tramp;
}


Ok, so i suppose you could ignore the environment stuff, just focus on the function pointer (f). tramp stads for "trampoline". my "tramp" pointer seems analogous to your "opcodes" pointer. So I just load up the template code, calculate offset, and memcpy it into place. Used memcpy because the field is not aligned.

I hope this helps. I know it's not exactly what you're looking for, but I think i could be useful.

EDIT:added definition of the closure structure to top for clarity on what's going on.

Share this post


Link to post
Share on other sites
Quote:

Near and Short Jumps. When executing a near jump, the processor jumps to the address (within the current code segment) that is specified with the target operand. The target operand specifies either an absolute offset (that is an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current value of the instruction pointer in the EIP register). A near jump to a relative offset of 8-bits (rel8) is referred to as a short jump. The CS register is not changed on near and short jumps. An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32). The operand-size attribute determines the size of the target operand (16 or 32 bits). Absolute offsets are loaded directly into the EIP register. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared to 0s, resulting in a maximum instruction pointer size of 16 bits.


From 3-333 of the ISR.

Share this post


Link to post
Share on other sites
Quote:
Original post by DaBookshah
The m32 refers to reading from memory. The alternative you might be thinking about is that m32 refers to a memory location, and to jump TO that location, but that's not the case here. You need to read Pages 51 - 54 of http://www.intel.com/Assets/PDF/manual/253666.pdf very carefully.


Right, so here's what's said about r/m32: "r/m32 - a doubleword general-purpose register or memory operand used for instructions whose operand size is 32 bits. The doubleword general purpose registers are: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI. The contents of the memory are found at the address provided by the effective address computation [...]".

I'm guessing it's that last sentence that is relevant. According to table 2-2 on page 39, the effective address computation is "disp32". An explanation is given for this in the manual (see my original post). My problem is that I don't understand the explanation. Specifically, what is "the index"?


Quote:
Original post by outRider
An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32).


Great, thanks. I can see that in my copy, now.

So, just so I can be sure that I've got this down straight: that second DWORD written by the function (jmpdest) doesn't necessarily have to follow the JMP's operand in memory. It could in fact be anywhere in the current segment, so long as the first DWORD contains its address. Is that correct?

Share this post


Link to post
Share on other sites
Quote:
Original post by the_edd
Quote:
Original post by outRider
An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32).


Great, thanks. I can see that in my copy, now.

So, just so I can be sure that I've got this down straight: that second DWORD written by the function (jmpdest) doesn't necessarily have to follow the JMP's operand in memory. It could in fact be anywhere in the current segment, so long as the first DWORD contains its address. Is that correct?


Yes. Think accessing vtables and function pointers in global/static areas.

Share this post


Link to post
Share on other sites
Here are some related items from the vaults.


// ---------------------------------------------------------------------------
// literal function name points to these bytes FF25########
// where ######## is the address of a pointer that points to the function entry point
#define JUMPFROMLITERAL(f) ((DWORD*)(((unsigned char*)(void*)(f))+2))[0]

unsigned char *temp = (unsigned char*)(void*)Sleep;
DWORD *ptr = (DWORD*)&temp[2];

// note the zero in the macro name
DWORD iat = JUMPFROMLITERAL0(Sleep);

// sample disassemblies
// [0000422] ff2594304000 jmp *0x403094 modrm: 0/5/4 GetModuleHandleA
// [0049296] ff256c924100 jmp *0x41926c modrm: 0/5/4

// *** don't try this - potential bsod ***/
// 001B:804653F7 2EFF25FE534680 JMP CS:[804653FE]



Share this post


Link to post
Share on other sites
LessBread's code is extracting the actual function's address using the IAT jump thunk table. Read here for more info on how the tables are laid out:

Understanding the Import address table


Short explanation:
- The address of a function in a DLL is stored in the IAT.
- The IAT is a placeholder until the DLL actually loads, then it's rewritten by the loader.
- The IAT jump thunk table is a hardcoded array of [FF 25 address] instructions to the table that the DLL loader rewrites.
- Your EXE code is hardcoded with calls to the thunk table so that the loader doesn't need to rewrite your entire EXE with DLL function addresses.


The jump thunk table only really exists so that you can use function pointers in your code (like 'Sleep' in LessBread's example). If the compiler could guarantee that function pointers to imported functions were never used, it could get rid of the thunk table and write indirect call instructions directly in the EXE.

[Edited by - Nypyren on March 26, 2010 3:50:36 PM]

Share this post


Link to post
Share on other sites
Nypyren's explanation is very good.

Those "vault" snippets depend on the compiler translating the function pointer to a Win32 API function -- Sleep or GetModuleHandleA for example - supplied by a dll as Nypyren wrote - typically an operating system dll like kernel32.dll or user32.dll etc. -- into an [FF 25 address] series of instructions. If the compiler doesn't use a thunk table (as Nypyren wrote) that macro won't work. That macro extracts the four bytes following FF25 which happen to be the address in the thunk table where the actual entry point address of the function is stored (think of pointers to pointers). My asm is a bit rusty, but "disp32" is equivalent to the thunk address and "effective address" is equivalent to the actual entry point address of the function and these items combined offer an alternative way to call a function.

The reason a compiler would not use a thunk table is because it knows the dll always loads at the same address every time. This is the case with kernel32.dll and ntdll.dll and a few other operating system dlls. Instead of redirecting through a thunk table, some compilers use the actual addresses of functions exported by those dlls.


At any rate,

// 001B:804653F7 2EFF25FE534680 JMP CS:[804653FE]

This is a disassembly from within the kernel (Note the addresses above 0x80000000). Jumping there directly from user mode is not recommended. I supplied it as another example of the use of FF25. Here it's preceded with the instruction that stipulates which segment register to use. This isn't something you will ever likely need to use.

Share this post


Link to post
Share on other sites
Thanks Nypyren/LessBread! Those explanations do indeed illuminate the utility of this particular instruction.

The point of this code (as I suspect many of you may have already guessed) is for API hooking. I guess if the function to be hooked is exported from a dll, it might be easier to simply make an adjustment in the IAT so that a function call jumps to a different address?

Share this post


Link to post
Share on other sites
Yeah, if the function you want to hook is only called via the IAT, you can just replace the IAT's pointer to hook the function.

The EXE can call DLL functions in other ways as well. For example:
- Instead of using the IAT, it could call GetProcAddress and store the function pointer anywhere it feels like.

- Call a function in a custom DLL which *then* calls the function you're trying to hook. Each DLL has its own IAT which you will have to search through to hook these. You can enumerate a process's loaded EXE/DLLs (aka "modules") by using "EnumProcessModules".

- Trust the target DLL will be loaded at a constant address and hardcode the pointer. This is extremely infrequent since Vista/Win7 like to randomize module load addresses more often (for security purposes).

Share this post


Link to post
Share on other sites
Makes sense. Thanks.

So, I'd like to sneak in one last question. The code I'm deciphering also chooses whether to write an absolute indirect jump (FF 25) or a relative jump (E9) by testing whether the distance between the start and the end of the jump is greater than 0x7FFF0000 i.e.


bool abs_jump_required(const char *from, const char *to)
{
return std::abs(from - to) > 0x7FFF0000;
}


I'm struggling to understand where 0x7FFF0000 comes from.

EDIT: I note it's very close to 0x80000000, mentioned previously by LessBread, but I fail to see a connection.

Share this post


Link to post
Share on other sites
This might explain some of that: 0x7ffe0000 - What is in it?

On second thought, it doesn't. With an absolute jump the "from" address can be taken as zero and the "to" address as the "offset from zero". With a relative jump, the "from" address is the address of the instruction (typically stored in the eip register). If the difference between "from" and "to" is greater than 0x7FFF0000, executing the instruction will result in a jump into the kernel which with throw an exception, likely crash the program and maybe cause a bsod.

I think the check is rather crude. Iirc there are size limitations to the displacements allowed by a relative jump - but as I wrote above, my asm skills are rusty. I would have to crack open the intel manual to be certain.

Share this post


Link to post
Share on other sites
Quote:
Original post by the_edd

bool abs_jump_required(const char *from, const char *to)
{
return std::abs(from - to) > 0x7FFF0000;
}


I'm struggling to understand where 0x7FFF0000 comes from.


Beats me. I would have said it could relate to the size of the jump (you can specify 8, 16? and 32 bit displacements), but 0x7FFF0000 doesn't make sense in that context. Is this code meant to deal with just x86, or x64 too?

Quote:
Original post by Nypyren
Yeah, if the function you want to hook is only called via the IAT, you can just replace the IAT's pointer to hook the function.


Another option is to overwrite the first 5 / 10 or so bytes of the target function at run-time with a call to your code. On the plus side you avoid having to think of all the different ways the target function could be referred to, and I find it easier to visualize as you don't have to deal with the IAT. On the other hand, it's much more intrusive.

Share this post


Link to post
Share on other sites
A few more thoughts...

Above I described the absolute jump as an alternative way to invoke a function, but iirc, the return address must be pushed right before the jump. When the 'call' instruction is used to invoke a function it pushes the return address before entering the function so that execution can resume at it's proper place when the function returns. I'm not completely certain but if you're going to use an absolute jump to invoke a function it's worth checking out if eip must be pushed prior to the jump. Most Win32 functions use the standard calling convention which pops the arguments to the function off of the stack with the 'ret' instruction. If the expectation is that the return address is found at the top of the stack, then eip should be pushed before the jump, otherwise execution might resume at whatever address could be interpreted from the contents of the top of the stack before the return pop. Function arguments are usually pushed right to left as they are written in C. If my suspicions about the return address are correct, without pushing eip, execution would jump to whatever address could be interpreted from the first argument to the function. At any rate, it's worth checking out and confirming either way.


Share this post


Link to post
Share on other sites
Quote:
Original post by LessBread
A few more thoughts...

Above I described the absolute jump as an alternative way to invoke a function, but iirc, the return address must be pushed right before the jump. When the 'call' instruction is used to invoke a function it pushes the return address before entering the function so that execution can resume at it's proper place when the function returns. I'm not completely certain but if you're going to use an absolute jump to invoke a function it's worth checking out if eip must be pushed prior to the jump. Most Win32 functions use the standard calling convention which pops the arguments to the function off of the stack with the 'ret' instruction. If the expectation is that the return address is found at the top of the stack, then eip should be pushed before the jump, otherwise execution might resume at whatever address could be interpreted from the contents of the top of the stack before the return pop. Function arguments are usually pushed right to left as they are written in C. If my suspicions about the return address are correct, without pushing eip, execution would jump to whatever address could be interpreted from the first argument to the function. At any rate, it's worth checking out and confirming either way.


Not quite sure what you're referring to here, but if you're suggesting this is a consideration when altering the jmp commands in the thunk table, it isn't. And yes, unless you push a return address one of the arguments will get treated as the return address.

Share this post


Link to post
Share on other sites
It's safe to use JMP instead of CALL for certain tail calls. In other words, say you've got some functions like this:

int FunctionA()
{
// do something and return a value
}

int FunctionB()
{
// do something

return FunctionA();
}


Normally you'd expect FunctionB to end like this...


// do something
CALL FunctionA
// do local stack cleanup
RET


...but if the call to FunctionA has either a '_(void)' signature and FunctionB uses cdecl calling conventions... OR FunctionA has the same signature as FunctionB, it can be simplified to this instead:


// do something
// do local stack cleanup, excluding stdcall arguments if stdcall is used
JMP FunctionA


At the RET in FunctionA, since the stack is setup exactly how it was when FunctionB ended, it would then return to whoever called FunctionB instead of going back to FunctionB first. This is how the JMPs in the thunk table work (they have the same effective function signature as the real function in the DLL).

Share this post


Link to post
Share on other sites
Quote:
Original post by DaBookshah
Is this code meant to deal with just x86, or x64 too?


Yeah, there are some x86/x64 #ifdefs in the code, but this abs_jump_required function is used in both 'branches'.

Quote:

Another option is to overwrite the first 5 / 10 or so bytes of the target function at run-time with a call to your code. On the plus side you avoid having to think of all the different ways the target function could be referred to, and I find it easier to visualize as you don't have to deal with the IAT. On the other hand, it's much more intrusive.


This is what I'm doing currently (it works, but this thread has been about understanding the lower level snippets and samples I've borrowed).

Share this post


Link to post
Share on other sites
Quote:
Original post by DaBookshah
Quote:
Original post by LessBread
A few more thoughts...

Above I described the absolute jump as an alternative way to invoke a function, but iirc, the return address must be pushed right before the jump. When the 'call' instruction is used to invoke a function it pushes the return address before entering the function so that execution can resume at it's proper place when the function returns. I'm not completely certain but if you're going to use an absolute jump to invoke a function it's worth checking out if eip must be pushed prior to the jump. Most Win32 functions use the standard calling convention which pops the arguments to the function off of the stack with the 'ret' instruction. If the expectation is that the return address is found at the top of the stack, then eip should be pushed before the jump, otherwise execution might resume at whatever address could be interpreted from the contents of the top of the stack before the return pop. Function arguments are usually pushed right to left as they are written in C. If my suspicions about the return address are correct, without pushing eip, execution would jump to whatever address could be interpreted from the first argument to the function. At any rate, it's worth checking out and confirming either way.


Not quite sure what you're referring to here, but if you're suggesting this is a consideration when altering the jmp commands in the thunk table, it isn't. And yes, unless you push a return address one of the arguments will get treated as the return address.


That was a reminder to push the return address before using an absolute jump to invoke a function.

Share this post


Link to post
Share on other sites
Quote:
Original post by the_edd
I'm struggling to understand where 0x7FFF0000 comes from.


I got a response from the original author on this. He can't remember where this constant comes from. It's possibly a mistake and should be 0x7FFFFFFF.

It also doesn't help of course, that I mis-translated the original function :/

Here it is:


virtual bool requiresAbsJump(uintptr_t from, uintptr_t to)
{
uintptr_t jmpDistance = from > to ? from - to : to - from;
return jmpDistance <= 0x7FFF0000 ? false : true;
};


So, here's what I think my version should look like:


// A relative jump (opcode 0xE9) treats its operand as a signed offset. If the unsigned
// distance between from and to is of sufficient magnitude that it cannot be represented
// as a signed integer, then we'll have to use an absolute jump instead (0xFF 0x25).
bool abs_jump_required(const char *from, const char *to)
{
const uintptr_t upper = reinterpret_cast<uintptr_t>(std::max(to, from));
const uintptr_t lower = reinterpret_cast<uintptr_t>(std::min(to, from));

const uintptr_t biggest_signed_magnitude =
(uintptr_t(1) << (sizeof(uintptr_t) * CHAR_BIT - 1)) - 1;

return upper - lower > biggest_signed_magnitude;
}


I hope to have this working on x64 in future, so I didn't hard-code 0x7FFFFFFF.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this