[x86 asm] help understanding JMP opcodes

Started by
22 comments, last by the_edd 14 years ago
Hi folks, I'm trying to understand a function I've found that writes the x86 opcodes for an absolute jump:

void write_abs_jump(unsigned char *opcodes, const void *jmpdest)
{
    opcodes[0] = 0xFF; 
    opcodes[1] = 0x25;

    *reinterpret_cast<DWORD *>(opcodes + 2) = reinterpret_cast<DWORD>(opcodes + 6);
    *reinterpret_cast<DWORD *>(opcodes + 6) = reinterpret_cast<DWORD>(jmpdest);
}
Here's what I've gleamed so far. Please correct me if any of this is wrong. I believe 0x25 is a ModR/M byte whose constituent parts are in binary: mod=00, reg/opcode=100, r/m=101 The reg/opcode part is 4 in decimal, so the instruction is found under "FF /4" in the intel reference manual. The instruction mnemonic listed for this is "JMP r/m32" and it has a single operand: "ModRM:r/m (r)". The "(r)" in there means that the content of the operand will be read by the processor. At this point I'm a bit stuck. Specifically, I can't figure out what that first DWORD is for (opcodes + 6). It's a pointer to the memory location that contains the value of jmpdest, but why is it needed? What part of the intel manual do I have to understand to appreciate its role in the instruction? There's a table in Intel's manual ("Table 2.2: 32-Bit Addressing Forms with the ModR/M Byte") where I can lookup the "effective address" corresponding to the value of the ModR/M byte, which in this case is "disp32". The manual tells me "The disp32 nomenclature denotes a 32-bit displacement that follows the ModR/M byte (or the SIB byte if one is present) and that is added to the index". I haven't got a clue what this part means though, but I'm pretty sure I don't have an SIB byte here. So any help in getting further with this would be very much appreciated!
Advertisement
The manual says that you can only do an absolute jump if its indirect, that is if the instruction has the form
JMP [eax]
JMP [00405748], etc

So your code sets up the 0xFF opcode, the 0x25 Mod R/M byte, then points to the 4 bytes just past the end of the command as the place to get the absolute location from. Then it writes the absolute location there.

I guess the point is that these 10 bytes can be relocated and be unaffected, whereas the relative form changes depending on where its executed.
Quote:Original post by DaBookshah
So your code sets up the 0xFF opcode, the 0x25 Mod R/M byte, then points to the 4 bytes just past the end of the command as the place to get the absolute location from. Then it writes the absolute location there.


Yep, I get that part. But I can't find where in the manual this follow-the-pointer-to-get-the-jump-location behaviour is described. Maybe that stuff I quoted effectively says this, but I can't so how if it does.

EDIT: ... but I am a total asm noob.

EDIT 2: could you tell me where in the manual it says that you can only do an absolute jump if it's indirect? That's probably a good bit for me to look at.
The m32 refers to reading from memory. The alternative you might be thinking about is that m32 refers to a memory location, and to jump TO that location, but that's not the case here. You need to read Pages 51 - 54 of http://www.intel.com/Assets/PDF/manual/253666.pdf very carefully.

EDIT: Just look at all the possible permutations of the JMP command. You only want the ones marked "near", and there are only relative versions or "absolute indirect".

I'm a bit of a beginner too though, so someone else might correct me.
I'm not sure I can help, but I did run-time jump generation once. I did it to get first class functions with closures in C.

I'm afraid I can't figure out what your code there does, so all I can say is how I did it, which might help.

As for as I can tell, there is no x86 instruction for absolute jump. Well, you can do an indirect jump off a register, but all direct addressed opcodes are relative. I decided to use the direct addressed form because it is smaller. Unfortunately, this also means calculating offsets. Here is my code, kindof butchered as i've only included the relevant parts

/*the structure of the run-time generated environment loader trampoline*/#define TRAMP_SIZE 10typedef struct {    char code[TRAMP_SIZE];} closure;/*the platform specific code stuff*//*the code template: load environment into %eax (0xb8), then jmp relative(0xe9)*/static const closure trampcode = {{0xb8,0,0,0,0,0xe9,0,0,0,0}};/*the important offsets*/static const int ENVPTR = 1;static const int CODEPTR = 6;static const int JMP_DATUM = 10;void *buildclosure(closure *tramp, void *env, void *f) {    /*load the code template*/    *tramp = trampcode;    /*calculate the jump offset (f - &nextinstr)*/    f = (char *)f - (tramp->code + JMP_DATUM);    /*put the two pointers in the correct locations in the code*/    /*memcpy instead of just = because unaligned byte pattern*/    memcpy(tramp->code + CODEPTR, &f, sizeof(void*));    memcpy(tramp->code + ENVPTR, &env, sizeof(void *));    /*tramp is now callable as a function*/    return tramp;}


Ok, so i suppose you could ignore the environment stuff, just focus on the function pointer (f). tramp stads for "trampoline". my "tramp" pointer seems analogous to your "opcodes" pointer. So I just load up the template code, calculate offset, and memcpy it into place. Used memcpy because the field is not aligned.

I hope this helps. I know it's not exactly what you're looking for, but I think i could be useful.

EDIT:added definition of the closure structure to top for clarity on what's going on.
Quote:
Near and Short Jumps. When executing a near jump, the processor jumps to the address (within the current code segment) that is specified with the target operand. The target operand specifies either an absolute offset (that is an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current value of the instruction pointer in the EIP register). A near jump to a relative offset of 8-bits (rel8) is referred to as a short jump. The CS register is not changed on near and short jumps. An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32). The operand-size attribute determines the size of the target operand (16 or 32 bits). Absolute offsets are loaded directly into the EIP register. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared to 0s, resulting in a maximum instruction pointer size of 16 bits.


From 3-333 of the ISR.
Quote:Original post by DaBookshah
The m32 refers to reading from memory. The alternative you might be thinking about is that m32 refers to a memory location, and to jump TO that location, but that's not the case here. You need to read Pages 51 - 54 of http://www.intel.com/Assets/PDF/manual/253666.pdf very carefully.


Right, so here's what's said about r/m32: "r/m32 - a doubleword general-purpose register or memory operand used for instructions whose operand size is 32 bits. The doubleword general purpose registers are: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI. The contents of the memory are found at the address provided by the effective address computation [...]".

I'm guessing it's that last sentence that is relevant. According to table 2-2 on page 39, the effective address computation is "disp32". An explanation is given for this in the manual (see my original post). My problem is that I don't understand the explanation. Specifically, what is "the index"?


Quote:Original post by outRider
An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32).


Great, thanks. I can see that in my copy, now.

So, just so I can be sure that I've got this down straight: that second DWORD written by the function (jmpdest) doesn't necessarily have to follow the JMP's operand in memory. It could in fact be anywhere in the current segment, so long as the first DWORD contains its address. Is that correct?
Quote:Original post by the_edd
Quote:Original post by outRider
An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32).


Great, thanks. I can see that in my copy, now.

So, just so I can be sure that I've got this down straight: that second DWORD written by the function (jmpdest) doesn't necessarily have to follow the JMP's operand in memory. It could in fact be anywhere in the current segment, so long as the first DWORD contains its address. Is that correct?


Yes. Think accessing vtables and function pointers in global/static areas.
Here are some related items from the vaults.

// ---------------------------------------------------------------------------// literal function name points to these bytes FF25########// where ######## is the address of a pointer that points to the function entry point#define JUMPFROMLITERAL(f) ((DWORD*)(((unsigned char*)(void*)(f))+2))[0]    unsigned char *temp = (unsigned char*)(void*)Sleep;    DWORD *ptr = (DWORD*)&temp[2];// note the zero in the macro name    DWORD iat = JUMPFROMLITERAL0(Sleep);// sample disassemblies// [0000422] ff2594304000     jmp       *0x403094  modrm: 0/5/4	GetModuleHandleA// [0049296] ff256c924100     jmp       *0x41926c  modrm: 0/5/4// *** don't try this - potential bsod ***/// 001B:804653F7  2EFF25FE534680      JMP       CS:[804653FE]
"I thought what I'd do was, I'd pretend I was one of those deaf-mutes." - the Laughing Man
Quote:Original post by LessBread
Here are some related items from the vaults.

*** Source Snippet Removed ***


n00b requests explanation of l337 wizardry!

This topic is closed to new replies.

Advertisement