I'm only familiar with how this works in Windows PE (exe and dll) files:
- The assembler and linker will organize machine code however it feels like it. Typically a compiler will generate each function as they appear in the source file and the linker will generate a library in the order that objects are passed to it - this is mainly subject to change due to optimizations.
- The output object/library/executable files are partitioned into 'sections'. Each section can be relocated anywhere in RAM when the loader loads it.
- Once the exe or dll is generated, the code within each section of that file will assume that the section will be treated as an immutable chunk. Function calls within one section rely on this.
- A relocation table can be used to remap pointers within the code sections (typically addresses to other sections within the same file), but this never changes the relative offsets of things within the same section.
You should expect code within one section to get the same relative offsets to each other, but only until you recompile or re-link your exe/dll. It will not change between runs of the program. You should never rely on cross-section offsets being the same, OR absolute addresses being the same between runs of the same app, even if you did not recompile.
I've never inspected Linux binaries closely to see if they use this same pattern or not.
it is the same basic pattern, yes.
there are possible differences and things that could be clarified, but I decided against going too much into the specifics of PE/COFF and ELF loaders.
the main difference is mostly that, as applicable, the compiler (GCC) may shuffle things around and link objects in a pseudo-random order.
this doesn't usually change much between runs of a program though, but may effect things between builds.
the assumption in Linux land is generally that people will be regularly recompiling pretty much everything, rather than keeping the same binary around for a decade or more.
the rest is mostly randomizing the load address for a given image, so at one time it may be loaded at one address, and at another time another.
if the functions are not necessarily in the same image, then they will vary.
also it may happen that if you try to fetch a function pointer to a function in a different library, you will not get its true address.
partly it has to do with lazy linking:
the GOT holds an initialization stub, which when called will replace itself with the appropriate function address.
so, it isn't a good idea to return the address directly from the GOT, since if the function hasn't yet been called, it may be the wrong address (pointing to the stub, rather than the target function).
so, instead, what will be returned is a function pointer to a trampoline, which will jump to the address in the GOT.
this way, when the stub is called it can do its thing, and the function pointer will go to the right place.
PE/COFF will often also use trampoline stubs, mostly as a means to allow the compiler to more easily use cheap local calls when possible (at a slight added cost for the case where the function turns out to be a DLL import).
also, ELF will tend to use the GOT as a means of avoiding need for explicit relocation tables, allowing libraries to be mapped to arbitrary addresses without needing to be rebased, and allowing more pages to be shared. but, again, this comes at a slight cost to performance (pretty much everything is done indirectly), and typically keeping a register tied up with keeping track of the GOT.
in contrast, with PE/COFF, the solution is basically "try when possible to always map the DLL to the same base address in every process".
I personally like the PE/COFF strategy a little more, FWIW.
also, the addition of RIP-relative addressing on x64 greatly reduces the need for explicit relocations, while still allowing fast/cheap access to local variables and functions.
though, ELF has a few things it did well as well, FWIW...
Edited by cr88192, 16 July 2013 - 01:12 PM.