# [asm]Memory addressing and segments in exe

## Recommended Posts

##### Share on other sites
Whitespaces is the shit, believe me.

##### Share on other sites
Well, ill give it a shot.

First of all:
With the protected mode 32bit processors (386+) the segment registers (cs ds es ss etc) is no longer used in the same way (leftshift 4bits and add). Instead, the lowest two bits is your privilege level (determines if you're in kernel- or userspace) and the rest is an offset into a memory table that isnt visible to your program. That means that you should never change your segment registers. (And you'd probably get a general protection fault if you did).

Furthermore all memory is isolated between applications, this means that even if you use all your memory (or really all of the addres space) you dont actually use all of the computers memory. The OS pages your contiguous address space into physical memory (and as you said HDD).

The environment the OS provides you with are basically a flat addres space (virtual memory) that is (i think) 3GB in windows (the os uses the last GB for itself. there is no need for concern of the seg regs. You basically have all that memory to play with yourself, and the os will take care of fiddling with the lowlevel parts. Plus, cs ds and the like all point to the same memory, so writing to ds:eip would overwrite your code.

Second:
The first two bytes are the magic 'MZ'. All the other stuff you asked about is in the specification for the executable. This can be found on the internet (dont have a link sorry) The executable format used by windows is called "PE executable". If you want more information on Inteltype processors and their memory addressing, you can get "Intel System Programming Guide" from intels website.

And a last word on the loading of the exe. The program isnt really loaded into three different free segments in memory, (as understood in the old 8086 type of segment) but rather the program is loaded into a flat memory space where only the 32bit offset matters (to you).

The entire x86 instruction set can be found at intel's website and is called Instruction Set Reference. It can probably be found under the documentation for the processors.

##### Share on other sites
On the x86, physical addresses are mapped to 64 bits. Here are the chip manuals IA-32 Intel® Architecture Software Developer's Manual, Volumes 1, 2A, 2B, 3A and 3B, downloadable as pdf files. I recommend getting them all. They should also be available in languages other than English.

Under windows, the flat address space is 4 GB, the lower 2 GB are for user mode addresses, the upper 2 GB are for kernel mode addresses. In some configurations this can be tweaked to 3 GB and 1 GB. The address space of each process is separate, however, the kernel address space is mapped into the upper 2 Gb of every process, so by and large this memory is shared. If you really want to get into the low level details of Windows memory management, check out Memory Management: What Every Driver Writer Needs to Know.

The operating system handles loading exe files. The wikipedia entry for Portable Executable files is a good place to begin. The external links at the bottom will fill in the details, especially the articles by Matt Pietrek.

##### Share on other sites
Nice about the links, it was 2AM here and i was a tired panda.

Just a small nitpick about the 64bit part, (because, you know, i can =))

from manual 3A(Section 3.3):
In protected mode, the IA-32 architecture provides a normal physical address space of 4 GBytes (2^32 bytes). This is the address space that the processor can address on its address bus.
...
Starting with the Pentium Pro processor, the IA-32 architecture also supports an extension of the physical address space to 2^36 bytes (64 GBytes).

Theres also IA-32e, but i dont know anything about that.

Cheers...

##### Share on other sites
Well creating PE exe is not so easy because all the information you give with this links is a shit. I found two or three same tables describing the executable bytes, but no one actually shows the real executable header.
According to the table showed in pecoff.doc from microsoft.com the first two bytes are the magic (Don't know what is it) and it is described:
Quote:
 The unsigned integer that identifies the state of the image file. The most common number is 0x10B, which identifies it as a normal executable file. 0x107 identifies it as a ROM image, and 0x20B identifies it as a PE32+ executable.

But the magic is actually 0x4D5E. It is followed by the minor and major linker version and then it is an address of don't know what but it is everything, but not the size of Code.
I just remember my old computer. It hasn't HDD, it has 5.25'' FDD but when I want to make low-level program just write it in memory and start it with specified address and 'G' suffix. I think everything in 'modern' computers is protected to "cannot edit where you don't have work".
I got tired of reading theory that actually is very difficult (almost impossible) to try it in practise.
To make this post more meaningful I ask for an example of an working PE header not just something that work only in theory.

##### Share on other sites
There is more than one header. 0x4d5a (assuming you mean that instead of 0x4d5e) is the magic number for the original DOS executable header that is still present in modern formats. The 0x010b stuff is the type indicator of one of the optional headers.

##### Share on other sites
Quote:
 Original post by PrakNice about the links, it was 2AM here and i was a tired panda.Just a small nitpick about the 64bit part, (because, you know, i can =))from manual 3A(Section 3.3):In protected mode, the IA-32 architecture provides a normal physical address space of 4 GBytes (2^32 bytes). This is the address space that the processor can address on its address bus....Starting with the Pentium Pro processor, the IA-32 architecture also supports an extension of the physical address space to 2^36 bytes (64 GBytes).Theres also IA-32e, but i dont know anything about that.Cheers...

I meant to respond to this before. I don't know why I didn't. Anyway.

As far as a user mode program running on an IA32 is concerned, virtual memory addresses will always be 32 bit.

Regarding my remark about 64 bit mappings, I should have qualified my comment as Windows-centric. The PHYSICAL_ADDRESS type is 64 bits long. Consult the sections titled "Virtual Address Space" and "Physical Address Space" in the Memory Management link I dropped.

Quote:
 As hardware has evolved, the number of address bits has increased, leading to larger physical address spaces and potentially greater amounts of RAM. Current x86 CPUs use 32, 36, or 40 bits for physical addresses in the modes that Windows supports, although the chipsets that are attached to some 40-bit processors limit the sizes to fewer bits. Current releases of 32-bit Windows support a maximum of 37 bits of physical address for use as general-purpose RAM (more may be used for I/O space RAM), for a maximum physical address space of 128 GB. (These values may increase in the future.) Windows also continues to support older processors that decode only 32 bits of physical address (and thus can address a maximum of 4 GB).

##### Share on other sites
Quote:
 Original post by Anon MikeThere is more than one header. 0x4d5a (assuming you mean that instead of 0x4d5e) is the magic number for the original DOS executable header that is still present in modern formats. The 0x010b stuff is the type indicator of one of the optional headers.

4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF...

0x4D5A is already known. Following the PE32 description table, the 0x9000 is a major and minor linker version, next 4 bytes are the size of code (word and dword values are descending)... how the code could be lond 3 bytes. Next is the size of initialized data which is 4 bytes long and the miss to say for what are that FF FF. Not every PE is exactly the same but this FF FF values gives an error if the are different. I asking for a documentation describing how exactly must be written in this part of the exe. I need an example to follow, because the low level writing programs in high level programming environment is really shit.
Less theory, more practise :)

##### Share on other sites
typedef struct _IMAGE_DOS_HEADER {	WORD e_magic;	WORD e_cblp;	WORD e_cp;	WORD e_crlc;	WORD e_cparhdr;	WORD e_minalloc;	WORD e_maxalloc;	WORD e_ss;	WORD e_sp;	WORD e_csum;	WORD e_ip;	WORD e_cs;	WORD e_lfarlc;	WORD e_ovno;	WORD e_res[4];	WORD e_oemid;	WORD e_oeminfo;	WORD e_res2[10];	LONG e_lfanew;} IMAGE_DOS_HEADER,*PIMAGE_DOS_HEADER;

Iirc, the only parts that matter are the MZ at the beginning and the e_lfanew at the end.

Descriptions of several binary file types can be found here: Wotsits: Binary formats

Other related constants:

#define IMAGE_DOS_SIGNATURE 0x5A4D
#define IMAGE_OS2_SIGNATURE 0x454E
#define IMAGE_OS2_SIGNATURE_LE 0x454C
#define IMAGE_VXD_SIGNATURE 0x454C
#define IMAGE_NT_SIGNATURE 0x4550

These values can be found in the header files of many freely available compilers. MinGCC, Lcc-win32, PellesC,...

## Create an account

Register a new account

• ## Partner Spotlight

• ### Forum Statistics

• Total Topics
627682
• Total Posts
2978614

• 13
• 12
• 10
• 12
• 22