Jump to content
  • Advertisement
Sign in to follow this  
VanillaSnake21

Assembly word ptr, byte ptr etc. Why?

This topic is 3666 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, why do I need to put these modifiers when I move something? For example val_one dw 0 mov byte ptr [val_one], 'e' Can you please give me a detailed answer (don't simplify it), all the explanations I read so far say something like the assembler is "expecting" a byte/word but that is too general. The reason I'm asking this is because I'm trying to understand why a pointer has to be declared that it's pointing to a certain type. I mean an int pointer is 4 bytes large and it successfully points to an int, a LargeClass* pLC is also 4 bytes long and it points successfully to a 128 byte class. So why exactly does the compiler "needs" to know what the pointer is pointing to? Thanks

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by VanillaSnake21
val_one dw 0
mov byte ptr [val_one], 'e'

Why do you reserve a word (dw), but then only move a byte (byte ptr)?

Quote:
Original post by VanillaSnake21
I'm trying to understand why a pointer has to be declared that it's pointing to a certain type.

Because the type systems of assemblers are much weaker than those of high level languages. For example, the difference between signed and unsigned doesn't exist at all at the data level - you have to chose the right instructions to reflect that (imul vs. umul or something).

The same goes for pointers. In assembler languages, a pointer is just an address without any more information. You don't know what's at that address. Every pointer in assembler is a char* if you will ;-)

Quote:
Original post by VanillaSnake21
So why exactly does the compiler "needs" to know what the pointer is pointing to?

Because the type system doesn't provide enough information. In your example, val_one is just a symbolic name for an address. The fact that you define a word there is irrelevant. val_one is NOT a word, just an address!

When you say you want to move 'e' there, the compiler has no idea if you want to store an 8-bit, a 16-bit or a 32-bit value at that address, because 'e' also doesn't carry enough type information, it's just a number.

On the other hand, if you move eax somewhere, the compiler knows that you want to move 32 bits, because eax is 32 bits wide. Same goes for al or ah (both 8 bits) and ax (16 bits).

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
Quote:
Original post by VanillaSnake21
val_one dw 0
mov byte ptr [val_one], 'e'

Why do you reserve a word (dw), but then only move a byte (byte ptr)?

Actually I got this example from somewhere, but I think if val_one was declared as a byte then I wouldn't need the cast (byte ptr)


Quote:

Because the type systems of assemblers are much weaker than those of high level languages. For example, the difference between signed and unsigned doesn't exist at all at the data level - you have to chose the right instructions to reflect that (imul vs. umul or something).


Quote:
Original post by VanillaSnake21
So why exactly does the compiler "needs" to know what the pointer is pointing to?

Because the type system doesn't provide enough information. In your example, val_one is just a symbolic name for an address. The fact that you define a word there is irrelevant. val_one is NOT a word, just an address!

When you say you want to move 'e' there, the compiler has no idea if you want to store an 8-bit, a 16-bit or a 32-bit value at that address, because 'e' also doesn't carry enough type information, it's just a number.

On the other hand, if you move eax somewhere, the compiler knows that you want to move 32 bits, because eax is 32 bits wide. Same goes for al or ah (both 8 bits) and ax (16 bits).


[/quote]
Well actually when I said compiler I meant compiler. My actual example that I'm having trouble with is in C++, I just thought that if I understand it in assembly then I'll get it in C++. But still, a few questions about your comment. How does the assembler not know if I want to store an 8,16 or 32 bit address if 'e' is 69 in decimal, how can 69 be anything but a byte?

Share this post


Link to post
Share on other sites
Quote:
Original post by VanillaSnake21
How does the assembler not know if I want to store an 8,16 or 32 bit address if 'e' is 69 in decimal, how can 69 be anything but a byte?


On 32bit systems, numbers are generally 32bits. On 16 bit systems, they are 16 bits. On 64bit systems, they are 64bits. If I give you an arbitrary number, there is no way for you to tell me what type of data it is or on what platform it originates on. 69 can be represented in a BYTE, WORD, DWORD, QDWORD, float, double, and a long double.

What the data means to you does not matter to the assembler, just how you want to use it. Your specific example above, is NOT equivalent to:

DWORD val_one = 'e';

It is equivalent to (barring any mistakes on my part):

DWORD val_one;
val_one = MAKELONG( MAKEWORD('e', HIBYTE(LOWORD(val_one))), MAKEWORD(LOBYTE(HIWORD(val_one)), HIBYTE(HIWORD(val_one))));
Now, imagine if val_one was
FF FF FF FF // -1

When you execute: mov byte ptr [val_one], 'e', the result is (barring any endian issues):

45 FF FF FF

Where as: mov [val_one], 'e' would be

45 00 00 00

So, the reason those are there are to tell you how large the target destination is, since the source operand will usually be the default x-bit size for the platform. That is why you read the assembler is "expecting" a specific data size to write to, so when you want a smaller size, you have to specify it, since of all the mnemonic that exist for mov instruction.

Hope I didn't botch anything up in that explanation [wink]

Quote:
The reason I'm asking this is because I'm trying to understand why a pointer has to be declared that it's pointing to a certain type. I mean an int pointer is 4 bytes large and it successfully points to an int, a LargeClass* pLC is also 4 bytes long and it points successfully to a 128 byte class. So why exactly does the compiler "needs" to know what the pointer is pointing to? Thanks


The simple answer for that is just so the compiler knows "where" to call code and access member variables at. There are longer explanations but that gets into compiler design, which I don't have a solid knowledge to give an answer with.

Technically, it doesn't have to in C and in C++ it is really hard to get it right (but people still do vtable hacking). That also results in the void * audacities that you can get away with when passing pointers and type casting. If you know the address of a function or how the compiler will setup a class member function call, you can "do it yourself", but that's just not a good idea from inside the compiler. If you were working with the target via an injected dll though, it is possible and usually used in game hacking.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!