[C++] "Casting object addresses to char* ... almost alywas yields undefined behavior"

Started by
9 comments, last by DerekSaw 15 years, 9 months ago
I was reading an Item in Effective C++ and I came across a quote:
    "... casting object addresses to char* pointers and then performing pointer arithmetic on them almost always yields undefined behavior"
It would seem to me that more often than not that that sort of code would yield perfectly defined behavior as long as the code is sensible. Consider this:
    class Foo; ... Foo* foo_ptr = new Foo; char* char_ptr = reinterpret_cast<char*>(foo_ptr);
We know by definition that sizeof(char) is always 1 byte. We also know that Foo consists of sizeof(Foo) bytes. So we may advance char_ptr as many as sizeof(Foo) times. What caveats am I failing to see? It there some funny business regarding virtual functions and vtables, memory layout etc?
Advertisement
What he is getting at is that starting from something like:

class foo
{
virtual void stuff(){}
int x;
}
class bar : public foo {int y;}

this code is incorrect

bar b[20];
foo *f = &b[0];
f[1].x = 10;
cout << b[1].x << endl;

likewise, had I done

bar b[20];
char *c = (char *)&b[0];
c += sizeof(foo);
((foo *)c).x = 10;
cout << b[1].x << endl;

I still didn't edit the right memory location, and i get garbage. (AND in vs2008, i can look in the debugger, and both times
I just stomped the vtable for b[1])

The point of that statement is that given a pointer, you can't make assumptions about it. Unless you know that you have a
POD type pointer, you could randomly trash data by manipulating it. That stands for any cast, the only point of saying "to char *"
specificly is to emphesize that when you decide to throw out all information regarding your pointers you break things.
Quote:The point of that statement is that given a pointer, you can't make assumptions about it. Unless you know that you have a
POD type pointer, you could randomly trash data by manipulating it. That stands for any cast, the only point of saying "to char *"
specificly is to emphesize that when you decide to throw out all information regarding your pointers you break things.
I agree with that statement, although I don't think you've given good examples. Of course advancing an array of type X with sizeof(Y) bytes where sizeof(X) != sizeof(Y) would result in an error. The idea behind the pointer arithmetic after converting to char* is that you generally can't do things like
struct A{    short sh;    char byte;    int n;};A a;char* p = (char*)&ap += sizeof( short ) + sizeof( char );*((int*)p) = 5; // We're probably not assigning to n
_______________The essence of balance is detachment. To embrace a cause, to grow fond or spiteful, is to lose one''s balance after which, no action can be trusted. Our burden is not for the dependent of spirit. - Mayar, Third Keeper
I can't find the reference in Effective C++ Vol 3, could you look it up for me?
Quote:Original post by thedustbustr
I can't find the reference in Effective C++ Vol 3, could you look it up for me?


Yes, I don't remember that one being in the latest version (the one I own). Maybe it was one of the ones he removed, which would show that the item may not be important / valid.
Mike Popoloski | Journal | SlimDX
Quote:Original post by Mike.Popoloski
Quote:Original post by thedustbustr
I can't find the reference in Effective C++ Vol 3, could you look it up for me?


Yes, I don't remember that one being in the latest version (the one I own). Maybe it was one of the ones he removed, which would show that the item may not be important / valid.


I was not citing the title of an item, rather I was citing a line within one of the items, namely "Item 27: Minimize casting. 116".

That quote caused me confusion because for example, casting to char is pretty much the only way to write objects to disk isn't it?
Quote:Original post by fpsgamer

That quote caused me confusion because for example, casting to char is pretty much the only way to write objects to disk isn't it?


No and it never has been for non POD types. In that item he has just explained that a pointer to a derived does not mean a pointer to its base class has the same address, you have vtables (as pointed out in this thread aswell), then there is structure padding etc etc. The only thing you can do to a POD class in relation to this, that the standard guarantees, is that an object can be copied in a block of memory the same size (or bigger) and copied back in the object resulting in an object which is consistent with its state before the operation.

If you know information about which compiler you are using and its characteristics then you can do some none standard shenanigans.
Quote:Original post by fpsgamer
casting to char is pretty much the only way to write objects to disk isn't it?


You frighten me when you say things like that.
Any use of 'reinterpret_cast' involves at least implementation-defined behavior. The only specified behavior is that casting to a different type and then casting back to the first type must, in many cases, give back the original value. Other than that, the behavior is unspecified.

In other words, any useful code that uses reinterpret_cast is making assumptions about the implementation it is being compiled under.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
@Zalhman, sorry to cause you alarm :) If it makes you feel any better I have always been knowingly ignorant on this topic, so no actual software was at any risk [grin]

Quote:Original post by Extrarius
Any use of 'reinterpret_cast' involves at least implementation-defined behavior. The only specified behavior is that casting to a different type and then casting back to the first type must, in many cases, give back the original value. Other than that, the behavior is unspecified.

In other words, any useful code that uses reinterpret_cast is making assumptions about the implementation it is being compiled under.


I think I get what you mean.

So could it be said that (in practical cases) it can be moral to reinterpret_cast POD types, but never moral to reinterpret_cast non-POD types?

[edit]

On a somewhat related note. In C++, can basic types like int have arbitrary alignment requirements? If so, then that would mean taking an array of type int and walking through it via char* could yield some unsavoury behavior.

This topic is closed to new replies.

Advertisement