Back to General and Gameplay Programming

Strict aliasing rule

General and Gameplay Programming Programming

Started by Hodgman January 01, 2013 01:51 PM

7 comments, last by Hodgman 11 years, 3 months ago

Hodgman

52,717

Author

January 01, 2013 01:51 PM

This is a spin-off from another thread here.

I was advocating the use of a "polymorphic" design, where two structures that have the same initial member were aliased:

namespace Commands
{
	enum Type { Foo };
}
struct Command
{
	Commands::Type id;
};
struct Foo
{
	Commands::Type id;
	int value;
};
//e.g.
Foo foo = { Commands::Foo, 1337 };
Command* cmd = (Command*)&foo;
switch( cmd->id )
{
	case Commands::Foo:
		Foo* fooCmd = (Foo*)cmd;
		printf( "id = %d, value = %d\n", (int)fooCmd->id, fooCmd->value );
		break;
}

At the time, I thought this violated the strict-aliasing rule, but that the code would produce the intended behaviour anyway. The worst thing that I thought would happen, is that the compiler would generate code that redundantly reads "fooCmd->id", even though it already read "cmd->id" just above.

However, the C++03 wording of the rule is:

If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:

• the dynamic type of the object,
• a cv-qualified version of the dynamic type of the object,
• a type that is the signed or unsigned type corresponding to the dynamic type of the object,
• a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
• an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
• a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
• a char or unsigned char type.

Does the bold statement mean that I'm not actually breaking the strict aliasing rule here, because the aliased value (i.e. id) is actually the correct type in both structures?

. 22 Racing Series .

Bregma

9,461

January 01, 2013 02:38 PM

I would interpret those rules as saying you're invoking undefined behaviour because a Command does not contain a Foo (and vice-versa) or any of the 'aforementioned' variants, so clause 5 does not apply.

The point of clause 5 is that you can access the Commands::Type member of either class through an lvalue expression and the behaviour remains defined.

Stephen M. Webb
Professional Free Software Developer

Khatharr

8,814

January 01, 2013 02:40 PM

Does the bold statement mean that I'm not actually breaking the strict aliasing rule here, because the aliased value (i.e. id) is actually the correct type in both structures?

That's my understanding of it.

Uwaah. Bregma ninja'd me.

I'll elaborate a little on what I mean. I see libs all the time that use structs that start with a member like cbSize indicating the length of the struct and then having more than one version or allow for the future addition of new versions.

Oh. Now that I'm looking at the code again I see that Bregma is correct. For some reason I thought you put a Foo in there.

struct Foo {
  int a;
  int b;
};

struct Bar {
  Foo derp;
  int c;
};

Bar thing;
((Foo)thing).a;

I think that's what he's saying. You confused me there for a minute by having the 'Command' type as well as the 'Commands' type, lol.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Matias Goldberg

9,637

January 01, 2013 07:16 PM

I'm with Bregma's interpretation.

• an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union)

My understanding, it is saying that if Bar is derived from Foo, you can cast Foo to Bar and still access it correctly. Also the clause seems to allow for unions & typedefs.

• the dynamic type of the object

My interpretation here it is saying that it could be defined behavior because they're both of the same type (Commands::Type). However like Bregma said, this clause does not apply because you went into undefined behavior land when casting Foo into Command.

Implementation wise (so far all systems I know) will work as expected because the compiler can see you casted Foo into Command. However, compile some code with -fstrict-aliasing where the compiler can't see in it's scope you did that, and you'll run into trouble, for example this:

void test( Command *cmd, Foo *foo )
{
    cmd->id = 1;
    print( foo->id );
}

//Meanwhile in another compilation unit...
int main( /* ...args.. */ )
{
    Foo foo;
    foo->id = 3;
    foo->value = 0;
    test( reinterpret_cast<Command*>(foo), foo );
    return 0;
}

Note that MSVC is too relaxed on aliasing, so you're gonna need to test it into some other compiler which follows aliasing rules more strictly (like GCC with -fstrict-aliasing)

Twitter: @matiasgoldberg

Distant Souls ? Alliance AirWar ? My Free Royalty-Free Music Library

tanzanite7

1,409

January 02, 2013 12:36 AM

A bit of a tangent: Why on earth would one do such a thing? Why not just declaring the alignment explicitly through inheritance?

struct Id { Commands::Type id; };
struct Foo : Id { int value; };

edit: Good to see that gamedevs forum software still excels at exploding from even basic usage. Should be readable now.
edit: Oh, editing posts does not work either ... at least the changes seem to show up when reloading the page.

Hodgman

52,717

Author

January 02, 2013 07:04 AM

I guess I'm taking this part of the rule wording:
"If a program attempts to access the stored value of an object through ...
an aggregate type that includes the dynamic type of the object among its members ... [then the behaviour is well defined]"

In my interpretation, the stored value that I'm accessing is the value of id. In every case, it's accessed through an aggregate that includes the actual type of id.

So basically, I'm confused as to how the wording applies exactly when it comes to aggregates such as structures -- my above interpretation is focussed on the actual primitive type being accessed, whereas Bregma's interpretation focuses on the containing structures being different.

The point of strict-aliasing (and the restrict keyword) is that when I write to an int, the compiler knows that only other int variables might have been invalidated by that write (and may need to be re-read). So I figure that when I write to foo.id, which is an Commands::Type, the compiler knows that cmd->id might have been changed, because it's also a Commands::Type, and the aggregate types that contains these members is irrelevant -- only the types being read/written matter.

Implementation wise (so far all systems I know) will work as expected because the compiler can see you casted Foo into Command. However, compile some code with -fstrict-aliasing where the compiler can't see in it's scope you did that, and you'll run into trouble, for example this:

That's a good bit of code to clarify my question -- switching over to C99, which has almost the same rule -- given the code:

struct A { int value; };
struct B { int value; };
void test( A* a, B* b )
{
	a->value = 42;
	print( b->value );
}
int main()
{
	A obj = { 0 };
	test( &obj, (B*)&obj );
}

Is the above code equivalent to test1 or test2 below?

void test1( int* a, int* b )
{
	*a = 42;
	print( *b );
}
void test2( int* restrict a, int* restrict b )
{
	*a = 42;
	print( *b );
}
int main()
{
	int obj1 = 0, obj2 = 0;
	test1( &obj1, &obj1 );//should print 42
	test2( &obj2, &obj2 );//might print 0 or 42 (or do something else)
}

And if the latter, does this mean that we can emulate the restrict keyword simply by creating these kinds of wrapper structs?

And given Matias' example of soem broken code, it seems that this work around would make it well-defined, right?

void test( Command *cmd, Foo *foo )
{
    //cmd->id = 1; //instead of this, which the compiler apparently knows can't change foo->id
    int* id = &cmd->id;//int* is allowed to alias any int
    *id = 1;//now foo->id might possibly be 1, legitimately!?
    print( foo->id );//will print 1 (if (void*)cmd == (void*)foo)
}

Why on earth would one do such a thing? Why not just declaring the alignment explicitly through inheritance?

Yeah, in my actual implementation of this command system, I do use inheritance (and in C you could use composition, where the Foo struct begins with an Id member).
However, aliasing structures is a common pattern, so I'd like to understand the rule in it's edge cases!

Also, I've seen some compilers where #1 and #2 would fail, but #3/#4/#5 would pass this test (due to padding Base up to 4 bytes), which matters when structures have to match up with externally generated data formats:

struct Base { u8 id; };
struct Derived : public Base { u8 extra[3]; u32 extra2; };
struct DerivedHack { u8 id; u8 extra[3]; u32 extra2; };
#pragma pack(push)
#pragma pack(1)
struct BasePacked { u8 id; };
struct DerivedPacked : public BasePacked { u8 extra[3]; u32 extra2; };
#pragma pack(pop)
static_assert( sizeof(Base) == 1, "#1" );
static_assert( sizeof(Derived) == 8, "#2" );
static_assert( sizeof(DerivedHack) == 8, "#3" );
static_assert( sizeof(BasePacked) == 1, "#4" );
static_assert( sizeof(DerivedPacked) == 8, "#5" );

. 22 Racing Series .

gekko

479

January 03, 2013 07:03 AM

[quote name='Hodgman' timestamp='1357110260' post='5016592']
Is the above code equivalent to test1 or test2 below?
[/quote]

Pointers to two different types are presumed to not alias.

-- gekko

Matias Goldberg

9,637

January 04, 2013 04:35 AM

It is equivalent to test2. However, that's talking about "strict aliasing". Like I said, if you compile that code under MSVC (or GCC without strict aliasing) it will be equivalent to test1.

And if the latter, does this mean that we can emulate the restrict keyword simply by creating these kinds of wrapper structs?

That's compiler implementation dependent. For example in MSVC, simply no. The "emulation code" will compile into test1, while if you write the restrict keyword, MSVC will actually assume a & b aren't pointing to the same memory, even if they're of the same type.

Edit: Think 4x4 matrix multiplication with 2 input ptrs and one output ptrs for the result (3 arguments total).


void mul4x4( Matrix4x4 &r, Matrix4x4 &a, Matrix4x4 &b )
{
/* r = a * b; code here*/
}

With restrict keyword in all args, it results in massive optimization because to do r = a * b; you can't do a = a * b without overwritting critical matrix elements in a that are still needed. Since &r == &a == &b could be possible, the compiler has to do a copy of both a & b, hence the compiler ends up doing: r = a' * b'; where a' & b' are clones of a & b (actually not full copies because there ARE a few memory places that can be guaranteed to be written after all math has been calculated, particularly if there are many registers available in the target architecture)
With the restrict keyword in all three args, you're guaranteeing the compiler that &r != &a != &b; even if they're all of the same type (Matrix4x4 in this case).

it seems that this work around would make it well-defined, right?

Strictly speaking, no; because Foo & Command weren't supposed to point to the same memory location in the first place. We would need to do some testing and assembly inspection, but I doubt even GCC compiling with strict aliasing will produce broken code in that scenario. Edit: But not because it's standard compliant.
The only standard compliant workaround is to cast everything to char* and memcpy the data.

Also, I've seen some compilers where #1 and #2 would fail, but #3/#4/#5 would pass this test (due to padding Base up to 4 bytes), which matters when structures have to match up with externally generated data formats:

You also should read this site, which matters if there are virtual tables.
Also note how 'virtual' is implemented is implementation defined (all compilers happen to use vtables, but it's not a requirement afaik)

Twitter: @matiasgoldberg

Distant Souls ? Alliance AirWar ? My Free Royalty-Free Music Library

Hodgman

52,717

Author

January 04, 2013 05:44 AM

You also should read this site, which matters if there are virtual tables.

Yeah, I'm only interested in C++ POD types, and C types. The details of C++ non-POD types are obviously a lot more unpredictable across platforms.

I've read that before and the MS explanation for their extra padding is interesting though ;)

Strictly speaking, no; because Foo & Command weren't supposed to point to the same memory location in the first place.

Switching over to C99 lingo for a moment, it's possible to have an allocation that doesn't have a type:


void* data = malloc(sizeof(int));

and it only takes on the effective type of int when you store a value in it, and it maintains this effective type until it's next written to:


(*(int*)data)  = 42;

after the above line, it's undefined to access it via another type:


float f = *(float*)data;//undefined

however, as the (C99 version of the) rules in my first post say, it's valid to access the stored value via an lvalue expression that has the type of an aggregate type that includes an integer member...

So even though the allocation never had the effective type of this aggregate, the effective type is compatible with the member of the aggregate (and the object-type, aka byte-representation of both are equal too), so it sounds like this should be valid:


struct A { int value; };
A a = *(A*)data;

In which case, I could have an A and a B legitimately reading from the same memory address?

Bregma already said this wasn't true, but the C99 wording seems to say that a particular group of bytes (an object) takes on an effective type when it's written to, and then that written value can be fetched by basically any expression with a compatible type, which includes aggregates that are made up of compatible types. They key is that it says that the type of the lvalue expression is an aggregate, not that the type is a compatible member of an aggregate.

It does make sense that the point of this statement is so that I can write to a.value, and then make a copy of 'a' as a whole (which needs to read a.value)... but it seems ambiguous enough that it also allows for reading ((B*)&a)->value too

. 22 Racing Series .

Strict aliasing rule

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Strict aliasing rule

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines