Oh the union woes

Started by
4 comments, last by samoth 6 years, 10 months ago

What is the difference between...?


    struct foo
    {
            uint16_t a;
            uint16_t b;
            uint16_t c;
            uint16_t not_used;
    };

and


    union foo
    {
        uint64_t dont_care;
        struct
        {
                uint16_t a;
                uint16_t b;
                uint16_t c;
                uint16_t not_used;
        };
    };


Well yes, 47 characters. But apart from that, what if you e.g. make a std::atomic<foo>?

For a compare-exchange, the difference is around 200 instructions! And no, the union is not the one with more instructions. :)

It is of course well-known that std::atomic doesn't guarantee that just about everything be atomic, in particular it gives no guarantees whatsoever except on atomic_flag. How could one guarantee every possible thing being atomic, anyway -- the hardware doesn't necessarily even support that. OK. Accepted.

But you can somewhat expect that something that easily fits within the instruction set's capabilities (an integer, a struct of two integers, four smaller integers) is "just atomic" anyway. Well, that's not the case.

But what's really stunning is not so much that the compiler fails to emit e.g. a single CMPXCHG8B when it could just do that in the first case. What truly does me is that if you give the compiler a "hint" by aliasing with an uint64_t as in the second case, then all of a sudden, as if by magic, it works...

How did I find out? You don't want to know. :lol:

Advertisement

So the second struct is good and the first one bad?

What truly does me is that if you give the compiler a "hint" by aliasing with an uint64_t as in the second case, then all of a sudden, as if by magic, it works...

The first struct has 2 byte alignment so is not compatible with CMPXCHG8B. The second struct is 8 byte aligned. That seems like a sensible, non-magic answer :D

Sure, std::atomic<T> could be careful when allocating its internal T, making sure that anything with 4<size<=8 gets allocated with 8 byte alignment/padding.... but I'm guessing you've discovered that it doesn't?

The first struct has 2 byte alignment so is not compatible with CMPXCHG8B. The second struct is 8 byte aligned. That seems like a sensible, non-magic answer :D

Yes, obviously... except, no. :)

Trying struct alignas(8) foo {...} which should align the struct compatibly to CMPXCHG8B has exactly the same effect. In a union, it works (well it works either way...) but as a sole struct, it calls into library functions.

Which compiler are you using to test this?

I just tried it in VS2017 and interestingly got this compile error when testing the struct version:


error C2338: You've instantiated std::atomic<T> with sizeof(T) equal to 2/4/8 and alignof(T) < sizeof(T). Before VS 2015 Update 2, this would have misbehaved at runtime. VS 2015 Update 2 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility. Please define _ENABLE_ATOMIC_ALIGNMENT_FIX to acknowledge that you understand this, and that everything you're linking has been compiled with VS 2015 Update 2 (or later).

^^ That sounds like originally in VS2015, they would emit 64-bit atomic instructions but would fail to correctly align/pad the data, resulting in bugs at runtime :lol:

I tried compiling a simple test with:

  • struct with /D_ENABLE_ATOMIC_ALIGNMENT_FIX
  • struct with alignas(8)
  • union

And they all produced the same code, performing assignment to the atomic with:

xchg QWORD PTR [rcx], rax

So your bug is not universal across compilers... Yay for the standard library nuts and bolts being completely implementation defined :(

On 15.6.2017 at 0:55 AM, Hodgman said:

error

That must be the funniest ever error message I've seen in my life. And maybe the longest (standard library template errors with 5,000 lines of "instantiated from", doesn't really count). But you can't say it doesn't tell you exactly what's wrong in the compiler's opinion or what you have to do. :D

The compiler I used to get the above was GCC 7.1, I didn't bother trying clang... was too frustrating with a single compiler already.

This topic is closed to new replies.

Advertisement