• entries
743
1924
• views
584355

1603 views

Was just thinking about using some PIMPL in a project I'm working on and had a thought about avoiding the (probably irrelevant) cost of having to new the implementation for a value class.

Would this work?// thing.hclass Thing{public: Thing();private: classs Imp; Imp &imp(); char imp_data[1024]; // or whatever};// thing.cppclass Thing::Imp{ // the private stuff};Thing::Thing(){ new(imp_data) Imp(); // placement new?}Thing::Imp &imp(){ return *(static_cast(imp_data));}
Might have placement new syntax wrong there, but would this be an implementation of PIMPL that would mean the actual allocation for the private implementation data would just be part of the normal class memory rather than a pointer to data that could be (potentially) somewhere else leading to cache misses?

I'm really thinking out loud here. Would appreciate anyone's thoughts on this, never seen it done.

You would have to make sure that the size and alignment of imp_data are correct and you are missing a placement delete, but aside from that, yes, this is possible

You would have to make sure that the size and alignment of imp_data are correct and you are missing a placement delete, but aside from that, yes, this is possible

Thanks, yes, placement delete required.

What's the alignment issue? I assumed if the size was sufficient, it would just be like newing normal memory, no?

If your imp_data is or contains an __m256 for example it needs 32-byte alignment, and new won't be able to give you that because it doesn't know its an __m256 and usually only provides 8- or 16-byte aligned addresses for most types. That will cause your code to crash horribly upon casting your unsigned char array to an __m256 and trying to use it. For "most" types I guess it will work, though.

Also note that if you ever need to change the imp_data's size, which will likely happen every time you need to modify its internals, that is a breaking change binary-compatibility wise, so your PIMPL is still somewhat leaky (though I don't know how relevant that is in your case).

If your imp_data is or contains an __m256 for example it needs 32-byte alignment, and new won't be able to give you that because it doesn't know its an __m256 and usually only provides 8- or 16-byte aligned addresses for most types. That will cause your code to crash horribly upon casting your unsigned char array to an __m256 and trying to use it. For "most" types I guess it will work, though.

Also note that if you ever need to change the imp_data's size, which will likely happen every time you need to modify its internals, that is a breaking change binary-compatibility wise, so your PIMPL is still somewhat leaky (though I don't know how relevant that is in your case).

How would you modify the code example to fix this problem?  I haven't worked with SIMD types (which is what that looks like), so I'm curious.

As for ensuring the char array is large enough, a simple static assert will notify the programmers of any potential problems at compile time.

Thanks for the info guys. I'm just looking for a way to avoid the overhead of an extra new and the cache unfriendly nature of having the implementation possibly miles away from the normal class data, but it's probably a non issue.

If your imp_data is or contains an __m256 for example it needs 32-byte alignment, and new won't be able to give you that because it doesn't know its an __m256 and usually only provides 8- or 16-byte aligned addresses for most types. That will cause your code to crash horribly upon casting your unsigned char array to an __m256 and trying to use it. For "most" types I guess it will work, though.

Also note that if you ever need to change the imp_data's size, which will likely happen every time you need to modify its internals, that is a breaking change binary-compatibility wise, so your PIMPL is still somewhat leaky (though I don't know how relevant that is in your case).

How would you modify the code example to fix this problem?  I haven't worked with SIMD types (which is what that looks like), so I'm curious.

As for ensuring the char array is large enough, a simple static assert will notify the programmers of any potential problems at compile time.

Static assert is a good call. The obvious fix in the spirit of the journal entry is to do something like this (untested):

class Thing
{
public:
Thing();

private:
classs Imp;

Imp &imp();

char imp_data[1024] __attribute__ ((aligned (32 /* or whatever */))); // or whatever
};


And then also do a static assert on that alignment to make sure it matches the natural alignment of Imp (you can gather that natural alignment using alignof() in C++11 I believe, however I don't think you can just extract the "32" from the declaration above like you can the size with sizeof(), so you might need a named constant for it). Of course you still need to make sure the declared alignment agrees with Thing::Imp, but you also have that problem with the size anyway.

Of course, another solution that doesn't expose the alignment to the user would be to make sure imp_data has at least ALIGNMENT - 1 extra padding bytes, and before doing your placement new doing some math to construct Imp at the right offset in the array to get the correct alignment (and the extra bytes ensure you always have enough space for the contents of Imp themselves). This is quite the hack and even though it - I believe - technically "works" in the sense that alignment issues are confined to the implementation, it kind of defeats Aardvajk's purpose of improving cache-friendliness since it gives you extra padding bytes that are going to get in the way of you packing your Things together efficiently. In fact, the real underlying issue is that you are compensating for the caller's inability to properly align the object it is constructing (because you are, in effect, lying to the compiler about the alignment the Thing class actually needs). So I wouldn't go with that approach, honestly.

And I was thinking more about actual binary compatibility, i.e. suppose you have the Thing class implementation in a library, call it ThingLib, and that you build CoolProgram using ThingLib (either statically or dynamically, doesn't matter). Now suppose you improve your Thing, which eventually lead to a change in the size of its imp_data member. You then rebuild ThingLib. At that point with the usual PIMPL techniques you:

- do absolutely nothing, if CoolProgram links to ThingLib dynamically (you can just swap in the new library)

- relink CoolProgram to the new ThingLib (without needing to recompile any of CoolProgram's object files) if linking statically

However, with this idiom we not only have to relink CoolProgram, but we also need to recompile every single object file that uses the Thing class and depends on imp_data's size. Which, if you are using this idiom everywhere, basically translates to a full rebuild of CoolProgram every time the Thing class' internal layout changes. As I mentioned this may or may not matter, if this is for internal use you probably do not care much for binary compatibility and the build time savings are (I guess) negligible, but the tradeoff is real and should be kept in mind.

I'd be interested to see alternative solutions though.

Good points re binary compatibility. This is actually for something I intend to put into a static library.

Sounds like there are a raft of reasons I've not seen this done and I'd be better off sticking to the normal PIMPL approach.

Great responses, thanks all.

Create an account

Register a new account