Should I use reinterpret_cast when casting between int and class pointers?

Started by
11 comments, last by the_edd 15 years, 1 month ago
Hi guys, I've tried to find the answer to this on my own via Google but no luck so far. I'm using the Havok Physics engine and it stores user data via a long int. I reinterpret_cast my SceneObject pointer into this, and then out when I want to retrieve information from a collision. As far as I understand, reinterpret_cast isn't done at compile time? I've also heard that one should avoid it as the results aren't defined on different compilers. However, the Havok examples use it in alot of places. What gives?
Advertisement
In the link you've posted there is the answer:

Quote:reinterpret_cast only guarantees that if you cast a pointer to a different type, and then reinterpret_cast it back to the original type, you get the original value.


You can think reinterpret_cast as a C-style cast. You actually tell the compile "trust me, I know what I am doing". So if you use reinterpret_cast to cast the pointer to long int and then back to the same kind of pointer you will be fine. But only in this case. That is way casting is dangerous, if you don't know what you are doing :-)
It's only that or using a simple (long int) cast that works, so there's no alternative. =)
The simple answer is that you should NEVER cast an int to a class object pointer. Why do you think this is necessary?

EDIT: Never mind, I re-read your post and now understand why you need to do it. It makes my alarm bells go off anyway. :-\
Thanks guys!

Yeah, I'm not too fancy using a long integer either. I would have preferred a void pointer because that's what I've seen before, but whatever. And as far as I understood I can't use a static cast here.. the compiler barks when I do.

So finally, is the reinterpret_cast incurring any performance penalty? I've dreamt nightmares about dynamic_cast (seems they're doing a string comparasion in some implementations?) but my profiling runs show not much time spent with these reinterpret_cast calls.
Quote:Original post by SymLinked
So finally, is the reinterpret_cast incurring any performance penalty?


None whatsoever. It just tells the compiler to treat the value as if it had been of the different type all along. No data conversion is done, unlike, say a static_cast of a float into an int.

Quote:I've dreamt nightmares about dynamic_cast (seems they're doing a string comparasion in some implementations?) but my profiling runs show not much time spent with these reinterpret_cast calls.

It shouldn't be doing string comparisons (if it is, it would be based on the typeinfo data you can get through typeid). At worst, it should be walking down inheritance trees until it finds a match for the class (often, it uses the vtbl pointer as a type identifier).

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
reinterpret_cast is supposed to take the input data and reinterpret it as the output data.

How many instructions does it take to retinterpret a pointer as an integer? Under most architectures, 0 instructions. (I could come up with a pathological machine language architecture in which you would have to do work -- imagine one in which data is typed at the machine language level, and you cannot act on an integer as a pointer without setting the right flags. In that case, you would have to set the flag. As noted, this is pathological -- I am not aware of any architecture that does anything like this.)

The C and C++ languages are designed to be able to be implemented on extremely pathological machine language architectures. To this extent, they leave a LOT of freedom in the spec for the underlying architecture to be pathological.

Practically, your largest problems with a reinterpret cast between a pointer and a long int are:
1> In a 64 bit system with 32 bit long ints, you just threw away data. And it is not possible to round-trip.

2> If you have class foo:bar{...};, and a bar being pointed to as a foo*, and you cast that via reinterpret to a long int, then you cast it _back_ to a bar*, then you have engaged in undefined behavior, if you have sufficient virtual class behavior going on.

A foo* and a bar* pointing to the same object need not have the same binary value. And reinterpret cast does binary value conversion (basically). This is really easy to miss, and can bite you in the ass. (What is worse is that it will work with some classes, and not with others.) So be careful that if you convert an A* to an int, that you convert it back to an A* and not some related pointer.

The largest class of objects for which this will not be a problem (or almost certainly won't be a problem) are POD types, which are defined in the standard. (It stands for Plain Old Data -- virtrual behavior-less, constructor-less, and containing nothing with such behavior). The C++0x standard extends these guarantees to a wider set of classes/structs (which, in all practically, work that way today).
Would it be feasible for you to have a global vector of SceneObjects or of pointers to SceneObject and give Havok an index into this vector? It does require using some form of global data, but it's probably much cleaner and more portable than casting between pointers and integers. I doubt the performance impact would be significant. How often will this code be run?
Thanks for all your comments, especially NotAYakk and Fruny.

Quote:Original post by alvaro
Would it be feasible for you to have a global vector of SceneObjects or of pointers to SceneObject and give Havok an index into this vector? It does require using some form of global data, but it's probably much cleaner and more portable than casting between pointers and integers. I doubt the performance impact would be significant. How often will this code be run?


Very often, unfortunatly. So keeping a list of objects, and letting the UserData integer be the index number.. It sounds simple enough, I'm just afraid I'll run into trouble when I delete anything from the list and the indexes change.

I guess I'll have to rethink it all. Just confused why the guys at Havok used this as an (apparantly erronous) example..

Edit: This is what the docs say:

Quote:
When overriding a pointer as an integer, you'll need to be careful with 32/64 bit issues. Overriding a type as a type of different size can cause the binary writer to fail. In the case of hkpWorldObject, the alignment requirements of adjacent variables ensures that the object layout is still computed correctly for 64 bit machines.


Does it mean I'm safe?

[Edited by - SymLinked on March 14, 2009 5:32:35 PM]
Each Resource has a 32 bit integer value.

Your Resource cache is a std::deque (as you don't know how big it will get). Each entry is a pointer to the Resource.

You also maintain a std::set of blank resources indexes (by integer).

class ResourceManager;class Resource;// global function:ResourceManager* GetResourceManager();class ResourceManager {public:  virtual int Register( Resource* ) = 0;  virtual void UnRegister( int ) = 0;  virtual Resource* GetResource( int ) = 0;};class Resource {  int ResourceId;  Resource( const Resource& other ); // not allowed  Resource& operator=( const Resource& other ); // not allowedpublic:  int GetResourceId() { return ResourceId; }  Resource() {    ResourceId = ResourceManager->Register(this);  }  virtual ~Resource() {    ResourceManager->UnRegister();  }};


Inherit from Resource, and magically you register yourself with the Resource Manager. And it automatically cleans itself up on destruction.

With this model, copies can be problematic -- so I just blocked them.

This does require that you do the casting from-resource manually. If you have a limited set of classes, implementing a per-type interface isn't that tricky (and can be automated using template-fu).

An easy option would be to create a set of virtual functions in Resource that downcast this, as follows:

class ResourceSubtypeA;class ResourceSubtypeB;...class ResourceSubtypeZ; // use snappier namesclass Resource {  ...public:  // syntactic sugar:  template<typename Subtype>  Subtype* downcast() {    Subtype* tmp = 0;    downcast(&tmp);    return tmp;  }  template<typename Subtype>  Subtype const* downcast() const {    Subtype* tmp = 0;    downcast(&tmp);    return tmp;  }private:  virtual bool downcast( ResourceSubtypeA** ) { return false; }  virtual bool downcast( ResourceSubtypeA const** ) const { return false; }  ...  virtual bool downcast( ResourceSubtypeZ** ) { return false; }  virtual bool downcast( ResourceSubtypeZ const** ) const { return false; }public:  template<typename Subtype>  static Subtype* FromId( int id ) {    ResourceManager* rm = GetResourceManager();    assert(rm);    if (!rm) return NULL;    Resource* resource = rm->GetResource( id );    if (!resource) return NULL;    return resource->downcast<Subtype>();  }};

This gives you the nice syntax:

ImageResource* foo = Resource::FromId<ImageResource>( id );

and uses an auto-managed resource manager.

It does require you to manually deal with downcasting, and extend the Resource interface for each child type. You could instead rely on dynamic_cast to do that, but it would be slightly slower.

The FromId function is still worth writing, but it would look like:
  template<typename Subtype>  static Subtype* FromId( int id ) {    ResourceManager* rm = GetResourceManager();    assert(rm);    if (!rm) return NULL;    Resource* resource = rm->GetResource( id );    if (!resource) return NULL;    return dynamic_cast<Subtype*>(resource);  }


Of course, this is without using a shared pointer infrastructure. If you like shared pointers, you have to do some work to get a resource manager of this kind to work well.

The problem is that you might only have an int outstanding that refers to a given Resource, but you don't want it to go away. And at other times, you do want it to automatically go away.

Solving this particular problem is a serious headache when you are also dealing with APIs that don't want to let you change the type of data you bounce around. :/

Honestly, I'd recommend going with the dynamic_cast route. Less code. If performance becomes a problem, do the extra legwork.

Edit: This post originally started with "Steal a page from malloc." Then I decided that doing the 'encode the empty list in the deleted nodes' to be overly complex for such a simple problem.


[Edited by - NotAYakk on March 14, 2009 9:07:46 PM]

This topic is closed to new replies.

Advertisement