Sign in to follow this  
Ryan_001

Fast dyanamic_cast

Recommended Posts

I'm working on a project that requires dynamic_cast. I understand its quite slow and should generally be avoided, that said I see no other way. To that end I thought I could perhaps speed up dynamic_cast by caching the results. So I put together a small class that seems to work for the limited test scenarios I threw at it. None-the-less its on the borderline of kosher and was wondering what other people think.

 

class FastDynamicCast {

	private:
		typedef const std::type_info* Type;
		typedef std::tuple<Type,Type,Type> InterfaceId;		// actual object type, src interface, dest interface
		typedef std::map<InterfaceId,int> Table;
		Table table;

	public:
		template <typename TR, typename T> TR Cast (T* ptr) {
			static_assert(std::is_pointer<TR>::value,"TR must be a pointer type.");
			static const int flag = std::numeric_limits<int>::max();

			InterfaceId id(&typeid(*ptr),&typeid(T*),&typeid(TR));
			auto i = table.find(id);
			if (i == table.end()) {
				TR r = dynamic_cast<TR>(ptr);
				if (r) {
					unsigned char* t0 = reinterpret_cast<unsigned char*>(ptr);
					unsigned char* t1 = reinterpret_cast<unsigned char*>(r);
					int diff = static_cast<int>(t1 - t0);
					table.insert(Table::value_type(id,diff));
					}
				else {
					table.insert(Table::value_type(id,flag));
					}
				return r;
				}
			else if (i->second == flag) return nullptr;
			else {
				unsigned char* t = reinterpret_cast<unsigned char*>(ptr);
				t += i->second;
				return reinterpret_cast<TR>(t);
				}
			}
	};

 

Despite the fact that the underlying implementation isn't specified in the standard, for a given object, src, and dest interface the pointer transformation needs to be the same... right? I mean you can't have objects magically changing around their structure at runtime.  Anyways I just thought I'd throw it out here to see what people thought and/or critique it.

Share this post


Link to post
Share on other sites
When people say that dynamic cast is slow, they mean that it is slower then doing no work at all, which is the case of a static cast.
With this comparision, its infinitely slower, but that doesn't mean that the dynamic cast will matter in your project.

If you do any reasonable amount of work with the pointer after (doesn't even have to be that much) the cost of the dynamic cast will quickly disappear.
It's not that slow, its just much slower then doing no work :)

That said, dynamic casts are often avoidable, and the need to use them usually points to something fishy in your design. Edited by Olof Hedman

Share this post


Link to post
Share on other sites

Have you profiled this? I highly doubt that this would outperform the compiler's implementation. In particular, what makes you think that typeid() will be significantly faster than dynamic_cast?

Not just typeid(), but the cost in terms of cache performance of using a map to hold the lookups is going to bite you pretty quick.

dynamic_cast<> on "slow" compilers is really just a strcmp. If you're really hurting for performance, rolling your own RTTI is the only way to really win.


But 99% of the time, you're not hurting that bad :-)

Share this post


Link to post
Share on other sites

I always wondered about something related. Assuming at some Point you Need to know what class an object actually is of. Then, as far as I understand, I could try dynamic_cast<> to all possible classes. And that's O(n), if n is the number of possible classes. So having a map or hashtable to just look it up should theoretically be faster for large enough n. Is this correct or am I missing a possible way to use dynamic_cast<> efficiently for this?

Share this post


Link to post
Share on other sites

I always wondered about something related. Assuming at some Point you Need to know what class an object actually is of. Then, as far as I understand, I could try dynamic_cast<> to all possible classes. And that's O(n), if n is the number of possible classes. So having a map or hashtable to just look it up should theoretically be faster for large enough n. Is this correct or am I missing a possible way to use dynamic_cast<> efficiently for this?

Or have the object tell you its type directly.

Share this post


Link to post
Share on other sites

I always wondered about something related. Assuming at some Point you Need to know what class an object actually is of. Then, as far as I understand, I could try dynamic_cast<> to all possible classes. And that's O(n), if n is the number of possible classes. So having a map or hashtable to just look it up should theoretically be faster for large enough n. Is this correct or am I missing a possible way to use dynamic_cast<> efficiently for this?

If you are really hurting by this roll your own and type checking then boils down to a pointer address compare, which you can find in Game Programming Gems 2. This relies on staticly declared pointers in your RTTI capable classes, which you have to register your self. This is all the DECLARE_CLASS(CEnemy, CMoveableObject) macros that you see in the Half-Life codebase (https://developer.valvesoftware.com/wiki/Authoring_a_Logical_Entity).

Share this post


Link to post
Share on other sites

dynamic_cast<> on "slow" compilers is really just a strcmp.

Well it's more like a bunch of strcmps in a loop. Worst case that I know of is a linear search of the target type in a list of all the legal class names of the dynamic type.

Share this post


Link to post
Share on other sites

I'm working on a project that requires dynamic_cast. I understand its quite slow and should generally be avoided, that said I see no other way.

There is almost certainly another way. In my experience, 'unavoidable' use of dynamic_cast is due to design flaws elsewhere in the system.

 

In your usage, do you only need exact type matches, or do you need to take inheritance hierarchies into account?

e.g.

Given the inheritance hierarchy:

C is a B
B is a A

And the code:

C derived;
A* basePtr = &derived;
B* cast = dynamic_cast<B*>( basePtr );
Using regular dynamic_cast (or a custom implementation that also takes inheritance hierarchies into account), then this cast will succeed, because the object is an A, a B and a C.

 

With a custom implementation based on exact matches only, then this cast would fail, because the object only identifies as a C -- the advantage of these kinds of RTTI systems is that they are a lot faster.

 

Well it's more like a bunch of strcmps in a loop. Worst case that I know of is a linear search of the target type in a list of all the legal class names of the dynamic type.

And this is why I ban dynamic_cast in my coding guidelines...

Imagine the language didn't have RTTI at all -- instead there was a free, cross-platform, but closed-source library that gave you easy RTTI. Despite it being closed source, people had stepped through the Asm to see how it worked, and found that on Windows it often resulted in a long loop of string comparisons.
How many users would that library have, compared to a simpler library, or compared to people just using their own extremely simple home-made RTTI systems based on enums, etc?
The only reason this horrible library does have any users at all, is because it's shipped with and integrated into the language. I believe in Keep-It-Simple-Stupid as a strong guideline, but in this case, keeping it simple for me would be to reinvent the wheel, to replace an over-complicated and bloated system with a simple one.

Edited by Hodgman

Share this post


Link to post
Share on other sites

I think it is worth pointing out an additional item in regards to some of the suggestions that using dynamic_cast is not so bad because it won't be a notable performance problem later.  This is unfortunately a very bad way to look at things, especially with this particular bit of functionality.  You need to approach this from the usage side of things to see why accepting dynamic_cast performance is a bad idea.  How do you usually use dynamic_cast?  Well, usually you have a list of some object and you are processing them in order using dynamic_cast to perform various functionality on each object.  So, in your code you might only be using dynamic_cast once in a while, the calls to it become huge and the performance issues add up.  This is so common a pattern in games that I simply don't bother with standard RTTI anymore, well in any core processing systems that is.

 

This is a case of massive gains in the long run if you correct the problem early on.  Mikes rule: "Early optimization is the root of all evil.", well ignoring the obvious is beyond evil in this case.  Especially if you consider that in games the most likely usage is in entities which will be called thousands of times a game loop.

 

.02

Share this post


Link to post
Share on other sites

But why do you "need" dynamic_cast<> or its homemade equivalent, as opposed to virtual functions to avoid leaking unneeded type information, segregation of objects of different classes to avoid identifying each one, members and methods providing the information you actually need, and other leaner and cleaner approaches?

Share this post


Link to post
Share on other sites

I always wondered about something related. Assuming at some Point you Need to know what class an object actually is of. Then, as far as I understand, I could try dynamic_cast<> to all possible classes. And that's O(n), if n is the number of possible classes. So having a map or hashtable to just look it up should theoretically be faster for large enough n. Is this correct or am I missing a possible way to use dynamic_cast<> efficiently for this?

If you are really hurting by this roll your own and type checking then boils down to a pointer address compare, which you can find in Game Programming Gems 2. This relies on staticly declared pointers in your RTTI capable classes, which you have to register your self. This is all the DECLARE_CLASS(CEnemy, CMoveableObject) macros that you see in the Half-Life codebase (https://developer.valvesoftware.com/wiki/Authoring_a_Logical_Entity).

If std::type_info::hash_code wasn't made deliberately stupid and useless, you wouldn't even need that. You could just use the language's built-in feature. It would even work perfectly for serialization.

 

Shame that usability was apparently not intended with N2530. A hash code that explicitly gives no guarantee about being the same between different invocations of the same program is entirely useless for the majority of things.

Share this post


Link to post
Share on other sites
To be fair to the poorly performing dynamic_cast implementations, they generally designed to work in corner situations that game developers rarely run into. Such as dynamic_casts from an object created in a DLL from code in another DLL that may have accidentally declared a type of the same name. In general C++'s dynamic_cast is meant for systems that look very different from games; systems that are long lived and require binary compatibility between revisions are places where dynamic_cast become a more reasonable option. Fortunately for everyone involved these kinds of systems are becoming more frequently implemented in managed languages.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this