8 hours ago, Gnollrunner said:
I've tested it verses some simple cases where I could either use dynamic_cast or use a virtual function call to get the same functionally. I found that even in fairly simple cases dynamic_cast is in fact slower than the function call (which somewhat surprised me)
When you need to use dynamic cast in a design it is a code smell. It isn't necessarily wrong, but there's a high probability a better solution exists. In the general sense every derived object should be interchangeable for the intended purpose. Also in the general sense, there is a chance your code will never know about the actual derived class because code completely outside your control is able to create new derived classes; code should always assume they are a completely unknown concrete type rather than one of the types you know is in the code right now today.
As an example, let's say you're working on a D3D graphics system. They follow the principles rather well. Let's say you've got the handle to your device with D3D12CreateDevice(). You get back an ID3D12Device pointer. From that point on everything in the code you can do with an ID3D12Device pointer you can do with your object. Any device is interchangeable with any other device. Your code should never need to use a dynamic cast to see if the underlying type is actually a GeForce 970, or an AMD R9 250, or some other specific card. If you attempted to individually handle all the cards on the market today your code would break in the future when the next generation of graphics cards is released. If you have a handle to a base class then that should be all you need, they should be interchangeable.
As for speed, virtual calls are very fast because of their indirection design. Virtual calls were implemented by following the best practice of high performance indirections, then CPU designers improved the hardware to handle them nearly for free. There is one indirection to the function's vtable, which in turn points to the function. On most modern chips that indirection will be in a reserved section of the CPU's cache since virtual functions are called so frequently. On modern chips with the out-of-order core the indirection has zero cost when they're already warmed in the CPU's cache or only have the cost of a single cache lookup (perhaps about 7 ns) the first time they're encountered.
On the other hand, looking up the type for conversion for dynamic_cast is a much more involved operation: Functions to look up the type must be called, and the type information must be loaded; the exact concrete type is tested first, followed by a test for the primary base type, both tests are relatively fast; if neither of those work, a series of operations and lookups take place along the entire inheritance tree until the result is found, or none is found, and none of that information will already be warmed up in the CPU's cache.
Design the base classes so you don't need to know what concrete type they are. Issue commands or queries on the interface and use the virtual functions to drive behavior.
Trying to move back on topic, Hodgman's got the standard solution. Include both the object and the function to call. The appropriate source can then call the function in the correct context.