Archived

This topic is now archived and is closed to further replies.

dynamicman

What do virual functions really do in the background?

Recommended Posts

I've always wondered what inheritence is implemented in the background... For all I know, it looks something up in a table.. dereferences it (whatever that means) for you dynamic type... What really happens? Can someone please clarify? and also, when you do a dynamic cast, what happens in the background? For example...
      

class base
{
}

class child : public base
{
}

//code somewhere

base* obj = new base();
child* objChild = (Child*)obj; //What happens here?


    
What happens at the last line memory wise? I know if you cast from floats to ints (data members) conversion is needed and its a bit of overhead (not relating to this case since float is not child of int, vis versa)... But what happens when a class inherits another and tries to do some dynamic casting? The clearest plainest english would be nice... Assume I don't know technical jargon (the truth is, I probably don't...).. Edited by - dynamicman on October 16, 2001 11:53:47 AM Edited by - dynamicman on October 16, 2001 11:54:56 AM

Share this post


Link to post
Share on other sites
I don''t believe the C++ spec says how virtual functions have to work so in theory nothing is guaranteed.

In practice you''re right - it looks it up in a table. If your class has virtual functions then the compiler makes a table (called a vtable) that has pointers to the virtual functions. In your example class base and class child would have two different tables. In one table the first entry points to (for example) base::Foo and in the other it points to child::Foo. The code doesn''t care which - it just grabs the first entry and jumps to it.

Share this post


Link to post
Share on other sites
ty getting clearer..

what about the casting? How is the conversion handled?

I understand that objChild will have a vtable that points to its own methods (ex. foo)... but what happens when a cast is made (to obj)? How does it "switch" vtables?

in other words how does this..

(Child*)obj

get converted to this...

child* objChild

in relation to swithing vtables and conversion?

Share this post


Link to post
Share on other sites
You question has nothing to do with virtual functions because you have no virtual functions.

Casting, asside from dynamic_cast, is a compiler thing. Object is simply a chunk of memory. by casting, you are telling the compiler that the object is now of another type. This can be dangerous, like in your case, if the object are not related ;

  

class Base
{

int i;
};

class Child
{
float v;
};

...

Base* b = new Base();

Child* c = (Child*)b;

//one of those 2 lines will make the program crap out. It only depends which object(i or v) comes in memory.

c->v;
c->i;

//you can also imagine worse :


class A
{
int i;
};


class B
{
char array[100];
}

A *a = new A;

B* b = (B*)a;

char c = b->array[50] ; //ouch!!!




by using C++ style cast(static_cast, dynamic_cast, const_cast, reinterpret_cast) you can avoid those errors.


Share this post


Link to post
Share on other sites
What if they are related. If the child inherits base. What happens then? Is it just a compiler thing? Does it only notify the compiler that I want to access the "child" classes methods because of the cast? Or when it compiles, does it actually do some memory rearranging? I know from float to int (data types) it will do some memory rearranging... but my MAIN question is what heppens if there is a base-child format?

I'm sorry if I'm not getting this.... Learning as I go...

assume that I have the method foo() within the base and child...

Edited by - dynamicman on October 16, 2001 6:12:46 PM

Share this post


Link to post
Share on other sites
OK, the first example is tweaked because you''re downcasting a baes class object into a derived class pointer, which is illegal, immoral and fattening. I''m sure you''ve heard that public inheritence = "Is-A", and it''s quite a bit more complicated than that but that''s a good rule of thumb. In the example, child Is-A base, but base is NOT a child, so casting a base object to a child does pretty much undefined things.

If you have a virtual function in a class, that class gets a virtual function table, or vtable. There is only one vtable object per class; in other words, the vtable is static. Each class object has a hidden member variable that point to the static vtable for that class. This is why adding a single virtual function adds 4 bytes to the size of your class object, but no more than that.

When you call this function, call it base::foo, what actually happens is that the program "looks up" the vtable for that base object (dereferences the pointer), and then calls the function that the vtable points to. Since you only have 1 virtual function, there is only 1 entry in this table.

Now let''s say you derive a class from base called child, and you implement foo in the child class (child::foo). There is now a single static vtable for all child class objects in addition to the base vtable. Furthermore, the child''s vtable entry points to child::foo, whereas the base class''s entry points to base::foo. When you create a child, its vtable pointer points to the child vtable, not the base vtable. So executing a function call looks something like this:
  
base *pBase = new base; // creates base object, vtable* points

// to base vtable (static)

base *pChild = new child; // creates child object, vtable*

// points to child vtable (also

// static.) Note that this is a base *,

// not a child *. The compiler thinks

// these two pointers point to the same

// type, but the vtable is used so that

// virtual functions go to the right

// place

pBase->foo (); // - loads vtable pointer from pBase object (base''s vtable)

// - jumps to foo function in that vtable (base::foo)

pChild->foo (); // - loads vtable pointer from pChild object (child''s vtable)

// - jumps to foo function in that vtable (child::foo)


That''s virtual functions in a nutshell. The cost of virtual functions are:
- 1 extra pointer dereference for virtual function calls (speed overhead)
- 4 bytes for the vtable pointer per object (per-object memory cost)
- 4*N bytes for the vtable itself per _class_, where N is the number of virtual functions (per-class memory cost).

The benefit of using virtual functions is that you should not have to ever switch on a type variable and can leave a single framework in place and extend that framework to handle new kinds of objects without recompiling.

I hope I haven''t mis-spoken at all, I''m about 90% sure this is how it all works, though I''m a little unsure about how multiple virtual functions resolve through a single pointer. Maybe somebody else can chime in.

Share this post


Link to post
Share on other sites
quote:

...which is illegal, immoral and fattening...


but unfortunetly it's sometimes needed. C++ probably has a RTTI (run-time type information) spec, but it got implemented on each compiler before it was standardized, so it's not always exactly the same.

COM's QueryInterface deals with this issue by forcing each object to match a cookie to each of it's interfaces (more-or-less it's base-classes). This works becuase QI is implemented on the most-derived class, and it always knows who all it's parents are.

C++ allows for branches in the inheritence tree, which enables MI (multiple inheritence). The few times I've looked at the disasm it was just a bigger vtbl, with each branch concatenated on the end. That's why you need additional information at run-time to do "horse-shoe" cast. The vtbl is linear, so there's no way to know how to go back up and come back down another path . If the same interface (class with virtual's) is inherited twice, it's in the vtbl twice . Often those two sets of vtbl pointers end up pointing to the same methods - but they don't have to. This is the problem RTTI solves and QueryInterface tactifully avoids. RTTI is likely a hell-of-a-lot faster than a QI call, but it's "esoteric" and may be compiler sensitive - a RTTI lib from MSVC5 don't work so well on BCB1. Perhaps in the current compiler versions there's more options for compatibility.

I don't remember if virtual inherited interfaces are in the vtbl twice and guaranteed to point at the same methods, or if it's only in the vtbl once (that's ones of those TBD by the compiler things).

For these reasons people _hate MI, and that's why everything else (except SmallTalk and Python) are SI only.

If someone else has the time, make a test case and do an asm listing and lets see exactly what happens.

Edited by - Magmai Kai Holmlor on October 17, 2001 1:13:32 AM

Share this post


Link to post
Share on other sites
quote:
Original post by Magmai Kai Holmlor
but unfortunetly it''s sometimes needed.


Your points are valid and correct as far as I can tell, but I was commenting on the fact that his object actual IS a base, and he''s trying to cast it to a derived. If he''d used dynamic_cast and RTTI, it would have returned a NULL pointer.

In other words, in his original code, he''s trying to use polymorphism backwards from base to derived instead of derived to base.

(Personally, I avoid RTTI like the plague, but others on my team seem to thinks it''s the hottest thing since sliced bread; I thought conditional if chains were something that inheritence was created to AVOID!) =)

Share this post


Link to post
Share on other sites
Some information about casting:

1) All casting is done on pointers. It never modifies objects pointed by these pointers.

2) In case of upcasting (child to base) compiler doesn''t generates any code, it only checks (at compile time, not at runtime) if child really inherits from base. Upcasting is always safe unless you use the explicit cast what is something completely wrong because it tells compiler not to check the types.

Child* pChild = new Child;
Base* pBase = pChild; // always safe

Base* pBase = (Base*)pChild; // absolutely unsafe and wrong

3) In case of downcasting (base to child) without using dynamic_cast, compiler doesn''t generates any code, but doesn''t perform any type checking.

4) In case of downcasting using dynamic_cast, compiler do insert some code that makes sure (at runtime) that downcasting is safe (and if it is not, dynamic_cast returns null).

Share this post


Link to post
Share on other sites
quote:

...I was commenting on the fact that his object actual IS a base, and he's trying to cast it to a derived.


Oh, NEVER do that - it's the devil!

I have to agree, I avoid RTTI like the plague as well. Since I'm into the COM thing I rely on QI for that functionality. I resisted this philosophy for a while, but change my mind when I ran into IMoniker. I wanted to create a derivation of an MFC CComboBox to work with some DirectShow filters, so the user could have a look and pick one. I ended up making the ComboBox list IMoniker's, we means it can used a number of other COM based components besides DirectShow filters. But in order to get useful information out of the ComboBox, you would need to perform a RTTI type dynamic cast on the IMoniker pointer if it weren't for QI. Since all the filters expose an IMoniker interface you can use them with anything that can manipulate monikers. Once the user picks a moniker from the list, you can query it for an interface that actually does some work (IBaseFilter in my case). This means anything that exposes IMoniker works with my ComboBox. Since the application must know what the original interface is, it knows which interface to ask for once a IMoniker comes it's way. Once it settled on me how useful that was, I started liking run-time casting (beit RTTI or QI) alot more.

quote:

I thought conditional if chains were something that inheritence was created to AVOID.


Exactly how I felt - what's the point in fracturing everything into interfaces and classes and then dumping into a switch to decide how to cast and what to do. I thought that you ought to have a virtual method that everyone shares to do that work to make good use of polymorphism. What really ought to be done, is everyone that wants to use a certain service, ought to implement the inferface required to use it (e.g. implement IMoniker to use services that require names, like a MonikerComboBox). So within a service you should should use polymorphism via the virtuals. To hook-up and detach from services, you should use RTTI/QI to navigate the available interfaces.

- Magmai Kai Holmlor
Hark, I have found the Holy Grail!


...
  
class base{}
class child : public base{}
//code somewhere

base* obj = new base();
child* objChild = static_cast<Child*>(obj); //What happens here?


A obj is copied to objChild, their values will be the same. It will even work correctly since it's a SI (single inheritence) branch. The compiler will about it, but you force the cast using static_cast (reinterpret_cast will also work, but static is a safer). Normally you use dynamic_cast to move up the inheritence tree; dynamic_cast will move down and around the tree (like your code example above) if you enable RTTI. Using RTTI is safer and consequentially the preferred method. However, it is not _required for the above example. (i.e. obj==objChild)


If child inherited from more than one class (e.g. base1 and base2), _then it would be impossible to do that cast without some sort of RTTI. You could go from child to base or from child to base2 without RTTI, and it would offset the child pointer to where the base1 and base2 begin (i.e. base1!=base2).

Edited by - Magmai Kai Holmlor on October 17, 2001 6:24:29 PM

Share this post


Link to post
Share on other sites
quote:
Original post by dynamicman
what about this...

base* mybase = new child();
child* mychild = (child*)mybase;

does this generate any code? I'm guessing it just follows the same principals so no....




Yes, this doesn't generates any extra code.

However, before to use it, I suggest to make sure that the problem cannot be solved using virtual functions. One of the main principles of object oriented programming says that child classes must be usable through their base class pointer without knowing their exact type.

A very detailed explanation of inheritance, virtual function and casting (as well as many other features of c++) can be found here:

http://www.camtp.uni-mb.si/books/Thinking-in-C++/TIC2Vone-distribution/html/Frames.html

Look in chapters 14 and 15.


Edited by - Advanced Bug on October 18, 2001 3:24:41 AM

Edited by - Advanced Bug on October 18, 2001 3:25:31 AM

Share this post


Link to post
Share on other sites
Hehe, thanks bug! Finally got the answer I have been searching for!

One more thing. Are there any downsides to adding more functionality to child classes and casting them for their "specific" functionalities? This kinda breaks some OO rules but I''m wondering if it affects speed, efficiency. Of course it is more prone to errors, but is there anything beyond that which I should worry about?

Share this post


Link to post
Share on other sites
quote:
Original post by dynamicman

Are there any downsides to adding more functionality to child classes and casting them for their "specific" functionalities? This kinda breaks some OO rules but I''m wondering if it affects speed, efficiency.


No, it doesn''t affect speed if you use (child*)mybase type of casts . At least on GCC compiler it generates exactly the same code as simple pointer assignement.

Share this post


Link to post
Share on other sites