Back to General and Gameplay Programming

How a derived class cannot resolved members of its base class at compile time?

General and Gameplay Programming Programming

Started by Rickert February 17, 2012 08:26 AM

8 comments, last by SiCrane 12 years, 2 months ago

Rickert

109

Author

February 17, 2012 08:26 AM

Consider this code:

class X { public: int i; };

class A : public virtual X   { public: int j; };

class B : public virtual X   { public: double d; };

class C : public A, public B { public: int k; };

// cannot resolve location of pa->X::i at compile-time

void foo( const A* pa ) { pa->i = 1024; }



main() {

   foo( new A );

   foo( new C );

   // ...

}

It is said that the compiler cannot fix the physical offset of X::i accessed through pa within foo(), since the actual type of pa can vary with each of foo()'s invocations in the book "*Inside C++ object mode*l".
So, the compiler has to create something like this:
// possible compiler transformation
void foo( const A* pa ) { pa->__vbcX->i = 1024; }

If the program has a pointer to the virtual base class, how can't it resolve the memory address of that member at compile time? As far as I know, when each derived class object is created, the memory layout of each object consists of:

all members in the base class
a virtual pointer (of a virtual destructor)
a pointer to the virtual base class of the derived object
all of the members of the derived class object.

For example, suppose I have an object C c_object and A a_object

This is what I think about object c_object layout (suppose c_object start at address 1000:





1000: int i; //(subobject X)

1004: int j; //(subobject A)

1008: double d; //(subobject B)

1012: __vbcX; // which is at address 1000

1016: __vbcA; // which is at address 1004

1020: __vbcB; // which is at address 1008

1024: int k;

This is what I thought from what I read anyway, please verify and correct it for me.

So, finding the base class member should simply be finding the right offset from the starting address of the derived class object. But why can't it be resolved?

Washu

7,836

February 17, 2012 09:34 AM

First things first...

C++ does not have a "well defined", so the insides of it are implementation specific.

Anyways, foo won't work at all anyways, constant pointer to an A will not allow you to modify the integer.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

Washu

7,836

February 17, 2012 09:56 AM

If you happened to assemble it yourself and list the assembly you might get something like...



movq -16(%rbp), %rdi

callq __Z3fooP1A

//...

movq %rax, %rdi

callq __Z3fooP1A

For your calls to foo (after fixing the const issue), foo then looks like:





__Z3fooP1A:							 ## @_Z3fooP1A

Ltmp2:

.cfi_startproc

## BB#0:

pushq %rbp

Ltmp3:

.cfi_def_cfa_offset 16

Ltmp4:

.cfi_offset %rbp, -16

movq %rsp, %rbp

Ltmp5:

.cfi_def_cfa_register %rbp

movq %rdi, -8(%rbp)

movq -8(%rbp), %rdi

movq (%rdi), %rax

movq -24(%rax), %rax

movl $1024, (%rdi,%rax)	  ## imm = 0x400

popq %rbp

ret

Ltmp6:

.cfi_endproc

Leh_func_end0:

Feel free to figure it out, its pretty straightforward. Note that this is NOT optimized code. Which pretty much eliminates most of the code.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

Rickert

109

Author

February 17, 2012 10:17 AM

If you happened to assemble it yourself and list the assembly you might get something like...
movq -16(%rbp), %rdi callq __Z3fooP1A //... movq %rax, %rdi callq __Z3fooP1A
For your calls to foo (after fixing the const issue), foo then looks like:
__Z3fooP1A: ## @_Z3fooP1A Ltmp2: .cfi_startproc ## BB#0: pushq %rbp Ltmp3: .cfi_def_cfa_offset 16 Ltmp4: .cfi_offset %rbp, -16 movq %rsp, %rbp Ltmp5: .cfi_def_cfa_register %rbp movq %rdi, -8(%rbp) movq -8(%rbp), %rdi movq (%rdi), %rax movq -24(%rax), %rax movl $1024, (%rdi,%rax) ## imm = 0x400 popq %rbp ret Ltmp6: .cfi_endproc Leh_func_end0:

Feel free to figure it out, its pretty straightforward. Note that this is NOT optimized code. Which pretty much eliminates most of the code.

Thanks for your answer.

However, I learned assembly using Motorola 68K, not x86, and it was a long time ago. So I will figure it out in the future by learning proper x86 instruction set. Can you elaborate the answer in a higher level point of view?

Washu

7,836

February 17, 2012 10:30 AM

That's AT&T syntax, used by clang/gcc. Intel syntax is the other popular variant, and is much easier to read. I would pull it up for you but I'm not on my windows laptop atm.

In essence, its a table lookup based on the type to find the offset into the class for the appropriate types.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

m3mb3rsh1p

440

February 17, 2012 01:28 PM

With all due respect, Washu, turning the question into an analysis of assembly language doesn't help make it easier to understand. If the concepts are straightforward to you, please simplify them for us.

Antheus

2,410

February 17, 2012 01:44 PM

This is what I thought from what I read anyway, please verify and correct it for me.[/quote]

Not for virtual or multiple inheritance. There are multiple vtables, which are resolved during run-time, depending on how object is constructed.

A * is not sufficiently defined. While in this particular case it might be obvious, it's an exception, not the rule.

IIRC, multiple inheritance should be viewed as each class having its completely own vtable, rather than sharing it across hierarchy. There's also a ton of rules on how such classes are constructed and destructed. Consider a diamond:X / \ A B \ / CGiven an instance of C, one can cast it to either A or B, but A and B are completely distinct types. So even though we have C which has both, function that operates on A or B cannot rely on fixed layout.

Last time I tried to comprehend it I gave up and decided that virtual multiple inheritance is one of those parts of C++ one doesn't use. It's also the reason why essentially no other language supports it, there's just too many complications.

See here.

Rickert

109

Author

February 17, 2012 05:42 PM

This is what I thought from what I read anyway, please verify and correct it for me.

Not for virtual or multiple inheritance. There are multiple vtables, which are resolved during run-time, depending on how object is constructed.

A * is not sufficiently defined. While in this particular case it might be obvious, it's an exception, not the rule.

IIRC, multiple inheritance should be viewed as each class having its completely own vtable, rather than sharing it across hierarchy. There's also a ton of rules on how such classes are constructed and destructed. Consider a diamond:X / \ A B \ / CGiven an instance of C, one can cast it to either A or B, but A and B are completely distinct types. So even though we have C which has both, function that operates on A or B cannot rely on fixed layout.

Last time I tried to comprehend it I gave up and decided that virtual multiple inheritance is one of those parts of C++ one doesn't use. It's also the reason why essentially no other language supports it, there's just too many complications.

See here.
[/quote]
I see. It seems that the example in the book is so obvious that it hardly makes sense. From what was written in C++ FAQ in this section: http://www.parashift...e.html#faq-25.9, it seems there's not ambiguity in the example, since with the virtual keyword eliminates the duplication of multiple inheritance, thus calling a data member is straightforward .

Since you refer to vtables, I will modify the example a bit clearer:

class X { public: int i; virtual void func(){} }; class A : public virtual X{ public: int j; virtual void func(){} }; class B : public virtual X { public: double d; virtual void func(){} }; class C : public A, public B { public: int k; virtual void func(){} }; // cannot resolve location of pa->X::i at compile-time void foo( const A* pa ) { pa->X_only(); } main() { foo( new A ); foo( new C ); // ... }

In this new code, each base suboject of object C as well itself will have a virtual pointer to its own destructor. It is not known until runtime to be sure about which virtual destructor to be invoked.

Based on the answer in this question: http://stackoverflow...tion-is-invoked, it is known t hat in runtime, the virtual pointer of the base class subobject will be replaced by its derived one. In this case, func() cannot be determined until actual object is passed into foo() at runtime, thus it cannot assign the function address to invoke call to func() in foo().

So, I think the memory layout of typical object C is:
1000: int i; //start of subobject X 1004: __vptr_func_X; //virtual pointer of func() in X, however, it is pointing to address of __vptr_func_C 1008: int j; //start of subobject A 1012: __vptr_func_A; //virtual pointer of func() in A, however, it is pointing to address of __vptr_func_C 1016: double d; //start of subobject B 1020: __vptr_func_B; //virtual pointer of func() in B, however, it is pointing to address of __vptr_func_C 1024: __vbcX; // which points to the start subobject X at address 1000 1028: __vbcA; // which points to the start subobject A at address 1008 1032: __vbcB; // which points to the start subobject B at address 1016 1036: int k; 1040: __vptr_func_C; //virtual pointer of func() in C

About the virtual base class pointer __vbc things, I'm not sure if it is laid out as I wrote it to be. Or maybe this is compiler specific, and assume I'm a compiler maker, I can actually do it that way or otherwise, place it at the end of the object C as long as I satisfy the condition of having a virtual base class pointer in the derived class object, is it right?

Antheus

2,410

February 17, 2012 07:10 PM

Or maybe this is compiler specific[/quote]

It is.

I can actually do it that way or otherwise, place it at the end of the object C as long as I satisfy the condition of having a virtual base class pointer in the derived class object, is it right?[/quote]

I don't know what standard requires, but it does not specify layout, one of many things that leads to incompatible ABIs and lack of compatibility between different compilers and even compiler settings.

SiCrane

11,840

February 17, 2012 09:06 PM

Ok, to understand this, you need to first understand class layouts in simpler cases. This is compiler specific but generally goes as follows. A class with no inheritance or no virtual members is just its data members. For example:



struct A {

  int i;

  int j;

};



0000 i

0004 j

For a class with virtual members you also add a pointer to the vtable. Usually this is at the beginning of the class layout, but nothing requires it to be so.



struct B {

  int i;

  int j;

  virtual ~B() {}

};



0000 vptr to B::vtable

0004 i

0008 j

A class with non-virtual single inheritance places the complete layout of the base class at the front of its layout. If it has any virtual members, these members are made part of an extended vtable whose pointer replaces the pointer in the derived class, if the derived class has one. If not a new vptr member is created in the derived part of the layout.



struct C : A {

  int k;

  virtual ~C() {}

};



// beginning of A subobject of C

0000 i

0004 j

0008 vptr to C::vtable

000C k



struct D : B {

  int k;

  virtual ~D() {}

};



// beginning of B subobject of D

0000 vptr to D::vtable

0004 i

0008 j

000C k

With non-virtual multiple inheritance the base classes are generally laid out one after another and the derived class places its members after the base classes. If there are virtual functions the derived class appends it's added virtual functions to one of the base class's vtables or creates a new vtable. Note that with multiple inheritance there can be multiple virtual function tables associated with the derived class.



struct E {

  int i;

  virtual ~E() {}

}



struct F {

  int j;

  virtual ~F() {}

};



struct G {

  int k;

  virtual ~G() {}

};



// beginning of E subobject of G

0000 vptr to G::vtable1

0004 i

// beginning of F subobject of G

0008 vptr to G::vtable2

000C j

0010 k

Next there's virtual single inheritance. When virtually inheriting directly from a base class, the derived class places at the front of it's layout a pointer to a vtable and it's members. The virtual base's members goes at the end of the layout, and the offset to the virtual base is places as a member of the vtable. Non-virtually deriving from one of these classes inserts the new members between the virtual and non-virtual base.



struct H {

  int i;

  virtual ~H() {}

};



0000 vptr to H::vtable

0004 j



struct I : virtual H {

  int j;

};



0000 vptr to I::vtable1 // contains offset to 0008 as beginning of H

0004 j

// beginning of the H subobject of I

0008 vptr to I::vtable2

000C i



struct J : I {

  int k;

};



// beginning of the I subobject of J

0000 vptr to J::vtable1 // contains offset to 000C as beginning of H

0004 j

0008 k

// beginning of the H subobject of J

000C vptr to J::vtable2

0010 i

Finally we can get to the diamond virtual inheritance situation. Here the inheritance is done by aggregating the non-virtual base parts of the base classes like in the non-virtual inheritance situation and then placing the virtual base at the end.



struct K {

  int i;

  virtual ~K() {}

};



0000 vptr to K::vtable

0004 i



struct L : virtual K {

  int j;

  virtual ~L() {}

};



0000 vptr to L::vtable1 // includes offset to 0008 as beginning of K

0004 j

// beginning of the K subobject of L

0008 vptr to L::vtable2

000C i



struct M : virtual K {

  int k;

  virtual ~M() {}

};





0000 vptr to M::vtable1 // includes offset to 0008 as beginning of K

0004 k

// beginning of the K subobject of M

0008 vptr to M::vtable2

000C i



struct N : L, M {

  int l;

};



// beginning of the L subobject of N

0000 vptr to N::vtable1 // includes offset to 0014 as beginning of K

0004 j

// beginning of the M subobject of N

0008 vptr to N::vtable2 // includes offset to 0014 as beginning of K

000C k

0010 l

// beginning of the K subobject of N

0014 vptr to N::vtable3

0018 i

Here you can see that K::i is at location 0018 in N but 000C in L.

How a derived class cannot resolved members of its base class at compile time?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

How a derived class cannot resolved members of its base class at compile time?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines