Polymorphic sizeof() Operator?

Started by
7 comments, last by Zahlman 17 years, 6 months ago
Hello. I'm having a bit of trouble with the sizeof() operator in C++. I'm working on a garbage collection system with some smart pointer objects. So far, they've worked beautifully, but I'm having trouble with the sizeof() operator. The smart pointer that does the garbage collection is Handle<typename T, unsigned Size>. Handle supports arrays, which is why it has a Size template parameter. This enables some neat bounds checking features, including assignments between Handles that may lead to an array whose max index will refer to unallocated/unreserved memory. However, to get this feature fully functional, I need a means of using the sizeof() operator polymorphically. The problem is polymorphic classes. Let's say we have two simple classes called A and B like this:
class A
{
public:
    A(): _myInt(0) {}
    virtual ~A() {}
private:
    int _myInt;
};
class B : public A
{
public:
    B(): _myFloat(0.0f) {}
    virtual ~B() {}
private:
    float _myFloat;
};




Simple enough. As I said before, with my Handle class, I can do something like this:
...
Handle<A, 10> hva1 = new A[10];
Handle<A, 3> hva2 = &hva1[2]; // Safe.
Handle<A, 6> hva3 = &hva1[8]; // Error! Throws exception! Bounds overlap!
...




The last line there will throw an exception because hva3 is declared an array of size 6, but it refers to an array of As that is of size 2 (hva1[8] and hva1[9]). Removing that line will get rid of the exception and run just fine, letting the collector do the cleanup work. However, this will not work properly:
...
Handle<A, 100> hvb1 = new B[10]; // No complaint; Bs are As.
Handle<A, 20> hvb2 = &hvb1[79]; // This should work, but depending on the nature of A and B, it may throw an exception!
...




I haven't tested this specific example, but I know that depending on how A and B are defined, the second line may or may not throw an exception, which means that the behavior is essentially undefined and dangerous. I know this sort of assignment into arrays is strange, but I'm trying to make Handle as flexible as possible. So, after all of that rambling, my question is... is there a way to use sizeof() polymorphically? If I have a pointer of type A that points to a B, can I somehow get the size of B when I pass that (dereferenced) pointer to sizeof()? Thanks! I appreciate any help.
Advertisement
sizeof alone will not work polymorphically; it's not actually a function (i.e. it is determined at compile-time, not run-time).

I can think of a couple ways of getting the size, but without knowing exactly how/when you need your size, I don't know which is appropriate. One way:

class A{public:  virtual size_t ClassSize() const { return sizeof(A); }  // rest of A};class B : public A{public:   virtual size_t ClassSize() const { return sizeof(B); }   // rest of B};
Thanks for the reply. :-)

I need to know the size of the pointed to object within Handle which only knows the object's type through its template parameter. I thought about your approach, but that means that all objects would have to be derived from some common base that defined a pure virtual function that returned its size. I'd like for my collection system to work with any type, both user-defined and built-in, without the need for deriving from such a base.

Is this possible, or am I just out of luck? Supporting this "array safety" feature isn't absolutely necessary, but I would really like to implement it. If I can't, clients of Handle will have to know that assignments between Handles that point to arrays of polymorphic types may have some unexpected results. :-( (A rare situation, I know, but it's the principle of the thing. ;-) )

To be a bit more specific, Handle uses the sizeof() operator so that it can figure out the size of the block of memory that must be allocated to create the object or array it points to. In the case of non-array Handles, this just turns out to be the size of the type. However, polymorphism makes this difficult because a Handle that points to objects of type A and is assigned to an object of type B will report a referenced block of memory equal to sizeof(A) in size.

I can't think of any circumstances yet where this becomes a problem when not allocating arrays, which is why I say it's not imperative that I can implement this properly. In fact, I originally didn't care at all about the size of the block of memory but only the memory location of the variable. The addition of array support is what introduced tracking the block size.

This all comes into play when a new reference is made. If the address of the new reference falls within the memory block of a previously added reference, the references are considered to be the same. This is because the previous reference must act as a unit (like in the case of an array) and any "subreferences" (references within that memory block) refer to some subset of that previous reference.

Thanks again! I really appreciate it. Any more insight/ideas would be great.
Something like this, perhaps?

(WARNING: This will fail horribly in case of a base pointer pointing to a derived object and sizeof(base) != sizeof(derived). Use with care.)
template <typename T, size_t size>class Handle{ public:  template <typename U>  Handle(U* uptr) : element_size(sizeof(U))  {    // ...  }  // ... private:  size_t element_size;  // ...};
Thanks for the reply, Sharlin.

I would like to avoid requiring the user to have to provide any extra information by invoking a parameterized constructor, but it seems the best solution. I need the information from somewhere, right? Using the parameterizd constructor could also be optional, depending on the circumstances.

I also realized that there is a very important reason to accurately know the size of a memory block even when not allocating an array! I need to know in order to prevent circular references! If the proper size of the memory block is not known, it is possible to create circular references that cause serious problems, especially with arrays. (I just tested a few cases.)

For instance, I can define the class A this way without problems:

class A{public:    A(): _myInt(0), _myHandle(this) {}    virtual ~A() {}    Handle<A>& GetHandle()    {        return _myHandle;    }private:    Handle<A> _myHandle;};

I can also do something like this safely without worrying about dangling objects, or in this case, an entire array that stays on the heap:

...Handle<A, 10> hva1 = new A[10];hva1[0].GetHandle() = &hva1[4];/* hva1[0]'s Handle refers to hva1[4], so even when hva1 goes out of scopethere is still a reference to hva1[4] and thus that block of memory remains.However, Handle is smart enought to detect this and delete the array as soon ashva1 goes out of scope.*/...

This isn't always the case though when Handle is used polymorphically. If Handle points to objects of type A but is assigned to some far descended object of type E of a very different size, then the above could easily result in a circular reference.

Thanks again, Sharlin. I'll try adding a special set of constructors that allow an appriate size to be passed in. This doesn't completely solve the problem, but it's very helpful.
Hm, the user doesn't need to provide any explicit information; the template parameters of a function template are deduced from the parameter types if possible (actually, in case of constructors, there *isn't* any way to explicitly specify them!).
class Base { int i; };class Derived : public Base { int j; };Handle<Base, 1> h1 = new Base; // element_size == sizeof(Base)Handle<Base, 1> h2 = new Derived; // element_size == sizeof(Derived)// BUT:Base* p1 = new Derived;Handle<Base, 1> h3 = p1; // BOOM: element_size == sizeof(Base)
Yep, you're right. I totally read your code wrong and added the template in the wrong place (so that it became Handle<typename T, typename U, size_t Size>)!

My fault. >_<

Thanks again.
I'm wondering: why do you need this? You already know the type of objects when creating an array of them, so you know their size.

The point of converting "Array of B" to "Array of A" is moot, since you would need to restrict array writes and store the internal stride (i.e. both sizeof(B) and the maximum number of elements) for iterating anyway.
Uh, yeah, like he said. As soon as you have 'Handle<A, 100> hvb1 = new B[100];' (I assume '10' was a typo), you're screwed anyway. If sizeof(B) != sizeof(A), then you've allocated more than 100 A's worth of space. But there's no way for Handle<A, n> (for any n) to "know" that it's pointing at storage for Bs; it decided *at compile time* that it's pointing at A storage, and its operator[] will return an A, and it will index into that storage assuming sizeof(A), etc. At runtime it is too late. The pointed-at array allocation is not an object that you can query for the element-type.

Frankly, that line shouldn't compile as it is, and if it somehow does, you need to rethink your design.

(And yes, this is related to the phenomenon of "object slicing".)

This topic is closed to new replies.

Advertisement