Better method than smart-pointers for dealing with circular dependency?

Started by
10 comments, last by _the_phantom_ 10 years, 8 months ago

Just turned in an assignment where I ran into this. It's not really a big issue, but I'm wondering if there's a better way to deal with it than what I did.

I have a class 'Game', which has a few members that need to refer back to the owning instance of Game.

In order to do this, the classes for the member objects each have a member: 'Game& game'. Obviously I need to assign this in the initializer list for those classes, which I do. The problem is that this creates a circular dependency between these classes and Game.

Example:


class Game {
public:
  Game();
private:
  Thingy thing;
};

class Thingy {
public:
  Thingy(Game& game);
private:
  Game& game;
};

Game::Game() : thingy(*this) { // <--- Not allowed, since 'this' is not defined yet.
}

Thingy::Thingy(Game& game) : game(game) {
}

So I ended up using std::unique_ptr to store those objects and I just allocate them in the ctor body of 'Game'.

Is there a cleaner way to do this? I suppose I could set up some kind of an event system, but that would be overkill for the small assignments I'm doing, and it means doing an awful lot of legwork for something so simple.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.
Advertisement

Is there a cleaner way to do this?

First of all: If you absolutely must do this then no, not really but I can pretty much guarantee you there is a better way of structuring your code. You should consider avoiding the cyclic dependency altogether. Such a simple loop might seem innocuous at first but over time, in a larger project this can get you into serious trouble and is a nightmare to untangle.

Second of all: using std::unique_ptr is going to get you intro trouble. Game contains an instance of Thingy and Thingy contains a std::unique_ptr<Game> to the Game instance that owns it. When Game's destructor goes off the destructor for Thingy will go off as well triggering the destructor for the std::unique_ptr<Game> instance which will in turn cause the destructor for Game to go off. Stack overflow. If you are new-ing an instance of std::unique_ptr<Game> instide of Thingy as your phrasing suggest the stack won't overflow but there is little point in having a pointer to a smart pointer...all the advantages are lost, you might as well use a regular pointer.

[edit] Forgot to mention, if you do new std::unique_ptr<Game> in Thingy you need to delete it or you will leak memory but doing so will cause the Game and Thingy destructors to call each other recursively until the stack overflows. You just can't win.


//Thingy.h

//Predeclared Game, since Thingy's interface only requires a reference or pointer.
class Game; 

class Thingy
{
	public:
	  Thingy(Game& game);
	private:
	  Game& game;
};

//Thingy.cpp

#include "Thingy.h"
#include "Game.h" //Include Game.h so we can get the full definition.

Thingy::Thingy(Game& game) : game(game)
{
	
}

//Game.h

//We must include the full definition of 'Thingy' here, because Game has a non-reference/non-pointer Thingy as a member.
//If you used the pImple idiom, this can be avoided, but in this case that's not necessary.
#include "Thingy.h" 

class Game
{
public:
  Game();
private:
  Thingy thing;
};

//Game.cpp

#include "Game.h"

Game::Game() : thingy(*this) { // <--- Yes this is allowed
}

Yes this is allowed, since 'this' IS defined AND initialized, but 'Game' isn't fully constructed yet, so you have to make sure Thingy's constructor doesn't actually use it yet - but the address is valid, so Thingy's constructor can save it for later use. In this simple example, Thingy isn't actually using the pointer/reference in the constructor, so it's cool to do.

If you *know* a variable is already constructed, you can even access that variable in the initializer list.

Example:


Game::Game() : a(0), b(this) //'b' can read and write to 'a' using 'this', because 'a' was initialized before 'b'.

Yes this is allowed, since 'this' IS defined AND initialized, but 'Game' isn't fully constructed yet, so you have to make sure Thingy's constructor doesn't actually use it yet - but the address is valid, so Thingy's constructor can save it for later use. In this simple example, Thingy isn't actually using the pointer/reference in the constructor, so it's cool to do.

You can get it to work, you can even do so without breaking the implementation out into multiple files; however, from the MSDN:

The this pointer is valid only within nonstatic member functions. It cannot be used in the initializer list for a base class.

The base-class constructors and class member constructors are called before this constructor. In effect, you've passed a pointer to an unconstructed object to another constructor. If those other constructors access any members or call member functions on this, the result will be undefined. You should not use the this pointer until all construction has completed.

It triggers a warning that most developers elevate to an error. My understanding is that the C++ language specification doesn't insist that 'this' be defined until the class's user defined constructor enters scope but it just so happens that the Visual Studio implementation defines it before churning through the initializer list so you can get off with a warning.

I recommend a solution of lazy or two-step initialization. I also recommend that internally it follows a pimpl idiom.


Why does it absolutely need a reference? Is there a reason that a pointer will not work? And why does it absolutely need it during construction? My guess is that it doesn't NEED either. Make it a pointer that is initialized to null. Later, after the other parts are set up, set the pointer on every object that will need it.

Extending that further you could also allow both. As an obvious example within C++ think of the file stream library. You can use the constructors that automatically open a file. You can also use the constructors that don't, then open a file later. The same with your code, you can provide a version that accepts the parameter, and also provide a constructor that doesn't require the parameter and lets you set it later.

In our current engine at work there is actually a 4-step game object initialization process. The game object sees a constructor which is nearly always empty. Then an OnCreate() method is called where you know all the base member variables are created and initialized; here you set up your own values but do not touch other components, the function is skipped if the object is being loaded through serialization. Next an OnStartup() method where you register with other components and recreate any non-persisted values, this is run both after deserialization and on new object creation. Finally, when the object is placed in the world, OnPlacedInWorld(), only a few objects need this additional step, but sometimes they need to hook up with nearby objects and this provides that opportunity.

Is there a cleaner way to do this?


First of all: If you absolutely must do this then no, not really but I can pretty much guarantee you there is a better way of structuring your code. You should consider avoiding the cyclic dependency altogether. Such a simple loop might seem innocuous at first but over time, in a larger project this can get you into serious trouble and is a nightmare to untangle.

Second of all: using std::unique_ptr is going to get you intro trouble. Game contains an instance of Thingy and Thingy contains a std::unique_ptr<Game> to the Game instance that owns it. When Game's destructor goes off the destructor for Thingy will go off as well triggering the destructor for the std::unique_ptr<Game> instance which will in turn cause the destructor for Game to go off. Stack overflow. If you are new-ing an instance of std::unique_ptr<Game> instide of Thingy as your phrasing suggest the stack won't overflow but there is little point in having a pointer to a smart pointer...all the advantages are lost, you might as well use a regular pointer.

[edit] Forgot to mention, if you do new std::unique_ptr<Game> in Thingy you need to delete it or you will leak memory but doing so will cause the Game and Thingy destructors to call each other recursively until the stack overflows. You just can't win.


huh.png


Game::Game() : thingy(*this) { // <--- Yes this is allowed
Yes this is allowed, since 'this' IS defined AND initialized, but 'Game' isn't fully constructed yet, so you have to make sure Thingy's constructor doesn't actually use it yet - but the address is valid, so Thingy's constructor can save it for later use. In this simple example, Thingy isn't actually using the pointer/reference in the constructor, so it's cool to do.


Is this standard or VC? This was my first instinct, but the warning suggested that it was unsafe. I'm not hot on the idea of abusing implementation behavior, since it breaks portability. It'd be perfect if it's standard, though.


I recommend a solution of lazy or two-step initialization. I also recommend that internally it follows a pimpl idiom.


Why does it absolutely need a reference? Is there a reason that a pointer will not work? And why does it absolutely need it during construction? My guess is that it doesn't NEED either. Make it a pointer that is initialized to null. Later, after the other parts are set up, set the pointer on every object that will need it.

Extending that further you could also allow both. As an obvious example within C++ think of the file stream library. You can use the constructors that automatically open a file. You can also use the constructors that don't, then open a file later. The same with your code, you can provide a version that accepts the parameter, and also provide a constructor that doesn't require the parameter and lets you set it later.


Just trying to avoid the mess and the dynamic allocation, really. I suppose pimpl could reduce it to a single allocation rather than one per object, but then the internals have that extra layer of complexity.

I've been warned against using multiple construction, so I usually try to avoid it. In this case it seems overkill compared to the smart pointers.

In our current engine at work there is actually a 4-step game object initialization process. The game object sees a constructor which is nearly always empty. Then an OnCreate() method is called where you know all the base member variables are created and initialized; here you set up your own values but do not touch other components, the function is skipped if the object is being loaded through serialization. Next an OnStartup() method where you register with other components and recreate any non-persisted values, this is run both after deserialization and on new object creation. Finally, when the object is placed in the world, OnPlacedInWorld(), only a few objects need this additional step, but sometimes they need to hook up with nearby objects and this provides that opportunity.


Holy crap, man!

Couldn't you just create separate ctors for normal construction and deserialization construction? I mean, I get the OnPlacedInWorld() part, but it sounds like OnCreate() and OnStartup() could just be normal ctors that call a common member function after the serializable values are set. Why do you guys avoid using the ctor? Or, rather, if it needs to be plugged in this way, why not a pair of factory functions?



Thanks, all for your responses so far.
void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

You can get it to work, you can even do so without breaking the implementation out into multiple files; however, from the MSDN:

The this pointer is valid only within nonstatic member functions. It cannot be used in the initializer list for a base class.

Good point! We're not using it in a base class in this example, though.

The base-class constructors and class member constructors are called before this constructor.

That is saying is that the base class constructor and member-variable constructors get called before your class's constructor's code gets executed.

In effect, you've passed a pointer to an unconstructed object to another constructor. If those other constructors access any members or call member functions on this, the result will be undefined.

Yep, that's what I mentioned, poorly worded, in my previous post: "[your class] isn't fully constructed yet, so you have to make sure Thingy's constructor doesn't actually use it yet - but the address is valid"

You should not use the this pointer until all construction has completed.

Indeed. So we're not using it. Just passing it. wink.png

It triggers a warning that most developers elevate to an error. My understanding is that the C++ language specification doesn't insist that 'this' be defined until the class's user defined constructor enters scope but it just so happens that the Visual Studio implementation defines it before churning through the initializer list so you can get off with a warning.


As long as it has the memory address of the class, we can safely pass it around, afaik. However, just because it has the memory address doesn't mean the memory has been initialized (it hasn't been, in this case). Since the memory isn't initialized, then we can't safely read or write to the pointer when we dereference it.

I might be mistaken about this! I was trying to look it up in the C++ standard, but I'm not familiar with the standard, so I didn't come up with a clear answer... but it seemed to reinforce what I was saying.

The closest I found was this quote highlighted in red below, and a few examples of valid code snippets in the standard that uses the 'this' pointer: (The code comments are part of the standard's example also)

To form a pointer to (or
access the value of) a direct non-static member of an object obj, the construction of obj shall have started
and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing
the member value) results in undefined behavior.
{ Example:


struct A { };
struct B : virtual A { };
struct C : B { };
struct D : virtual A { D(A*); };
struct X { X(A*); };

struct E : C, D, X
{
E() : D(this), // undefined: upcast from E* to A*
               // might use path E* ? D* ? A*
               // but D is not constructed
               // D((C*)this), // defined:
               // E* ? C* defined because E() has started
               // and C* ? A* defined because
               // C fully constructed

     X(this) { // defined: upon construction of X,
               // C/B/D/A sublattice is fully constructed
     }
};
—end example }

§ 12.6.2 paragraph 12 (...I think. I'm still learning how to read this thing)
Names in the expression-list or braced-init-list of a mem-initializer are evaluated in the scope of the constructor for which the mem-initializer is specified. { Example:


class X {
    int a;
    int b;
    int i;
    int j;

public:
    const int& r;
    X(int i): r(a), b(i), i(i), j(this->i) { }
};
initializes X::r to refer to X::a, initializes X::b with the value of the constructor parameter i, initializes
X::i with the value of the constructor parameter i, and initializes X::j with the value of X::i; this takes
place each time an object of class X is created. —end example } { Note: Because the mem-initializer are
evaluated in the scope of the constructor, the this pointer can be used in the expression-list of a meminitializer to refer to the object being initialized. —end note
}


(This is from N3376, which is a draft version a dozen or so days after the C++11 standard. i.e. it is the C++11 standard with one or two minor corrections)

It's dangerous because you don't know whether the constructor of the class you are passing the 'this' pointer to tries to access it or not. But it's is valid and standard, as far as I can tell. Though I still can't find a guaranteed firm statement in the standard, the examples of valid code and the highlighted note hint at it.

I use it in the MinGW compiler, and I haven't noticed a warning, but I might have it accidentally disabled or something.

The C++FAQ says:

Some people feel you should not use the this pointer in a constructor because the object is not fully formed yet. However you can use this in the constructor (in the {body} and even in the initialization list) if you are careful.

[...snip...]

Here is something that sometimes works: if you pass any of the data members in this object to another data member's initializer, you must make sure that the other data member has already been initialized. The good news is that you can determine whether the other data member has (or has not) been initialized using some straightforward language rules that are independent of the particular compiler you're using. The bad news is that you have to know those language rules (e.g., base class sub-objects are initialized first (look up the order if you have multiple and/or virtual inheritance!), then data members defined in the class are initialized in the order in which they appear in the class declaration). If you don't know these rules, then don't pass any data member from the this object (regardless of whether or not you explicitly use the this keyword) to any other data member's initializer! And if you do know the rules, please be careful.


So again he's saying it's dangerous if you try to access a member-variable that hasn't been constructed yet.
But you can still safely store the 'this' pointer for later use. Leastwise, that's what I get out of it.

Ah, cool. Okay.

Yeah, it is dangerous, but it can be used in this case.

However, I went outside to smoke just now and realized that this whole situation is resulting from inappropriate intimacy between the objects. They should be communicating their state to Game through getters and letting Game handle the interactions rather than interacting with one another. I'm going to try to adjust it and see if I can get rid of the references to game.

Edit:

Oooooooh yeah. That's so much cleaner. I'm gonna resubmit this one.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

frob, on 24 Aug 2013 - 01:34 AM, said:snapback.png

In our current engine at work there is actually a 4-step game object initialization process. The game object sees a constructor which is nearly always empty. Then an OnCreate() method is called where you know all the base member variables are created and initialized; here you set up your own values but do not touch other components, the function is skipped if the object is being loaded through serialization. Next an OnStartup() method where you register with other components and recreate any non-persisted values, this is run both after deserialization and on new object creation. Finally, when the object is placed in the world, OnPlacedInWorld(), only a few objects need this additional step, but sometimes they need to hook up with nearby objects and this provides that opportunity.

Holy crap, man!

Couldn't you just create separate ctors for normal construction and deserialization construction? I mean, I get the OnPlacedInWorld() part, but it sounds like OnCreate() and OnStartup() could just be normal ctors that call a common member function after the serializable values are set. Why do you guys avoid using the ctor? Or, rather, if it needs to be plugged in this way, why not a pair of factory functions?

This is pretty standard. It's not that you can't achieve the same with multiple constructors but breaking it up into multiple function calls or passes allows a few different things:

1) External systems can run code in-between the initialization passes. This is almost a necessity, especially for games.

2) Code reuse. Often times the default constructor does stuff that all the other initialization functions want done. If everything was broken out into multiple, overloaded constructors then a lot of code would get copied across them.

3) Clear intent. OnPlacedInWorld() OnCreate() and OnStartup() tells would be modifiers of the class where to put various bits of initialization code while the systems using the class know when to call what.

4) Performance. As mentioned, the minimally filled out default constructor allows faster serialization as you can avoid doing things that will simply be undone by the serializer.

All those and more.

5) Faster loading by lazy initialization. You can create lightweight proxy objects during the busy times, and then during idle times (or on demand) fill in the holes.

6) Avoid the code smell of "the whole game gets loaded during the splash screen". One title I worked with did put most everything in the constructor. Between constructors and static objects, about half of the entire game code was processed all before entering main().

7) Interface Segregation Principle antipattern. These are necessarily part of the base GameObject class which tends to be monolithic as an intentional design choice. By keeping the functions virtual the individual game objects can be more easily extendable. You can provide an empty constructor (relying on the GameObject base constructor), and frequently provide no overrides for any of the other initialization function if the basic GameObject functionality is enough.

8) Simplifies the use of resources, improves performance, and decouples responsibilities. The object is fully constructed and useable at construction, but it is not interconnected with the rest of the system; this is much like a file stream that is fully constructed but not open or using system resources. The second says "You were just created and need to have your values set", which is the opposite of "you were just loaded from disk, let me reset your values for you." The third says "open up resources you need and register with the rest of the components". The fourth says "You are being placed on stage in front of the player, start doing your thing." Each phase is a different task.

I'm sure I could expand the list more, but hopefully you get the point.

It isn't a case of non-construction. The game object is fully constructed after the constructor. Instead it is a case of only doing the minimal work required at that point. Don't request resources until you need them.

One of the worst things a game can do is to naively construct everything individually simultaneously. Bonus stupid for saying "Creating all these objects are slow, I'll do it multithreaded", but that is exactly what many inexperienced programmers do when loading is slow or when there are cycles during initialization. When the system is already waiting for 20 requests to create your objects, you don't speed things up by adding an additional 20 requests on top of it for extra resources. Anything that is resource-intensive, such as object creation, should prefer to defer work either by picking up a proxy from the engine that will be handled later (when the system is less busy) in some form of priority queue, or through a massive pre-processed load-in-place where nothing needs to be constructed, or through some other lazy / late streaming system.

It is fairly obvious when games make that mistake. Just open up your resource monitor and watch how many simultaneous disk reads are pending. I've seen games that approach a thousand queued simultaneous reads. Avoid the bottlenecks of heavy construction.

This topic is closed to new replies.

Advertisement