Two-Phase construction...

Started by
8 comments, last by reaperrar 11 years, 3 months ago

I'm starting my first big project and I want to stick to a strict coding standard. It is probable in the future other people will be working with my code.

I'm a fan of Two-Phase construction(/deconstruction) yet most online sources I've found advise against it. The general pros and cons I've discovered are:

Two-Phase:

Pros:

  • Can control initialization and cleanup of object outside allocating space for the object. So if I wanted an array of objects that initialize with certain variables it doesn't get annoying if objects don't have default constructors.
  • Can have a return result for success or failure. Little simpler than throwing an exception & catching it outside object construction.
  • Can call virtual functions within initialization.

Cons:

  • Base class initialize functions not hidden. They can be protected though, it just appears a little less elegant I guess.
  • Less inituitive for other programmers. I'm not sure I 100% agree with this one as if you have a strict coding standard where all your objects need to be initialized manually after construction then it shouldn't be too difficult to get used to.
  • Compilers already enforce order of construction/deconstruction when using constructors/destructors so it is less error prone and re-inventing the wheel, so to speak, may be unneccassary.

Single-Phase:

Pros:

  • The opposite of all two-phase cons.
  • Less invariants? Not 100% clear on this.

Cons:

  • The opposite of all two-phase pros.

So given these points so far I'm still leading to two-phase yet can't bring myself to do it since so many people oppose >< Curious to see what other people on this forum think.

Advertisement

The answer depends on whether you're thinking of using it EVERWHERE in your code/engine as a rule, or just in certain places or objects. Generally, trying to use it everywhere sounds like a bad idea. But, if for example you're thinking of just using it for game objects (stuff that needs to be initialized with data loaded from a file, etc), then that would make more sense. Maybe not all game objects might need it, but if generally they do, and you prefer that pattern of usage, then that makes sense. This also makes it nice and easy for your future work, or other devs working with the code later. The rule would be that the norm is single construct/destruct, unless for game objects, in which case they all go through a 2-stage construct + init process.

My philosophy for architectural things like this is that I try to think of all the consequences of the decision, in all cases, in relation to my initial objectives. If it leads me to any odd or awkward situations, then that's a big red flag that there's something wrong with the design. If it all works together smoothly and accomplishes my objectives, then that's probably the way to go. This also applies for when I'm implementing code later, after the architecting phase is over. If I'm writing code where I'm "working around" stuff, or hacking to allow some code to work with other code... it usually means I'm violating some of my own initial design, or the design was flawed to begin with. But, usually it's that I'm violating the design.

I think I disagree with some of your pros:
Two-Phase:
Pros:
  • Can control initialization and cleanup of object outside allocating space for the object. So if I wanted an array of objects that initialize with certain variables it doesn't get annoying if objects don't have default constructors.
  • Can have a return result for success or failure. Little simpler than throwing an exception & catching it outside object construction.
  • Can call virtual functions within initialization.


I'll tell you how I would achieve those things without the two-phase construction. In order:
  • You can allocate the space without calling the constructor. This is a little bit tricky, but std::vector does it for you in the vast majority of situations where you might need to do this (you can call .reserve on a vector to allocate the space, but the object won't be constructed until you do something like calling .push_back).
  • You can set an internal flag to indicate bad status if the construction failed. You can see an example in std::ifstream, where you can use a constructor that tries to open a file, and then you can check if the opening succeeded or not with .good(). I don't believe most classes will need to do this.
  • I have never had a need to call virtual functions from a constructor. Then again, I don't use inheritance a whole lot.
But the main problem with two-phase construction is that you can't really take advantage of RAII, which is a wonderful feature of well-written C++.

Using the non-OO method of two-phase creation means you can not have class invariants to help you reason about your code. You are in effect using procedural idioms. Do not fool yourself into thinking otherwise.

It is possible to maintain a class invariant in a procedural manner if you ignore the object state in between creation and initialization and between tear-down and deallocation, and you add state checking to each and every access (so, every member function needs to endure the state of the object is valid, ie. the invariants are true. Or, like most production code, you just ignore error states and the the OS clean up on crashes -- this is the 'simpler than throwing an exception' method).

Here's another set of data points for your consideration.

  • All members are initialized in the constructor regardless of your intentions, and the 'initializer' only assigns the members new values later. That's fine if all your members are held by pointer, but otherwise you've just increased the construction cost -- and given the new operator is very expensive you might want to reconsider always holder members by pointer.
  • Using procedural methods of object construction prevents the effective use of RAII, the singlemost powerful and useful idiom in C++.
  • Procedural initialization means explicitly chaining initializers up the class heirarchy.
  • Procedural initialization has a much greater potential for resource leaks, since you need to check and manually unwind on failure. See above about holding members by pointer.
  • Procedural initialization requires either convoluted if-blocks or (preferrably in this case) cascading gotos for any non-trivial classes. Or you just use the method that's simpler than throwing an exception.

Two-phase destruction of objects is even more problematic, especially in terms of resource leakage. That's just lunacy gone mad.

If your standards require that you write more code than necessary and that you write more error-prone code, you might be doing it wrong. If your standards require you write code that makes it harder to reason about the internal state of your program and makes it harder for another programmer to come along and pick up on what's going on, it might be failing to fulfill its main purpose.

By the way, if you're using a compiler that conforms to the C++ standard, unform initializers let you initialize arrays and even std::vectors in a constructor without the need for default constructors.

Stephen M. Webb
Professional Free Software Developer

I'm a little confused about how could go about doing this...


class MyClass
{
	public:
		MyClass(int i) : m_i(i);
		~MyClass();
		
	private:
		int m_i;
}

int main()
{
	const unsigned int uiArraySize = 1000;
	int iInitializeValue = 5;
	
	MyClass oClass[uiArraySize] = {iInitializeValue, iInitializeValue, iInitializeValue...};

	return 0;
}

If I want all instances of the class to intialize with value "iInitializeValue" is there some automatic way this can be done instead of copy paste 1000 times xD?

EDIT: ...Without using anything other than syntax? I don't want to be forced to use something in std::

EDIT: Also, I assume acheiving the same thing is out of the question with the new operator when creating an array, though it would then seem more appropriate to use the std::vector.... unless the object provided to be copied from is non-copyable. /sigh

You should really do it using std::vector. If you are allergic to std::, you are programming in the wrong language, but you can make an array of char of the correct size, somehow enforce proper alignment, and then use placement new.
If I want all instances of the class to intialize with value "iInitializeValue" is there some automatic way this can be done instead of copy paste 1000 times xD?

Instead of an argument to the ctor set a static member and initialize the value from that?

Why are you not wanting to use a vector?
void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.
Why are you not wanting to use a vector?

If possible, I'd like memory required by the object to be local to the object rather than off in some random place. Plus I'm under the impression the vector will copy the object into each instance (as opposed to call each instance's constructer with the value individually), so if the object is not meant to be copied that could be a problem down the track I'm thinking.

I guess I'm just seeing if there are alternatives to the vector though it appears there isn't without two-phase construction.

No offence or unfair stereotyping intended, but the only code-bases I've used that were based around two-phase construction, were projects where the majority of staff weren't very good at C++, but were decent at C, so they were more comfortable with the old way of doing things.
Can control initialization and cleanup of object outside allocating space for the object. So if I wanted an array of objects that initialize with certain variables it doesn't get annoying if objects don't have default constructors.
C++ has a tool for this -- placement new (see below), which is the standard way to separate allocation from construction, and is what every container class will use internally to do so.
Can have a return result for success or failure. Little simpler than throwing an exception & catching it outside object construction.
In my experience, it's very rare to have classes that can fail construction. However, just because you don't have a return value doesn't mean you cant return a failure code. Use an out-argument (non-const pointer or reference argument) and write an error code to it.
Can call virtual functions within initialization.
Again, in my experience it's very rare to require virtual constructors, but if you've got a genuine use-case that requires them, and doesn't fit in the typical C++ solutions, then two-phase might have a purpose there. Often these classes will use a factory function, so from the user's point of view there's still only one line of code to fully construct the object.
I'm a little confused about how could go about doing this...
Do you really have a use-case where you need an array with a very large, hard-coded size, where they're all constructed up-front?

The standard solution would be as follows, but yes, this won't work on non-copyable objects. However, on copyable objects, the theoretical initialize-temp/copy-construct/destruct-temp operation can be optimized down to just a regular construct operation per element. Also, C++11 adds the move-constructor, which allows you to have non-copyable classes that are still movable, so they can be moved into vectors instead of being copied into them.
std::vector<MyClass> vec(1000, iInitializeValue);
Or to construct them one at a time, you'd use something like below (and again, although they're copied, if the copy operation has no side effects it will be optimized out):
std::vector<MyClass> vec;
vec.reserve(1000);
//...
vec.push_back(1337);

However, a large, hard-coded array sounds more like it's going to be used as a pool, which means the elements shouldn't be constructed until they're needed. If that's the case, then you're example is going to give you the wrong kinds of suggested answers.

Using placement-new, mentioned above, looks like this and separates allocation from construction:
MyClass* data = (MyClass*)malloc(sizeof(MyClass)*1000);//allocate array
MyClass* instanceAtIndex42 = new(data+42) MyClass(1337);//construct element at index 42
instanceAtIndex42->~MyClass();//destruct element at index 42
free(data);//deallocate array
However, instead of using placement-new directly, it's usually used inside container classes (like std::vector, or your own reinvention).

In my engine (1,2,3), I use scope-stack allocation, and I'd write your large array case like:
class MyThingThatOwnsAHugeArray
{
public:
	MyThingThatOwnsAHugeArray(Scope& a, int size) : data(eiNewArray(a, MyClass, size)) {}
private:
	MyClass* data;
};
//...
Scope a( stack );
MyThingThatOwnsAHugeHardCodedArray* bigThing = eiNew(a, MyThingThatOwnsAHugeHardCodedArray)(1000);
Although the array is dynamically allocated, it will exist sequentially in memory right after the parent allocation, so it's just as 'local' as if it were a hard-coded array like in your example; not in 'some random place'.

Alternatively, I'd use some kind of pool, like:
Pool<MyClass> myPool(a, 1000);//allocate space for 1000 items

//if the pool is designed to call constructors:
MyClass* item = myPool.Alloc(MyClass(1337));//construct one via copy constructor
myPool.Release(item);//destruct one

//or a POD pool:
MyClass* item = new(myPool.Alloc())MyClass(1337);//allocate one and then call constructor
item->~MyClass();//destruct
myPool.Release(item);//deallocate
No offence or unfair stereotyping intended, but the only code-bases I've used that were based around two-phase construction, were projects where the majority of staff weren't very good at C++, but were decent at C.

The largest project I've ever worked on took 1min to compile lol. It's hard to grasp the reasons for avoiding T-PC because of this I beleive. I haven't much real c++ experience though I realised my design was wrong because of the research I did into it. Came here to the begginers area for convincing and your post was most helpful, ty.

I have used placement new before though didn't make the connection that the allocation/construction happened in a identical way to my two-phase approach. I was using a pool, allocating space for the object and calling initialise when a new object was requested.

I'll avoid using two-phase construction flippantly in my design.

This topic is closed to new replies.

Advertisement