C/C++, Constructing and assigning

Started by
17 comments, last by Deyja 14 years, 11 months ago
First, I want to make it clear that I'm not optimizing performance. I'm just seeking knowledge. Here's a structure:

struct Thing
{
	int a,b,c,d,e,f,g,h;

	Thing(int fa,int fb,int fc,int fd,int fe,int ff,int fg,int fh)
		: a(fa),b(fb),c(fc),d(fd),e(fe),g(fg),h(fh) {}

	Thing& operator += (const Thing &t)
	{
		a+=t.a;
		b+=t.b;
		c+=t.c;
		d+=t.d;
		e+=t.e;
		f+=t.f;
		g+=t.g;
		h+=t.h;
		return *this;
	}
	Thing operator + (const Thing &t) const
	{
		return Thing(a+t.a,b+t.b,c+t.c,d+t.d,e+t.e,f+t.f,g+t.g,h+t.h);
	}
	void AsSum(const Thing &ta,const Thing &tb)
	{
		a=ta.a+tb.a;
		b=ta.b+tb.b;
		c=ta.c+tb.c;
		d=ta.d+tb.d;
		e=ta.e+tb.e;
		f=ta.f+tb.f;
		g=ta.g+tb.g;
		h=ta.h+tb.h;
	}
}
With the above structure, which of these would be more efficient?

Thing first, second; //(assume these already contain interesting data)

1: // This would create a temporary storage structure, but is the nicest to use
Thing third = first + second;

2: // This doesn't create a temporary storage structure
Thing third = first;
third += second;

3: // The only drawback here is that it doesn't assign on construction
Thing third;
third.AsSum( first, second );
Would a compiler normally optimize code like that in 1 to not create temporary storage, even if the operator code is not inlined? A class/struct could define specific behavior for + and +=, so it's hard to imagine how it would manage that. Do any of you find yourself using code like 2 or 3 in any situations at all? For example, if the struct is huge, or needs to run complex constructor routines? Thanks for any information.
Advertisement
Quote:Original post by Kest
First, I want to make it clear that I'm not optimizing performance. I'm just seeking knowledge.


You're seeking optimizing performance knowledge.

Rule #1 is profile. You want to know which would be faster? It depends on your compiler, settings, environment, actions performed, everything. Profile.

The answer will be that your bottleneck is elsewhere. Adding shit together is fast (tm). There are far more important uses of your time than micro-optimizing additions.

Quote: Thing operator + (const Thing &t) const
{
return Thing(a+t.a,b+t.b,c+t.c,d+t.d,e+t.e,f+t.f,g+t.g,h+t.h);
}


This way of implementing operator+ is not recommended. It allows for this to compile:

Thing a = thing + somethingconvertabletothing;

But not:

Thing a = somethingconvertabletothing + thing;

It is recommended instead that you use a non-member function. It is also recommended to implement this in terms of operator+= for reduction of code duplication:

// right after the ending }; in class Thing { ... };Thing operator+( const Thing& lhs, const Thing& rhs ) {    Thing copy(lhs);    copy += rhs;    return copy;}


I've chosen a version which I've found takes advantage of return value optimization in reducing extraneous code on modern versions of MSVC -- it should hopefully perform on par with using operator+= itself, thanks to your optimizing compiler. But profile.
You don't have to give your parameter different names in the constructor to use inisialisation lists. The following is perfectly valid:
	Thing(int a,int b,int c,int d,int e,int f,int g,int h)		: a(a),b(b),c(c),d(d),e(e),g(g),h(h) {}


Also your assumption about number 1 isn't necessarily true. RVO and NRVO can typically optimise out the temporary in release builds of modern compilers.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Quote:Original post by MaulingMonkey
Quote:Original post by Kest
First, I want to make it clear that I'm not optimizing performance. I'm just seeking knowledge.


You're seeking optimizing performance knowledge.

No, I just like to know which direction to lean when either direction takes the same effort.

Quote:Rule #1 is profile. You want to know which would be faster? It depends on your compiler, settings, environment, actions performed, everything. Profile.

It's not just about which is faster. If A is friendlier to use, but B is a little faster, I'll probably use A. But I'd rather find C, which is just as friendly as A, and just as fast as B.

To use my imagination, it could be something like an advanced constructor that inlines a member function to generate its construct values. Syntax would be something like.. Third = Thing<AsSum>(First,Second); ..I doubt something like that exists, but as often as this kind of thing is used, it doesn't hurt to take a minute to ask.
Why does the title mention the imaginary language "C/C++", when this is clearly C++?
Quote:1: // This would create a temporary storage structure, but is the nicest to use
Thing third = first + second;

I can't see any temporary here, assuming NRVO.
Quote:Original post by Kest
Quote:Original post by MaulingMonkey
Quote:Original post by Kest
First, I want to make it clear that I'm not optimizing performance. I'm just seeking knowledge.


You're seeking optimizing performance knowledge.

No, I just like to know which direction to lean when either direction takes the same effort.

Flip a coin then. They're never going to be exactly the same in terms of effort and readability, so the answer doesn't matter [lol].

Quote:But I'd rather find C, which is just as friendly as A, and just as fast as B.

To use my imagination, it could be something like an advanced constructor that inlines a member function to generate its construct values.

It's called (N)RVO and is actually the same as case A -- when it can be applied.
Any and all optimizations are solely at discretion of the compiler. Standard is merely flexible enough to allow some rather drastic optimizations, but it doesn't require any of them to be performed.

For example - GCC will never inline a function pointer even if declared as template parameter (hence fully determined at compile-time), whereas MSVC often will, sometimes so aggressively, that entire function call, if without side-effects, will be omitted.

MSVC is also incredibly aggressive with construction. If types and values are known at compile time, it will sometimes pre-calculate the result completely outside of function in static context. The degree to which it can detect certain patterns (loop summation, multiplication and similar) are surprising.

MSVC will be also aggressive to optimize many well-known code paths. Example:
template < class Message > void dispatch(Message m) {  foo(m.param2, m.param4);};// where foo is a function pointer to void foo(int a, int b) {  std::cout << (a + b);};// and called asMessage m;m.param1 = 100;m.param2 = x;m.param3 = "Hello World";m.param4 = 20;m.param5 = "Hi";m.param6 = 30;dispatch(m);
is very likely to result in the following being generated in assembly:
std::cout << (x+20);


Using initialization lists improves chances of such optimizations. GCC however would refuse to inline function pointer to foo, and generate something like:
foo(x, 20);
at best.
That's a GCC bug, they're not doing that on purpose...
It might even have been solved, what version did you use?

Quote:Message m;
m.param1 = 100;
m.param2 = x;
m.param3 = "Hello World";
m.param4 = 20;
m.param5 = "Hi";
m.param6 = 30;
dispatch(m);

is very likely to result in the following being generated in assembly:

std::cout << (x+20);

Well yeah, a good compiler should be able to do constant propagation, function inlining and dead code elimination...

The harder parts are aliasing analyses.
Quote:Original post by MaulingMonkey
Quote:Original post by Kest
Quote:Original post by MaulingMonkey
Quote:Original post by Kest
First, I want to make it clear that I'm not optimizing performance. I'm just seeking knowledge.


You're seeking optimizing performance knowledge.

No, I just like to know which direction to lean when either direction takes the same effort.

Flip a coin then. They're never going to be exactly the same in terms of effort and readability, so the answer doesn't matter [lol].

You're saying one or more of my examples were difficult in some way to type or understand? It took all of a few seconds to post, and I would think even a non-programmer could understand the example code for all three cases. So where's the leaning problem?

Quote:It's called (N)RVO and is actually the same as case A -- when it can be applied.

That's actually useful information. But several helpful posters beat you to it while you were micro-optimizing my usage of time.

This topic is closed to new replies.

Advertisement