Jump to content

  • Log In with Google      Sign In   
  • Create Account


C/C++ Compiler Optimisation, multiplying by zero.


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
7 replies to this topic

#1 JamesCobras   Members   -  Reputation: 112

Like
0Likes
Like

Posted 24 January 2012 - 07:03 AM

Hi guys, just out of interest, I'm implementing an algoithm in which I'm using a mask on pixels (I'm doing a canny edge filter).

The mask has a column where the contents are multiplied by 0. I could just skip the column, by deleting the code for this column, but I want it to be clear.

If I times by a constant of zero will the compiler ignore that part of the code? Or will it perform the calculation?

Cheers Guys

James



Sponsor:

#2 Hodgman   Moderators   -  Reputation: 29493

Like
1Likes
Like

Posted 24 January 2012 - 07:17 AM

float or int?

A float multiplied by zero doesn't always result in zero (e.g. +infinity * 0.0f == NaN) so it might not always be valid for the compiler to make that optimisation.

#3 clb   Members   -  Reputation: 1780

Like
0Likes
Like

Posted 24 January 2012 - 07:28 AM

Also, if it's a very tight inner loop, where you add a zero check like an 'if (fabs(multiplier) < 1e-5f)' or 'if (multiplier == 0)', it can be much slower than just doing the multiplication, since branches aren't actually cheap either.

As always, profiling will tell.
Me+PC=clb.demon.fi | C++ Math and Geometry library: MathGeoLib, test it live! | C++ Game Networking: kNet | 2D Bin Packing: RectangleBinPack | Use gcc/clang/emcc from VS: vs-tool | Resume+Portfolio | gfxapi, test it live!

#4 JamesCobras   Members   -  Reputation: 112

Like
0Likes
Like

Posted 24 January 2012 - 10:03 AM

Yeah I was thinking about an integer... So will it optimize for an integer? I can understand the float not.

James

#5 Josh Petrie   Moderators   -  Reputation: 3111

Like
0Likes
Like

Posted 24 January 2012 - 11:02 AM

Look at the generated assembly and see.

This isn't an FB topic, moving.

Josh Petrie | Core Tools Engineer, 343i | Microsoft C++ MVP


#6 Antheus   Members   -  Reputation: 2397

Like
0Likes
Like

Posted 24 January 2012 - 12:02 PM

Profile these two cases:
N = something big, megabytes.
int a[N];

for (int i = 0; i < N; i++) {
  a[i] = 0;
}

// vs.

for (int i = 0; i < N; i += 16) { // note the step
  a[i] = 0;
}
It should be obvious that second loop does 16 times less work, so it should run 16 times faster.

Here's results of a quick benchmark:
(4299788 - 5144680) |#################################################################
(5144680 - 5989572) |############
(5989572 - 6834465) |##
(6834465 - 7679357) |##
(7679357 - 8524249) |##################
(8524249 - 9369142) |
(9369142 - 10214034) |
(10214034 - 11058927) |
5448480.2 [4299788.0 - 11058927.0]
0
(2871890 - 3112349) |###############################
(3112349 - 3352809) |#####################################
(3352809 - 3593268) |#####################
(3593268 - 3833728) |####
(3833728 - 4074188) |##
(4074188 - 4314647) |###
(4314647 - 4555107) |
(4555107 - 4795567) |#
3302247.6 [2871890.0 - 4795567.0]

First loop takes about : 5.5 million cycles
Second loop takes about 3.3 million cycles.

Or, despite writing 16 times less data, second loop is not even twice as fast.

Why?

Cache lines. Whether touching one or 64 bytes, simple operations complete faster than data can be fetched from memory.

#7 Álvaro   Crossbones+   -  Reputation: 12921

Like
0Likes
Like

Posted 24 January 2012 - 02:01 PM

The mask has a column where the contents are multiplied by 0. I could just skip the column, by deleting the code for this column, but I want it to be clear.

How about writing a comment instead?

#8 AllEightUp   Moderators   -  Reputation: 4186

Like
0Likes
Like

Posted 24 January 2012 - 08:46 PM

I've never done a Canny filter myself so did a quick wikipedia to get a general idea. If I read it correctly and we are talking about a 5x5 matrix convolution then it is highly unlikely that a compiler will notice the zero's in a certain row/column given the non-algorithmic nature of the source data. I.e. reading from an array doesn't tell the compiler to look for a "pattern of zero's" which could be optimized out.

So, if I'm thinking in the correct direction for what you are actually doing, yeah, make sure you don't call that code. There are so many ways you could code this though, it's really up to the specifics and what you think should be optimized.

Of course, going to SIMD could be very useful, if it is 5x5 and one column or row is all zeros, perfect for a SIMD optimization.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS