How smart is the compiler?
I certainly appreciate your point about long code. Perhaps, though, you could do it with some function pointers? Say you have three bits of code in your while loop, each of which depends on some logic; move each of those snippets to a function, then do the logic outside your while loop and assign the results to function pointers which get called inside your loop.
Quote:
It looks like the compiler isn't as smart as you'd like it to be. At least mine isn't:
Not surprising.
The presumed optimization cannot be made if the code AshleysBrain supplied is taken literally, as you have done; the value of x is not neccessarily known at compile-time. If "GetSomeBool()" is replaced with something that is actually known at compile-time, then I would imagine the compiler would make the optimization.
Again, the compiler is compiling for a processor that has branch prediction. It knows perfectly well that the comparison will be free, so there is no need to make the "optimization".
Quote:Original post by jpetrieQuote:
It looks like the compiler isn't as smart as you'd like it to be. At least mine isn't:
Not surprising.
The presumed optimization cannot be made if the code AshleysBrain supplied is taken literally, as you have done; the value of x is not neccessarily known at compile-time. If "GetSomeBool()" is replaced with something that is actually known at compile-time, then I would imagine the compiler would make the optimization.
Compiling it out is not possible, but hoisting the comparison out of the loop and splitting it is (i.e. do the comparison first, and then choose between two versions of the loop using the result).
However, like Promit said, this is a step backwards on modern processors with good branch prediction: the comparison is essentially free in the "normal" version, and therefore the only thing the "optimized" version can accomplish is to bloat the working set, possibly spilling out of the instruction cache and making performance *worse*.
This function is actually a member function of an object that is instanced many times, most of them having differing member data, with GetSomeBool() returning different values between the objects. This member function is then called on all instances of the object.
Assume for example that for each instance GetSomeBool() returns randomly either 0 or 1, and assume there is only one instance of the function assembly. Won't the processor mispredict the branches? Isn't it therefore in this case more efficient to do the original proposed optimisation?
Of course this is a matter for the compiler - I will be moving the ifs outside the while at the end of my project, and I'll profile it to check up the results.
Assume for example that for each instance GetSomeBool() returns randomly either 0 or 1, and assume there is only one instance of the function assembly. Won't the processor mispredict the branches? Isn't it therefore in this case more efficient to do the original proposed optimisation?
Of course this is a matter for the compiler - I will be moving the ifs outside the while at the end of my project, and I'll profile it to check up the results.
Quote:Original post by AshleysBrain
This function is actually a member function of an object that is instanced many times, most of them having differing member data, with GetSomeBool() returning different values between the objects. This member function is then called on all instances of the object.
Assume for example that for each instance GetSomeBool() returns randomly either 0 or 1, and assume there is only one instance of the function assembly. Won't the processor horribly mispredict the branches? Isn't it therefore in this case more efficient to do the original proposed optimisation?
Of course this is a matter for the compiler - I will be moving the ifs outside the while at the end of my project, and I'll profile it to check up the results.
It might mispredict it initially, but unless your loop is being executed by several objects simultaneously, it should predict correctly after that. Unless the inner portions are rather lengthy, such that the branch predictor forgets about your test. In that case, your optimization wouldn't help anyhow, because the test is such a small portion of the entire loop.
CM
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement