Nice to see a simple but interesting question get discussed in detail tbh.
loop break
I'm going to be honest with all of you. This seems like a pointless and impossible argument. I knew coming in to it that playing devil's advocate wouldn't make me very popular here.
However, if you're on a platform where you actually care about the code size in terms of free bytes, then unrolling loops and using extra registers for simple bookkeeping is counter-productive.
Things I am arguing:
- If you're counting bytes, small loops are important.
- If you're on a low-powered platform, you may not even have that good of an optimizing compiler.
- If you're on a slow, small-cache/cacheless CPU, that extra register matters.
Things I am NOT arguing:
- There isn't better code for modern machines and processors.
- If you [modify the scenario in some way that suits some particular argument], it won't optimize to nothing or some large, but fast, control structure.
- This is the preferred coding method.
- Counting down is easier to read.
- Optimizers are bad at optimizing.
I feel this is turning into a flood of "Look, when I set the limit to zero, the counting up version converts to an AVX loop that plays 'Moonlight Sonata' out of my PC speaker. Take THAT!" and everyone high-fives and up-votes. To be honest, this is all missing the point. Most of the arguments here are throwing out my reasons why I offered a use case, and just attack the flimsiest part of the argument, by offering solutions that make code that is astronomically large for a simple loop, use 2 or more registers just for the loop counting (which, if you only have 4 registers, is a waste), and even do things as bizarre as to make it into a function call.
I suggested that where it matters, it can make a difference. Showing that on a high-speed, giant three-tiered cache, 16 64bit general purpose register processor it doesn't matter is great. However, interpreting these results as being relevant to the original point that I made is misleading.
EDIT: As evidenced by the downvote.
I'm going to be honest with all of you. This seems like a pointless and impossible argument. I knew coming in to it that playing devil's advocate wouldn't make me very popular here.
However, if you're on a platform where you actually care about the code size in terms of free bytes, then unrolling loops and using extra registers for simple bookkeeping is counter-productive.
Things I am arguing:
- If you're counting bytes, small loops are important.
- If you're on a low-powered platform, you may not even have that good of an optimizing compiler.
- If you're on a slow, small-cache/cacheless CPU, that extra register matters.
Things I am NOT arguing:
- There isn't better code for modern machines and processors.
- If you [modify the scenario in some way that suits some particular argument], it won't optimize to nothing or some large, but fast, control structure.
- This is the preferred coding method.
- Counting down is easier to read.
- Optimizers are bad at optimizing.
I feel this is turning into a flood of "Look, when I set the limit to zero, the counting up versions converts to an AVX loop that plays 'Moonlight Sonata' out of my PC speaker. Take THAT!" and everyone high-fives and up-votes. To be honest, this is all missing the point. Most of the arguments here are throwing out my reasons why I offered a use case, and just attack the flimsiest part of the argument, by offering solutions that make code that is astronomically large for a simple loop, use 2 or more registers just for the loop counting (which, if you only have 4 registers, is a waste), and even do things as bizarre as to make it into a function call.
I suggested that where it matters, it can make a difference. Showing that on a high-speed, giant three-tiered cache, 16 64bit general purpose register processor it doesn't matter is great. However, interpreting these results as being relevant to the original point that I made is misleading.
I have to say that if you'd qualified your original posts with these points things may have gone differently; the ensuing discussion may have even been interesting rather than tedious!
From my point of view, your earlier posts in this thread came across as though you were someone who was claiming that "for (i = something; i--;)" was faster than "for (i = something; i > something_else; i--)" and that you were providing a disingenuous contrived example that wasn't an apples-to-apples comparison in order to prove a point. I'm not saying that's what you were doing, I'm saying that's how you came across (I need to stress this because you've been guilty of doing what you claim others have done to you too; i.e misrepresenting their positions and arguments).
"Where it matters it can make a difference" is something that's true of anything: one could hypothesise a processor that's several orders of magnitude slower at subtraction than it is at addition and in that use case - even for asm instructions and register usage - you'd probably still want to count up. I think that hypothetical (and, I admit, contrived) example defeats your final statement (with specific applicability to counting down vs counting up, not in general terms), but I've no arguments with the thinking behind it, which seems to me to essentially boil down to: "choose your optimization strategies according to what's good and what actually gives results on your target platform".
But surely we all already knew that?
; 10 : for(unsigned int i = 0; i != limit; ++i){
mov ecx, DWORD PTR _limit$[ebp]
add esp, 12 ; 0000000cH
xor eax, eax
test ecx, ecx
je SHORT $LN1@up
$LL3@up:
; 11 : buf[i] = i;
mov BYTE PTR _buf$[ebp+eax], al
inc eax
cmp eax, ecx
jne SHORT $LL3@up
$LN1@up:
; 12 : }
Down:
; 23 : for(unsigned int i = limit; i-- ; ){
mov eax, DWORD PTR _limit$[ebp]
add esp, 12 ; 0000000cH
test eax, eax
je SHORT $LN6@down
$LL2@down:
dec eax
; 24 : buf[i] = i;
mov BYTE PTR _buf$[ebp+eax], al
jne SHORT $LL2@down
$LN6@down:
; 25 : }
We do appear to have saved an additional compare instruction when counting down.I'm not familiar with the Visual C++ command line, and the project I used is a bit of a dumping ground for helping people here and elsewhere, so it is possible I've some odd options enabled, but the basic optimisation ones appear to
But surely we all already knew that?
Not everyone; there's a For Beginner's section of the site, and a lot of people may interpret this subforum as a "what not to do" of programming. It's entirely possible for someone to read this, and start automatically condemning every usage of it without bothering to ask why.
I'd prefer that they learn to write well-reading code if they are just learning, but there's no reason to ban parts of their toolbox if it is the tool for the job, unless there's red tape.