loop break

Started by
47 comments, last by Ectara 9 years, 11 months ago

I guess this is rejected because it looks like a "normal" for loop, but does something slightly different. If you carefully read it, then it is of course obvious, but it could easily lead to misunderstandings when only scanning over the code quickly.

You might very well be inclined read this as (and nobody could blame you) a loop where i counts from 10 down to 1.

While one could argue "go home if you can't read C", this booby trap is just unnecessary.

Advertisement

I do use this idiom in non-client-facing code, mostly in three cases:
1. Loop limit is calculated just before the loop executes. Instead of doing this:


int limit = getLimit();
for(int i = 0; i < limit; ++i)
    ;

I do:


for(int i = getLimit(); i--;)
    ;

to only use one temporary instead of two (barring the option of calling the function on every iteration... >_>).

2. Loop limit is a member variable. Instead of having to look it up at an offset from the instance's base address, I'd prefer to increase the likelihood of the limit being optimized into being stored in a register.


int limit = this->myLimit;
for(int i = 0; i < limit; ++i)
    ;

However, I can take it one step further, and dramatically increase the likelihood of it being optimized by the compiler:


for(int i = this->myLimit; i--;)
    ;

No second temporary, and for a tight loop, register storage is more likely.

3. Smaller code size in frequently used templates. If a template is going to be frequently instantiated with many different parameters, but its functions can't be generalized to a non-templated base class, I prefer to make the functions compile to as small code as possible (especially if marked inline). Thus, this is a combination of the above, in a way: as few temporaries as possible (if limit isn't constant), constrain to registers, spare the cache sparse lookups (if limit isn't constant), and simplify conditional expressions.

However, I'd understand why this would fail a code review. Even if the loop must be iterated in reverse, there are cleaner ways of writing it.

Perhaps a fair compromise can be found?


/* for(int i = 0; i < 10; ++i) in reverse */
for(int i=10; i--;)
{
    /* do something */
}

or


/* for(int i = limit - 1; i + 1 > 0; --i) */
for(int i=limit; i--;)
{
    /* do something */
}

to essentially write an equivalent, but easier to understand loop just above it in a comment if you have a good reason for a more concise loop, but don't want to sacrifice readability.

Note that these are just examples; I don't know what you or your co-workers would find more readable, since the post-decrement loop is very readable to me.

EDIT: Perhaps even in plain text?


/* for i in [limit to 1] do something with i - 1 */
for(int i=limit; i--;)
{
    /* do something */
}

To me, it isn't about saving time or keystrokes when writing it, but saving time and memory during execution (I'd feel safer about a recursive function using one temporary instead of two). Since comments aren't compiled, they should be used to mark intent in an obvious way.

2. Loop limit is a member variable. Instead of having to look it up at an offset from the instance's base address, I'd prefer to increase the likelihood of the limit being optimized into being stored in a register.


int limit = this->myLimit;
for(int i = 0; i < limit; ++i)
    ;
However, I can take it one step further, and dramatically increase the likelihood of it being optimized by the compiler:

for(int i = this->myLimit; i--;)
    ;
No second temporary, and for a tight loop, register storage is more likely.

I always write those in 'iterator' style, with the end variable inline like this:


for(int i=0, end=this->myLimit; i != end; ++i)
    ;

As far as I'm concerned, using a member (which you know to be constant) in the condition part of the loop (e.g. for(int i=0; i != this->myLimit; ++i)) is just plain wrong wink.png as it forces the compiler to perform aliasing analysis to try and guess whether the member variable will change or not, when you could've just told it so. Also, the compiler will quite often get it wrong, e.g.


for(int i=0; i != this->myLimit; ++i)//myLimit is fetched from memory once per loop, instead of once at the start of the loop
  this->data[i] = i*2;//writes an int to memory, this->myLimit is an int, therefore it's possible that the loop body changes it


I always write those in 'iterator' style, with the end variable inline like this:

Yes, I often wind up doing that when I'm not concerned with size, or the loop needs to be iterated in ascending order.


is just plain wrong

I agree entirely.

I like it :)

In a way it's reminiscent of the original K&R style code, which stressed 'terseness' (because a byte was a terrible thing to waste).

I wouldn't let it anywhere near production code, but I'd laugh a bit before I sent it back to you (and expect you to have the 'correct' code waiting).

If I saw it in the 'while' loop form I'd consider it much more readable, though I'd still probably send it back.

Back in the day, when Java was still fairly new, IBM sent me to a training course (nice hotel, the whole bit - they let me pay the difference for a suite so I could bring my wife and daughter, made a vacation of it). One of the code problems came down to a return statement written as return (i++). I spotted the error - it needed to be return (++i), but the instructor argued with me a bit before trying it and still couldn't see the difference.

Next morning he told me he understood, having read up a bit on it. I would have explained it to the rest of the group, but none of them had even tried to figure out why one worked and the other didn't, so I figured if they were that lazy they were beyond my help [if they had tried and still didn't understand it, I'd have gladly explained it]. The scary thing is most of them had experience coding in C (or so they said...).

With that in mind, I tended to leave the 'cutesy' tricks behind me or at least comment them well enough if there was a real reason to use them.

"The multitudes see death as tragic. If this were true, so then would be birth"

- Pisha, Vampire the Maquerade: Bloodlines

To me, it isn't about saving time or keystrokes when writing it, but saving time and memory during execution (I'd feel safer about a recursive function using one temporary instead of two). Since comments aren't compiled, they should be used to mark intent in an obvious way.

...and if there's a dependency on the loop running in a certain order - e.g accessing items in a list where they must be accessed from first to last - and if you're not aware of that dependency, your desire to save a few bytes has now introduced a bug.

Well done you.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.


...and if there's a dependency on the loop running in a certain order - e.g accessing items in a list where they must be accessed from first to last - and if you're not aware of that dependency, your desire to save a few bytes has now introduced a bug.



Well done you.

I'd say that's about as likely as me writing a loop that iterates forwards when it needs to iterate backwards. The code I'm working on has no external dependency, and I've personally written it all. Moreover, it's used in small, tight loops, like initializing all of the POD elements in an array to a value, where it is guaranteed to not depend on the order.

I didn't say that I'd blindly stamp that philosophy everywhere I needed a loop.

To me, it isn't about saving time or keystrokes when writing it, but saving time and memory during execution
You do however realize that an optimizing compiler will produce exactly the same code whether you use the obfuscated version or not?

Any of the above alternatives will translate to something like:


    MOV 9, %reg
L1:
    STUFF
    MORE STUFF
    ...
    SUB 1, %reg
    JNS L1

The computer doesn't care, but it's much easier to the sleep-deprivated, overworked human on crunch time reading over your code to identify the "slower" versions correctly for what they are.

Not 10 iterations counting from 10 to 1, nor doing 9 iterations, nor anything else that one might wrongly figure or assume when looking at it with one eye from the other side of the room.


You do however realize that an optimizing compiler will produce exactly the same code whether you use the obfuscated version or not?

up.c


int main(){
	volatile char buf[10000];
	volatile int limit = 10000;
	
	for(int i = 0; i < limit; ++i){
		buf[i] = 0;
	}
}

down.c


int main(){
	volatile char buf[10000];
	volatile int limit = 10000;
	
	for(int i = limit; i--;){
		buf[i] = 0;
	}
}

up.s


	.file	"up.cpp"
	.section	.text.startup,"ax",@progbits
	.p2align 4,,15
	.globl	main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	subq	$9904, %rsp
	.cfi_def_cfa_offset 9912
	movl	$10000, -108(%rsp)
	movl	-108(%rsp), %eax
	testl	%eax, %eax
	jle	.L5
	xorl	%eax, %eax
	.p2align 4,,10
	.p2align 3
.L4:
	movslq	%eax, %rdx
	addl	$1, %eax
	movb	$0, -104(%rsp,%rdx)
	movl	-108(%rsp), %edx
	cmpl	%eax, %edx
	jg	.L4
.L5:
	xorl	%eax, %eax
	addq	$9904, %rsp
	.cfi_def_cfa_offset 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (Debian 4.8.1-10) 4.8.1"
	.section	.note.GNU-stack,"",@progbits

down.s


	.file	"down.cpp"
	.section	.text.startup,"ax",@progbits
	.p2align 4,,15
	.globl	main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	subq	$9904, %rsp
	.cfi_def_cfa_offset 9912
	movl	$10000, -108(%rsp)
	movl	-108(%rsp), %edx
	testl	%edx, %edx
	leal	-1(%rdx), %eax
	je	.L6
	.p2align 4,,10
	.p2align 3
.L8:
	movslq	%eax, %rdx
	subl	$1, %eax
	cmpl	$-1, %eax
	movb	$0, -104(%rsp,%rdx)
	jne	.L8
.L6:
	xorl	%eax, %eax
	addq	$9904, %rsp
	.cfi_def_cfa_offset 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (Debian 4.8.1-10) 4.8.1"
	.section	.note.GNU-stack,"",@progbits

They aren't identical, and the counting down version has one fewer instruction (compiled with gcc -O3 -S). Yes, cleaner code is better. However, it is all too easy to say that they compile to the same instructions.

You don't have to do this practice, for sure. However, it does generate smaller code in some cases. I'm just saying that there's a reason. You may prioritize whichever goal you prefer.

EDIT: In a lot of other cases, the compiler can optimize heavily and generate similar instructions. It's possible. However, there is not a case that I have encountered where counting down resulted in larger code.

EDIT2:

In addition to making sure that it doesn't all get optimized out, the volatile limit simulates register pressure; it is entirely possible for that counter to spill to RAM.

They aren't identical

Well of course not, they do different things :-)

The fact that the code for two loops that do different things is not identical isn't surprising, I was obviously talking of loops that do equivalent things, not different things that incidentially have the same values stored in a memory region after execution.

If, whatever you write, translates to "loop and count down from 9 to 0", the compiler will just do that, in the presumably most efficient way it can. No matter whether you write your loop in terms of for or while or do-while, no matter whether you do your decrement in between the first two or after the last semicolon, or whether you use pre- or post-decrement with an accordingly adjusted counter and maybe a (i-1) inside the loop.

With some luck, although I wouldn't rely on that, you may even get the same code if you are counting up but it's obvious from the loop body that you really mean to count down, such as when you only use something like 9-i in the body.

Such things work, for example, spectacularly if you try to simulate rotation (which the language does not support natively) with a shift-or combination. The compiler perfectly figures out that you do not really mean to rotate one way and wordsize minus that amount in the other direction and mash them together, but you really want a completely different thing, so it inserts a rotate instruction.

On the other hand, making the loop counter dependent on a volatile variable has a hundred times more impact than what flavour of loop you use or which way around it goes (or everything else). It forces the compiler to reload the value from memory for every access (the movl -108(%rsp), %edx instruction), to no avail. This is something to worry about much more than if the compiler maybe generates one additional instruction.

This topic is closed to new replies.

Advertisement