loop break

Started by
47 comments, last by Ectara 10 years ago

They aren't identical, and the counting down version has one fewer instruction

ONE instruction! Would you write your own allocator if you felt that malloc was too slow on some platforms?

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Advertisement

They aren't identical, and the counting down version has one fewer instruction

ONE instruction! Would you write your own allocator if you felt that malloc was too slow on some platforms?

That's a false analogy. Standard allocators already exist. If I'm writing a loop, it doesn't already exist; it is up to me to choose which characteristics it will have.

EDIT:

That's also ignoring the "one temporary, less register pressure" argument.

Ectara, you don't cache the limit variable into a local in the up case, so it has to be fetched every iteration, which doesn't happen in your down case.
The use of volatile is also going to do god knows what to the optimizer - you can't compare code that touches that keyword to any other code.

Remove volatile, add that local, maybe also use unsigneds so the compiler doesn't have to reason about negative numbers, make the up condition a != rather than a <, and they'll be more equivalent. Ten add a loop that passes the output array's values into a printf or something to ensure they're not optimized out.


Ectara, you don't cache the limit variable into a local in the up case, so it has to be fetched every iteration, which doesn't happen in your down case.

...Isn't that the point, that counting down doesn't fetch on every iteration? It is cached into a local variable; if you look at the ASM, it mov's the value from an offset from the stack pointer. That sounds local to me. It's doing exactly what it would if there were too many values held in the loop, and the counter spilled to RAM.


The use of volatile is also going to do god knows what to the optimizer

I posted the ASM. You can see exactly what happened.


Remove volatile, add that local

In other words, allow the compiler to see that it's an unchanging number and inline the constant limit (I've tried that).


maybe also use unsigneds so the compiler doesn't have to reason about negative numbers, make the up condition a != rather than a <, and they'll be more equivalent.

...In other words, just don't use signed values, so that I can feel good about counting up all the time? I fail to see the point of actually modifying the use case to suit the algorithm, rather than the other way around.


Ten add a loop that passes the output array's values into a printf or something to ensure they're not optimized out.

I don't mean to be abrasive, but did you read the assembly output? It clearly moves zero to each of the array elements, exactly as the C code says (array is volatile qualified). It's guaranteed to have not been optimized out, because it says right in the resulting code that it is doing it.

What does it do?


Perhaps this slight variation, using the "goes to" operator, will clarify:

while ( i --> 0) {
    // ...
}

I'm not sure if you are being tongue-in-cheek or not, but when I want to iterate backwards, I use (almost) the same thing:


for(int i = 10; i --> 0;)
{
     //...counts backwards: 9,8,7,6,5,4,3,2,1,0
}

Apparently that's frowned upon by some C++ programmers, because people mistake "-->" for its own operator. However, I feel it genuinely helps me remember the syntax of something I rarely use, preventing off-by-one errors. It's a useful mnemonic.

Any programmer who doesn't know this mysterious non-operator will look it up, and henceforth would know it. They won't start actively using an operator that they don't know what it does - and if they do, then your use of --> in your code base is the least of your concerns.

Note that these are just examples; I don't know what you or your co-workers would find more readable, since the post-decrement loop is very readable to me.

It's a bit deeper than that. The coding standards are legal standards, and go off to a lab to get certed. You can break a rule, but it means getting in both reviewers, and senior manager booked for a meeting(costly, and pisses people off, although there's biscuits), then a document is written justifying the deviation from standard (costly and boring, massive waste of time and resource), then the certification lab want to check it but charge more (maybe £6k). Basically if you pull that 'clever' shit you can consider yourself unemployed pretty quick. They canned 12 guys on NYE no warning, another 4 last week, all for bad business practice, they were probably adequate coders, just didn't understand the industry.

There are times when you get pulled in to make something uber efficient, like context switching on a blistering fast fpga, but mostly, write it like a muppet can read it and you keep on getting paid.

I do like that loop though :)

...Isn't that the point, that counting down doesn't fetch on every iteration?

No, that's not the point. This was never about counting down versus counting up, you're picking a completely different fight here, one that nobody else has even gone near talking about, and you're treating everyone else as if they're having the same fight as you whereas they're not. This is about comparing the cutesy construct given in the OP with a more idiomatic for loop.

I.e compare this:

for (int i = 10; i--;)

To this:

for (int i = 10; i; i--)

That's what the point is.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

This is really just an obfuscated version of the idiom:


while (i--)
{
    /* do something */
}

Where i is a preexisting variable.

Not really complete unless you show the 'i' variable getting its initial value.

Somehow it would have to get the right value previously (and having it set somewhere far above would obscure the codes obvious operation and add to the potential for a error).

--------------------------------------------[size="1"]Ratings are Opinion, not Fact

...Isn't that the point, that counting down doesn't fetch on every iteration? It is cached into a local variable; if you look at the ASM, it mov's the value from an offset from the stack pointer. That sounds local to me. It's doing exactly what it would if there were too many values held in the loop, and the counter spilled to RAM.

Counting up wouldn't have fetched each iteration if you'd cached he volatile into a local. If you want to simulate register pressure, then use a lot of variables in the loop body rather than disabling the optimizer via volatile.

I posted the ASM. You can see exactly what happened.

and it's not necessarily the same thing that the optimizer would've actually done in real code. Your written an example that does exactly what you assume the compiler will do, but can't possibly show us what the compiler would actually do in a real situation.
You can't use a forced example to demonstrate the optimizer's actual behavior! ;-)

In other words, allow the compiler to see that it's an unchanging number and inline the constant limit (I've tried that).

You can use a global, or a parameter from another function, or get the value from another translation unit, etc... That will stop the compiler from hard doing the loop limit in he same way that it does in real code -- rather than forcing its hand with code that gives it zero leeway.

...In other words, just don't use signed values, so that I can feel good about counting up all the time? I fail to see the point of actually modifying the use case to suit the algorithm, rather than the other way around.

unsigneds work for both up and down. If you want to compare those two loops fairly then you've got to make them as equivalent as possible -- if one uses "not zero", the other should use "not limit", etc. Neither use negative values so signed/unsigned is nuetral, but it would be interesting to see if it affects the codegen (many code bases use uint over int by default).
IMO it's also more common to write !=limit and !=0 rather than <limit and >0...

I don't mean to be abrasive, but did you read the assembly output? It clearly moves zero to each of the array elements, exactly as the C code says (array is volatile qualified). It's guaranteed to have not been optimized out, because it says right in the resulting code that it is doing it.

If you're testing whether the compiler can transform between up/down, then the use of volatile on the output array completely voids your test -- volatile writes can't be reordered, so you've told the compiler explicitly not to modify your iteration order!
So to test this behavior of the optimizer, you need to remove volatile, and then add an alternative method of stopping the code from being optimized away entirely.

There is also that thing that Ive seen more than a little where in the test clause the constant is put first

for(i=0; 10 > i; i++)

{

}

or more commonly

if (10 > varx) { }

or

if (Mode1 == varx) { }

which usually makes me double check rather more than the classic i<10 style

--------------------------------------------[size="1"]Ratings are Opinion, not Fact

This topic is closed to new replies.

Advertisement