GOTO, why are you adverse to using it

Started by
74 comments, last by SmkViper 9 years, 7 months ago

Let's not forget that goto fail; happened.

Braceless conditionals are eeeeevuuuuul!


This is true. Though with the goto fail example, shouldn't the compiler have produced an "unreachable code detected" warning? (Or do they still not have that kind of static analysis there?)

Maybe I'm just spoiled by C# compilers...

Who's to say the project doesn't compile with a bajillion warnings, or that particular warning is disabled?

That's the reason for always compiling with /w4 and treat warnings as errors.

Worked on titles: CMR:DiRT2, DiRT 3, DiRT: Showdown, GRID 2, theHunter, theHunter: Primal, Mad Max, Watch Dogs: Legion

Advertisement

This is certainly the case in C or code that is forced to follow similar flow patterns, where a jump to a cleanup stage is often required in lieu of a return. The RAII crowd will claim that this is a fundamental failing of the language. I have mixed feelings on the matter, but it's the damndest thing: hundreds of thousands of lines in our current project, a number of subroutines in the thousands of lines, and no need for goto statements. So I can't really agree with you that its use is a common structural pattern, particularly in situations where you do use RAII constructs (ie locally scoped destructors) to deal with cleanup.

Here's the thing, though. Maybe it's goto. Maybe it's a break/continue on deeply nested looping structures (which we definitely have in our codebase). It sorta doesn't matter which one - most of these direct-jump commands are representative of fundamental design issues, or simple poor code quality in need of refactoring. You've got a big attitude about the wrong thing. If you have a lot of non-local control flow, especially nested, then your code sucks. Avoiding a specific keyword has no effect on that. These are endemic problems. If your use of goto makes completely badly written code only tolerably badly written, that is not an argument in favor of goto, nor is it an argument in favor of other long distance jump instructions.

I really agree with this. I have never had the need to use a GOTO in my whole life. There is only one case I can remember where I was tempted to use a GOTO, and that was when I needed to BREAK out of a multi level FOR/WHILE loop. I don't remember how I solved that problem, but I have so much aversion against GOTO, that I am sure I didn't fall for my temptation. On the other hand, I am feeling a bit guilty, because I write code like this all the time:


    for (uint8_t channel = 0; channel < module->ChannelCount; channel++) {
        
        currentChannel = &module->ChannelList[channel];
        
        if (currentChannel == NULL) {
            printf("Current channel is null...\n");
            isErrorStatus = true;
            return 0;
        }
        
        if (!currentChannel->status.isPlaying) {
            continue;
        }
        
        if (currentChannel->status.isMuted) {
            continue;
        }
        
        if (currentChannel->delaySample != 0) {
            continue;
        }
        
        if (currentChannel->lengthInBytes == 0) {
            currentChannel->status.isPlaying = false;
            printf("Tried to play an empty sample.\n");
            currentChannel->volume = 0;
            currentChannel->status.isPlaying = false;
            continue;
        }

Well... This is maybe the most extreme example I could find, but I think it is cleaner to use a few breaks and continues than lots of levels of nested ifs. Here I break the flow with both returns and continues.

Perhaps you should read Dijkstra's original letter.

I have found the judicious use of goto for handling error conditions in pure C code to be advantageous: it allows for the explicit and graceful unwinding of resource acquisition. I have never found any other advantageous use for the construct. Source code is meant to be read by humans, and unstructured code offers very poor legibility. I say this because of 30 years experience reading other peoples' code. Most people write bad code; unstructured code is badder code.

Stephen M. Webb
Professional Free Software Developer

[...] I have never had the need to use a GOTO in my whole life. There is only one case I can remember where I was tempted to use a GOTO, and that was when I needed to BREAK out of a multi level FOR/WHILE loop. I don't remember how I solved that problem, but I have so much aversion against GOTO, that I am sure I didn't fall for my temptation.


If the alternative to the goto is using boolean variables to control flow, I'd rather take the goto. The enemy is unreadable code. Injudicious use of goto leads to unreadable code, but if you have a situation where goto is more readable than the alternatives, go for it.

On the other hand, I am feeling a bit guilty, because I write code like this all the time:


    for (uint8_t channel = 0; channel < module->ChannelCount; channel++) {
        
        currentChannel = &module->ChannelList[channel];
        
        if (currentChannel == NULL) {
            printf("Current channel is null...\n");
            isErrorStatus = true;
            return 0;
        }
        
        if (!currentChannel->status.isPlaying) {
            continue;
        }
        
        if (currentChannel->status.isMuted) {
            continue;
        }
        
        if (currentChannel->delaySample != 0) {
            continue;
        }
        
        if (currentChannel->lengthInBytes == 0) {
            currentChannel->status.isPlaying = false;
            printf("Tried to play an empty sample.\n");
            currentChannel->volume = 0;
            currentChannel->status.isPlaying = false;
            continue;
        }
Well... This is maybe the most extreme example I could find, but I think it is cleaner to use a few breaks and continues than lots of levels of nested ifs. Here I break the flow with both returns and continues.


Why would you feel guilty about using language constructs in a completely idiomatic way, making the code more readable?

My typical use is goto ex; which is jump to the exit routine, everything is done, clean up and get out.

I just noticed this... If your use of goto is to return, why don't you just put an return in place of the goto? Then the intention will be much clearer. With goto to exit, you are just camouflaging the real intent. You are adding one more level of "complexity" that is unnecessary. You are still breaking the flow with return, but it is still cleaner than a goto in this case. Besides, I am not so sure it is all that bad to break the flow in this manner. The intent with a return statement is clear: I am done with this function.

If you want to think about this in a real technical way, you should be able to draw a flow chart of your function. This should be done in advance, but who does that? The rules of a flow chart is: that you are having only one entry point, and only one exit point, and that absolutely no lines are crossing each other. No, you can't cheat and draw a flow chart that branches to the right side, and then to the left side to avoid a crossing on the right side. If you are ever using any of goto, break, continue or early returns, then you are breaking this flow.

According to these rules, I can't claim that I am being correct myself, since I am using all but the goto one. However, there really aren't many cases where goto will be needed. I am repeating myself a little bit now, because I am curious what view other people have about break, continue and early returns, if they see goto as evil. I sure do see goto as evil, but realise that break, continue, and early returns kind of falls in the same drawer, so that would theoretically make those too evil, even though they seem to be seen as more acceptable.

Edit:

If the alternative to the goto is using boolean variables to control flow, I'd rather take the goto. The enemy is unreadable code. Injudicious use of goto leads to unreadable code, but if you have a situation where goto is more readable than the alternatives, go for it.

On the other hand, I am feeling a bit guilty, because I write code like this all the time:

for (uint8_t channel = 0; channel < module->ChannelCount; channel++) {

[...] //Edited out to reduce amount of "spam"


Why would you feel guilty about using language constructs in a completely idiomatic way, making the code more readable?

Álvaro snuck in and answered my question. Regarding "If the alternative to the goto is using boolean variables to control flow, I'd rather take the goto." It is sure the dilemma I have when I am breaking out of multi level loops. I think I actually used boolean values in the case I had to solve this, and I considered goto for the first time in my life. I somehow felt dirty either way. None of the solutions felt really clean. I think I need to be cured of goto-phobia then.

And regarding the other case, I don't feel guilty over those language constructs, that is why I am using them. It is just that they are technically gotos too, just a bit cleaner since the intention is clearer. I always want to get better at what I am doing, that is why other people's opinions matters, and for me discussions like this is brilliant.

My typical use is goto ex; which is jump to the exit routine, everything is done, clean up and get out.


I just noticed this... If your use of goto is to return, why don't you just put an return in place of the goto? Then the intention will be much clearer. With goto to exit, you are just camouflaging the real intent. You are adding one more level of "complexity" that is unnecessary. You are still breaking the flow with return, but it is still cleaner than a goto in this case. Besides, I am not so sure it is all that bad to break the flow in this manner. The intent with a return statement is clear: I am done with this function.


In C++, you are basically right. In C, you may need to do some cleanup before you return, and you don't have RAII to help you.

If you want to think about this in a real technical way, you should be able to draw a flow chart of your function. This should be done in advance, but who does that? The rules of a flow chart is: that you are having only one entry point, and only one exit point, and that absolutely no lines are crossing each other. No, you can't cheat and draw a flow chart that branches to the right side, and then to the left side to avoid a crossing on the right side. If you are ever using any of goto, break, continue or early returns, then you are breaking this flow.

According to these rules, I can't claim that I am being correct myself, since I am using all but the goto one. However, there really aren't many cases where goto will be needed. I am repeating myself a little bit now, because I am curious what view other people have about break, continue and early returns, if they see goto as evil. I sure do see goto as evil, but realise that break, continue, and early returns kind of falls in the same drawer, so that would theoretically make those too evil, even though they seem to be seen as more acceptable.


Those rules are rather arbitrary, and they can lead to ridiculously nested conditionals in cases like the code you posted.

There is nothing wrong with break, continue or early returns. They can all be used to make the code more clear, which is why you are using them, and you shouldn't feel bad about it. Of course these are all gotos in disguise. If the language didn't provide them, I would go ahead and write explicit gotos in those cases. If I am breaking out of a nested loop and using a goto is the cleanest way to write the code, I will use it.

You speak as if you never programmed anything of more than trivial complexity. BREAK and CONTINUE can be less clear where there are (many) nested loops or where your 'just jump your eyes' might be to along ways off the screen far below.

Don't put words in my mouth (or in my keyboard tongue.png) that I didn't say. I said the comparison between break/continue and goto was unfair because goto is worse than break & continue. I didn't say break/continue were good practice. I try to avoid them too (however for goto, I just never use it).
Furthermore, I said that where there are (many) nested loops it is bad practice to use break/continue there. Read my post again.

Do you have any examples of the sorts of circumstances under which more modern compilers are untrustworthy?

Yes. I have a few examples. I encounter things like this all the time. I used to have a link to a comparison between GCC, Clang & VS showing how bad they all generated intrinsic code but I can't seem to find it now.

Aside from that (which can be treated as codegen performance bugs that will eventually be fixed), there's a limit on how much the compiler can optimize because of assumptions the compiler cannot make due to how the language works (i.e. restrict helps us a lot).

Something as simple as moving member variables to a local variables before going into a loop that calls a function can greatly improve performance because the compiler may not know if the member variables are going to be modified by the function being called; thus the compiler has to reload them on every iteration. With local variables, the compiler can assume that (*).

Another example is avoiding writing a non-const reference to a variable/object until the very end of the routine (i.e. an output variable) and only read from it at the beginning; otherwise the compiler cannot assume the variable isn't aliased to any of our member variables (or to any other non-local variable).

There is a limit on what the compiler can optimize for us. Not because the optimizer is good/bad, but rather because the language enforces some guarantees (with good reason!) that sometimes costs us a lot of CPU time.

This is a very current problem since memory latency and bandwidth has become a bottleneck; and turns out the compiler can't do much to optimize out (unnecessary-in-practice) memory reloads; and needs our help.

If you're writing accountancy software, ERP, or something like that; you will never have to worry about that. But in gamedev where we squeeze every drop of cpu cycles and iterate over many thousands of objects many times per second; then yes; these limitations begin to notice.

(*) Massively copying everything to local variables won't help either; because this data won't fit into registers and will end up in the stack every time we enter that troublesome function and reload when we leave it; which is basically what the compiler would do with member variables, but worse.

In such cases one has to sit down and analyze the sources of aliasing and rewrite the routine in a way that they get minimized or contained to a particular place where it won't matter. Also one has to analyze the possibility of enabling the visibility of the routine to the compiler so the compiler can know what it is doing and what gets modified (but if it's a virtual function, the compiler can't do it even if it sees the function's body). LCGT can help too in this cases; but remember that even then, there are some assumptions the compiler still can't make.

I just noticed this... If your use of goto is to return, why don't you just put an return in place of the goto? Then the intention will be much clearer. With goto to exit, you are just camouflaging the real intent. You are adding one more level of "complexity" that is unnecessary. You are still breaking the flow with return, but it is still cleaner than a goto in this case. Besides, I am not so sure it is all that bad to break the flow in this manner. The intent with a return statement is clear: I am done with this function.

Notice the word "cleanup". This is the key point - many people have brought up C error handling, and for good reason: which idiom do you prefer between these three popular approaches? There are more, but these are just examples. The "sequential error handling" model:


int do_work(int stuff)
{
    int res1, res2, res3;

    res1 = acquire_one_resource(stuff);
    if (!res1)
    {
        return 0; // error!
    }

    res2 = acquire_another_resource(stuff);
    if (!res2)
    {
        free_resource(res1); // gotta clean this up
        return 0;
    }

    res3 = acquire_more_resources(stuff);
    if (!res3)
    {
        free_resource(res2); // oh boy, here we go
        free_resource(res1);
        return 0;
    }

    /* etc, releasing more and more previous resources each time
     * so that we never leave any acquired resource behind */

    /* do work with resources */

    free_resource(res3); // oh, yeah, let's not forget
    free_resource(res2); // wait, didn't we already write this earlier?
    free_resource(res1); // maybe it should go into a function.. oh, wait, it can't, can it

    return 1; // yay
}

The "cascade deallocation" model:


int do_work(int stuff)
{
    int res1, res2, res3;
    int success = 0;

    res1 = acquire_one_resource(stuff);
    if (res1)
    {
        res2 = acquire_another_resource(stuff);
        if (res2)
        {
            res3 = acquire_more_resources(stuff);
            if (res3)
            {
                /* etc. until you finally have everything */

                /* do some stuff with resources */

                success = 1;

                free_resource(res3); // here we go
            }

            free_resource(res2);
        }

        free_resource(res1);
    }

    return success;
}

Or the honest-to-god goto model:


int do_work(int stuff)
{
    int res1 = 0, res2 = 0, res3 = 0;
    int success = 0;

    res1 = acquire_one_resource(stuff);
    if (!res1) goto cleanup;

    res2 = acquire_another_resource(stuff);
    if (!res2) goto cleanup;

    res3 = acquire_more_resources(stuff);
    if (!res3) goto cleanup;

    /* etc. */

    success = 1;

cleanup:
    /* here the only requirement is that the cleanup process
     * ignores unacquired resources, perhaps with some minor
     * logic if you have pointers to resources (unusual) */
    free_resource(res3);
    free_resource(res2); // would you look at that, all the cleanup code
    free_resource(res1); // in one single spot, no code duplication
    return success;
}

(there's also the variant with multiple labels that jump at the appropriate position in the cleanup routine, which relaxes the requirement of having to ignore resources which were never acquired, but that I find a bit too verbose for my taste personally - if you are on a relatively flat memory model, which you should be in C, cleanup itself will be easy, the harder part is determining what to clean up at any given time)

One of the snippets above scales. The other two don't. I think it's clear which is which.

In C++ and other languages you have stuff to help you. In C++ that's the RAII memory ownership model, in C# or Java that's the garbage collector and maybe some other automatic acquire/release mechanism (like scoped using declarations in C#), which basically do the same thing but in a mechanical, behind-the-scenes way, which is - of course - better. In C, you have nothing but you and your pointers, and for most nontrivial programs error handling is going to make up a large fraction of the code, so it better be readable and maintainable.

Sure, you can delegate the resource acquisition code to some other part of your program, or stretch your code out over an interminable amount of two-line helper functions in the name of structured programming, but it doesn't matter how much you move it around, eventually you're going to have to deal with the necessity of having multiple resources around at the same time and to be prepared for the possibility of only acquiring part of those resources and being required to execute a correct cleanup sequence every time.

And if you are thinking why I didn't just move the cleanup bit into its own function, let me ask you this: why would I create an off-screen helper function with critical cleanup code that intrinsically needs to be kept in sync with the do_work() function, when I can just have it naturally at the bottom of the function all by replacing "cleanup(); return 0;" with "goto cleanup;". What concrete, practical advantage is there? None. It's a waste of space and programmer energy, and a prime example of cargo cult programming.

If you want to think about this in a real technical way, you should be able to draw a flow chart of your function. This should be done in advance, but who does that? The rules of a flow chart is: that you are having only one entry point, and only one exit point, and that absolutely no lines are crossing each other. No, you can't cheat and draw a flow chart that branches to the right side, and then to the left side to avoid a crossing on the right side. If you are ever using any of goto, break, continue or early returns, then you are breaking this flow.

That's an arbitrary blanket rule. Early returns are a good thing. Breaks and continues have their places. Goto has its place too when you need some kind of control flow that improves readability (like the error handling example) and no other language construct naturally lends itself to that (admittedly, that is rare even in languages like C, and basically nonexistent for higher level languages). And when I see this kind of rule being repeated blindly all I can think of is the programmer who, told not to use goto, wrapped up his error handling code in a dreadful do{}while(0); loop to obtain the same control flow without writing 'goto' and obfuscating his code in the process. And that is far worse than any amount of gotos (well, almost any amount).

I'm not saying goto is good. Here the use of goto for error handling is justifiable because goto naturally lends itself to a rather elegant, very low boilerplate resource cleanup implementation in C. Clearly jumping all over your code, to labels hundreds of lines away, is nothing short of insane. But simply outright banning goto because there is a paper called 'goto considered harmful' without even considering the possibility that maybe, just maybe, goto could be used in a controlled manner (it is a tool, and just like any other tool it can be abused; break and continue are also tools which can be abused) simply shows a lack of intellectual depth in my opinion.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Something as simple as moving member variables to a local variables before going into a loop that calls a function can greatly improve performance because the compiler may not know if the member variables are going to be modified by the function being called; thus the compiler has to reload them on every iteration. With local variables, the compiler can assume that (*).

(*) Massively copying everything to local variables won't help either; because this data won't fit into registers and will end up in the stack every time we enter that troublesome function and reload when we leave it; which is basically what the compiler would do with member variables, but worse.

In such cases one has to sit down and analyze the sources of aliasing and rewrite the routine in a way that they get minimized or contained to a particular place where it won't matter. Also one has to analyze the possibility of enabling the visibility of the routine to the compiler so the compiler can know what it is doing and what gets modified (but if it's a virtual function, the compiler can't do it even if it sees the function's body). LCGT can help too in this cases; but remember that even then, there are some assumptions the compiler still can't make.

Oh dear! I happen to do the opposite on occasion, moving local variables to member variable in cases where I have tight loops and want to optimise a bit, and now you are telling me that? Seems I have a lot of work learning my own compilers behaviours...

[...]

The "cascade deallocation" model:

[...]

Or the honest-to-god goto model:

[...]

And if you are thinking why I didn't just move the cleanup bit into its own function, let me ask you this: why would I create an off-screen helper function with critical cleanup code that intrinsically needs to be kept in sync with the do_work() function, when I can just have it naturally at the bottom of the function all by replacing "cleanup(); return 0;" with "goto cleanup;". What concrete, practical advantage is there? None. It's a waste of space and programmer energy, and a prime example of cargo cult programming.

If you want to think about this in a real technical way, you should be able to draw a flow chart of your function. This should be done in advance, but who does that? The rules of a flow chart is: that you are having only one entry point, and only one exit point, and that absolutely no lines are crossing each other. No, you can't cheat and draw a flow chart that branches to the right side, and then to the left side to avoid a crossing on the right side. If you are ever using any of goto, break, continue or early returns, then you are breaking this flow.

That's an arbitrary blanket rule. Early returns are a good thing. Breaks and continues have their places. Goto has its place too when you need some kind of control flow that improves readability (like the error handling example) and no other language construct naturally lends itself to that (admittedly, that is rare even in languages like C, and basically nonexistent for higher level languages). And when I see this kind of rule being repeated blindly all I can think of is the programmer who, told not to use goto, wrapped up his error handling code in a dreadful do{}while(0); loop to obtain the same control flow without writing 'goto' and obfuscating his code in the process. And that is far worse than any amount of gotos (well, almost any amount).

I'm not saying goto is good. Here the use of goto for error handling is justifiable because goto naturally lends itself to a rather elegant, very low boilerplate resource cleanup implementation in C. Clearly jumping all over your code, to labels hundreds of lines away, is nothing short of insane. But simply outright banning goto because there is a paper called 'goto considered harmful' without even considering the possibility that maybe, just maybe, goto could be used in a controlled manner (it is a tool, and just like any other tool it can be abused; break and continue are also tools which can be abused) simply shows a lack of intellectual depth in my opinion.

The cascade model is just plain ugly. The goto way is actually really elegant. In those cases I would probably go for a function in the past to avoid goto, but your arguments have convinced me. There are definitively reasons for goto in certain cases, and the function way is not good either.

Regarding the blanket rule, I am not at all preaching the "flow chart" model, since I am obviously not following those rules myself, just wanted to hear some arguments, and I got a lot of good ones. I am a follower of clean and elegant code, and that is for me more important than any other strictly (and blindly) enforced idioms. Hearing comments like this also eases my insecurities/discomforts when I am breaking out of learned/taught conventions to make things more readable/elegant too, but the goto one has really got stuck in me for some reason.

This topic is closed to new replies.

Advertisement