Jump to content

  • Log In with Google      Sign In   
  • Create Account

Interested in a FREE copy of HTML5 game maker Construct 2?

We'll be giving away three Personal Edition licences in next Tuesday's GDNet Direct email newsletter!

Sign up from the right-hand sidebar on our homepage and read Tuesday's newsletter for details!


We're also offering banner ads on our site from just $5! 1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Goto and Loops - NOT a goto discussion


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
20 replies to this topic

#1 Milcho   Crossbones+   -  Reputation: 1177

Like
2Likes
Like

Posted 09 January 2013 - 06:20 PM

First - this isn't meant as a question whether using gotos is appropriate to use in an OO langauge. That's been overdone, so please don't let it degenerate to that type of discussion.

 

I DID however use a goto recently in a nested loop to escape it, and it got me thinking about the variables used in a loop, and the flow.

 

I tested the following minimal, working, c++ code:

#include <iostream>
using namespace std;

int main ()
{
  if (true)
  {
    for ( int i = 0 ; i < 3; i++)
    {
    looped:
      cout << i << " ";
    }
  }
  cout << "\nexit loop\n";
  int j = -1;
  // cout << i; // does not compile, obviously as expected
  goto looped;
}

 

This actually ends up printing:

 

0
1
2
exit loop
3
exit loop
4
exit loop
5
exit loop
6

 

And so on, forever

So... I expected i to be popped off the stack once the loop exited naturally (the first time), and the declaration of j to replace it on the stack. 

I'm using Visual Studio 2012 Express to test this. I haven't tested it on anything else yet. Why does it look like the declaration of i isn't popped off the stack at all?

 

This also makes me wonder what happens when I use a goto to escape the loop - does the variable declared in the loop's header stay on the stack? It seems like it should, but after this experiment, I'm not feeling too confident in how I know the stack to work.

 

Anyone have any comments?



Sponsor:

#2 Paradigm Shifter   Crossbones+   -  Reputation: 5407

Like
0Likes
Like

Posted 09 January 2013 - 06:27 PM

I'll take a punt on "undefined behaviour". Typically the stack pointer is set up only once per function, not once per scope (i.e. the curly brackets), and the compiler could share the same stack location for i and j, maybe the goto has made it realise you jump into another scope so it doesn't? Did you test it in release mode/optimisations on as well?

It definitely wouldn't work like that if there was a nontrivial destructor for i...
"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

#3 L. Spiro   Crossbones+   -  Reputation: 13964

Like
4Likes
Like

Posted 09 January 2013 - 06:36 PM

Why does it look like the declaration of i isn't popped off the stack at all?
Because you never left the function. The stack pointer won’t be modified until you enter or exit a function.
And what you are seeing is undefined behavior. When it goes back into the loop it simply takes the value at the address of “i” and keeps working on it.

Not to confuse you further, but nothing really “leaves” the stack. It is just a series of addresses and a pointer to somewhere in them that gets shifted up and down as you enter/leave functions.
The pointer gets shifted but the data all over the stack is not changed when this happens. Through undefined behavior and hacks you can access values on the stack from the caller of your current function, for example. You can also modify values “ahead” of your current stack position and change data that the next function you call will use.

You are basically associating scopes with stacks. This is wrong. Scopes are a language constraint. Just because a variable goes out of scope does not mean the stack gets modified in any way (destructors can cause this to happen though). Nothing happens to “i” just because you leave the for-loop.


L. Spiro

Edited by L. Spiro, 09 January 2013 - 06:38 PM.

It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#4 Milcho   Crossbones+   -  Reputation: 1177

Like
0Likes
Like

Posted 09 January 2013 - 06:40 PM

I'll take a punt on "undefined behaviour". Typically the stack pointer is set up only once per function, not once per scope (i.e. the curly brackets), and the compiler could share the same stack location for i and j, maybe the goto has made it realise you jump into another scope so it doesn't? Did you test it in release mode/optimisations on as well?

It definitely wouldn't work like that if there was a nontrivial destructor for i...

 

I was testing this in release mode - as for optimizations /OPT:REF is set, but that's only supposed to be to eliminate variables that are never referenced, not to keep variables like that.

 

 

But then, based on your last line, I tried this code (also fixing my lack of return statement in main... shame on me :) )

#include <iostream>
using namespace std;

class Nontrivial
{
public:
Nontrivial () { i = 0; }
~Nontrivial () { i = -100; }
int i;
};

int main ()
{
  if (true)
  {
    for ( Nontrivial n; n.i < 3; n.i++)
    {
      looped:
      cout << n.i << " ";
    }
  }
  cout << "\nexit loop\n";
  int j = -1;
  // cout << i; // does not compile
  goto looped;
  return 0;
}

This.... didn't compile at all:

 

Error 1 error C2362: initialization of 'n' is skipped by 'goto looped' 

Yeah.... 
 



#5 ApochPiQ   Moderators   -  Reputation: 16001

Like
2Likes
Like

Posted 09 January 2013 - 06:41 PM

This should be totally well-defined AFAIK.

The lifetime of the stack slot is an implementation detail; the language semantics only care about the lifetime of the data placed in that slot. If you place a variable definition inside a loop, it'd be wasteful to push and pop the stack every loop iteration to reserve stack space for the variable; as an optimization, it is common to reserve all needed stack space in a single shot when entering a function, and clean it all up in one pop at the end. (Note that this depends on calling conventions and other details.)

If you put an object with nontrivial construction/destruction in place of a primitive, the compiler will construct and destruct it on scope entry/exit as you would expect. It will typically reuse stack slots as far as is safe, but since you can easily construct cases that can't be statically proven safe, the compiler will often not reuse stack slots specifically to avoid mashing two things into the same slot.


[edit] The reason this fails with your nontrivial example is that the compiler can't statically figure out how to safely construct the object for every possible flow scenario.

Edited by ApochPiQ, 09 January 2013 - 06:43 PM.


#6 SiCrane   Moderators   -  Reputation: 9626

Like
7Likes
Like

Posted 09 January 2013 - 06:41 PM

Actually, this is a should-have-failed-to-compile situation according to the standard. If you jump into a block where a variable with automatic storage duration with an initializer is in scope from a place where it isn't in scope, the program is ill-formed. (Relevant section is 6.7 of all of C++98, C++03 and C++11.) MSVC permits this as an extension. If you disable language extensions it fails to compile with a C2362.

#7 Milcho   Crossbones+   -  Reputation: 1177

Like
0Likes
Like

Posted 09 January 2013 - 06:43 PM

Why does it look like the declaration of i isn't popped off the stack at all?
Because you never left the function. The stack pointer won’t be modified until you enter or exit a function.
And what you are seeing is undefined behavior. When it goes back into the loop it simply takes the value at the address of “i” and keeps working on it.

Not to confuse you further, but nothing really “leaves” the stack. It is just a series of addresses and a pointer to somewhere in them that gets shifted up and down as you enter/leave functions.
The pointer gets shifted but the data all over the stack is not changed when this happens. Through undefined behavior and hacks you can access values on the stack from the caller of your current function, for example. You can also modify values “ahead” of your current stack position and change data that the next function you call will use.

You are basically associating scopes with stacks. This is wrong. Scopes are a language constraint. Just because a variable goes out of scope does not mean the stack gets modified in any way (destructors can cause this to happen though). Nothing happens to “i” just because you leave the for-loop.


L. Spiro

 

Ok, I understand that nothing "leaves" the stack, no confusion there :). That's why I had put in the declaration of j. But the fact that the stack is allocated/deallocated on a per-function basis would actually explain this, I suppose.



#8 Paradigm Shifter   Crossbones+   -  Reputation: 5407

Like
0Likes
Like

Posted 09 January 2013 - 06:44 PM

main is the only function that returns a value that you are allowed to skip the return statement ;) It implicitly returns 0 if you omit it.
"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

#9 Milcho   Crossbones+   -  Reputation: 1177

Like
0Likes
Like

Posted 09 January 2013 - 06:52 PM

Actually, this is a should-have-failed-to-compile situation according to the standard. If you jump into a block where a variable with automatic storage duration with an initializer is in scope from a place where it isn't in scope, the program is ill-formed. (Relevant section is 6.7 of all of C++98, C++03 and C++11.) MSVC permits this as an extension. If you disable language extensions it fails to compile with a C2362.

Yes, you're correct, disabling language extensions I get the "initialization skipped" error.

Oddly however, I left the loop definition like this, without initializing i to 0:

for ( int i; i < 3; i++)

and it compiled. Value of i was obviously random, but it compiled and ran just fine, printing a random number and continuously incrementing it.

 

I don't know why I had the idea that the stack was dealt with inside a function, instead of only on a function level. I suppose language syntax and scoping rules made me assume it worked like that, but in retrospect, I don't think I've read anything to indicate it.



#10 ApochPiQ   Moderators   -  Reputation: 16001

Like
0Likes
Like

Posted 09 January 2013 - 07:20 PM

Welcome to undefined behavior: stuff that makes zero sense, compiles, runs, and might (if you're lucky) indicate something is broken before it's too late.

#11 SiCrane   Moderators   -  Reputation: 9626

Like
0Likes
Like

Posted 09 January 2013 - 07:37 PM

Oddly however, I left the loop definition like this, without initializing i to 0:
for ( int i; i < 3; i++)
and it compiled.


As I said, it's illegal to jump in a block with an automatic variable with an initializer. If it's a POD type, like an int, without an initializer, it's legal to jump into a block that contains it. In C++98/03 it's illegal to jump into a scope with a non-POD type with or without an intializer. In C++11 the rules are basically the same, but are described in a more complex manner because of the shades of what is considered POD.

#12 Bregma   Crossbones+   -  Reputation: 5242

Like
1Likes
Like

Posted 09 January 2013 - 10:05 PM

I don't know why I had the idea that the stack was dealt with inside a function, instead of only on a function level. I suppose language syntax and scoping rules made me assume it worked like that, but in retrospect, I don't think I've read anything to indicate it.

It's a sensible assumption about the implementation of the underlying C++ memory model.  There is in fact nothing in C++ that mandates the use of a stack for anything.  It is quite possible (provably so, because it's been done) to have a C compiler for hardware that does not support stack operations, and I imagine it's equally as possible to do so for C++. I'm not saying it wouldn't be a pain, but technically possible.

 

The very use of the common misnomers "on the stack" or "on the heap" when referring to automatic or free storage respectively can cause a good deal of confusion when people find out that's not how things really work.


Stephen M. Webb
Professional Free Software Developer

#13 Álvaro   Crossbones+   -  Reputation: 13649

Like
1Likes
Like

Posted 09 January 2013 - 10:54 PM

The very use of the common misnomers "on the stack" or "on the heap" when referring to automatic or free storage respectively can cause a good deal of confusion when people find out that's not how things really work.

 

 

How are those misnomers? If you look in the Wikipedia page about memory management, dynamic memory allocation is described as "allocating portions of a large pool of memory called the heap". The C standard doesn't mention "heap" anywhere, but it doesn't mention "free storage" either. 

 

The C standard doesn't talk about "the stack", but it also doesn't talk about "local variables". I am happy to talk about local variables in the context of the C language, and the implementation of local variables in a C compiler has to be a stack of some sort, at least if we are dealing with recursive functions. So I don't see the problem with calling this space "the stack" either. This stack doesn't have to be the same as the call stack, although that is the case in every compiler I have ever used, and when debugging some problems, it has been useful to understand something about the memory layout of these things.

 

 

I knew about the stack and the heap before I knew about C or C++, and I have a hard time using other terms. The mental pictures of the stack and the heap still serve me well, and when I ask my compiler to turn some piece of code into assembly language, it still looks like the stack and the heap are being used. I don't see why the terms used in the language description should be preferred.

 

I guess I just don't allow programming-language standards to determine how I talk about things that are larger than the programming language. For instance, a byte to me is 8 bits, but the standard specifies that a byte is whatever a char is, which could be something other than 8 bits. That might be the convention used in that document, but when I use the word "byte", I mean 8 bits.

 

While I am in rant mode, I call my .c and .cpp files "modules", and not "translation units", and I don't know anybody who uses the latter.


Edited by Álvaro, 09 January 2013 - 10:55 PM.


#14 wqking   Members   -  Reputation: 756

Like
-1Likes
Like

Posted 10 January 2013 - 12:09 AM

As far as I understand (same as others), that should be invalid syntax. Variable 'I" should be only valid in the for loop.

Also, even it's valid syntax, I would never write that kind of code because it's so confusing. Indeed I would not spend my any time on studying "goto". I will stop here, otherwise will bring a flame war of "goto" discussion.


http://www.cpgf.org/
cpgf library -- free C++ open source library for reflection, serialization, script binding, callbacks, and meta data for OpenGL Box2D, SFML and Irrlicht.
v1.5.5 was released. Now supports tween and timeline for ease animation.


#15 Milcho   Crossbones+   -  Reputation: 1177

Like
0Likes
Like

Posted 10 January 2013 - 02:46 AM

I don't know why I had the idea that the stack was dealt with inside a function, instead of only on a function level. I suppose language syntax and scoping rules made me assume it worked like that, but in retrospect, I don't think I've read anything to indicate it.

It's a sensible assumption about the implementation of the underlying C++ memory model.  There is in fact nothing in C++ that mandates the use of a stack for anything.  It is quite possible (provably so, because it's been done) to have a C compiler for hardware that does not support stack operations, and I imagine it's equally as possible to do so for C++. I'm not saying it wouldn't be a pain, but technically possible.

 

The very use of the common misnomers "on the stack" or "on the heap" when referring to automatic or free storage respectively can cause a good deal of confusion when people find out that's not how things really work.

 

I'm also not sure why you refer to these terms as misnomers. These terms were created for the exact purpose of the stack / heap allocation that exists on computers, they are still applicable as far as I'm know. Trying to allocate a large enough local variable will cause a crash, as you run out of space on the stack, yet using new for the same size allocation on the heap works just fine - so there's still a definite difference between the two. It has been a couple of years since I took my OS design courses, and the knowledge there is a bit simplified over the real things, but the 'on the stack' and 'on the heap' should be correctly describing of what's actually going on.

 

The major reason why I posted this is because I wanted to know about the exact specifics of of the stack management and how this specific case was handled. At any rate, I think the previous posts in this thread did describe the reasoning behind this fairly well.



#16 Hodgman   Moderators   -  Reputation: 30950

Like
1Likes
Like

Posted 10 January 2013 - 04:11 AM

I'm also not sure why you refer to these terms as misnomers

The C++ standard uses his wording -- "the stack" is automatic storage and "the heap" is the free store.
"The stack" and "the heap" are just colloquialisms carried over by C programmers. Arguing that these terms are wrong is a bit of a pedantic thing to do though, like pointing out that "The STL" is wrong as should just be called the standard library, instead of just quietly admitting that you know what people mean by these phrases wink.png The point though, is that "the stack" isn't necessarily a stack, and "the heap" isn't necessarily a heap. They're abstract concepts in the C++ standard, and the implementations we're used to are concrete compiler decisions that fulfil those concepts.


Edited by Hodgman, 10 January 2013 - 05:09 AM.


#17 Álvaro   Crossbones+   -  Reputation: 13649

Like
1Likes
Like

Posted 10 January 2013 - 06:07 AM

This post is similar to the one above. I am not trying to beat a dead horse, but perhaps gain some understanding. So please point out if I am wrong about anything below (which is very plausible).

 

 

The point though, is that "the stack" isn't necessarily a stack, and "the heap" isn't necessarily a heap.

 

 

How could the automatic storage not be a stack? If you have a recursive function with parameters and local variables, there must be a stack of some sort somewhere to handle different calls to the function being "active" at any given time. It might not be the x86 stack, but it has to be a stack of some sort. And the large pool of memory from which you dynamically allocate portions is called the heap, and that is just a name, not a statement of a particular implementation of the concept.

 

 

They're abstract concepts in the C++ standard, and the implementations we're used to are concrete compiler decisions that fulfil those concepts.

 

 

Although there is nothing technically wrong with that statement, I think it confuses the history of these things: A beginner will get the impression that the C++ standard was created in a vacuum and then people came along and settled on these particular implementations, as if by some sort of accident. The reality is that the implementations came first and these concepts were well established before the C++ standard was created. The C++ standard is simply a contract between the programmer and the compiler writer as to what things are guaranteed to work. I don't understand why the standard didn't use the prevailing terminology: Calling the heap "the free store" gives the impression that they are allowing more flexibility in the implementations than if they had called it "the heap", but this is not the case.


Edited by Álvaro, 10 January 2013 - 06:09 AM.


#18 reinder   Members   -  Reputation: 134

Like
0Likes
Like

Posted 10 January 2013 - 06:10 AM

The code won't compile using gcc:

 

 

In function "int main()":
10: error: jump to label "looped"
17: error:   from here
8: error:   skips initialization of "int i"
 


#19 Hodgman   Moderators   -  Reputation: 30950

Like
1Likes
Like

Posted 10 January 2013 - 06:59 AM

How could the automatic storage not be a stack? If you have a recursive function with parameters and local variables, there must be a stack of some sort somewhere to handle different calls to the function being "active" at any given time. It might not be the x86 stack, but it has to be a stack of some sort. And the large pool of memory from which you dynamically allocate portions is called the heap, and that is just a name, not a statement of a particular implementation of the concept.

The point about these being a misnomer is only because the standard doesn't actually use those names. "The stack" and "the heap" are what we all call them, but the standard doesn't call them that, so it's informal language.

 

The "call stack" does act like a stack (pushing and popping stack frames), and the stack data-structure can be implemented in different ways (e.g. we're used to the call-stack being an array, but it could also be a linked list). But perhaps on some particular machine it's possible to have a call-tree instead of a call-stack -- e.g. for co-routines -- such a machine could implement a standard C or C++ compiler, with it's own co-routine extensions to the language to make use of it's call-tree, which would allow for yield/resume as well as call/return, etc... It would likely use a linked-list based tree for automatic storage ("the call-stack") instead of an array-based stack.

 

"The heap" could be implemented as a heap, or many other structures. I'm not sure why we call the free-store "the heap"... maybe it was originally implemented as a heap data structure? Or maybe because there's a whole heap of bytes in it cool.png

 

n.b. I did make the point that it's awfully pedantic to argue over whether we should call it the heap or the free-store, etc... I prefer the informal colloquialisms because they're more prevalent (it's what i was taught to call them in Uni, anyway).



#20 Bregma   Crossbones+   -  Reputation: 5242

Like
3Likes
Like

Posted 10 January 2013 - 08:54 AM

How could the automatic storage not be a stack?

On the venerable TM9900, subroutine calls are perfomed using the BLWP instruction (branch-and-link-workspace-pointer), which implements call/returns semantics using what is effectively a linked list.  No stack.  None, nada, not a sausage.  Variables of automatic storage duration are created in scratch RAM indexed by one of the 16 general-purpose registers.  I'm not sure of a C compiler was every available for that platform, but if there was it would make more sense to use that native method than to try to hack one of general-purpose registers for use as a stack pointer (call/return would still use BLWP, since there was no alternative).

 

Most calls on a SPARC were done by switching registers sets, with arguments passed in registers and, if possible, automatic variables were also in registers.  SPARCs had a lot of registers.  That means, unless the parameters or locals were huge, subroutine calls were through a chained linked-list structure, not a stack.  There was definitely a C compiler for the Unixes that run SPARCs.

 

If the size of the locals or arguments on a 68k or a PPC are small enough, only the registers are used and stack use is avoided by most optimizing compilers.  The OP's code is an example of such small code.

 

To assume you have to use some sort of stack structure to implement automatic variables in C or C++ is just an invalid assumption.  That's why it's a misnomer.  The word means "improperly named", and it's an accurate description.  Sure, you can go ahead and use the phrase "on the stack" to describe variables of automatic storage duration and most folks will know what you're talking about, but if you take the name literally and make assumptions about the underlying implementation, you're going to run in to trouble.  That's what happened here.  That's why it happened here.  I do not think it is pedantic or pretentious to explain why it's a misnomer and how it being a misnomer caused misunderstanding.

 

As to the phrase 'on the heap' when referring to the free store, there was never a heap in the technical computer-science sense of the word.  The term originated when Unix was running on the DEC PDP-11 (as the "on the stack').  This was before the era of virtual memory, and each process had its own memory region.  The executable code was in one area, and area was designated for the stack, which grew "downwards", and the rest of the memory was reserved for allocations (using malloc() -- which was an acronym for memory allocation).  The memory allocations and the stack grew towards each other.  Because the free store was always drawn at the bottom of diagrams, depictions of memory allocations looked like a big pile of boxes thrown one on top of the other.  Deallocations could occur in the middle, so eventually the diagrams would look like a big pile of stuff at the bottom.  As Hodgman puts at, 'a whole heap of bytes'.

 

With the advent of virtual memory, this description no longer makes sense.  It's still a misnomer.  It doesn't hurt to use the slang, but don't expect to find any kind of heap if you look under the hood, even if it might be a reasonable design on some system.  Don't expect to find the term in the standard, since that would require the use of a particular implementation where it might be inappropriate.


Stephen M. Webb
Professional Free Software Developer




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS