Goto and Loops - NOT a goto discussion

Started by
19 comments, last by alvaro 11 years, 3 months ago
Oddly however, I left the loop definition like this, without initializing i to 0:

for ( int i; i < 3; i++)
and it compiled.


As I said, it's illegal to jump in a block with an automatic variable with an initializer. If it's a POD type, like an int, without an initializer, it's legal to jump into a block that contains it. In C++98/03 it's illegal to jump into a scope with a non-POD type with or without an intializer. In C++11 the rules are basically the same, but are described in a more complex manner because of the shades of what is considered POD.
Advertisement
I don't know why I had the idea that the stack was dealt with inside a function, instead of only on a function level. I suppose language syntax and scoping rules made me assume it worked like that, but in retrospect, I don't think I've read anything to indicate it.

It's a sensible assumption about the implementation of the underlying C++ memory model. There is in fact nothing in C++ that mandates the use of a stack for anything. It is quite possible (provably so, because it's been done) to have a C compiler for hardware that does not support stack operations, and I imagine it's equally as possible to do so for C++. I'm not saying it wouldn't be a pain, but technically possible.

The very use of the common misnomers "on the stack" or "on the heap" when referring to automatic or free storage respectively can cause a good deal of confusion when people find out that's not how things really work.

Stephen M. Webb
Professional Free Software Developer

The very use of the common misnomers "on the stack" or "on the heap" when referring to automatic or free storage respectively can cause a good deal of confusion when people find out that's not how things really work.

How are those misnomers? If you look in the Wikipedia page about memory management, dynamic memory allocation is described as "allocating portions of a large pool of memory called the heap". The C standard doesn't mention "heap" anywhere, but it doesn't mention "free storage" either.

The C standard doesn't talk about "the stack", but it also doesn't talk about "local variables". I am happy to talk about local variables in the context of the C language, and the implementation of local variables in a C compiler has to be a stack of some sort, at least if we are dealing with recursive functions. So I don't see the problem with calling this space "the stack" either. This stack doesn't have to be the same as the call stack, although that is the case in every compiler I have ever used, and when debugging some problems, it has been useful to understand something about the memory layout of these things.

I knew about the stack and the heap before I knew about C or C++, and I have a hard time using other terms. The mental pictures of the stack and the heap still serve me well, and when I ask my compiler to turn some piece of code into assembly language, it still looks like the stack and the heap are being used. I don't see why the terms used in the language description should be preferred.

I guess I just don't allow programming-language standards to determine how I talk about things that are larger than the programming language. For instance, a byte to me is 8 bits, but the standard specifies that a byte is whatever a char is, which could be something other than 8 bits. That might be the convention used in that document, but when I use the word "byte", I mean 8 bits.

While I am in rant mode, I call my .c and .cpp files "modules", and not "translation units", and I don't know anybody who uses the latter.

As far as I understand (same as others), that should be invalid syntax. Variable 'I" should be only valid in the for loop.

Also, even it's valid syntax, I would never write that kind of code because it's so confusing. Indeed I would not spend my any time on studying "goto". I will stop here, otherwise will bring a flame war of "goto" discussion.

https://www.kbasm.com -- My personal website

https://github.com/wqking/eventpp  eventpp -- C++ library for event dispatcher and callback list

https://github.com/cpgf/cpgf  cpgf library -- free C++ open source library for reflection, serialization, script binding, callbacks, and meta data for OpenGL Box2D, SFML and Irrlicht.

I don't know why I had the idea that the stack was dealt with inside a function, instead of only on a function level. I suppose language syntax and scoping rules made me assume it worked like that, but in retrospect, I don't think I've read anything to indicate it.

It's a sensible assumption about the implementation of the underlying C++ memory model. There is in fact nothing in C++ that mandates the use of a stack for anything. It is quite possible (provably so, because it's been done) to have a C compiler for hardware that does not support stack operations, and I imagine it's equally as possible to do so for C++. I'm not saying it wouldn't be a pain, but technically possible.

The very use of the common misnomers "on the stack" or "on the heap" when referring to automatic or free storage respectively can cause a good deal of confusion when people find out that's not how things really work.

I'm also not sure why you refer to these terms as misnomers. These terms were created for the exact purpose of the stack / heap allocation that exists on computers, they are still applicable as far as I'm know. Trying to allocate a large enough local variable will cause a crash, as you run out of space on the stack, yet using new for the same size allocation on the heap works just fine - so there's still a definite difference between the two. It has been a couple of years since I took my OS design courses, and the knowledge there is a bit simplified over the real things, but the 'on the stack' and 'on the heap' should be correctly describing of what's actually going on.

The major reason why I posted this is because I wanted to know about the exact specifics of of the stack management and how this specific case was handled. At any rate, I think the previous posts in this thread did describe the reasoning behind this fairly well.

I'm also not sure why you refer to these terms as misnomers

The C++ standard uses his wording -- "the stack" is automatic storage and "the heap" is the free store.
"The stack" and "the heap" are just colloquialisms carried over by C programmers. Arguing that these terms are wrong is a bit of a pedantic thing to do though, like pointing out that "The STL" is wrong as should just be called the standard library, instead of just quietly admitting that you know what people mean by these phrases wink.png The point though, is that "the stack" isn't necessarily a stack, and "the heap" isn't necessarily a heap. They're abstract concepts in the C++ standard, and the implementations we're used to are concrete compiler decisions that fulfil those concepts.

This post is similar to the one above. I am not trying to beat a dead horse, but perhaps gain some understanding. So please point out if I am wrong about anything below (which is very plausible).

The point though, is that "the stack" isn't necessarily a stack, and "the heap" isn't necessarily a heap.

How could the automatic storage not be a stack? If you have a recursive function with parameters and local variables, there must be a stack of some sort somewhere to handle different calls to the function being "active" at any given time. It might not be the x86 stack, but it has to be a stack of some sort. And the large pool of memory from which you dynamically allocate portions is called the heap, and that is just a name, not a statement of a particular implementation of the concept.

They're abstract concepts in the C++ standard, and the implementations we're used to are concrete compiler decisions that fulfil those concepts.

Although there is nothing technically wrong with that statement, I think it confuses the history of these things: A beginner will get the impression that the C++ standard was created in a vacuum and then people came along and settled on these particular implementations, as if by some sort of accident. The reality is that the implementations came first and these concepts were well established before the C++ standard was created. The C++ standard is simply a contract between the programmer and the compiler writer as to what things are guaranteed to work. I don't understand why the standard didn't use the prevailing terminology: Calling the heap "the free store" gives the impression that they are allowing more flexibility in the implementations than if they had called it "the heap", but this is not the case.

The code won't compile using gcc:

In function "int main()":
10: error: jump to label "looped"
17: error: from here
8: error: skips initialization of "int i"

[quote name='Álvaro' timestamp='1357819659' post='5019839']
How could the automatic storage not be a stack? If you have a recursive function with parameters and local variables, there must be a stack of some sort somewhere to handle different calls to the function being "active" at any given time. It might not be the x86 stack, but it has to be a stack of some sort. And the large pool of memory from which you dynamically allocate portions is called the heap, and that is just a name, not a statement of a particular implementation of the concept.
[/quote]The point about these being a misnomer is only because the standard doesn't actually use those names. "The stack" and "the heap" are what we all call them, but the standard doesn't call them that, so it's informal language.

The "call stack" does act like a stack (pushing and popping stack frames), and the stack data-structure can be implemented in different ways (e.g. we're used to the call-stack being an array, but it could also be a linked list). But perhaps on some particular machine it's possible to have a call-tree instead of a call-stack -- e.g. for co-routines -- such a machine could implement a standard C or C++ compiler, with it's own co-routine extensions to the language to make use of it's call-tree, which would allow for yield/resume as well as call/return, etc... It would likely use a linked-list based tree for automatic storage ("the call-stack") instead of an array-based stack.

"The heap" could be implemented as a heap, or many other structures. I'm not sure why we call the free-store "the heap"... maybe it was originally implemented as a heap data structure? Or maybe because there's a whole heap of bytes in it cool.png

n.b. I did make the point that it's awfully pedantic to argue over whether we should call it the heap or the free-store, etc... I prefer the informal colloquialisms because they're more prevalent (it's what i was taught to call them in Uni, anyway).

[quote name='Álvaro' timestamp='1357819659' post='5019839']
How could the automatic storage not be a stack?
[/quote]

On the venerable TM9900, subroutine calls are perfomed using the BLWP instruction (branch-and-link-workspace-pointer), which implements call/returns semantics using what is effectively a linked list. No stack. None, nada, not a sausage. Variables of automatic storage duration are created in scratch RAM indexed by one of the 16 general-purpose registers. I'm not sure of a C compiler was every available for that platform, but if there was it would make more sense to use that native method than to try to hack one of general-purpose registers for use as a stack pointer (call/return would still use BLWP, since there was no alternative).

Most calls on a SPARC were done by switching registers sets, with arguments passed in registers and, if possible, automatic variables were also in registers. SPARCs had a lot of registers. That means, unless the parameters or locals were huge, subroutine calls were through a chained linked-list structure, not a stack. There was definitely a C compiler for the Unixes that run SPARCs.

If the size of the locals or arguments on a 68k or a PPC are small enough, only the registers are used and stack use is avoided by most optimizing compilers. The OP's code is an example of such small code.

To assume you have to use some sort of stack structure to implement automatic variables in C or C++ is just an invalid assumption. That's why it's a misnomer. The word means "improperly named", and it's an accurate description. Sure, you can go ahead and use the phrase "on the stack" to describe variables of automatic storage duration and most folks will know what you're talking about, but if you take the name literally and make assumptions about the underlying implementation, you're going to run in to trouble. That's what happened here. That's why it happened here. I do not think it is pedantic or pretentious to explain why it's a misnomer and how it being a misnomer caused misunderstanding.

As to the phrase 'on the heap' when referring to the free store, there was never a heap in the technical computer-science sense of the word. The term originated when Unix was running on the DEC PDP-11 (as the "on the stack'). This was before the era of virtual memory, and each process had its own memory region. The executable code was in one area, and area was designated for the stack, which grew "downwards", and the rest of the memory was reserved for allocations (using malloc() -- which was an acronym for memory allocation). The memory allocations and the stack grew towards each other. Because the free store was always drawn at the bottom of diagrams, depictions of memory allocations looked like a big pile of boxes thrown one on top of the other. Deallocations could occur in the middle, so eventually the diagrams would look like a big pile of stuff at the bottom. As Hodgman puts at, 'a whole heap of bytes'.

With the advent of virtual memory, this description no longer makes sense. It's still a misnomer. It doesn't hurt to use the slang, but don't expect to find any kind of heap if you look under the hood, even if it might be a reasonable design on some system. Don't expect to find the term in the standard, since that would require the use of a particular implementation where it might be inappropriate.

Stephen M. Webb
Professional Free Software Developer

This topic is closed to new replies.

Advertisement