The Last Quiz Had a Point?

Published February 03, 2006
Advertisement

So what was the point of the last quiz?


Well, that's a fairly short one to answer: C++ is a hard language to learn.

Not a single person got #1 completely right (except Promit).

Let's look at the For Beginners forum for a second. Count how many posts you see where the person is asking "what language should I learn?" Finished yet? You won't be finished anytime soon, so let me just give you the answer: A lot.

Now looking at the answers I see an overwhelming majority are: "Learn C++ because... (Various crap)." One of the major contenders for that crap title is: "Because that's what is used in the industry". Having dealt with industry professionals in other mediums before, I can say that they would have scored about as well as most of you who participated in my little quiz did. Learning C++ as a first language is usually a bad choice, IMO, mostly because you get bogged down into the details of the language, instead of learning the concepts of programming.

Now looking over the results, there is one question that everyone got wrong (except one person, who didn't answer it because we had already discussed it previously). That question being the first one. Think about what the question is asking, it seems so obvious to you, so clear cut and simple. But yet the behavior of the second line is completely undefined. It can do anything. However, the majority of you answered that it would just result in a pointer to . Why is this? I think my best guess is the following: You have been taught pointers wrong. At some fundamental level, you have been taught that a pointer is just an integer, and as such operations like adding a number to an integer can't really have any side effect unless you use that integer elsewhere (which that pointer doesn't do). But that is not true. Pointers are NOT integers; as such they don't have all of the properties of integer addition in C++. Other pointer behavior that would be undefined, continuing from the first question: int* l = p - 1; would also be undefined behavior. The only legal range for a pointer that deals with p is: [p, p + 10]. Anything further and you're in the realm of undefined behavior. One might be tempted to say: "Well... on system XYZ that code works fine." But that's not the point of the question. While it may result in predictable behavior on some systems, that doesn't mean it will on all systems. It's just like relying on the VC6 for loop scope issue.

There is a heck of a lot of undefined behavior in C++. Simple things that may seem to be entirely legal can easily fall into the realm of undefined behavior. The classic example of this would be: int h = 10; h = h++;. Think about what you would expect that line to result in? Most people "in the know" will say something like 10, or maybe even 11. Someone who is really in the know will say: "Undefined behavior." You could almost say: You have a better chance of finding a C++ program relying on undefined behavior than you will of finding one that is perfectly compliant. Scary thought, that one.
0 likes 10 comments

Comments

evolutional
I like your point.
February 03, 2006 10:51 AM
_the_phantom_
ah, and now the point of the last post becomes clear... and I need to go and revise pointers [sad]

and yes, C++ is a terrible language to learn first, unfortunately between the perceptions of 'its what used in the industry', 'its fast' and other such myths and rumours its all newbies wants to learn..
February 03, 2006 11:05 AM
Cypher19
A valid point, I admit that, but how many people even think of accessing vars outside the array range of p as a purposeful solution? Even IF they don't know that just incrementing p there is invalid, surely they know at least that "we don't know what's there" or "we could be, dangerously, modifying data there allocated by some other variable or program" and so on, right?
February 03, 2006 11:45 AM
jollyjeffers
1 - 0 to Washu

Quote:C++ is a hard language to learn
I thought this might be the point of the quiz. I've often thought that one of the key things that differentiates between myself and n00b is that I know what I don't know. Stupid as it sounds, I know that despite using the language regularly for 3-4 years there are still large parts (or finer details) that I don't fully understand or simply don't know about.

I've also defended on a number of occasions my choice (and the choice of others) to start with something like VB6. I looked at C++ early on and it made my head hurt, so I went with VB4 (then 5 and 6) because I could focus on making the program rather than writing complicated syntax and looking at confusing errors [smile]

Quote:between the perceptions of 'its what used in the industry', 'its fast' and other such myths and rumours its all newbies wants to learn..
I could draw a similar analogy (as I'm sure you could) about n00b's downloading the DirectX SDK and wanting to jump straight to High Dynamic Range Rendering and Advanced Per-Pixel Lighting Models before they've even worked out what a depth buffer (etc..) is [wink]

I mean, every commercial game has funky normal-mapping and dynamic soft shadows. Can't be that hard if everyone else is doing it...

Jack
February 03, 2006 11:47 AM
Fruny
One has to realize what "undefined behavior" means to compiler implementers. For every situations, the standard may essentially say one of three things.

  • Mandate specific treatment.
  • Require implementation-defined behavior.
  • Leave behavior undefined.


In the first case, implementers must properly detect the situation and follow the requirements to the letter.
If the standard states that something is an error, the compiler must properly flag that error and reject the program as ill-formed. This category obviously represents the bulk of the document, given that the intent is to specify the language. [smile]

In the second case, implementers must still properly detect the situation, but may treat it however they want.
Take RTTI type names, for example: typeid(something).name() -- the standard mandates that this must returns a string uniquely describing the type of something, but the actual strings are implementation defined. GCC and Visual C++ each return different strings. This category has significant implications on code portability. You shouldn't hard-code such type names, just like you shouldn't hard-code type sizes, nor rely on a specific number representation ... unless you are willing to restrict code portability to a specific version of a specific compiler.

Which leaves us with undefined behavior. Either explicit, when the standard states that (I paraphrase) "dereferencing an invalid pointer has undefined behavior", or implicit when the standard just doesn't mention some issue.

What it means to the implementer is "don't bother about that case". They do not have to write code to treat it. Nor even to detect it. In fact, in quite a few cases, it might just be impossible to even detect that something bad happened, or detecting it would completely change the nature of the language (giving rise to, e.g. C++/CLI, where invalid pointers do not happen), be too slow, etc.

Or, to rephrase that, "you can assume this will never happen, users are not supposed to do this. If they do, it's their problem, not yours".

If you were to write a specification for most beginners' C++ programs, in quite a few places, you would end up specifying "FOO has undefined behavior". Why? Because they didn't do error checking and whatever happens when you do "FOO" depends on whatever default code paths get followed in such a case.

Think, for example, about what happens when you write code intended to read in a number, but somebody writes text instead. If the program specification had said "Entering text here is an error", the program would be required to do error checking. If the program specification says "Entering text here has undefined behavior", then anything goes.

Now, that problably does not mean that the program is going to go through your hard drive and erase all your files -- unless that undefined behavior happens withing a file manipulation routine or unless the implementer was deliberately malicious, but what happens truly depends on how the program was written. Were the program to erase your files, it would still be correct according to the specs. You probably don't want nuclear missile control systems that rely on undefined behavior. Starting WW3 would be a possibility (and have fun telling people that it is perfectly acceptable behavior according to the specs).

Consider what happens (undefined!) when you write past the end of an array. The OS might stop you (unless you *are* writing the OS, in which case you're on your own, pal), you might overwrite part of your application's code, overwrite other variables, corrupt your stack. These are all things that are external to the C++ specification itself: you might not have an OS, your architecture might be weird... Heck, in fact, precisely what happens depends on how the CPU will map the values in memory to opcodes. None of this is something that is under the control of the compiler writers. Nor should they be.

Take another commonly misintepreted C++ issue that has undefined behavior: polymorphically destroying an object which has a non-virtual destructor: Base* ptr = new Derived; delete ptr;. You often see people claiming that it will only call the Base destructor. That may effectively be what happens because of how classes are implemented in your compiler, but that might change from one version of the compiler to another. It might even change if you modify the compilation options.

Now, compiler writers are (mostly) rational beings, and with a bit of knowledge about data structures and C++ specifics, you can probably infer how they are doing things internally. This can lead toclever user hacks, which are unportable abominations from a theoretical point of view, but may be reasonable in practice -- because compiler writers are not completely insane, nor malicious, and though behavior is unspecified, common problems generally end up being solved in similar ways (at least until someone does something clever).



Deliberately relying on undefined behavior in your program is bordering insanity. You are, after all, reverse-engineering the compiler. But relying on undefined behavior because you can't be arsed to do things correctly is just plain irresponsible.
February 03, 2006 12:01 PM
ApochPiQ
Interesting. This quiz completely backfired for me. When I first read the questions, my thoughts were, roughly:

1. a) No, it's undefined. Might get lucky on certain systems, but it is strictly illegal.

1. b) Who the heck knows. The question is moot because line 2 isn't even valid.

1. c) Arithmetic and comparison. Dereferencing would be foolish.


2. Undefined. You may get a number, you may get a pie, you may get eternal chaos.

3. The comma operator is the spawn of Satan. If I was forced at gunpoint to figure out that line of code, I'd look up the operator semantics in a good reference before answering. Then, when nobody was looking, I'd rewrite that beastly line into something more readable and less prone to bad bugs. (Incidentally, if anyone has some good examples of the comma operator making code cleaner, I'd be very interested in them. I've yet to see any.)

4. Potential memory leak, potential reliance on order of construction.



Those were going to be my answers, but I decided to cheat and read the other replies first. After seeing a good half-dozen posts indicating that I'd gotten something wrong in question 1, I figured I was way out in left field, and didn't reply at all.

Just goes to show that counting on other people to know the language can be risky [smile]
February 03, 2006 01:14 PM
MustEatYemen
Something about Opening mouth and removing all doubt?
February 03, 2006 02:03 PM
Mithrandir
<blissful tone>Ahhhh, beautiful beautiful C#. Thanks to you, I don't ever have to worry about this shit ever again!</blissful tone>
February 03, 2006 02:19 PM
Rob Loach
Write a book... Seriously.
February 03, 2006 11:44 PM
demonkoryu
Yes, book²

Ok, I thought I was firm on C++... how wrong was I. I have read the replies made, but I will be honest with what I knew:


1.
a) I would have thought it is valid. Like an end() iterator pointing just behind the last element.
b) To the int after the last array element.
c) Arithmetic and comparison. Dereferencing would be foolish.

2. It should (if I was to decide) print 101010. Of course, it is undefined.

3. OMG I first thought this has to do something with functions... but it's only the lowly comma operator.

4. Possible memory leak if either constructor throws.

(Template ripped from ApochPIQ)

So the score is... 3 right, 3 wrong... hmm... yah...
I love ruby[totally]
February 08, 2006 07:35 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement