My question derived from a problem occurring by GCC's name decoration. When GCC compiles a program (Not at the linking stage yet), how does it determine that a function's return type can be coerced to the object that it is being set to?
Ok, that question right away to me, makes me see some ambiguities in what I mean (Some may interpret my question as: "Well a float can be truncated or rounded in value and stored in an int, or an int can be casted to a float...".). That's not what I'm asking Let me elaborate on my question. (It's going to sound a little "ramblish", but I think it may be necessary to demonstrate where I'm coming from and help anyone reading to understand where my confusion is. I could probably trim a line here or there, but I really think much of it is needed, unfortunately.)
I've put some code up on ideone.com located here: http://ideone.com/pKty2
The problem caused by the name decoration system utilized in GCC (That is somehow works around.):
My confusion stems from the fact that return types are not rolled into decorated names (They are with Microsoft's cpp compiler though; not sure about the .NET compilers...). Anyhow, ok so on line 31, we have an expression, more specifically a line that translates into an assignment operator call. According to the Annotated C++ Reference Manual, it's to catch certain errors. What it does not do is elaborate. What I get from its example is that the design choice was in reaction to a problem stemmed from rolling the return type into the decorated name. More significantly and specifically, because rolling a return type into the decorated name would now make identical function names that have identical function arguments have unique decorated names; since the return types are different and they (The return types) become part of the whole of the decorated name; making the entire decorated name different. (The book doesn't state the last few bits -- a major blunder imho since that's the problem it was trying to avoid; It only stated what was done and gave some code but you're left to logically legitimize the design choice by figuring out what the problem would have been.)
A worded example being played out that progresses in complexity:
Let's start out easy, and look at a normal simple function call with no parameters and just flat out function call (No assignments, etc.). (Rhetorical question) How does the compiler get the correct function? Through the decorating of the scope plus the name plus the arguments, then it goes down the list in the symbol table matching the signature and grabs the address, spins up a function call thunk (Might be stub; not sure on proper terminology here), and emits it as intermediate code representation. No problem, pretty straight forward, pretty simple and easy. Now let's, throw some arguments into the mix. Again, looks up the function matching the signature no problem.
However, in going back to the original problem, let's say there is a single function (In a class or not -- kind've irrelevant for the purposes of demonstrating my question.) just to make it easier. And let's say that function has no arguments, so it's argument list is "v" for void. Now we go back to assignment statement. Somehow the compiler knows whether or not that function (matching the scope, name, and argument list) is of an acceptable return type to the L-Value; despite the return type not being rolled into the decorated name. My question is how?! Please note: this question is irrespective to whether an int can be casted to a float, or a float truncated to an int, etc. My question is the compiler knowing the return is an int or a float -- or the more complex example, a compound object (a struct, class, etc...), which I suppose would be the same implementation in determining if it were a first-class/primitive type (int, char, bool, float, etc...).
My work towards trying to figure this out:
I started thinking about the Cartessian Product Algorithm (CPA), making a super set of smaller sub-sets, etc. But then I realized, wait a min. that doesn't answer the question, because if I'm able to use the CPA then that implicitly assumes I've already discerned the type returning by that function. Then I thought of the typical binary trees. Then I realized: "Well if that's the case, then the return type IS being referred to on some level at some point, somehow -- just not in the decorated name". Then I tried looking at the gcc code, that didn't go so well. heh. I can read code just fine, but the formatting, imho, was just about the most hideous format I had ever seen. I've done a bit of research on it, but I haven't found the answer to how the compiler knows to kick an error out on an assignment statement without knowing it's return type (Or maybe it does?). I know it's obviously not the decorated name, as per the Annotated C++ Reference Manual, and several sites I found online detailing the naming scheme. I've even tried checking the dragon book. This is what I found it says:
p. 391: (Continued from p. 390 on "6.5.3 Overloading of Functions and Operators")
"It is not always possible to resolve overloading by looking only at the arguments of a function. In Ada, instead of a single type, a subexpression standing alone may have a set of possible types for which the content must provide sufficient information to narrow the choice down to a single type"
p. 423:
- "Function types. The type of a function must encode the return type and the types of the formal parameters"
Problem 1) Unfortunately, the dragon book is heavily based on designing a java like language, which may have different internal semantics than say GCC.
Problem 2) The first snippet references Ada. I'm not interested in Ada, I'm interested in C++
Then I started thinking well maybe it can tell if the types are equivalent enough based on the memory size of the types, but then I realize, again: I'd have to first know the type being returned by the function. There are plenty of solutions I can come up with (Essentially all boiling down to throwing more information at the problem, or rolling the return type into the decorated name -- which is essentially throwing more information that the problem.), but I kinda, personally, consider GCC the defacto standard, not Microsoft, not Borland, etc (Though they are very respectable compilers, and I don't have a single thing against them.). There's just still a piece of the puzzle missing... Any thoughts on this would really be appreciated! I'm kinda at a standstill until I figure this out...