Sign in to follow this  
ToohrVyk

Dynamic overloading

Recommended Posts

Consider the following C++ code:
class A{};
class B: public A{};
void Frobnicate(const A&) { std::cout << "A"; }
void Frobnicate(const B&) { std::cout << "B"; }

A* a = new B();
Frobnicate(*a);
delete a;
This displays "A", and not "B". Why did the language designers make this choice? Implementation issues notwithstanding, what other reasons could they have had for making this choice? While this approach is consistent with itself, I'm having trouble wrapping my mind around why compile-type binding by default (using the static type of variables) is necessary in a language.

Share this post


Link to post
Share on other sites
The compiler is just selfconsistent in this case..

Actually, this is not a polymorphic call, since these cannot be done except through a method (a function inside the class/object?)

Search for MSDN help on overloaded function resolution to see how the compiler chooses the right function that is overloaded.

In this case the one with const A& is an exact match.

Although I can't say whether they thought about it or not, to my own evaluation (if I were the designer), this is a case where the more general design must be taken..

The other behavior may be achieved through virtual member methods.

class A{
virtual void Frobnicate() { std::cout << "A"; }
};
class B: public A{}
{
void Frobnicate() { std::cout << "B"; }
};

void Frobnicate( const A& ){ A.Frobnicate; }

A* a = new B();
Frobnicate(*a);
delete a;



If they took the other way, I don't know how they could achieve the same generality.

Share this post


Link to post
Share on other sites
To rephrase my question better: why perform overloading based on the static type of the object reference, instead of polymorphic dispatch based on the dynamic type of the object itself? What purpose does the type of the object reference serve, that makes it worthy of being the overloading decision factor? Assuming that the language did behave as I proposed, what frequent construct would have been broken by the new behaviour?

How often does one need to make actions depend on the type of a reference, as opposed to the type of an object?

Share this post


Link to post
Share on other sites
All overloads are resolved at compile time, based on the parameters' static type; surely it would be nonobvious and unintuitive if this were not the case and *some* (but not all) overloads would instead be deferred to run-time?

virtual member functions are clearly marked as being "special". It is of course arguable whether the language should also support virtual non-member functions (but then we should probably support multiple dispatch too, for completeness' sake).

Share this post


Link to post
Share on other sites
Quote:
Original post by Sharlin
All overloads are resolved at compile time, based on the parameters' static type; surely it would be nonobvious and unintuitive if this were not the case and *some* (but not all) overloads would instead be deferred to run-time?


Why "some" ? I'm arguing against the use of static reference types in deciding what should be done. In my perfect world, the static type of objects is not used, only the dynamic type is. And I'm arguing that by doing this, no useful feature of the language is cropped.

Quote:
virtual member functions are clearly marked as being "special". It is of course arguable whether the language should also support virtual non-member functions


What's so special about virtual member functions? In an object-oriented context, I would find on the contrary that it is the non-virtual member functions that are "special".

Quote:
(but then we should probably support multiple dispatch too, for completeness' sake).


And the compiler would probably do a much better job than anyone else as far as multiple dispatch is concerned.

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
To rephrase my question better: why perform overloading based on the static type of the object reference, instead of polymorphic dispatch based on the dynamic type of the object itself? What purpose does the type of the object reference serve, that makes it worthy of being the overloading decision factor?


How do you propose the compiler handle this?


// Translation unit 1
class A {};
class B : public A {};

void Frobnicate(const B&) { std::cout << "B"; }

// Translation unit 2
class A;
class B;

void Frobnicate(const B&);

void func(A* a)
{
Frobnicate(*a);
}


Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
Why "some" ? I'm arguing against the use of static reference types in deciding what should be done. In my perfect world, the static type of objects is not used, only the dynamic type is. And I'm arguing that by doing this, no useful feature of the language is cropped.


Static typing is the "default" in C++ because of its history as "C with classes" and its "you don't pay for what you don't use" paradigm — dynamic dispatch is more of a "special" feature brought in to support OO. The simple rule of thumb is that overloading is "fast", dynamic dispatch "slow".

Besides, generalizing dynamic typing in the manner you describe would probably necessitate generating RTTI for every single class in the program, and a pointer to some kind of a generalized vtable structure in every single object ever created. This is clearly an unacceptable breach of the C++'s core philosophy.

Quote:
And the compiler would probably do a much better job than anyone else as far as multiple dispatch is concerned.


Agreed.

Share this post


Link to post
Share on other sites
Quote:
Original post by joanusdmentia
Quote:
Original post by ToohrVyk
To rephrase my question better: why perform overloading based on the static type of the object reference, instead of polymorphic dispatch based on the dynamic type of the object itself? What purpose does the type of the object reference serve, that makes it worthy of being the overloading decision factor?


How do you propose the compiler handle this?

*** Source Snippet Removed ***


It depends on the language philosophy. A language like C++ prevents instantiation of abstract classes or conversion of base-to-derived pointers, because allowing these might lead to incorrect behaviour (calling a pure virtual function, causing a type mismatch). In this philosophy, since func might be called on an A* that points to something other than B for which no overload is known to exist (a brother or parent of B), such a call would be prevented.

In a more flexible language, such as a slightly more lenient version of C# (or even PHP if it had such things as user-defined types and overloads), the compiler would let the function be called, and throw an incorrect type exception if, at runtime, no correct overload is found.

My preferred level of safety on this issue is to consider such functions as being part of the class' signature. If Frobnicate was a member function instead of a free-standing function, the compiler could rightfully refuse the call to it, because the object of type A has no such function. The problem of translation units (where Frobnicate(const A&) might exist in a third, unmentioned translation unit) is solved by deferring this check to the linking stage: the compiler assumes that an overload of Frobnicate exists for A or a supertype of A, until it's proven wrong by the linker, at which point it would be allowed to complain. Java classes can call each other without knowing each other's signatures at compile time, and C# handles partial classes pretty well - both are very similar to this proposal, with the sole exception that I consider a free-standing function instead of a member.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sharlin
Besides, generalizing dynamic typing in the manner you describe would probably necessitate generating RTTI for every single class in the program, and a pointer to some kind of a generalized vtable structure in every single object ever created. This is clearly an unacceptable breach of the C++'s core philosophy.


Not necessarily. I think we both agree that if dynamic dispatch is wanted, then it is philosophically consistent to create a vtable (or enlarge the existing vtable). Where our disagreement stands is the situation where dynamic dispatch is not needed. To summarize the situation:


B b;
Frobnicate(b); // I want this to display "B"
A* a = &b;
Frobnicate(*a); // I want this to display "A";


My first constatation is: I cannot imagine a situation where this behaviour would be wanted. My second constatation is: if it is really necessary to use this behaviour, why not simply constrain it by using types where dispatch is not possible (because subtyping does not exist) ?


void pFrobnicate(B*) { std::cout << "B"; }
void pFrobnicate(A*) { std::cout << "A"; }

B b;
pFrobnicate(&b); // Displays "B"
A* a = &b;
pFrobnicate(a); // "A"


The code is not much more complex than it was before (although the pass-by-pointer is somewhat ugly), but since this code is somewhat rare the effect should not be extremely important. Conversely, it removes the need for implementing functions which DO need dynamic dispatch by adding a layer of member functions.

Or, more simply, use the approach currently used for marking "virtual" functions. By marking "non-virtual" functions instead, it would be possible to use dynamic dispatch only when required, thereby causing a much smaller breach of the philosophy (the only problem with this being that it's opt-out instead of opt-in, but then again, so are exceptions on most compilers).

Share this post


Link to post
Share on other sites
After working on this question some more, I have isolated two potential problems.

The first problem here is that such functions interact badly with namespaces. Consider for instance a function on Base defined in namespace Foo, and an overload on Derived defined in namespace Bar. The Derived overload should be called, if both namespaces are used. What overload should be called if both namespaces are used? If only Foo is used? If only Bar is used?

This problem can be solved by requiring that all overloads be part of the same namespace (therefore allowing the insertion of a method into a namespace as part of this approach). So, Foo::message can only be overloaded by Foo::message, and Bar::message is a different message altogether.

The second problem is that of type-checking. I had mistakenly assumed that types would form a lattice, which is clearly not the case. As mentioned in another thread, checking that the set of overloads is nonambiguous is quite time-consuming.

A third problem would be that someone, somewhere, told me an article exists which claims to prove that such overloading causes conceptual computability problems in certain areas — I haven't been able to isolate the problem or find the article yet.

Share this post


Link to post
Share on other sites
Unrestricted overloading C++-style is problematic, hence G'Caml and typeclasses in Haskell. This might not be such a problem if your language is not type-inferred, I can't remember the details.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rebooted
Unrestricted overloading C++-style is problematic, hence G'Caml and typeclasses in Haskell. This might not be such a problem if your language is not type-inferred, I can't remember the details.


I am considering a language which is not type-inferred (Java would be a good example of language where I'd apply this).

Share this post


Link to post
Share on other sites
Wait, wait, wait.

Do you ask why C++ doesn't perform multiple dispatch? I think the main reason is performance - outlined by the "pay for what you use" philosophy. You example is quite simple, but if it was allowed, the language should also allow
struct S
{
void frobnicate(const A& a);
void frobnicate(const B& b);
};

A *a = new B;
S s;
s.frobnicate(*a);

It would be very inconsistent if it disallowed such code and allowed yours.
Then it should also accept making those method virtuals, and this suppose some kind of introspection (although it can be limited to knowing the type of the parameter) - but it would allow double dispatch and solve the conceptual prblems that I have with the visitor pattern, which would be quite cool :)

Since such construct would be valid C++, it means that the overhead produced by the RTTI system would not be optional (as it currently is) but systematic. This would not be very consistent with the language design philosophy.

Hey, that's why C++ is an OO language, not a full blown object language [smile]

Beside that, I'm not sure that it pose much problems with namespaces. An extended form of Koenig's lookup can come to the rescue, or maybe another carefully constructed lookup rule. Anyway, defining how namespace and this interact solves the namespace problem. I'm also quite confident that multiple dispatch does not lead to conceptual computability problems - if it were, we would be in trouble with languages such as Nice (where this technique is implemented).

Did I miss something important (I felt I have [wink])

Share this post


Link to post
Share on other sites
Quote:
Original post by Emmanuel Deloget
It would be very inconsistent if it disallowed such code and allowed yours.
Then it should also accept making those method virtuals, and this suppose some kind of introspection (although it can be limited to knowing the type of the parameter) - but it would allow double dispatch and solve the conceptual prblems that I have with the visitor pattern, which would be quite cool :)


The compiler could implement the visitor pattern itself, since it's pretty easy to determine which functions are virtual and which aren't. This would add no overhead to constructs which already exist (my point is that almost nobody uses this "lack-of-feature" in C++) but allow more behaviour.

Quote:
Since such construct would be valid C++, it means that the overhead produced by the RTTI system would not be optional (as it currently is) but systematic. This would not be very consistent with the language design philosophy.


The RTTI is not necessary, because all that information can be safely added to the vtable in a visitor-pattern approach. It would, however, add a vtable to objects without virtual functions if they ever are used polymorphically (but that should come as no surprise).

Quote:
Beside that, I'm not sure that it pose much problems with namespaces. An extended form of Koenig's lookup can come to the rescue, or maybe another carefully constructed lookup rule. Anyway, defining how namespace and this interact solves the namespace problem.


Lookup here does not work, since the behaviour of a dynamically-bound function call must be the same everywhere in the program, regardless of visible namespaces. Although I agree it can be solved.

Quote:
I'm also quite confident that multiple dispatch does not lead to conceptual computability problems - if it were, we would be in trouble with languages such as Nice (where this technique is implemented).


There is at least one problem, which is static verification: given the set of all possible implementations of a message, how to prove that no call will ever be ambiguous? If D is-a B, and a function f(B,B) with overloads f(D,B) and f(B,D) is called on two D's, which overload should be used? Additional constraints: handle a potentially infinite number of types, instead of only a finite lattice, by using first-class functions and parametric types.

I'm pretty sure that Nice (or CLOS, or several others, for that matter) do not perform static checking but instead either rely on runtime dispatch or resolve ambiguities arbitrarily.

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
To rephrase my question better: why perform overloading based on the static type of the object reference, instead of polymorphic dispatch based on the dynamic type of the object itself?


1) Because C++ is focused on static typing and verification (everything has a clear mapping at compile time - there's no possibility for throwing based on the absense of a function without implementing such functionality yourself and depending on OS specific dynamic library APIs)

2) For consistency with pointers, which are consistent with C's pointers - for obvious reasons.

Fortunately, there are various alternatives to get your desired behavior for you to pick from:

1) visitor pattern
2) chain of dynamic_cast conditionals
3) chain of type_info conditionals
4) associative map of type_info wrappers
5) member virtual function
6) using a different (more dynamic) language

I've picked #6 personally. The language I chose dosn't even *have* static type info :D.

Share this post


Link to post
Share on other sites
Quote:
Original post by MaulingMonkey
1) Because C++ is focused on static typing and verification (everything has a clear mapping at compile time - there's no possibility for throwing based on the absense of a function without implementing such functionality yourself and depending on OS specific dynamic library APIs)


The validity of an overload (both that the function exists, and that it is nonambiguous) can be done at linking time, in less time complexity than it takes to actually build the dispatch tables.

On the other hand, dispatch tables are quadratic in size (as consistent with their visitor-pattern implementation).

Quote:
2) For consistency with pointers, which are consistent with C's pointers - for obvious reasons.


I agree. Damn pointers!

Quote:

I've picked #6 personally. The language I chose dosn't even *have* static type info :D.


I'm also in the process of building and then picking #6. Out of curiosity, which language do you use?

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
Quote:

I've picked #6 personally. The language I chose dosn't even *have* static type info :D.


I'm also in the process of building and then picking #6. Out of curiosity, which language do you use?


Ruby.

C++ interoperability currently sucks, unfortunately. Their C headers are macro soup in places, so you can't for example use SWIG to make a C++ class available to ruby without first #undefing twenty bazillion things like "close", "open", "write", etc. - at least, assuming you want to use a standard header like, say, <iostream>.

That said, the language just feels so damn natural compared to what I'm used to that I'm currently taking a stab at writing my own interpreter for it again. We'll see how that turns out. It will of course offer much better C++ integration should it ever be finished.

It's worth noting Ruby is a lot more advanced at interoperating with other languages than it is with C++. Funny story: I had no problems using Ruby to load Python modules written using boost::python in C++.

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
Quote:
Original post by MaulingMonkey
1) Because C++ is focused on static typing and verification (everything has a clear mapping at compile time - there's no possibility for throwing based on the absense of a function without implementing such functionality yourself and depending on OS specific dynamic library APIs)


The validity of an overload (both that the function exists, and that it is nonambiguous) can be done at linking time, in less time complexity than it takes to actually build the dispatch tables.


Just to expand on this also: Not for delay-loaded libraries that may or may not introduce new classes and/or overloads.

Share this post


Link to post
Share on other sites
The case of adding new classes is quite irrelevant, since their addition does not alter the validity of the message implementations, and can be statically verified as if the library did not exist at all (even if these classes inherit from library classes). The same is true for the creation of new messages that do not exist in the dynamic library.

The problem arises from the creation of a new implementation for a message that already exists in the library. First, the verification algorithm is incremental: this means that the message implementations in the executable can be verified as they are inserted into the library. This verification takes linear time in the number of implementations for the given message in the worst case (but most often it is done in constant-time). Additional constraints on the structure of the loaded library might eliminate the need for individual verification altogether.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this