Sign in to follow this  
Glak

making a new language, what are the ramifications of this feature?

Recommended Posts

I'm making a programming language that has a lot in common with C++. One difference is that I'm going to move the virtual pointer into the pointers and references. This will make it easier to do multiple dispatch and possibly some higher level stuff. Well actually it won't be a virtual pointer but a type id, but it will play the same role. In C++ you only get polymorphic behavior through pointers and references, and I am fine with that limitation in my language. Am I missing anything? Is there going to be something that I can't implement? Anything that will be counter intuitive? I have already though of one situation without a clear answer. Using C++ code to illustrate: struct Base; struct Derived; Base* p = new Derived(); //p has two members, an address and the type id of Derived Base** pp = &p; //pp has two members, the address of p and either the type id of a Base pointer or the type id of a Derived pointer. I am thinking that I will go with it being the first, as I think that it will be easier to implement and result in faster code, however it might be unintuitive.

Share this post


Link to post
Share on other sites
When working with double pointers (or higher), why store a type ID that says 'pointer' of any kind? Why not just have a Base** store a type ID of what it points to (Derived or Base), and the same for Base**********************.
Otherwise, you're going to have a ton of extra type ids for every pointer depth used, which will require the type ids to be larger and will make the EXEs resulting from your programming language very large.

Share this post


Link to post
Share on other sites
Quote:
Original post by Glak
I'm making a programming language that has a lot in common with C++. One difference is that I'm going to move the virtual pointer into the pointers and references.

What? Why!? The type of an object is part of that object, not part of every pointer referencing it. What possible benefit could this have over putting the typeid in the object itself?

Share this post


Link to post
Share on other sites
Extrarius: because int and int* are two different types. I want to regularize my language as much as possible to allow for higher levels of abstraction. For example I am bringing a lot of lisp like stuff into the language, while still retaining the C++ design philosophy.

HairyTroll: I've looked at it before and it doesn't really go where I am going. Plus working on this project will be fun for me.


Sneftel: I sat here for like three minutes trying to remember why the pointers will have the type ids and eventually I remembered the main reason. I think that there were others though. The original reason was that I don't want primative types to have to contain this sort of information. I am not restricting polymorphism to the "this" pointer. I want to be able to have virtual parameters. I'm sure that at some point everyone has tried to write some collision code and thrown an ugly switch statement in. Virtual parameters are a cool feature, and why make a language for fun if you can't add some features that you like?

However one of the laws of my design is that primative types are full types and get to participate in all of the cool stuff. So I need to be able to write something like this: (again, not the actual syntax of my language)

void f(virtual int* ip)=0;
void f(unsigned int* ip){cout<<"unsigned";}
void f(signed int* ip){cout<<"signed";}

At run time the int needs to identify itself as signed or unsigned. Storing the type id of an int with every int would be insane. If the typeid is the same size of the int then you double the size of an int. You more than double the size of a char. The size of objects would be similarly increased, as at some level they are made up of primative types. Everything would be increased in size.

Java, C#, and similar languages use something called "boxing" to work around this issue. They have int for plain old efficient ints, and Integer for the full blown object versions which do little more than wrap the plain old versions. Then they boast about new features that automatically box and unbox ints. Boasting about a feature that other languages (such as C++) don't need doesn't impress me. I really can't bare the idea of treating primative types so differently.

Pointers to primative types are rare enough that a little extra overhead would be ok, and a lot of those pointers are to arrays so it isn't like we are doubling memory usage.

Pointers to more complex types are common enough that the overhead might be an issue, especially in things like linked lists. Iterators would become larger too, as they are generally small and contain a pointer. However if my language is not as efficient as C++ that is ok with me. I want my language to be decently fast but realistically I am making a high level language and so it will be slower.

Share this post


Link to post
Share on other sites
Quote:
Original post by Glak
I sat here for like three minutes trying to remember why the pointers will have the type ids and eventually I remembered the main reason.

This isn't a good sign.
Quote:
Original post by Glak
I want to be able to have virtual parameters.

This sounds suspiciously like templates. Elaborate, please.

CM

Share this post


Link to post
Share on other sites
Quote:
Original post by Conner McCloud
[...]
Quote:
Original post by Glak
I want to be able to have virtual parameters.

This sounds suspiciously like templates. Elaborate, please.[...]
He wants multiple dynamic dispatch.

[Edited by - Extrarius on November 3, 2005 11:48:32 PM]

Share this post


Link to post
Share on other sites
From what I can see, you are only wanting it to do things that can already be done in C++, only C++ compilers do them in a much better, more optimised way.

Share this post


Link to post
Share on other sites
Quote:
Original post by Glak
I sat here for like three minutes trying to remember why the pointers will have the type ids and eventually I remembered the main reason. I think that there were others though. The original reason was that I don't want primative types to have to contain this sort of information. I am not restricting polymorphism to the "this" pointer. I want to be able to have virtual parameters. I'm sure that at some point everyone has tried to write some collision code and thrown an ugly switch statement in. Virtual parameters are a cool feature, and why make a language for fun if you can't add some features that you like?

What you're taking about is called "multiple dispatch". But multiple dispatch doesn't require you to store the type information somewhere other than the object. I see what you're saying WRT trying to pretend that primitive types aren't primitive types, but examples of where an unsigned int* needs to be treated like a int* through virtual binding are...well, I simply cannot think of a real-world situation where this would be useful. And really, unsigned int has no IS-A relationship with int; its range is not contained in int's range. Finally, keep in mind that the only sane way to treat integers as polymorphic would be to dynamically bind all arithmetic operations. (Yikes.)

Share this post


Link to post
Share on other sites
Ooh! Also, it means that you can't have double pointers to arrays, since it's not clear whether, given int** i, i[1] refers to the second int* in an array of integer arrays, or to the typeid for a single integer.

EDIT: fixed types

Share this post


Link to post
Share on other sites
Quote:
Original post by hplus0603
Just like real dynamic languages do. There are uses for this mechanism -- if you are prepared to pay the price.

Well, sure... for dynamic languages, it's worth it. But in a language with dynamic polymorphism, rather than full dynamic typing, it's a tough sell.

Share this post


Link to post
Share on other sites
Just to fuel the fire here... It seems like some of the motivation for this is to "save" memory, by not storing typeids with small types. But except in the case of containers (and even then?), you're not ever saving memory. If I create a "type-of" integer somewhere, I MUST maintain a pointer to it, to remember its actual type. (In your scheme.) So at a bare minimum, I've got the one extra typeid floating around in memory. (Plus a pointer). Basically, in C++ you pay the price per instance. In your scheme, you pay the price per-reference. Unless you have memory leaks, the number of references is strictly greater than or equal to the number of instances. (Or I'm missing something.)

Share this post


Link to post
Share on other sites
thanks everyone, I am convinced to drop it for now.

"From what I can see, you are only wanting it to do things that can already be done in C++, only C++ compilers do them in a much better, more optimised way."

Well I did only post one of the many features of my language here. Plus multiple dispatch really isn't something that can be done in C++ without ugly hacks and/or some metaprogramming. You shouldn't have to resort to metaprogramming to get the features that you want in a language, that's like saying that we should all use lisp because it is so powerful. There are more than enough differences between my language and C++.

"Ooh! Also, it means that you can't have double pointers to arrays, since it's not clear whether, given int** i, i[1] refers to the second int* in an array of integer arrays, or to the typeid for a single integer."

No that's not a problem. To the user a pointer is a pointer. The fact that it has a typeid field is hidden.

"In your scheme, you pay the price per-reference. Unless you have memory leaks, the number of references is strictly greater than or equal to the number of instances. (Or I'm missing something.)"

yeah, lets say that you have a type A with two members, int B and point C. C has three floating point members. Ok let's say that int and float are 32 bits and char is 8. The sizeof B will be 4, sizeof C will be 12, the sizeof a will be 16. Ok now if you have to store a typeid (again 32 bits) the sizeof B goes from 4 to 8. The size of each of C's members goes from 4 to 8, the size of C goes to 28 and the size of A goes to 40. 40>16.

"How will pointer arithmetic work?"

In the case of incrementing or otherwise moving the pointer the address would be changed by the operand. The typeid is totaly hidden from the user, and indeed is just an implementation detail that the compiler writer can change, well in the same way that a vtable pointer is "merely" an implementation detail.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this