A pragmatic language for the future

Started by
609 comments, last by ApochPiQ 16 years, 1 month ago
I don't think its important to be talking about details like accessors or syntax yet.

If we are going to have this discussion, we should focus on specific areas, look at the approach of other languages, and decide what it best for "Foo".

I think we should start with the type system. Whatever paradigms the language supports, its going to have a type system. The type system will be what makes the language safer, and its the basis of everything.
Advertisement
Yes I agree, such matters like that may not even be needed in this language but I wanted to relay the ideas before I forgot them. Yea, I also think the type system is probaly one of the best places to start at.
Well, no one seems to want to comment.

This is what I think:

We have strong, static typing. Normal data types would be Number, String, etc. Types like Byte and Int32 would be available in the standard library when you needed to work on a concrete level.

On top of this we would have something like dependent types. Epigram and Ontic have systems like this. ApochPiQ's validations on types are the same kind of thing.

We might decide we want dynamic typing. Conceptually, everything in the type system could be done via constraints, so a variable with no constraints placed on it is a dynamic type, and can be assigned any value. Another variable could be constrained to the type of a natural number between 0 and 5.

These are just my initial thoughts, what do you think?
And I was just about to post in the other thread "how long before somebody proposes creating a language here on gamedev". Anyway, a few things:

1) Regarding dimensional analysis - fractional dimensions must be supported. They're rare but I have seen them.

2) On class design - one of the flaws of C++ is generic programming. If I want to swap two variables in a template I have to use the following code to be generic:

using std::swap;
swap(a, b);

Because then Koenig lookup will cause a swap function for any class types to be found, but std::swap is needed for built in types to work. This problem got me thinking that a function is often more defined by what it does than what it operates on, so perhaps the concept of member functions is unnecessary. Obviously some kind of access protection is still needed for classes so class definitions will still need to specify which functions can access its members. Perhaps read and write access could be considered separately to reduce the need for accessor functions.

Of course if functions were separate to classes they would need a separate access scheme to prevent internal functions being called on classes. This would also allow multiple instead of single dispatch but might require all virtual functions to be listed in the class definition so the compiler knows what to look for when building the vtable. I think LISPs CLOS works something like this but I've never used it so I'm not sure.

3) Metaprogramming: C++ templates allow all kinds of weird and wonderful things to be done at compile time but they tend to slow compiling down to a crawl. boosts parser library is a great example of this - it would be good if a new language could allow portions of code to be processed by other portions somehow so similar features could be implemented in a simpler way. It seems obvious to me that all code would have to follow some kind of basic syntax rules in order to be able to cope at all.

4) Operators: LISP doesn't suffer any kind of precedence issues because the order is totally explicit. However I think this leads to a pretty verbose language and I'd prefer it if operators had precedence as in C/C++ because I think it makes formulae clearer. However it would be nice if it was possible to redefine precedence, and in fact if the original precedence was defined by the same system along the lines of the metaprogramming system (sorry, that all sounds a bit vague really).

5) Syntax: I'm definitely a fan of terse syntax. C/C++ style is probably not the way forward because it ties you into one style of programming too much. A functional style should definitely be available. As far as English language keywords go I think that's the least of most peoples' worries in learning to program well. And what about foreign languages - it's probably no less confusing for the rest of the world if they have BEGIN...END rather than {...} it's just 8 meaningless characters rather than two.

6) Garbage collection - I'm not an expert on how to implement garbage collection efficiently and as far as I remember a lot of the sources I've read have been conflicting. Anyway it seems to me that in a lot of cases ownership of objects is pretty obvious and therefore garbage collection is superfluous so I think there should be a mechanism for restricting garbage collection to where's it's needed.

7) Static typing - I think static typing is the way forward. Types should be deduced and checked at compile time. I can't see dynamic typing being as efficient and a static type system with introspection can emulate dynamic typing where it's necessary anyway, or a generic object could be made available.

8) Templates - C++s system is brilliant in the flexibility it allows but it seems to me that when a type is passed to a template it is often required to provide some specific capabilities, which usually produce unreadable compilation errors. It would be better if the requirements were stated explicitly rather than implicitly in the code because this would allow the code to be checked before a type was given to it and produce simpler errors in all cases. It seems to me like this would be well described by a concept similar to inheritance but judging by ApochPiQs journal he thinks differently.

I've got some more thoughts on the subject of RAII, and others, but this post is long enough as is.
Well it'd be best not to use templates. Polymorphism should be avaliable through some form of generic code, and if we need a metaprogramming system, something like Nemerle's system would be better than hacking templates, which were not designed for metaprogramming.

Nemerle does it like this: Macros are Nemerle functions, just run at compile time instead of runtime. They have access to a compiler API, and generate code through syntax trees. This is easy too:
<[ Fun(); ]>
represents the syntax tree for
Fun();
Yes - I was about to make a comment along the same lines. I'm not familiar with Nemerle, but I was going to say this: C++ templates seem to confuse two things - polymorphism and metaprogramming. They force compilers to build separate functions for different types which are more efficient than manipulating those types through virtual functions but conceptually the same. Ideally this decision would be taken by the compiler not the programmer but if that is not possible then at least the function should only be written once and the programmer should decide explicitly which method to use. Therefore I agree templates shouldn't be used.
Type system is definitely a great place to start. Personally, this is what I'd like to see in a type system:

  • Strong typing as a deneral default

  • Allow selective weakening by defining type convertors

  • Static typing as a general default

  • Permit dynamic typing with generic-types and concepts


Yeah, I know this sounds like "best of all worlds" gibberish and verges a bit on bloat, but bear with me here [smile]

Strong, static typing goes with the philosophy of clearly expressing semantics and contracts. Since we're aiming at having the behavior of a program very well defined, it doesn't make sense to default to highly weak types, or dynamic typing. However, both weak typing and dynamic typing have their uses; I think we can obtain the benefits of those concepts without sacrificing the compile-time verifiability of strong, static typing in general.

In general, I'd like to see types treated as abstractly as possible: "Number" rather that int, unsigned, float, et. al. This means that typical whining about strong typing (like munging between numeric types) is not really important as often as it might be in more concretely-typed languages like C.

More importantly, we should be able to define type munging semantics ourselves. The "standard library" will probably need to offer basic numeric-munging support out of the box, such that Integer to Real conversions are handled properly both directions, perform correct rounding, etc. So if we want, we can "loosen up" the strong typing a bit by offering such munging facilities. This can become an extremely powerful technique when OO concepts start showing up, because a conversion between two object types (classes) may become possible, allowing for all manner of fun stuff with generic programming.


Type munging will in general take care of the strong/weak decision, but that still leaves static/dynamic to figure out. Again I think the focus on compile-time expressivity and certainty makes static typing the natural choice. However, as we all know, static typing makes generic programming a royal pain in the posterior. Generic programming happens to be a highly nifty thing, so we shouldn't sacrifice it offhand.

C++ has one approach to combining static typing and genericity, but templates are ugly and not really an optimal solution. More importantly, templates are (essentially) preprocessor magic, since they have to be instantiated at compile-time. While template metaprogramming is cool, and compile-time magic is a fun brain exercise, these things tend to get in the way of Real Work. The combination of C++'s file model and the technical behavior of templates make them virtually useless for true generic programming, since I have to either bloat each code file with the complete template class, or instantiate the template for each type combination I want to specialize into that template. I think we can do far better.

I think there might be some room for type inference in Foo, but I'd prefer to avoid it. I'd really like to see the notion of data types enter the semantic realm, where I define abstract data types myself in each application (like milliseconds, pixels, etc.) instead of just working in vague stuff like float and int. Since each user-defined semantic type can have quirky behavior, it probably will become impossible to do genuine type inference.

For example, consider the oddnumber concept that I've mentioned before. If I need to infer the type of the literal 7, what do I do? If we allow a duck-typing model, we'll have to run the validator of every semantic type in the program's knowledge space in order to try to infer the type. And what if we get multiple positives? The oddnumber validator will say that 7 is definitely a valid oddnumber, but so will the primenumber validator. Type inference doesn't work here. I'm not entirely sure yet, but I suspect that with implicit munging facilities definable by the programmer, type inference won't even really be needed.


So, getting back to generic programming... I think concepts are the way to go here. Let me recall one of my pseudo-examples from earlier:

DataType oddnumber Basis:Integers Validation(x):{ x % 2 == 1 };

This semantic type definition has some important information, specifically the Basis field. I think this is where we capitalize on concepts. If we define that any semantic type (i.e. any DataType) must define a subset of the basis set, we're golden. "Integers" becomes the concept, and oddnumber will always overlap a subset of the integers. If "Integers" is defined in the language to also be a subset of "Number," I get some good results. This means that I can write a generic blurb like this:

Number DoubleNumber(Number x) { return x * 2; }

This will automatically specialize properly to oddnumber. What's important, though, is the return value; although the parameter can be implicitly decayed from oddnumber to Integer without violating semantics (since DataTypes are always a subset of the basis type), we can't do the same for the return value of the function. In fact, DoubleNumber() will never return a valid oddnumber, but that is not verifiable at compile time.

I think the way to handle this is to specialize generic return values based on the way the function is called, not the way the function is declared. For instance:

oddnumber x = 7;
oddnumber y = DoubleNumber(x); // barf!
Integer z = DoubleNumber(x); // all OK; munge to integer i.e. round off
Number q = DoubleNumber(x); // all OK; number is a generic concept

DoubleNumber returns a generic Number. The return value is never actually specialized; it remains a generic. What's important is whether or not I can munge it into a given type. The oddnumber munge attempt will fail, for obvious reasons. The Integer munge will simply round off the result (if needed) and give me an Integer. This should include a warning as to possible loss of precision, as usual. Finally, if I simply treat the function as returning Number, no conversion is needed, and I can go about my merry way.

This kind of builds a layered sytem on the notion of constraints, as Rebooted mentioned earlier.


This of course raises a few important questions:
  • Can users define concepts? I'd like this ability, but I don't know how it would be done yet.

  • What about a true-generic "Thing" that encompasses numbers, strings, etc?

  • How does this mesh with OO concepts?




Some final assorted thoughts:

Operators
Implicit precedence is good, IMHO. We're used to it, and it allows us to avoid lots of nested parentheses [grin] Mainly, though, my concerns in the area of operators lies in the notion of defining them. Defining custom operators (as words, not magic symbols) would make things very, very handy, especially if operators are allowed to also do type conversions. For instance, I could define a DotProduct and CrossProduct set of operators, and get highly self-descriptive code. Yeah, there is potential for abuse and obfuscation here as well, but as with C++'s operator overloading I think the risk is justified. Defining custom operators (and their precedence hierarchy) is important to defining domain-specific abstraction layers.

Syntax
BEGIN and END are gross. I like the curly-brace-block style, but I think it's important to keep a healthy balance between magic shapes/symbols and readable text. C++ has too many magic squiggles, for instance; stuff like "= 0" for pure virtual function specification is just gross. Some symbols are needed for expression - parentheses, brackets, block braces, operators, etc. But reliance on magic squigglies can be taken too far, and I'd like to avoid it. Part of what gives me a headache any time I try to read Lisp code is the mass of squigglies. Brevity and clearness are good, but when communication of purpose becomes important, let's prefer words to squigglies. I never want to see a comma operator in Foo.

Garbage Collection
Definitely has to be a way to denote stuff that is garbage collected, as well as individual blobs of data and sections of code which are not. I think GC should be the default preference as much as possible. Manual memory allocation/control is going to be important for concreteness, but it should be harder to use than the GC model, to help underscore the fact that it is also much harder to get right.

Metaprogramming
I really like the notion of running code at compile-time. I think C++ templates are half a step in the right direction; they got the Turing completeness, but fail to be really practically useful for a lot of things. I'd like to see the compile-time metaprogramming capabilities be basically identical to runtime - i.e. my compiled metaprogramming is a Foo program that operates on itself. I firmly believe that code should be self-conscious: it should know that it's code, and it should know what to do about it.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I agree with your points. I think we should only allow implicit casts down the type hierarchy though. I'd be careful with allowing implicit conversions to be defined. So if you code a function that takes a Number argument, it'll work for all "derived" types, but anything else requires an explicit cast. This cast will fail if the validations specified on the type are not met.

I also think we should specify all types in terms of these validations. It will simplify and unify things. So a type with no validations is a dynamic type (your true-generic "Thing"), then you have Number, String. The next layer has Integer, Real; then comes Natural; then comes Natural:{ n > 7 }.

I see your point about type inference. This is going to make it impossible for a compiler to select the right function to call if it has been overloaded forms, too.

We should definitely look at Epigram, DML and Ontic.

Tim Sweeney talked a bit about type systems here.

I think we have idea of the type system sorted, pretty much.
You guys realize its going to be really... "fun" implementing dynamic types in a compiled language? OR will this be interpreted? In my opinion I think we should use Parrot. Its the VM being made as a universal interpreter for dynamic languages and is being made for Perl 6's primary target.
Quote:Original post by ApochPiQ
Static typing as a general default

I strongly disagree. I want inferred typing where possible by default. When inference fails, the most I can handle is a warning. I want to be able to place type constraints and have the type checker generate warnings. I don't want to spend my time making the compiler happy for the sake of making the compiler happy.

This topic is closed to new replies.

Advertisement