Jump to content

  • Log In with Google      Sign In   
  • Create Account

We need your feedback on a survey! Each completed response supports our community and gives you a chance to win a $25 Amazon gift card!


A pragmatic language for the future


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
610 replies to this topic

#1 ApochPiQ   Moderators   -  Reputation: 16423

Like
0Likes
Like

Posted 27 March 2006 - 01:18 PM

Introduction This topic was started as a way to discuss a practical, real language that will aid in solving real programming challenges. Some of these challenges are already old and well-understood in the field. Others, like multiprocessing, are partly understood, but will become increasingly more important in the near future. This discussion arose from the observation that currently popular languages (an oft-mentioned culprit being C++) do not solve these needs well. Other languages may address these needs in very elegant and effective ways, and yet these languages have failed to achieve mainstream usage in most problem domains. Our goal is to critically analyze the problems that we face, the solutions to these problems, and practical ways to integrate those solutions into a new language. Our goal is not to develop the "perfect" language, or to build anyone's "dream language." Rather, we accept the reality that any language will have to have some ugly, rough edges. At some point, a language must disappoint someone. We are not here to make the ultimate language for everything. However, we do believe that a language can be developed which is both a major improvement on existing tools, and is also suitable for a very wide variety of problem domains. Whereas many highly elegant languages are, realistically speaking, only used in narrow problem spaces, we intend to devise an elegant, rich, and powerful language that is suitable for many problem domains, and is appealing to programmers working in those domains. (Note: this post has been modified to clarify and better express our plans. It has changed with the course of discussion to reflect the changing vision of the language and its goals. I fully expect it to continue to change more over time, as described below in the Philosophical Goals.) Philosophical Goals These are the cornerstones of our effort. These ideals guide our planning and discussion. They are intended to provide a sort of shibboleth by which we decide which things are good for our language, and which are not. If a decision is to be made regarding the language, it should be made in light of these principles.
  • Seek a careful balance between pragmatism and idealism. By nature, our investigation is seeking to apply the best knowledge of programming theory to the domain of real-world problems. We are interested above all in a language that effectively aids programmers in accomplishing their work, be it academic research, embedded applications, business logic, or games. While pragmatic concerns are intensely important to our efforts, it is also vital to remain grounded in solid theory. We seek not just an elegant and self-consistent design, but also a design that gives us powerful tools to accomplish the creation of software.
  • Produce a real language for real work. At all times we should be mindful of the fact that we are not engaging merely in a mental exercise. We do not seek to produce an idealistic language that exists only in specification papers and committee meetings. Instead, we aim to grow a language organically from the experience, insight, and desires of real programmers who work on real code. This language is being designed with the very real goal of implementing it and using it for doing work. Without this emphasis, our endeavor may be mentally stimulating, but will ultimately not be useful. Driving for an actual realization of our ideals is vital.
  • Use the language early, use the language often. As noted, one of our goals is to produce a real language that is viable for real work. Therefore, it is important that as often as possible we test our designs by doing real work in the language. During the early stages we can expect the language to change rapidly; however, even once this pace slows later in the language's life, it is vitally important to make sure that the design remains consistent with real usage. It is in our best interests to seek to implement the language as quickly as possible, and use the implementation heavily. By using our own creation, we ensure that our designs are truly good. Moreover, we offer ourselves the opportunity to refine our designs based on real feedback from real use of the language. We must truly eat our own dogfood, and we must be prepared to adjust the formula to taste.
  • Provide the tools to develop abstraction and express it clearly. One of the fundamental aims of this project is to assist programmers in clearly communicating their intentions to the machine. It is unfortunately common with popular language tools today for a programmer to "know" much more about the semantics and expectations of a system than he is able to tell the compiler. We aim to reduce this gap in knowledge, and give the programmer tools and a vocabulary by which to express abstract notions precisely and thoroughly.
  • Flexibility is paramount: assume the programmer knows his needs well. It is important that we retain a degree of flexibility. Despite its flaws, C++ has remained a powerful force in the programming world because it permits the programmer a very high degree of control. We must be willing to trust programmers with this control, because in many problem domains, such control is important. The language should exist to support the programmer's efforts, not constrain or limit them.
  • Reliability is paramount: the programmer is fallible. This is a careful balance with the previous goal. While flexibility is important, safety is also equally important. Productivity comes by removing menial concerns. Abstraction and encapsulation aid productivity by removing the concern of duplicating technical and implementation details. Garbage collection aids productivity by removing the concern of memory management. Wherever possible, we should seek to increase productivity by removing the mundane technical concerns of computing, and allow the programmer to focus on solving the problems of his domain. However, it is important to permit the programmer to be concerned with these things, should the need arise.
  • Build what is needed. Build tools to develop the rest. Feature creep and bloat are very real dangers. We must avoid the temptation to add every possible feature under the sun. Rather, we should seek to provide a powerful toolkit by which programmers can assemble a library of useful features themselves. Moreover, where it is practical to do so, we should permit programmers to bend and modify these features to suit their individual needs. Users of the language should be encouraged to develop reusable abstractions and share them with the community as a whole. Some fundamental features should be codified as "standard" elements of either the language or the core library of features that ships with a compliant implementation of the language.
  • Favor programmer productivity and efficiency, even if it makes implementation of language features difficult. Language features only need to be implemented once. After that, they may be used freely. We should strive to implement useful, powerful, and elegant features, even if they are hard or expensive. Our investment of time and effort in developing the language's features will pay itself back many times over in saved time and effort by the users of the language - which, hopefully, should someday include us as well.
Practical Challenges Since we are aiming to develop a real language for use in real work by real people, we must remain mindful of the real issues that we face. There are challenges all along the way that must be addressed and considered if we are to succeed.
  • Target Audience We cannot be all languages to all programmers. Such is not practically - if even theoretically - possible. However, we seek to provide sufficient flexibility that the resulting language is useful for a wide variety of people. We aim to allow programmers who are experienced with traditional imperative languages to transition easily and comfortably into this language. We aim to permit functional programmers to use their preferred style in order to gain the advantages offered by the functional approach. We aim to permit a wide range of development paradigms and philosophies while encouraging and facilitiating safe, reliable, and productive practices. Ideally, we would like to provide the tools for programmers to work close to hardware, in domains where this is important (embedded systems, realtime systems such as 3D games, etc.) while also offering a rich library of abstractions, so that the language is useful in higher level logic domains (business logic, web logic, simulation logic, etc.). Where existing languages succeed in crossing many paradigms and methodologies, we seek to cross many levels of abstraction; it is our belief that by offering a unified toolkit for developing on many such levels, each level may benefit by being coherent and consistently expressed with the other levels.
  • Platform Support For better or worse, a programming language will not survive as a viable tool without interoperability with the Windows platform. The ability to interoperate with the family of Unixes is also important, although to a slightly lesser degree. A central goal should be interop with the Win32 API. Ideally, a layer for interacting with the .Net platform should also be developed as quickly as is practical. The language must facilitate the development of interop layers with the Unixes as well.
  • Execution Model Agnosticism The language should be permitted to execute under many implementations: interpreters, compiled bytecode, compiled machine code, et. al. This allows us to use and develop the language early on (since interpreters and bytecode systems are fairly easy to build) while allowing us to eventually target hardware directly (compiling to machine code). This ability is important, because assuming the existence of a VM layer or interpreter violates the usefulness of programming close to hardware. Additionally, ability to compile to machine code (or possibly widely adopted VM formats such as MSIL) is critical for performance, since it is doubtful that we will be able to build a VM system that can rival commercial products.
  • Early and Widespread Buy-in Just as we must test and use the language while it is being developed, we must also provide incentives for others to do so as well. Enlarging the pool of users and spreading the word is critical. Without widespread support and interest, the language will not be able to compete with established technologies. Achieving this buy-in requires that we be able to offer people the ability to use the language early on. It is vital that, as soon as possible, programmers are able to start writing real, useful code in the language, even if it is just interpreted or bytecode compiled.
  • Growth by Supplementing, Not Replacing, Existing Tools Realistically, the "big language" to compete with in terms of buy-in is C++. Other languages are also important in various domains, but for Windows applications and games in particular, C++ continues to be the dominant player. This is primarily due to inertia: huge amounts of time and money are invested in existing C++ systems, making it impractical to replace them with systems in a new language. We cannot hope to gain adoption by demanding that people discard C++ and use this language instead. Rather, we must gain adoption by supplementing C++. Interoperability with C-style APIs is mandatory. Providing the tools to build binding layers with C++ systems (similar to embedded-language projects like boost::python) is essential. Programmers must be able to take this new language and add it to their toolset slowly, at a pace which is practical and comfortable for them. We cannot replace the existing Big Players overnight, because the inertia of existing code is too large. However, with careful planning, we can make it more cost-effective to use this new language for new code, rather than writing new code in old languages. Over time this will amount to a gradual shift in language emphasis. During this transition, though, it is important to emphasize compatibility - this language must be usable in concert with existing tools like C++. Demanding a full replacement of existing code will not work.
Language Elements and Features This is an overview of the features and capabilities that we have chosen to build into the language.
  • Favor high degrees of abstraction and transparency
  • Permit low edgrees of abstraction and allow technical concreteness (programming close to the hardware)
  • Maximize the ability to express behavioral semantics of a system, as well as the data in that system
  • Provide rich introspection and metaprogramming capabilities
  • Encourage the use of nested abstraction layers to build domain-specific "dialects" on top the base language, via metaprogramming
Type System This section deals with the type system of the language: its nature, features, and limitations.
  • Employ static typing and strong static-time checking of types.
  • Encourage the definition of abstract datatype semantics, and methods for converting between semantic types.
  • Provide rich features for dimensional analysis.
  • Semantic datatypes are defined by the set of values they may store. A variable of a given semantic type may not be assigned a value which is not in its set. Doing so should cause a static, compile-time error (if possible), a warning if the behavior is not guaranteed to be reliable based on the defined semantics of the type, and a run-time error should an invalid assignment be attempted.
  • Semantic type validation is performed by a boolean validation routine that returns either "true" for valid data or "false" for invalid data. This is done instead of explicitly defining semantics in the language syntax to allow for maximum flexibility in the way validation is performed. Validation routines are encouraged to be kept simple so that they can be checked at compile time. To help enforce this, validation routines are not permitted to interact with data outside the value being validated. This ensures that validation can be checked statically.
  • Support for generic values is accomplished via a subset hierarchy. At the root of this hierarchy is the notion of Anything. All language concepts are a type of Anything (this is done to allow for higher-order functions, introspection, and other metaprogramming capabilities). Various data concepts express restrictions over the broad Anything. Such restrictions include numeric types, container types, and (eventually) objects. In addition, individual restrictions may be further subdivided by semantic types. For instance, numeric types can be split into integers, reals, complex numbers, and so on. A type may implicitly take the place of any superset of its own semantic set; an Integer is always substitutable for a Number, and a Number is always substitutable for Anything. Types may not be substituted across the hierarchy to sibling types, or down the hierarchy to child types, unless a specific semantic conversion function has been defined for the particular conversion being attempted.
  • Support type inference as much as possible. Whenever a type can be statically inferred, this should be done automatically by the language. The simple rule of thumb for the compiler is "infer what I can; complain about what I cannot." If a type cannot be inferred, it must be explicitly specified by the programmer.
Additional points will be added as final conclusions are reached in the discussion.
The Current State of Affairs I have personally taken on the responsibility of bootstrapping the language and delivering an implementation as quickly as my limited spare time allows. A running commentary of my progress is available via my journal, which can be reached by clicking the Journal link below this post. When major events in Epoch's development arise, I will drop a post in here. For the minor bits, keep an eye on my journal. In the meantime, everyone please feel free to continue discussion and making suggestions here.
Optional, But Encouraged Reading The best starting place is to read this thread itself, as it contains the vast bulk of the history of the discussion, and all of the important conclusions and issues are at least mentioned here. This rest of this is just some background material to help illuminate the train of thought that started this project. With the exception of Tim's presentation, don't take any of this as the word of an expert. Heck, don't even take it as the word of anyone who has the first clue what he's babbling on about [wink] The Pragmatist's Programming Language Wishlist Additional explanations and thoughts Tim Sweeney's presentation on languages (PDF) An early concept for abstraction layering Some early thoughts of mine on dialectic language families Some early thoughts on integrating expressivity of abstraction and automated testing My thoughts on domain-specific languages and the need for expressivity [Edited by - ApochPiQ on March 10, 2008 5:38:52 PM]

Sponsor:

#2 Sneftel   Senior Moderators   -  Reputation: 1781

Like
0Likes
Like

Posted 27 March 2006 - 01:34 PM

I agree with most of your points, but I can't bring myself to fully support the first one (the requirement of "concreteness"). The way I read it, you're poo-pooing the idea of hiding complicated tasks such as garbage collection within the language, and I really believe that there are certain things that need to be embedded in the fabric of a language.

My primary requirement for a good language, though, does sort of mesh with that point, and with a couple of the others you mentioned. I think you'd agree with this: The language needs to be rich enough to rely primarily on metaprogramming for functionality. LISPers crow about this perhaps to a fault, but they have a good point. If I could change two things about C++, they would be these: (1) take out classes, and (2) add enough metaprogramming that you could fully define the concept of a class within the standard library. That satisfies me on a language-theoretic level, but it also means that I'm never going to bang my head on the language. If I want to do some weird illegal inheritance thing just this once, then I should be able to with a minimum of fuss.

#3 ApochPiQ   Moderators   -  Reputation: 16423

Like
0Likes
Like

Posted 27 March 2006 - 01:51 PM

Hmm, not really what I was driving for. What I'd like to see is a sort of "default mode" where garbage collection, abstract numeric types, and other such features are normal. In other words, it should take no effort for me to write a program that is garbage collected, uses transparent type promotion to avoid overflow, checks array boundaries automatically, etc. etc. Those things are extremely important to having good abstraction capabilities in a language, and should not be neglected in any way.

My contention, though, is that these features need to be permitted to co-exist with the capability of working with a high degree of concreteness. It should take extra effort to specify technical limitations like 32 bits of storage for a numeric type - but not too much extra. It is simply important, in my mind, that those capabilities exist.

A language is only as portable as its most expressive level of concreteness. PHP has no technical semantics, and as such is dependent on its execution engine to handle those semantics. Therefore, PHP will never be used for embedded logic development, or doing low-level video hardware interactions. My emphasis on the concreteness aspect is primarily borne from the pragmatic realm of 3D game engines; writing a 3D engine effectively, especially a cutting-edge one, requires concreteness. That's why I think C++ continues to be so popular in game development - it offers such concreteness.

Direct hardware interaction is important also because it allows for portability, as outlined in my second journal entry. Think of an OS built in Foo; the lowest-level, hardware-interaction elements can be done nicely in Foo, without sacrificing any of the expressivity and contractual safety nets, while upper layers can be built entirely in abstract onion-shell combinations of functionality. IMHO the only real hope for native OS "awareness" of concepts like JIT execution models and global garbage collection is to have such interoperability between layers of abstraction, within a single language family.



I definitely agree on metaprogramming. I've been pondering the concept of building "objects" not based on static, code-time classes, but with a more dynamic and fluid structure. This has been inspired largely by evolutional's experiments in tag-oriented data organization; the main challenge I see at this point is figuring out the logical extension of that sort of paradigm to logic organization as well. But in general, yes, introspection and metaprogramming are extremely important. Foo should let me build entire syntactic subdialects for domain-specific purposes.

We don't need a silver-bullet paradigm or a silver-bullet "reinvention" of OO; I don't think such a thing exists. We need a way to express concepts like OO at a fundamental level, dynamically, as you've described.

#4 Azh321   Members   -  Reputation: 569

Like
0Likes
Like

Posted 27 March 2006 - 04:53 PM

I think im falling in love with you Apoch :-p

Ive been trying to solve these questions since I started programming (making my own language is what got me into programming...yes, even though I couldnt program, I wanted to make my own language :-p ah ha! the chicken came before the egg)

Though, you say you want abstract types. I have been debating the use of dynamic types in the real world for some time, so far ive came up with this idea:

Dynamic types that allow you to put restrictions upon them. You mentioned this earlier in your journal or a thread and ive had the idea for quite some time. So you could do:
int(range(0,2*5) foo;
Thats an example if you wanted a python approach, though I like the syntax to be more Lisp/Haskel like:
int[1..2*5] foo;
Those are merely examples and imply the use of a static types with restrictions. Im not sure how the syntax would be for dynamic types...could take a OO approach:
foo.range = [1..2*5];
Or like this:
var[1..2*5] foo


There are many different subjects that would go along with designing a language, the type system is just the biggest aspect ive been thinking about latly.

I think you should open up a forum some where Apoch for us to discuss our ideas and to develop this language. For everyone here to help out.

EDIT: Im looking at this now and im curious to what you think about the usage of dynamic types of static Apoch. The syntax looks much cleaner when using static, actually I really like this line: int[1..2*5] foo; Even though that was just made up syntax, I like it. Ah well, I would like to hear your thoughts on it, along with everyone elses.

#5 ApochPiQ   Moderators   -  Reputation: 16423

Like
0Likes
Like

Posted 27 March 2006 - 05:35 PM

Expressing range and validation constraints on data types is very important, I think.

I'd take it a step further, though. First of all, I think data type restrictions should be explicitly stated as formal declarations. Having the restrictions inline in every single variable declaration is going to get messy, and is counterproductive if you ever need to tweak the restrictions. Secondly, I think there should be an implicit concept of "validation" in each data type:

DataType oddnumber Basis:Integers Validation(x):{ x % 2 == 1 };

Of course if complex validation is not needed it can be omitted. Such validation should be done as much as possible at compile-time (looking for potential conflicts, etc.) and have a well-defined model for validation at run-time. Off the top of my head, the easiest way to do this is to have validation occur as a pre-copy step when evaluating an assignment. For instance:

// Given some function Foo()
oddnumber x = Foo();


Foo() is first invoked and its return value obtained. Then, the return is passed to the validator of oddnumber, which is a magical boolean function or something. The validator detects that Foo() has returned an even number, and a run-time error is raised to signal the offense.

At compile-time, if Foo() is simply flagged as returning "Integer" rather than an explicit oddnumber, the compiler should raise a warning that Foo() cannot be guaranteed to return compatible data. Conversions between user-defined data semantic types should basically follow traditional conversion-warning rules from languages like C++ (where, for instance, assigning an int into a bool makes compilers grumpy).



Now of course there is the concept of introspection and metaprogramming, where it becomes important to be able to query a user-defined semantic type about its nature. For instance, I should be able to write code like:

if(2 IsValid oddnumber)
message("barf!");


This has two main benefits. First, and most obviously, it allows for easy testing of the semantics and validation of types. Secondly, it allows the program logic itself to adapt at run-time to the capabilities of data types. Since all programs in the FooLanguage express semantics in the same way, it should then be possible to transparently marshal data between programs without needing explicit converter code for each combination of two interacting programs. There's also some cool metaprogramming stuff that can be done with types introspection; I'll have to poke my brain a bit and see if I can come up with a good example.


Side node on syntax
I don't really like the C/C++ syntax format. In fact, I don't like the syntax philosophy of most languages. The philosophy tends to seem like "memorize this set of magical symbols and words, and then figure out how to make them interact." Inconsistent semantics and quirks make it hard to memorize what certain magic characters are for. (Exhibit A: the eternally annoying angle-bracket semantics in C++)

Personally, if I'm going to go to all the effort to construct a language from scratch, I'm going to make an attempt at a consistent, logical, and readable syntax. A beginner coming into the language should be able to figure out the bulk of the syntax elements just from their names. We've been pushing the merits of self-descriptive identifiers in code for years... now let's take it the next logical step and start looking at self-descriptive syntax, too.

I can't speak for anyone else, but I'd much rather exchange some verbosity and a little extra typing for extremely clear code.


To pick on your examples, I don't agree with the use of brackets to specify type range. Yes, this is a well-known convention in mathematics, but we're not in mathematics here. In practicality, if we lean towards a C-style syntax, brackets are now given extra responsibility, since they work as the array-indexing operator, and are sort of the de-facto standard for expressing the generic indexing concept in C++. The bottom line is that I now have to know multiple possible roles of the [] symbols.

That doesn't seem like such a huge deal to anyone who has been doing real-world programming for any length of time, so I understand that this looks like a really anal bit of nitpickery to many. In fact I had a similar discussion earlier today about a related issue, and was alone in a group of 5 coders in advocating verbosity, simplicity, and consistency over obtuse abbreviations and symbol-overloading. My gut feeling is... yeah, it's a tiny thing to have to memorize, but why should I have to memorize it at all? Tiny things add up; when a language is full of such tiny, "insignificant" things, we get C++, which is a damned confusing language (syntactically) on the best of days.

The main reasons I hear for using tight, compact symbol-hackery instead of clear natural-language-style syntax are:
- English-like syntax takes too much typing
- Verbose syntax takes up a lot of screen real-estate and can inhibit readability

The first objection is frankly laziness, as far as I'm concerned. Typing is such a trivial subset of the work of programming that I honestly don't care about it. My feeling is that unless the amount of typing work increases at least fivefold, typing work is not even a consideration. So what if I have to press six keys instead of two?

The second objection has merit, and as a matter of fact it's something I think is important when approaching the syntax design of a language. At-a-glance perception of code is very important. Highly experienced programmers are sometimes said to be able to "see clean code" just by looking at the formatting and general pattern of certain syntax elements. Screen real-estate I don't think is really that big of a deal, especially since the eye is remarkably good at filtering out irrelevant syntactic fluff to get to the "good stuff" once one gets used to a language.

Overall readability, though, is certainly an issue. My personal view (again, can't speak for anyone else, and I know many would disagree) is that a natural-language syntax is more pleasant to look at in general than a highly symbolic one. I know of few beginners who would say that Lisp is more readable or obvious than BASIC. Yeah, again, those distinctions largely go away once one invests a lot of time and effort learning to "read" a language. All I'm saying is... why should we even have to? I'm lazy. I don't want to learn to recognize Lisp syntax. I already did all the work of learning to read English, so why can't I take advantage of that existing ability to read (parts of) my code?


Anyways... I need to cut this short (as if it were short to begin with...) or I'll babble all night [grin]

#6 Azh321   Members   -  Reputation: 569

Like
0Likes
Like

Posted 27 March 2006 - 05:45 PM

A friend of mine who has been working on a billing program (a "real world" program :-p). He also knows a good bit about language design and has written a few compilers/interpreters, I asked him if he had any ideas that he would want in a language like this. He said, in the billing program he has a ton of accesor and modifier methods that do nothing but return a variable or modify a variable.

So how about something like this:
class foo
{
private:
int bar : accesor;
}
Which would pretty much automatically generate a public: int bar(); method that returned the value of bar.
Not ground breaking but sounds nifty. Thats just example syntax of course, another way of representing it might be:
class foo
{
private:
int bar;
public:
int bar() : bar acessor;
}
More typing but I thought I might mention it, that just says int bar() is a accessor for bar. The same stuff could work for the keyword modifier that modified that variable instead of just returning it.

Another idea was being able to set variable defaults in the class header rather than in a constructor. I thought that was a good idea and almost completely eliminates the use of a constructor for me. I think thats a much better and more consistent way. Plus you dont have to search for the constructor to see the default values.

So there you have it, a few more ideas. They both cover less code and easier to read/maintain. Which I believe are quite important, lets hear your thoughts on the ideas!

#7 ApochPiQ   Moderators   -  Reputation: 16423

Like
0Likes
Like

Posted 27 March 2006 - 05:47 PM

OK, one last thought, and then I'm done for a while, I promise [smile]


Another area of datatype semantics that could be extremely useful is manually specifying interactions between datatypes. In the Tim Sweeney thread I talked about Conversion functions that can be used to intelligently munge between data types without risk of losing valuable information or accuracy.

However, a challenge comes up in cases like this:

DataType oddnumber Basis:Integers Validation(x):{ x % 2 == 1 };
DataType evennumber Basis:Integers Validation(x):{ x % 2 == 0 };

oddnumber x = 3;
evennumber y = 2;
y = x;


What can we detect and whine about at compile time?
  • We can compile the validation functions for each data type

  • Validation functions can be required to be effects-free by the language

  • Since validation has no side-effects, we can validate data at compile time

  • We can validate that 3 and 2 are acceptable for oddnumber and evennumber, repsectively


Therefore, we can validate at compile time, before the program even starts running, that the first four lines of this code are never going to cause problems or violate their own semantics.

Unfortunately, the fifth line is a problem. We cannot, at compile time, detect that these two data types are mutually exclusive. (Typical reduction to the Halting Problem, blah blah blah, etc.) The best we can do is note that both oddnumber and evennumber have explicit validators, and therefore issue a warning during compilation:

Warning W47948: You might be doing something really retarded on line 5.

However, any human reviewer can see immediately that any assignment between evennumber and oddnumber is going to barf. The two are mutually exclusive.


There are two ways to handle this in the language. The first is to be pessimistic and assume that conversion between semantic types never makes sense unless there is an explicit conversion function, or a chain of conversion functions, by which one semantic type can be transformed to another. The second is to warn about any such conversions, but allow the conversions to be flagged specifically as errors by means of an additional declaration:

MutuallyExclusiveDataSemantics oddnumber,evennumber;

Now the compiler knows that if it ever sees an oddnumber getting assigned to an evennumber, it can vomit outright and demand that the programmer fix his mistake.

I'm not sure off the top of my head, but at the moment I favor the second model. Yeah, it requires more typing again, but it has a very important bonus: it clearly defines the semantics to human readers. Sure, the compiler can figure out that oddnumber=evennumber is a dumb idea in either approach. However, by specifically expressing that these two types are mutually exclusive, we now codify more information that the human programmer can use.

When considering a system, there are two sets of knowledge that are important: the set of what the compiler/language knows about the system, and the set of what the programmer knows about the system. The idea here is to minimize the difference between these two data sets, by giving linguistic constructs that allow explicit expression of as much of that knowledge as possible. This is really the same principle behind requiring DBC concepts in the language itself.

#8 ApochPiQ   Moderators   -  Reputation: 16423

Like
0Likes
Like

Posted 27 March 2006 - 05:50 PM

Quote:
Original post by Falling Sky
[snip thoughts on brevity of accessors, mutators, initialization]


Sounds like your friend needs C# [smile]

In general, I agree - these types of things are preferable. However, I think that may be a bit far down the road at the moment, especially considering that we don't even really have a plan for how OO concepts are implemented in the first place. Given the leanings of Sneftel's suggestion earlier (and my heartfelt agreement with it), I think it's safe to say that the OO philosophy will be quite different from C++, which might make some of these subtleties moot.

#9 Azh321   Members   -  Reputation: 569

Like
0Likes
Like

Posted 27 March 2006 - 05:52 PM

Actually, I was thinking you WANTED a C++ like syntax. Ide be more than happy to try to strive for something else. Though thats one of the hardest aspects I think, coming up with a new syntax. Especially when your biased since you can only think of syntax similar to languages youve seen. Look at Python, its somewhat verbose and to me is very natural to read. Maybe we can borrow some things from it. Though I dont like its significance in white space.

PS are you not on AIM while you post :-p. Maybe I can catch you online tomorow.

#10 Azh321   Members   -  Reputation: 569

Like
0Likes
Like

Posted 27 March 2006 - 05:55 PM

Yes, I didnt want to get into OOP aspects but I asked him what he would want and I just relayed the ideas. It would be cool to think of OOP in a different manner than you do in C++. And yes I promise this is it for me too for tonight...unless Apoch replies again :-p

#11 Rebooted   Members   -  Reputation: 612

Like
0Likes
Like

Posted 27 March 2006 - 07:35 PM

I don't think its important to be talking about details like accessors or syntax yet.

If we are going to have this discussion, we should focus on specific areas, look at the approach of other languages, and decide what it best for "Foo".

I think we should start with the type system. Whatever paradigms the language supports, its going to have a type system. The type system will be what makes the language safer, and its the basis of everything.

#12 Azh321   Members   -  Reputation: 569

Like
0Likes
Like

Posted 28 March 2006 - 04:50 AM

Yes I agree, such matters like that may not even be needed in this language but I wanted to relay the ideas before I forgot them. Yea, I also think the type system is probaly one of the best places to start at.

#13 Rebooted   Members   -  Reputation: 612

Like
0Likes
Like

Posted 28 March 2006 - 05:48 AM

Well, no one seems to want to comment.

This is what I think:

We have strong, static typing. Normal data types would be Number, String, etc. Types like Byte and Int32 would be available in the standard library when you needed to work on a concrete level.

On top of this we would have something like dependent types. Epigram and Ontic have systems like this. ApochPiQ's validations on types are the same kind of thing.

We might decide we want dynamic typing. Conceptually, everything in the type system could be done via constraints, so a variable with no constraints placed on it is a dynamic type, and can be assigned any value. Another variable could be constrained to the type of a natural number between 0 and 5.

These are just my initial thoughts, what do you think?

#14 ZQJ   Members   -  Reputation: 496

Like
0Likes
Like

Posted 28 March 2006 - 07:24 AM

And I was just about to post in the other thread "how long before somebody proposes creating a language here on gamedev". Anyway, a few things:

1) Regarding dimensional analysis - fractional dimensions must be supported. They're rare but I have seen them.

2) On class design - one of the flaws of C++ is generic programming. If I want to swap two variables in a template I have to use the following code to be generic:

using std::swap;
swap(a, b);

Because then Koenig lookup will cause a swap function for any class types to be found, but std::swap is needed for built in types to work. This problem got me thinking that a function is often more defined by what it does than what it operates on, so perhaps the concept of member functions is unnecessary. Obviously some kind of access protection is still needed for classes so class definitions will still need to specify which functions can access its members. Perhaps read and write access could be considered separately to reduce the need for accessor functions.

Of course if functions were separate to classes they would need a separate access scheme to prevent internal functions being called on classes. This would also allow multiple instead of single dispatch but might require all virtual functions to be listed in the class definition so the compiler knows what to look for when building the vtable. I think LISPs CLOS works something like this but I've never used it so I'm not sure.

3) Metaprogramming: C++ templates allow all kinds of weird and wonderful things to be done at compile time but they tend to slow compiling down to a crawl. boosts parser library is a great example of this - it would be good if a new language could allow portions of code to be processed by other portions somehow so similar features could be implemented in a simpler way. It seems obvious to me that all code would have to follow some kind of basic syntax rules in order to be able to cope at all.

4) Operators: LISP doesn't suffer any kind of precedence issues because the order is totally explicit. However I think this leads to a pretty verbose language and I'd prefer it if operators had precedence as in C/C++ because I think it makes formulae clearer. However it would be nice if it was possible to redefine precedence, and in fact if the original precedence was defined by the same system along the lines of the metaprogramming system (sorry, that all sounds a bit vague really).

5) Syntax: I'm definitely a fan of terse syntax. C/C++ style is probably not the way forward because it ties you into one style of programming too much. A functional style should definitely be available. As far as English language keywords go I think that's the least of most peoples' worries in learning to program well. And what about foreign languages - it's probably no less confusing for the rest of the world if they have BEGIN...END rather than {...} it's just 8 meaningless characters rather than two.

6) Garbage collection - I'm not an expert on how to implement garbage collection efficiently and as far as I remember a lot of the sources I've read have been conflicting. Anyway it seems to me that in a lot of cases ownership of objects is pretty obvious and therefore garbage collection is superfluous so I think there should be a mechanism for restricting garbage collection to where's it's needed.

7) Static typing - I think static typing is the way forward. Types should be deduced and checked at compile time. I can't see dynamic typing being as efficient and a static type system with introspection can emulate dynamic typing where it's necessary anyway, or a generic object could be made available.

8) Templates - C++s system is brilliant in the flexibility it allows but it seems to me that when a type is passed to a template it is often required to provide some specific capabilities, which usually produce unreadable compilation errors. It would be better if the requirements were stated explicitly rather than implicitly in the code because this would allow the code to be checked before a type was given to it and produce simpler errors in all cases. It seems to me like this would be well described by a concept similar to inheritance but judging by ApochPiQs journal he thinks differently.

I've got some more thoughts on the subject of RAII, and others, but this post is long enough as is.

#15 Rebooted   Members   -  Reputation: 612

Like
0Likes
Like

Posted 28 March 2006 - 07:58 AM

Well it'd be best not to use templates. Polymorphism should be avaliable through some form of generic code, and if we need a metaprogramming system, something like Nemerle's system would be better than hacking templates, which were not designed for metaprogramming.

Nemerle does it like this: Macros are Nemerle functions, just run at compile time instead of runtime. They have access to a compiler API, and generate code through syntax trees. This is easy too:
<[ Fun(); ]>
represents the syntax tree for
Fun();


#16 ZQJ   Members   -  Reputation: 496

Like
0Likes
Like

Posted 28 March 2006 - 08:29 AM

Yes - I was about to make a comment along the same lines. I'm not familiar with Nemerle, but I was going to say this: C++ templates seem to confuse two things - polymorphism and metaprogramming. They force compilers to build separate functions for different types which are more efficient than manipulating those types through virtual functions but conceptually the same. Ideally this decision would be taken by the compiler not the programmer but if that is not possible then at least the function should only be written once and the programmer should decide explicitly which method to use. Therefore I agree templates shouldn't be used.

#17 ApochPiQ   Moderators   -  Reputation: 16423

Like
0Likes
Like

Posted 28 March 2006 - 10:10 AM

Type system is definitely a great place to start. Personally, this is what I'd like to see in a type system:

  • Strong typing as a deneral default

  • Allow selective weakening by defining type convertors

  • Static typing as a general default

  • Permit dynamic typing with generic-types and concepts


Yeah, I know this sounds like "best of all worlds" gibberish and verges a bit on bloat, but bear with me here [smile]

Strong, static typing goes with the philosophy of clearly expressing semantics and contracts. Since we're aiming at having the behavior of a program very well defined, it doesn't make sense to default to highly weak types, or dynamic typing. However, both weak typing and dynamic typing have their uses; I think we can obtain the benefits of those concepts without sacrificing the compile-time verifiability of strong, static typing in general.

In general, I'd like to see types treated as abstractly as possible: "Number" rather that int, unsigned, float, et. al. This means that typical whining about strong typing (like munging between numeric types) is not really important as often as it might be in more concretely-typed languages like C.

More importantly, we should be able to define type munging semantics ourselves. The "standard library" will probably need to offer basic numeric-munging support out of the box, such that Integer to Real conversions are handled properly both directions, perform correct rounding, etc. So if we want, we can "loosen up" the strong typing a bit by offering such munging facilities. This can become an extremely powerful technique when OO concepts start showing up, because a conversion between two object types (classes) may become possible, allowing for all manner of fun stuff with generic programming.


Type munging will in general take care of the strong/weak decision, but that still leaves static/dynamic to figure out. Again I think the focus on compile-time expressivity and certainty makes static typing the natural choice. However, as we all know, static typing makes generic programming a royal pain in the posterior. Generic programming happens to be a highly nifty thing, so we shouldn't sacrifice it offhand.

C++ has one approach to combining static typing and genericity, but templates are ugly and not really an optimal solution. More importantly, templates are (essentially) preprocessor magic, since they have to be instantiated at compile-time. While template metaprogramming is cool, and compile-time magic is a fun brain exercise, these things tend to get in the way of Real Work. The combination of C++'s file model and the technical behavior of templates make them virtually useless for true generic programming, since I have to either bloat each code file with the complete template class, or instantiate the template for each type combination I want to specialize into that template. I think we can do far better.

I think there might be some room for type inference in Foo, but I'd prefer to avoid it. I'd really like to see the notion of data types enter the semantic realm, where I define abstract data types myself in each application (like milliseconds, pixels, etc.) instead of just working in vague stuff like float and int. Since each user-defined semantic type can have quirky behavior, it probably will become impossible to do genuine type inference.

For example, consider the oddnumber concept that I've mentioned before. If I need to infer the type of the literal 7, what do I do? If we allow a duck-typing model, we'll have to run the validator of every semantic type in the program's knowledge space in order to try to infer the type. And what if we get multiple positives? The oddnumber validator will say that 7 is definitely a valid oddnumber, but so will the primenumber validator. Type inference doesn't work here. I'm not entirely sure yet, but I suspect that with implicit munging facilities definable by the programmer, type inference won't even really be needed.


So, getting back to generic programming... I think concepts are the way to go here. Let me recall one of my pseudo-examples from earlier:

DataType oddnumber Basis:Integers Validation(x):{ x % 2 == 1 };

This semantic type definition has some important information, specifically the Basis field. I think this is where we capitalize on concepts. If we define that any semantic type (i.e. any DataType) must define a subset of the basis set, we're golden. "Integers" becomes the concept, and oddnumber will always overlap a subset of the integers. If "Integers" is defined in the language to also be a subset of "Number," I get some good results. This means that I can write a generic blurb like this:

Number DoubleNumber(Number x) { return x * 2; }

This will automatically specialize properly to oddnumber. What's important, though, is the return value; although the parameter can be implicitly decayed from oddnumber to Integer without violating semantics (since DataTypes are always a subset of the basis type), we can't do the same for the return value of the function. In fact, DoubleNumber() will never return a valid oddnumber, but that is not verifiable at compile time.

I think the way to handle this is to specialize generic return values based on the way the function is called, not the way the function is declared. For instance:

oddnumber x = 7;
oddnumber y = DoubleNumber(x); // barf!
Integer z = DoubleNumber(x); // all OK; munge to integer i.e. round off
Number q = DoubleNumber(x); // all OK; number is a generic concept


DoubleNumber returns a generic Number. The return value is never actually specialized; it remains a generic. What's important is whether or not I can munge it into a given type. The oddnumber munge attempt will fail, for obvious reasons. The Integer munge will simply round off the result (if needed) and give me an Integer. This should include a warning as to possible loss of precision, as usual. Finally, if I simply treat the function as returning Number, no conversion is needed, and I can go about my merry way.

This kind of builds a layered sytem on the notion of constraints, as Rebooted mentioned earlier.


This of course raises a few important questions:
  • Can users define concepts? I'd like this ability, but I don't know how it would be done yet.

  • What about a true-generic "Thing" that encompasses numbers, strings, etc?

  • How does this mesh with OO concepts?




Some final assorted thoughts:

Operators
Implicit precedence is good, IMHO. We're used to it, and it allows us to avoid lots of nested parentheses [grin] Mainly, though, my concerns in the area of operators lies in the notion of defining them. Defining custom operators (as words, not magic symbols) would make things very, very handy, especially if operators are allowed to also do type conversions. For instance, I could define a DotProduct and CrossProduct set of operators, and get highly self-descriptive code. Yeah, there is potential for abuse and obfuscation here as well, but as with C++'s operator overloading I think the risk is justified. Defining custom operators (and their precedence hierarchy) is important to defining domain-specific abstraction layers.

Syntax
BEGIN and END are gross. I like the curly-brace-block style, but I think it's important to keep a healthy balance between magic shapes/symbols and readable text. C++ has too many magic squiggles, for instance; stuff like "= 0" for pure virtual function specification is just gross. Some symbols are needed for expression - parentheses, brackets, block braces, operators, etc. But reliance on magic squigglies can be taken too far, and I'd like to avoid it. Part of what gives me a headache any time I try to read Lisp code is the mass of squigglies. Brevity and clearness are good, but when communication of purpose becomes important, let's prefer words to squigglies. I never want to see a comma operator in Foo.

Garbage Collection
Definitely has to be a way to denote stuff that is garbage collected, as well as individual blobs of data and sections of code which are not. I think GC should be the default preference as much as possible. Manual memory allocation/control is going to be important for concreteness, but it should be harder to use than the GC model, to help underscore the fact that it is also much harder to get right.

Metaprogramming
I really like the notion of running code at compile-time. I think C++ templates are half a step in the right direction; they got the Turing completeness, but fail to be really practically useful for a lot of things. I'd like to see the compile-time metaprogramming capabilities be basically identical to runtime - i.e. my compiled metaprogramming is a Foo program that operates on itself. I firmly believe that code should be self-conscious: it should know that it's code, and it should know what to do about it.

#18 Rebooted   Members   -  Reputation: 612

Like
0Likes
Like

Posted 28 March 2006 - 10:50 AM

I agree with your points. I think we should only allow implicit casts down the type hierarchy though. I'd be careful with allowing implicit conversions to be defined. So if you code a function that takes a Number argument, it'll work for all "derived" types, but anything else requires an explicit cast. This cast will fail if the validations specified on the type are not met.

I also think we should specify all types in terms of these validations. It will simplify and unify things. So a type with no validations is a dynamic type (your true-generic "Thing"), then you have Number, String. The next layer has Integer, Real; then comes Natural; then comes Natural:{ n > 7 }.

I see your point about type inference. This is going to make it impossible for a compiler to select the right function to call if it has been overloaded forms, too.

We should definitely look at Epigram, DML and Ontic.

Tim Sweeney talked a bit about type systems here.

I think we have idea of the type system sorted, pretty much.

#19 Azh321   Members   -  Reputation: 569

Like
0Likes
Like

Posted 28 March 2006 - 11:00 AM

You guys realize its going to be really... "fun" implementing dynamic types in a compiled language? OR will this be interpreted? In my opinion I think we should use Parrot. Its the VM being made as a universal interpreter for dynamic languages and is being made for Perl 6's primary target.

#20 CoffeeMug   Members   -  Reputation: 852

Like
0Likes
Like

Posted 28 March 2006 - 11:06 AM

Quote:
Original post by ApochPiQ
Static typing as a general default

I strongly disagree. I want inferred typing where possible by default. When inference fails, the most I can handle is a warning. I want to be able to place type constraints and have the type checker generate warnings. I don't want to spend my time making the compiler happy for the sake of making the compiler happy.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS