I'm really starting to get interested in the notion of building a new programming language. Every language I've ever used has had some really glaring shortcomings and problems, even aside from the normal quirks and personality flakiness that any language is going to have as a matter of course.
Now, I have no formal training in PL theory. Some people might say that makes me a bad candidate for trying to design a language. Perhaps I'm foolish to do so, but I disagree. My view is that the academics have had plenty of chances to design The Perfect Language. Thus far, they haven't done such a great job. On the flip side, people who are concerned primarily with getting crap done seem to be far more successful: PHP, Java, C#, and (to a far lesser extent) Ruby all come to mind.
By nature, as a professional programmer, I'm intensely concerned with getting crap done. I really don't care what some guy in some university thinks is the best way to build a language. Existing theory and "best practices" don't mean much to me, personally. I live out in the real world, and in the real world, the only metric by which we measure the goodness of a tool is whether or not it helps us get crap done. C++ is a good tool because it lets us get work accomplished (mainly by virtue of its sheer inertia, but that's another matter). Malbolge is a bad tool not because it's weird, quirky, or obtuse, but because it gets in the way of doing stuff. It doesn't let us get things done.
A lot of people in the academic sphere are big on stuff like functional programming, aspect-oriented programming, and all this. I've read papers dating as far back as the early 1980s claiming that such concepts are, by necessity, central to producing good software. Yet here we are, smack in the middle of 2006, and we still write code in imperative languages like C++. Why? Because we can get stuff done in them.
However, anyone who has had even fleeting experience with those concepts understands their appeal. There's some good stuff out there. People like and use (sort of) the obscure academic languages for good, practical reasons. There are certainly gaping holes in our current toolset that are going to need to get fixed soon.
So, in the interests of being highly pragmatic, I think it's time we sat down and built a new language. This time, though, let's avoid the trap of doing what sounds good to the PL theorists, and keep a heavy emphasis on just getting crap done. At the end of the day, I really don't give a good goddamn whether I can use a higher-order function, a lexical closure, or currying. I can tell you how many times those terms have come up in the process of me trying to get stuff done: 0. I don't care about being a good PL theorist or a good academic. I want to get crap done.
The Pragmatist's Programming Language Wishlist
- The minimal level of abstraction should be highly concrete.
It is my belief that C++ (and C before it) has had success largely because it allows freedom to build software close to the hardware. Sure, both languages allow the expression of abstract concepts - C++ much moreso than C - but there are far superior languages for expressing the abstract. It is not uncommon to embed languages into applications (especially games) so that abstract logic can be done in a language more suited to such things than C++. The longevity of C++ in particular has nothing to do with its abstract expressivity - it has everything to do with its ability to be concrete and unabstract. That's why C++ remains (incorrectly) synonymous with "performance-critical code" for many programmers.
A new language must realize this, and capitalize on it. If a new tool is to have any kind of penetration and staying power in today's language market, it will have to allow programmers to work in highly concrete terms. This is critical for two reasons: first, it allows the New Language to affect many problem domains, like 3D graphics engine development, where concreteness is important. Second, it allows easier portability. If some useful subset of the language can be used without runtime dependencies (like those of Java or .Net languages), the language will have a much quicker transition to platforms other than the one it was born on. Computers have changed a lot since C++ was designed, but we still use it, because C++'s concreteness allows for portability.
- Maximize the ability to express semantics in both abstract and technical terms.
We need a combination of C++'s ability to be very precise about data storage (unsigned 32 bit integers), and the ability of any reasonable high-level language to ignore those technical details. We need a way to be very specific about the way data is handled in hardware, when such things are appropriate. (This also ties in with having a low level of minimal abstraction inherent in the language; low-level programming is impossible without technical semantic expressivity.) At the same time, productive programming demands that we be able to quit worrying about stupid problems like overflow and signedness.
We need ways to think about data types more richly than "numeric" and "string." We need a combination of dimensional analysis, flexible type definition, and rigorous storage/behavior semantics. Allowing types to be validated implicitly would be ideal. We need a language that lets us express the fact that milliseconds-numbers are not interchangeable with pixels-numbers, and so on. The use of explicit conversion functions should let us forget, for once and for all, stupid things like measurement units (inches vs. kilometres, et. al.). However, it is unacceptable to have such functionality at the expense of control over the representation of data in hardware.
I have deposited more detailed thoughts on this concept here.
- Encourage the use of nested layers of abstraction to build domain-specific "dialects" on top of the base language.
I've burbled extensively about my feelings on abstraction before, so I'll refrain from repeating those thoughts. In essence, though, what we need is a way to define a sort of "sandbox" inside the language itself. That sandbox should be highly abstract, have a well-defined interface to lower levels of abstraction, and yet be highly domain-specific. Talking across domains should be forbidden. Talking to other layers of abstraction must be forbidden.
Languages like C++ let us define layers of abstraction via things like class inheritance trees. Yet they also allow us to talk between any two layers of abstraction at will. Any code can, with sufficient pressure on the compiler, talk to any other level of abstraction in the program. This is bad because it allows violations of encapsulation and modularity. By contrast, domain-specific languages are good because they do only what they need to do, and there is literally no way to do other stuff. You can't send network packets from a LaTeX document.
However, creating a DSL for every possible domain is not really practical. Embedding languages has some potential, but it also has a cost: you have to have two distinct languages. In some cases, the binding mechanisms required for this are highly nontrivial, and can even result in a net loss of both coding productivity and runtime performence. Clearly, the solution is not to add to the proliferation of languages.
Instead, we need a language that allows us to build little dialects of functionality and domain. Enforcing abstraction and encapsulation has proven good effects. I think the implementation of this might end up looking vaguely like Java packages or .Net modules, but it needs to be approached from the ground up with the express intent of restricting inter-module interaction, rather than promoting it.
- Provide highly expressive compile-time contract support.
Design By Contract is a very good thing. In general, we need a language that lets us express contracts, in a way that is inherent in the language itself, and verified as much as possible by the compiler as well as at run-time. We need contracts on data types. We need contracts on functions. We need contracts on objects.
We need to be able to specify the exact limits and expectations for every aspect of our code, and make sure that those contracts are obeyed. One of the great benefits of functional programming is the ability to validate code at many different levels, simply by validating that a function does as expected. We can have this benefit easily in imperative-style programming provided sufficiently expressive contract definition and enforcement mechanisms.
- Provide a high degree of introspective capability.
Code should know that it is code. Code should know that it runs alongside other code. Code should be able to express concepts in terms of other code.
This is really a multi-sided issue. On the one hand, we need the ability to do things like comprehensions, wherein a data structure knows how to traverse itself. We then need the ability to tell a structure to traverse itself, and do some operation on each element, or maybe on select elements based on some discrimination criteria. However, it goes deeper. We need to be able to look at a data type and see what we can do with it (ala concepts). We need to be able to ask a function what its semantics are and what sorts of data it can operate on. We need a way to think about objects and what they can do, at run-time.
We need to escape academic fluffery and jargon. The term "currying" needs to never be seen in the language's documentation. We need to keep an emphasis, when designing introspection features, on real life uses. If an introspective feature cannot be justified with an example that immediately reveals, to a moderately experienced imperative-language programmer, the benefit of that feature, then the feature is crap and should be axed. We should prefer to piss of the academics for not using the "right terminology" in favor of not pissing off people who still have never been told why they should care about these notions.
- Escape from stupid and counterproductive conventions.
The idea that functions should only return 0 or 1 values is stupid. This is a pointless convention that makes no sense. In real-life programming, multiple-return-value functions could save a lot of time and code complexity.
The notion that objects are only "of" certain types, defined by a rigid hierarchy of inheritance, is stupid. We need a sort of fuzzy model that lets us work in the realm of concepts and capabilities. Sometimes objects are best expressed in terms that are not definable by "is-a" and "has-a."
In general, both OO and procedural programming are too weak to really work. Procedural programming doesn't scale nicely to massive code bases with complex data entities. OO doesn't work nicely when the boundaries between object types are fuzzy. Attempts at generic-OO capabilities have been sort of successful, but still are mostly ugly hacks and have plenty of drawbacks. Existing paradigms for organizing logic and data are insufficient.
Entire modes of thought that worked great in single-thread universes will cease to be useful as multiprocessing technology becomes more widespread. Dealing with synchronicity and parallelism is going to be a major responsibility for languages in the future, and a new language must understand multiprocessing concepts deeply in order to remain a viable option. However, as with other areas, it is important to avoid extremes - the language must not forget that a lot of code still makes sense in a single-thread model.
These are just the beginnings of my thoughts here, and are admittedly rather fuzzy (which is a side effect of me not having a good sleeping pattern). I welcome any other thoughts, insights, and wishlist-items.
Let's get some crap done.