The existing core of the language is fairly solid at this point, with most of my work (when it happens) going into cleaning up and refining the implementation of the ahead-of-time compiler and the associated IDE and toolchain. My other primary time sink, however, involves designing out parts of the language that are not yet specified. I'm trying to keep to the core structure and feel of the language as much as possible while adding functionality primarily in the form of well-separated features that can compose in rich ways.
One thing the language is weak at right now is encapsulation and separation of concerns. It feels very much like a C or Pascal era language, with few abstractions for grouping related data and code. I want to move past this into a future where composing fundamental language features allows for very powerful control over how programs are structured and how interrelated systems connect to each other.
My solution to this is a notion I've started referring to as tasks. In principle a task should solve many of the same problems that objects solve in other languages. Moreover, tasks should seamlessly integrate with a "green thread" model that I want for the runtime at some point in the future.
There are many considerations that have gone into the current design:
- Creation and management syntax
- Binding instances of a task to names
- Construction into a valid state by default, i.e. no need for two-phase initialization
- Provide encapsulation and composition tools
- Elegant handling of internal, hidden state
- Minimal extra syntax required
- Need a way to truly hide API/state surface from consumers
There are three components:
- A function which has an internal block called a "dispatcher"
- Messages which are received and handled by the dispatcher
- Syntax for invoking the function and creating a task that can be messaged
All well and good. What does it look like?
Averager : -> task
{
integer total = 0
integer count = 0
dispatch
{
DataPoint : integer x { total += x count++ }
GetAverage : -> total / count
}
}
entrypoint :
{
task avg = Averager()
avg => DataPoint(42)
avg => DataPoint(666)
print(avg => GetAverage())
}
Note that Averager looks like a function of no parameters and a special return signature of "task."Internally, it behaves like a function at first: it creates two local variables and initializes them. Next, we encounter the dispatch {} block. This block sets up a message handler structure that is bound to the return value of the function. When the function returns, its local variables are stored in a closure and the dispatch table is kept alongside them, much like a v-table in other languages.
The entrypoint function first invokes Averager() to create a task, and then stashes the closure in the variable avg. Next, it sends two DataPoint messages with some values to the closure. These are handled by the correspondingly-named pattern matchers in the dispatch block inside Averager(). Last but not least, avg is sent the GetAverage() message (which is really just a function call bound to the closure avg) and the result is printed to the screen.
This is of course a very simplistic example. A more interesting example involves polymorphism. The way to achieve this in Epoch is to use protocols. If a task implements all the messages specified by a protocol, it is said to be compatible with that protocol.
protocol Average :
(DataPoint : integer),
(GetAverage : -> integer)
// This would realistically use an enumeration instead of magic values
MakeAverager : 0 -> Average avg = AverageMean()
MakeAverager : 1 -> Average avg = AverageMedian()
MakeAverager : 2 -> Average avg = AverageMode()
MakeAverager : integer invalid -> Average avg = AverageMean() { assert(false) }
entrypoint :
{
Average avg = MakeAverager(random(0, 3))
avg => DataPoint(42)
avg => DataPoint(666)
print(avg => GetAverage())
}
Note that the syntax for a protocol is much like a structure that contains only function pointers.We use pattern matching here in MakeAverager to select a particular sort of average based on a numerical input. The return type is a protocol, meaning that the function is free to return any task that is compatible with the named protocol, in this case Average.
The entrypoint works much the same as before, except this time it indirectly creates a task compatible with Average using a random number.
So ultimately, the syntax is very simple. "task" is a special type placeholder with limited application, akin to var or auto, designed to help enforce the contracts of the type system without requiring the programmer to utter really gross type signatures - or, worse, redundant information the compiler already knows.
Binding a task to a name looks like any other variable binding, since task creation is just the invocation of a function.
Task functions can be passed parameters, so they can construct their internal closure into a valid state by default, as the Average example (poorly) illustrates. This is the equivalent of object construction.
Tasks can encapsulate arbitrarily rich logic just like an object. Moreover, they can be composed and arranged into arbitrarily sophisticated structures using the existing rules of the language combined with message passing.
Internal state is perfectly and cleanly hidden by the fact that the closure is not required to expose any of its local variables, and in fact can only do so by means of a message.
The syntactical burden - on both the programmer and the language implementer (hey, that's me!) - is minimal.
Combined with the fully orthogonal language feature of inner functions, this approach makes it trivial to hide portions of an API surface. I still plan to provide a full Access Control List feature at some point which dictates how various protocols can interact.
So overall I feel good with this, which means it's time to open it up to feedback and poke a bunch of holes in it!