Proposal: "Tasks" in the Epoch programming language

Started by
16 comments, last by ApochPiQ 8 years, 3 months ago
For the past several years I've been working on designing and implementing the Epoch programming language.

The existing core of the language is fairly solid at this point, with most of my work (when it happens) going into cleaning up and refining the implementation of the ahead-of-time compiler and the associated IDE and toolchain. My other primary time sink, however, involves designing out parts of the language that are not yet specified. I'm trying to keep to the core structure and feel of the language as much as possible while adding functionality primarily in the form of well-separated features that can compose in rich ways.

One thing the language is weak at right now is encapsulation and separation of concerns. It feels very much like a C or Pascal era language, with few abstractions for grouping related data and code. I want to move past this into a future where composing fundamental language features allows for very powerful control over how programs are structured and how interrelated systems connect to each other.


My solution to this is a notion I've started referring to as tasks. In principle a task should solve many of the same problems that objects solve in other languages. Moreover, tasks should seamlessly integrate with a "green thread" model that I want for the runtime at some point in the future.

There are many considerations that have gone into the current design:
  • Creation and management syntax
  • Binding instances of a task to names
  • Construction into a valid state by default, i.e. no need for two-phase initialization
  • Provide encapsulation and composition tools
  • Elegant handling of internal, hidden state
  • Minimal extra syntax required
  • Need a way to truly hide API/state surface from consumers
Fundamentally, the resulting solution is pretty simple. (Fans of PLT will note that this is really just a hefty dose of the object-closure isomorphism.)

There are three components:
  • A function which has an internal block called a "dispatcher"
  • Messages which are received and handled by the dispatcher
  • Syntax for invoking the function and creating a task that can be messaged
In other words, a task is a closure that can be sent messages to alter its internal state, or retrieve computed values. Any function with an embedded dispatch {} block can be invoked to create a task. Tasks may be stored either on the stack or the free-store at the programmer's discretion; this permits the use of custom allocators.

All well and good. What does it look like?

Averager : -> task
{
    integer total = 0
    integer count = 0
 
    dispatch
    {
        DataPoint : integer x { total += x   count++ }
        GetAverage : -> total / count
    }
}
 
entrypoint :
{
    task avg = Averager()
    avg => DataPoint(42)
    avg => DataPoint(666)
 
    print(avg => GetAverage())
}
Note that Averager looks like a function of no parameters and a special return signature of "task."

Internally, it behaves like a function at first: it creates two local variables and initializes them. Next, we encounter the dispatch {} block. This block sets up a message handler structure that is bound to the return value of the function. When the function returns, its local variables are stored in a closure and the dispatch table is kept alongside them, much like a v-table in other languages.

The entrypoint function first invokes Averager() to create a task, and then stashes the closure in the variable avg. Next, it sends two DataPoint messages with some values to the closure. These are handled by the correspondingly-named pattern matchers in the dispatch block inside Averager(). Last but not least, avg is sent the GetAverage() message (which is really just a function call bound to the closure avg) and the result is printed to the screen.



This is of course a very simplistic example. A more interesting example involves polymorphism. The way to achieve this in Epoch is to use protocols. If a task implements all the messages specified by a protocol, it is said to be compatible with that protocol.

protocol Average :
    (DataPoint : integer),
    (GetAverage : -> integer)

// This would realistically use an enumeration instead of magic values
MakeAverager : 0 -> Average avg = AverageMean()
MakeAverager : 1 -> Average avg = AverageMedian()
MakeAverager : 2 -> Average avg = AverageMode()
MakeAverager : integer invalid -> Average avg = AverageMean() { assert(false) }

entrypoint :
{
    Average avg = MakeAverager(random(0, 3))

    avg => DataPoint(42)
    avg => DataPoint(666)

    print(avg => GetAverage())
}
Note that the syntax for a protocol is much like a structure that contains only function pointers.

We use pattern matching here in MakeAverager to select a particular sort of average based on a numerical input. The return type is a protocol, meaning that the function is free to return any task that is compatible with the named protocol, in this case Average.

The entrypoint works much the same as before, except this time it indirectly creates a task compatible with Average using a random number.


So ultimately, the syntax is very simple. "task" is a special type placeholder with limited application, akin to var or auto, designed to help enforce the contracts of the type system without requiring the programmer to utter really gross type signatures - or, worse, redundant information the compiler already knows.

Binding a task to a name looks like any other variable binding, since task creation is just the invocation of a function.

Task functions can be passed parameters, so they can construct their internal closure into a valid state by default, as the Average example (poorly) illustrates. This is the equivalent of object construction.

Tasks can encapsulate arbitrarily rich logic just like an object. Moreover, they can be composed and arranged into arbitrarily sophisticated structures using the existing rules of the language combined with message passing.

Internal state is perfectly and cleanly hidden by the fact that the closure is not required to expose any of its local variables, and in fact can only do so by means of a message.

The syntactical burden - on both the programmer and the language implementer (hey, that's me!) - is minimal.

Combined with the fully orthogonal language feature of inner functions, this approach makes it trivial to hide portions of an API surface. I still plan to provide a full Access Control List feature at some point which dictates how various protocols can interact.



So overall I feel good with this, which means it's time to open it up to feedback and poke a bunch of holes in it!

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Advertisement
Here's a variant as food for thought:


Averager : -> dispatch
(
    DataPoint : integer x { total += x   count++ }
    GetAverage : -> total / count
)
{
    integer total = 0
    integer count = 0
}

entrypoint :
{
    task avg = Averager()
    avg => DataPoint(42)
    avg => DataPoint(666)

    print(avg => GetAverage())
}

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I prefer this syntax:


Averager : -> task
{
    integer total = 0
    integer count = 0
 
    dispatch
    {
        DataPoint : integer x { total += x   count++ }
        GetAverage : -> total / count
    }
}

but that's primarily because it looks like a C++ class. It looks good overall.

So, would Protocols work with multiple inheritance, for allowing fine-grained interfaces? (just wildly guessing at the syntax here)...


protocol Insertable<T>
    :(Add : T)
    ,(AddAll : Extractable<T>)

protocol Extractable<T>
    :(Empty: -> boolean)
    ,(Next : -> T)

protocol Queue<T>
    :Insertable<T>
    ,Extractable<T>
    ,(Iterator : -> Extractable<T>)
// Both Queue and an Iterator over it have the exact same protocol; Extractable<T>.
// The difference is only in semantics; Queue=>Next() removes the head item, but
// Iterator=>Next() just peeks at each item in turn.

Of course, composition is also always an option for "hid[ing] portions of an API surface", but having a rich, well-defined, extendable standard library is always a great boon to a language in my eyes.

RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.
There is no inheritance in Epoch. To implement multiple protocols, just have your dispatcher respond to the appropriate message(s) of each. Delegation of internal functionality can and should be modeled by composition (since inheritance is typically just hidden composition anyways).

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Here's a variant as food for thought:

Averager : -> dispatch
(
    DataPoint : integer x { total += x   count++ }
    GetAverage : -> total / count
)
{
    integer total = 0
    integer count = 0
}

entrypoint :
{
    task avg = Averager()
    avg => DataPoint(42)
    avg => DataPoint(666)

    print(avg => GetAverage())
}


Now that I'm not falling asleep, here's the rationale for this alternative format:
  • dispatch stops being a magic block inside a function
  • dispatchers can be instantiated freely in any expression to yield more compact message handling code
  • return type of the function is now concretely expressed instead of a magical "task"
  • more consistent with the internal syntax rules and the ideas of the language

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I don't mean inheritance in your dispatchers; I mean inheritance in your protocols. If I expose a function which wants a Queue<T>, I would want to be able to pass that Queue to any function which requires an Insertable, for instance. It's also possible I don't know what your design goals are.

Wait, nevermind. I figured out what you meant. Textual interface compliance, as opposed to referential. So my question-example could work as...


protocol Insertable<T>
    :(Add : T)
    ,(AddAll : Extractable<T>)

protocol Extractable<T>
    :(Empty: -> boolean)
    ,(Next : -> T)

protocol Queue<T>
    :(Add : T)
    ,(AddAll : Extractable<T>)
    ,(Empty: -> boolean)
    ,(Next : -> T)
    ,(Iterator : -> Extractable<T>)

... because any object which offers a dispatch for (Add:T) satisfies the Insertable<T> protocol, a la O'Caml-style duck typing.

And a question about one of your examples...


// This would realistically use an enumeration instead of magic values
MakeAverager : 0 -> Average avg = AverageMean()
MakeAverager : 1 -> Average avg = AverageMedian()
MakeAverager : 2 -> Average avg = AverageMode()
MakeAverager : integer invalid -> Average avg = AverageMean() { assert(false) }

Why the "Average avg = " on each line here? That makes it look like construction only possible in combination with assignment, e.g. the following is not syntactically valid...


// This would realistically use an enumeration instead of magic values
MakeAverager : 0 -> AverageMean()
MakeAverager : 1 -> AverageMedian()
MakeAverager : 2 -> AverageMode()
MakeAverager : integer invalid -> AverageMean() { assert(false) }

Or is that just a syntactical structure to explicitly declare a "return type" for MakeAverager?

RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.
Yep, you've got it - there's a proper formal name for the type system I'm going for, but it escapes me at the moment. Maybe I'll dig it up in a bit.

[edit]It's a flavor of structural typing, but I'm plagued by this itchy feeling that there's a more precise term still. Oh well.[/edit]



The "Average avg=" thing is just a coincidental habit of mine. Currently the compiler's type inference is a little weaker than I'd like, and pattern-matched functions often need explicit type signatures for their return values. In a clean version of the language (which I'm striving towards, I promise!) your second example is what it would look like.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

It sounds like duck typing if the following is true; "If it quacks like a duck, and it swims like a duck, I can call it a duck." Instead of caring about an explicit inheritance tree or declarative "I-am-a" annotation on your types, you only care about the "I-can" and "I-have" of your types. Using my example, the classification "Insertable<T>" applies to anything that can (Add:T) and can (AddAll:Extractable<T>). The naming of protocols then becomes an aliasing of human-readable interface label (and its associated documentation) to a given set of mutual required (or offered) capabilities.

Objective CaML, definitely my favourite static-typed functional programming language, uses this method for its Object-Orientation layer. A method's arguments only need to satisfy the needs of any expressions using that argument in the method (be that operators or method-calls on those arguments).

RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.
Maybe a goofy nitpick to make, but in my experience, duck typing is typically enforced at runtime. Epoch enforces as much of its type constraints as possible at compile time. (The only exception is failure cases in pattern matching, which is not decidable at compile time, so we typically defer to a runtime error for that stuff.)

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

This topic is closed to new replies.

Advertisement