This is nice because you can return the payload you want normally, and even attach arbitrary additional state data to the exception object itself. Unfortunately, exceptions are complicated and hard to get right, even in managed languages. Implementing exceptions is a major burden on language designers as well.
Of course there are other methods in use, such as multiple-return (return a payload in one "slot" and a success/fail code in another slot - popular in Lua, Go, etc.), discriminated-union-return (return an algebraic sum type that can carry either a success result or a failure code - popular in several functional languages), and so on. There's even the ever-mystical "continuation" method which is something like exception-style stack unwinding on steroids. (Check it out if you're not familiar with continuations, they're a damned powerful but really tricky concept that can be, at turns, very nice to have and infinitely frustrating to debug.)
I was thinking earlier about how I would implement error handling in an ideal scenario, and came up with a list of mandatory functionality I'd want involved:
Let me return a payload trivially - and with zero runtime overhead - in success cases
Allow arbitrarily rich "failure" objects/codes/etc. so I can be very precise about what went wrong
Complete static type safety
Allow arbitrary handling logic from the caller or callee when errors occur
Keep error processing logic close to, but not intermingled with, success-case code
In this example, we set up a "fallible" function which accepts an integer and a string. We then call this function from inside a "protect" block, which is followed by a task (think actor) which can accept messages.
(NB: think of foo=>Bar() as syntax for "send the Bar message to the foo task." In this case, panic is a special task alias for the "nearest" protector task that can handle the given message.)
Inside the function, we check if the integer passed is "very large" in which case we just "barf." This is equivalent to an unrecoverable error. Then, we set up a recoverable error if the integer is "slightly" too high. In the recoverable case, we actually fire a message to the protector task and use its return value to change the parameter. This repeats until a sane value is passed in; for sake of simplicity, this can loop infinitely, but more complex and realistic handling would just obscure the example.
Finally, the function constructs a string based on its input parameters, and returns it.
I like this mechanism. It allows trivial returns of payloads in success cases, and even allows totally unguarded execution, similar to exceptions and stack unwinding, if I (as the programmer) so desire. Failure messages can pass arbitrarily rich details to the protector task. Enforcing type safety and even static error-robustness checking is possible, albeit not necessarily trivial. We see a perfect example of caller and callee interacting to correct the "vomit" condition. Last but not least, I could hypothetically use a non-inline task if I wanted to, allowing reuse of error handling logic, or separation of concerns between success and failure paths, etc.
This feels to me like the best of all possible worlds, but I'm curious if it even makes sense to anyone else, or if someone has a better idea for how to handle error situations in code. Keep in mind I'm not looking for a solution to bolt into an existing language so much as a theoretical ideal.
Before we get too far into this, I should say that I'm fully aware that NLG is a massive field of research, and I'm not trying to pass any Turing tests here. I don't care if the generated "speech" even makes sense half the time; it's more for amusement than anything else.
My first inclination was to build a Markov model and use simple chains to construct sentences. Unfortunately, the space complexity of this is rather nasty, and the real killer is the amount of data needed to train the model adequately. I don't have a readily available corpus of plaintext to feed into the thing that suits the mood and personality I want to create.
The next obvious route would be to construct a Petri net for the language I want to speak. The major advantage is that this is a compact and fairly efficient way to do poor-man's NLG; the disadvantage is that hand-authoring and tuning a Petri net for nontrivial languages can be a huge time sink.
So I figured I'd poke around here and see if anyone knows of good algorithms for simple NLG that I might be able to take advantage of. I don't mind having to use a huge data set as long as the data is easily constructed and/or readily available in an easily digested format. Runtime is important since this is supposed to be a realtime conversational bot.
Non-goals: contextual recognition, memory, progressive refinement/learning, etc. It doesn't even have to do more than dumb keyword recognition for all I care.
I've spent a lot of time recently thinking about things I want my programming tools to do. Some of these things are just improvements to existing concepts, and some are entire categories of functionality that don't exist in common toolkits. (To be honest, though, most of the ideas are implemented out there, just not popular or widely available.)
Fileless code organization I'd love to stop thinking about my code in units of files. Give me a way to arbitrarily group code, maybe index it in several different ways, and I'll be much happier. I want to be able to organize on my own axes that don't always overlap with file divisions.
Semantic version control Kind of in line with the above, I'm tired of version control that's just text-based. I want my revision history to understand that I moved a function from one module to another, or renamed a couple of variables, or whatever. Just thinking of programs as text is too limiting.
Better REPLs The idea of a REPL (Read Eval Print Loop) is old and well-tested, but I'd like a GUI-based version. Smalltalk is the closest thing I know of to my ideal, wherein I can build arbitrarily complex data structures and run arbitrary code on them. Imagine a system where you can build entire test cases of data using a simple visual editor to lay out the data structures, and then feed that data into a module to get instant unit testing.
Better contracts I hate the way I have to specify interfaces and contracts in most languages. More accurately, I hate the way that interfaces are everything and contracts are second-class citizens. Aspect-oriented programming is almost a step in the right direction, but most implementations are really clunky and verbose. Why can't I just say something simple, like "this function never returns null", and get automatic optimizations and sanity checks from my compiler?
Responsiveness Visual Studio is a hog, and the other IDEs I've used are all worse. I want something slim and fast, so I can get all my coding goodness at a reasonable pace. Taking two minutes to load a project that comprises only a few dozen KB of XML on disk is just stupid.
One of my big time consumers lately has been the Epoch language compiler. This is a beast which is intended to self-host, i.e. compile itself.
It weighs in at just under 10,000 lines right now or I'd post it inline. You can see the evil monstrosity here and gawk in awe at its massive hideousness. Unfortunately Google Code does not know how to highlight Epoch code yet so it doesn't look as pretty as it does in my (fully home-grown) IDE.
Bestow upon me your undying praise expressions of vomitous revulsion.
OK, so I lied. Technically it's very close to being able to self-host, but it's missing a couple features.