Towards better error handling constructs

Started by
10 comments, last by Hodgman 10 years, 2 months ago
We all know about the eternal war between the Big Two error handling philosophies.


From Ages Immemorial we have the return-code method:

enum ReturnCode
{
    RETURN_CODE_SUCCESS,
    RETURN_CODE_BARF,
    RETURN_CODE_VOMIT,
};

ReturnCode MyFallibleFunction(int param1, const std::string& param2)
{
    // Do stuff

    // Ooops!
    return RETURN_CODE_BARF;
}
This works fine if your function only returns a result code; if it needs to have a payload as well, then you're stuck with out-parameters or other nasty hacks.


Therefore, from Slightly More Modern Times we have the exception method:

void MyFallibleFunction(int param1, const std::string& param2)
{
    // Do stuff

    // Ooops!
    throw BarfException();
}
This is nice because you can return the payload you want normally, and even attach arbitrary additional state data to the exception object itself. Unfortunately, exceptions are complicated and hard to get right, even in managed languages. Implementing exceptions is a major burden on language designers as well.


Of course there are other methods in use, such as multiple-return (return a payload in one "slot" and a success/fail code in another slot - popular in Lua, Go, etc.), discriminated-union-return (return an algebraic sum type that can carry either a success result or a failure code - popular in several functional languages), and so on. There's even the ever-mystical "continuation" method which is something like exception-style stack unwinding on steroids. (Check it out if you're not familiar with continuations, they're a damned powerful but really tricky concept that can be, at turns, very nice to have and infinitely frustrating to debug.)

A notable alternative that is popular in JavaScript (and probably other similar languages) is lambda-style handlers wherein I pass a function two lambdas, one that runs on success, and one that runs on failure. This is kind of a nice idea, but it turns into a soup of nested lambdas in complex scenarios, and gets ugly really, really fast.


I was thinking earlier about how I would implement error handling in an ideal scenario, and came up with a list of mandatory functionality I'd want involved:
  • Let me return a payload trivially - and with zero runtime overhead - in success cases
  • Allow arbitrarily rich "failure" objects/codes/etc. so I can be very precise about what went wrong
  • Complete static type safety
  • Allow arbitrary handling logic from the caller or callee when errors occur
  • Keep error processing logic close to, but not intermingled with, success-case code
Some of these are obviously easier than others.

Here's what I came up with:

entrypoint :
{
    protect
    {
        print(FallibleFunction(42, "Test"))
    }
    with task
    {
        Barf : { print("Function barfed.") }

        Vomit : string reason -> integer fallback = 0
        { print("Vomit! Falling back to 0 becuase: " ; reason) }
    }
}

FallibleFunction : integer p1, string p2 -> string ret = ""
{
    if(p1 > 100)
    {
        panic => Barf()
    }
    
    while(p1 > 20)
    {
        p1 = (panic => Vomit("Number slightly too high"))
    }

    ret = p2 ; " ... " ; cast(string, p1)
}

In this example, we set up a "fallible" function which accepts an integer and a string. We then call this function from inside a "protect" block, which is followed by a task (think actor) which can accept messages.

(NB: think of foo=>Bar() as syntax for "send the Bar message to the foo task." In this case, panic is a special task alias for the "nearest" protector task that can handle the given message.)

Inside the function, we check if the integer passed is "very large" in which case we just "barf." This is equivalent to an unrecoverable error. Then, we set up a recoverable error if the integer is "slightly" too high. In the recoverable case, we actually fire a message to the protector task and use its return value to change the parameter. This repeats until a sane value is passed in; for sake of simplicity, this can loop infinitely, but more complex and realistic handling would just obscure the example.

Finally, the function constructs a string based on its input parameters, and returns it.



I like this mechanism. It allows trivial returns of payloads in success cases, and even allows totally unguarded execution, similar to exceptions and stack unwinding, if I (as the programmer) so desire. Failure messages can pass arbitrarily rich details to the protector task. Enforcing type safety and even static error-robustness checking is possible, albeit not necessarily trivial. We see a perfect example of caller and callee interacting to correct the "vomit" condition. Last but not least, I could hypothetically use a non-inline task if I wanted to, allowing reuse of error handling logic, or separation of concerns between success and failure paths, etc.

This feels to me like the best of all possible worlds, but I'm curious if it even makes sense to anyone else, or if someone has a better idea for how to handle error situations in code. Keep in mind I'm not looking for a solution to bolt into an existing language so much as a theoretical ideal.


Thoughts?

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Advertisement

How is that different to throwing Barf() or Vomit(string reason) from a function?

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

This feels to me like the best of all possible worlds, but I'm curious if it even makes sense to anyone else, or if someone has a better idea for how to handle error situations in code. Keep in mind I'm not looking for a solution to bolt into an existing language so much as a theoretical ideal.
Thoughts?


Once I grok'd the syntax (Epoch?) the concept certainly made sense although the first concern which popped into my head was noise at the call site.

In fact, going back to your opening, one of my 'issues' with exceptions isn't so much the burden of handling as it is that you end up often wrapping up all the code in a try {} block instead of just the bits you think will fail as it saves you jumping in and out of the block and mixing try{} catch{} groups across a function.

I can see the same happening here, where everything ends up in the 'protect' block and an ever expanding list of 'task' to clean up on the end.

Working from that premise maybe assume always protected and setup the error handling via some other means?


entrypoint :
{
    print(FallibleFunction(42, "Test"))
}
with handlers
{
    Barf : { print("Function barfed.") }

    Vomit : string reason -> integer fallback = 0
        { print("Vomit! Falling back to 0 becuase: " ; reason) }
}


FallibleFunction : integer p1, string p2 -> string ret = ""
{
    if(p1 > 100)
    {
        panic => Barf()
    }

    while(p1 > 20)
    {
        p1 = (panic => Vomit("Number slightly too high"))
    }

    ret = p2 ; " ... " ; cast(string, p1)
}
Although this does introduce scoping issues for variables and the inability to 'catch and recover' within a function which might prove an issue, I'd have to ponder that some more. Although depending on the nature of the agents and how they bubble errors up this might not be a problem - it might not be logical to 'catch and recover' within in the semantics of the error handling agents and the rest of the program flow.

Overall however, I do find the idea of separating out error reporting/handling in this manner an interesting idea.

I feel like that sort of syntax would lead to a very tight coupling between the function and the protect block. You'd have to know within a function, who would be calling it, and the calling functions would have to know all the possible 'exceptions'. Having to specify all the 'exceptions' isn't hard for a single function, but once they start getting nested...

It doesn't seem practical to me. Outside of trivial examples I can't for the life of me think of a good example. Even in the case of say an invalid file name in a file handling function, looping on an exception or sentinel return value would be easier. This also looks to me to be a maintenance nightmare.

Clever, but I question its practicality. But just cause I can't see its use doesn't mean there isn't one. Any good examples come to mind? Ones that wouldn't be trivial to code using exception handling or sentinel return values?

I have a love/hate relationship with both of them. Both have their flaws. But ultimately I hate return codes less than I hate exceptions.

Think hard about why we have them. What are the reasons for return codes and exceptions?

Most error codes are about input validation problems.

No, seriously, take a look at things like the C errno values. Almost all of them are for conditions that should have been validated but were not. Argument list too long, permission denied, file exists, not a directly, is a directory, file too large, too many links, name too long, directory not empty, out of range, etc. These are all things that could have been caught by validation, but instead of doing they rely on the system do to it for them, and it fails with an error code. The same is true for Windows system error codes. File name too long, disk full, process not found, no children to wait for, invalid format, label too long, segment already locked, and on and on. True the rest of them are things you might not easily detect, but many of them are just a case of not validating your data.

So if you just validate your data you will solve the vast majority of your error conditions. Liberally cover the top of your functions with asserts to let programmers know they failed at the basic task of validating their inputs. Most of the time you don't need either exceptions or sentinel return values, just validate your own data.

Next, even though exceptions can provide additional information closer to the source of the error, they degrade back to error codes. Exceptions do provide graceful handling of a few things -- such as errors during a constructor (although that is very bad, write c'tors and d'tors to never throw or else never create an array) -- but they also lack grace in many situations by adding a large amount of error handling code and adding invisible exit points at every function call and operation. Also when an exception must cross a boundary -- such as a library boundary -- the exception invariably gets converted into a sentinel return value.

In searching for a supporting statement, I found Ramond Chen had an article about it that summed up my feelings nicely. Both are equally powerful in that they allow you to detect and handle error conditions. Exceptions do provide quite a few nice features like a call stack and unique messages along the way back up, but on the other hand they present many more opportunities for error. When everything relies on error codes it is very easy to tell when the code is using those values or not. One glance can tell you if it is following a chain where every return value is tested. But exception based code is a bit harder. Did they catch every significant exception? Did they catch them at the right point to avoid leaking resources? It is harder to see bad exception-based code than to see bad error-code-based code.

Seven decades of active use demonstrate that while return values can be ignored or misused, they can also be used very effectively.

While exceptions are able to generally (but not always) do the job, their track record is much shorter. And when they can't do the job (such as library boundaries) they revert back to sentinel values.

I'm sorry I disagree Frob, I use exceptions almost exclusively, and it'd be a cold day in hell before I returned to return values.

Both return values and exceptions have the same amount of error handling code. The difference is with return values you must handle it at the function return point, with error code spread all over the place. With exceptions you can handle it where ever is the most convenient/easiest/least error prone to do so.

A proper exception hierarchy will also provide much more information, and be quicker and safer to expand, than an error code. Add a new error code to a function and anything that uses that function will have to be re-written. Add a new exception and only a few catch() blocks need to be updated.

Ignored exceptions are also easier to track. Ignored return value errors can cause the program to execute significantly past the point of error, corrupting memory along the way and making tracing the bug tedious. An ignored exception will halt the program, with a very informative stack output and all data up to the exception intact.

I also don't see why constructors throwing exceptions are a bad thing. That was one of the primary situations that exceptions were created for. Obviously exceptions in destructors are bad, but its not like you can return a return code from them either.

I know exceptions seem to get a lot of hate, but once I got used to them I fell in love. Its one of the few things I love about C++...

What you seem to be describing is a "warning" system that works in place of the "error" system that is exception. That is, you're creating a mechanism to continue on that's controlled at the call site. That's not really one of the 5 priorities you listed.

I think exceptions are much better than return codes, precisely because you can put the code to handle the exception case out of line of the normal path.

But exceptions do have their problems too. For example, sometimes you do want to handle the exception right at the call site, but exception syntax is very verbose for doing that. So you've traded brevity for flexibility.

What you seem to be describing is a "warning" system that works in place of the "error" system that is exception. That is, you're creating a mechanism to continue on that's controlled at the call site. That's not really one of the 5 priorities you listed.

I think exceptions are much better than return codes, precisely because you can put the code to handle the exception case out of line of the normal path.

But exceptions do have their problems too. For example, sometimes you do want to handle the exception right at the call site, but exception syntax is very verbose for doing that. So you've traded brevity for flexibility.

But is the syntactic difference really that much?


a = GetNumber();
if (a == error_1) { /* handle error 1 here */ }
else if (a == error_2) { /* handle error 2 here */ }
else { /* handle everything else here */ }

vs


try { a = GetNumber(); }
catch (const Error1& e) { /* handle error 1 here */ }
catch (const Error2& e) { /* handle error 2 here */ }
catch (...) { /* handle everything else here */ }

I mean, I understand that prior to exceptions that the return value code would be clearer simply due to familiarity. But having used both, from a purely syntactic point of view, the difference is negligible. Only in the most simplest of cases are return values smaller:


if (!GetNumber()) {}

vs


try { GetNumber(); } catch(...) {}

That's about the only situation I can think of where return values are clearly tidier.

Perhaps there are situations I'm not aware of? When you say exception handling syntax is verbose what in particular are you thinking of?

Is error handling even a big deal? Most code doesn even have failure conditions; after all, everything is still just "input -> process -> output" at the core. If a process can produce expected errors, then we're just replacing "output" with "union{output1,output2}" -- these aren't actually processes with "errors", but a just process that can produce multiple different result types.

IMHO, errors are either expected, meaning they wont be ignored no matter the mechanism as long as you write half-decent code (if the retval of "bool DoesFileExist()" or the catch for "File* TryOpen() throws FileNotFound" is ignored, the code should be obviously wrong upon very quick inspection), or, errors are unexpected, in which case the error is simply because the code is wrong and needs to be fixed by a programmer.

Expected errors don't need an error handling paradigm because they're just regular core logic (I wouldn't say that DoesFileExist, above is using the "return codes" paradigm; it's just using sensible return values like 'normal' code).

Unexpected errors don't need an error handling paradigm beyond: "just assert and crash hard in the assertion handler". i.e. if you get an unexpected error, your code is wrong, and the error handler is just a debugging aid. The error handler here should actually be an assertion/crash/debug handler.

APIs with complex return codes -- e.g. D3D's HRESULTs are mostly a debugging aid (for fixing bad code that produces unexpected errors). 99% of D3D's HRESULTs should be used in assertions, which can be stripped from the retail/shipping build. The other 1% of them are actual expected errors, unfortunately using the same mechanism, which is confusing...

So personally, I'd suggest showing some *real* example use cases for this new error handling paradigm, to demonstrate what kind of code it will benefit. i.e. what kind of failable code is currently ugly, and is made more robust with this system?

I'd also be more interested in language features to help writing errorless code, such as better assertions/preconditions/post conditions/limitation of side-effects/etc ;-)

Back on topic -- it's basically a mixture of callbacks/lambdas and exceptions? You can either do something similar to throwing an exception, or you can have the exception be handled and resume from the throw-site?
It might just be because it's an unfamiliar construct, but I'd probably prefer that Vomit just be a lambda, which is invoked when the input is too high and returns a replacement value.
It's also not obvious to me why Barf is unrecoverable (halts the program?) but Vomit isn't? Is the idea that handlers with no ret-val are basically crash handlers, but ones that have a ret-val are lambdas?

A thing I do in my code since I've switched to exceptions is to take advantage of the following property:

  • many places in which stuff can go wrong;
  • usually one place in which we can trivially say "from now on, nothing can go wrong"

So, I push on stack a sequence of "undo" functions and disable them when successful. It's still far from the "fire and forget" construct I'd like but still better than writing an explicit try/catch in my opinion. The lack of error information however is a real concern, I suppose I could work out an half-decent interface for it... but in the end, I don't need that so my research ends here.

Continuation appears interesting; I have a testbed for this, maybe in the future I'll investigate.

For now, I'm following the thread.

Previously "Krohm"

This topic is closed to new replies.

Advertisement