Descriptive Error Handling

Started by
21 comments, last by frob 10 years, 2 months ago

I've been accustomed to using integer return codes for many years, and after spending a lot of time at work using nothing but dynamically typed languages, I've grown impressed with how I can return very descriptive, human-readable error messages (sometimes even generated at run-time).

My basic problem is this: Take for example, my regular expression class. There are a billion places that parsing an expression can fail. I currently have one code to cover this: E_SYNTAX_ERROR. Not very descriptive, at all. This same code is returned if there's a misplaced parenthesis, the brackets don't agree, an unexpected end of input was encountered, the lower bound for a loop is higher than the upper bound, etc. Things get hairy if I have a giant enum that contains a billion little codes like Result::REGEX_INVALID_LOOP_CONSTRUCT, and I have to change the enum every time I add a new module. On the other hand, I could have a ton of enums, which neatly compartmentalizes the slightly less vague errors, but then I can't propagate them upward; if a file loading function encounters a line containing an invalid regex, and the regex parser throws an error, the file loading function returns an enum related to its functionality, which results in a loss of information since the regex-specific error code is lost.

This is, of course, ignoring niceties like giving the character offset where the problem was encountered, which a simple enum can't provide.

What other alternatives are there for this sort of thing? Are exceptions my only (undesirable) option, even if it isn't always an exceptional occurrence?

Advertisement

Any reason your error code cannot be a string literal?

String literals have static storage duration, so returning a const char* rather than an integer error code works: return "You forgot to do the thing. Do the thing next time"; If you are passing around const char* variables as parameters that would work instead as an alternate way of getting the message across. This also works well as a 0 value is no error message.

For debug-only things, most code bases have advanced assertion functionality that takes strings. If yours doesn't have that yet, it is easy enough to write:

assert( foo==bar && "this message appears along with the statement foo==bar when the condition fails");

You can also use the __FILE__ macro to get the file name, the __LINE__ macro to get the line number (but you need to tokenize it). If you want to enter platform specific functionality, you might also use __FUNCTION__ and __PRETTY_FUNCTION__, but they sometimes are variables rather than static strings. Google can find lots of examples about using those in error messages.


Any reason your error code cannot be a string literal?

The biggest one that I can think of is handling the error. If it is all status code, the user can't fix his input without meticulous re-doing of the steps that the machine just did. If it is all human-readable, it takes an undue amount of effort for a program to figure out what happened and report on it.


For debug-only things, most code bases have advanced assertion functionality that takes strings. If yours doesn't have that yet, it is easy enough to write:

assert( foo==bar && "this message appears along with the statement foo==bar when the condition fails");

Funny thing, I decided earlier today to start doing that. I've been making the expressions unnecessarily verbose to indicate intent, like comparing variables against zero rather than just letting them be implicitly converted to bool, but I think text would be best, because if someone else triggers my assertion, they'd have to start with my code and work their way back, rather than starting with their code.

Mostly, what I'm looking to accomplish is a dynamic method of combining a return code with extra information, in a way that doesn't depend on its origin. The flimsiest way that I can imagine accomplishing this is having a return code base class, and returning a pointer to a derived class that is specific to the problem, containing extra information like the offset, line number, precise cause, etc., and some virtual methods to get the general code, maybe the name of the class, and if it is acceptable to not need the details, just a general line of text explaining what the problem is.

Obviously, this sounds like a horrible idea, because I don't want to introduce heap allocations for returning the result of a function, and I'm not sure how to have polymorphism like this without pointers and have the storage duration last longer than the function that created it.

The only way that I know of to do this is exceptions, pretty much for this reason, but I don't want to use them. I know it sounds silly to have a feature that seems to do what I want and not use it, but it has noticeable overhead in multiple ways, and choosing between having overhead whether an exception is thrown or not and just not using exceptions seems like an obvious choice.

What works for you in your high-performance code?

Could exception handling perhaps be an alternative? You can derive a specific class for every error that can occur from std::runtime_error, and catch it:


try
{
    doTheThing();
} catch( const RegexInvalidLoopConstructException& e )
{
    // handle error
} catch( const SyntaxErrorException& e )
{
    // handle error
}
// --etc--
"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty

Could exception handling perhaps be an alternative? You can derive a specific class for every error that can occur from std::runtime_error, and catch it:


try
{
    doTheThing();
} catch( const RegexInvalidLoopConstructException& e )
{
    // handle error
} catch( const SyntaxErrorException& e )
{
    // handle error
}
// --etc--

I hope to avoid it, for the reasons that I stated in my above posts that I don't want to use them.

I probably would use them, if compilers didn't have an option to disable them. Not only does this provide a more efficient alternative, but should one ever disable them, all code that is capable of recovering if an exception occurs now breaks entirely, and aborts instead of having the exception be catchable. Thus, I guess you could say I'm looking for a way around it, by having some sort of return code process that doesn't penalize code that never encounters an exceptional case.

Exception handling is considered the cleanest approach in C++. However, they are not always viable, such as in your case.

I can recommend reading page 32 and onwards in the following document:

http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf

Also check out this thread:

http://www.gamedev.net/topic/653195-towards-better-error-handling-constructs/

"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty




I can recommend reading page 32 and onwards in the following document:

http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf

I'm sorry, but this didn't provide me with any new information.




Also check out this thread:

http://www.gamedev.net/topic/653195-towards-better-error-handling-constructs/

While the words "exception" and "return code" do occur in the thread, I believe the thread's purpose is defining a new language construct, in a custom language, rather than using existing constructs in a new way.

I have to wonder, am I out of options here? Either exceptions with overhead whether I use them or not, millions of return code checks with minimal information, or some Frankenstein's monster combination of both using polymorphism and dynamic allocation?


I have to wonder, am I out of options here? Either exceptions with overhead whether I use them or not, millions of return code checks with minimal information, or some Frankenstein's monster combination of both using polymorphism and dynamic allocation?

Your example was for regex, but i thought you meant this was going to be for a generalized system.

If your example case was your only case, then different rules apply.

For regex specifically, I wouldn't do anything more than return a single error code.

Developers using your program only need to know that their expression is broken. Regular expressions can be brutally difficult to get right, and helpful hints will only go so far. I don't see any need for anything beyond the simple return code and a few assertions. They can be debug-only checks since the programmers should have fixed any errors with it before release.

If this is something that will be exposed to the user, well, you might as well just have a single error and provide a link to this amazon page. The 550 page book doesn't cover everything but it is a good start on the subject. The average human isn't going to have any success with regular expressions, and even technically-inclined individuals will struggle with a regex of any complexity.


Your example was for regex, but i thought you meant this was going to be for a generalized system.

This is intended to be for a generalized system. The regex library is an example of a localized issue that loses important context when attempting to propagate the information upward to code that is in a position to handle it, or ask the user to rectify the situation.

Whats the final destination of the error, when it bubbles back out of the system? Is it just displayed to the user in some way? Just a string to be printed to a log, etc?

This topic is closed to new replies.

Advertisement