• Create Account

Descriptive Error Handling

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

22 replies to this topic

#1Ectara  Members

3097
Like
0Likes
Like

Posted 08 February 2014 - 07:28 PM

I've been accustomed to using integer return codes for many years, and after spending a lot of time at work using nothing but dynamically typed languages, I've grown impressed with how I can return very descriptive, human-readable error messages (sometimes even generated at run-time).

My basic problem is this: Take for example, my regular expression class. There are a billion places that parsing an expression can fail. I currently have one code to cover this: E_SYNTAX_ERROR. Not very descriptive, at all. This same code is returned if there's a misplaced parenthesis, the brackets don't agree, an unexpected end of input was encountered, the lower bound for a loop is higher than the upper bound, etc. Things get hairy if I have a giant enum that contains a billion little codes like Result::REGEX_INVALID_LOOP_CONSTRUCT, and I have to change the enum every time I add a new module. On the other hand, I could have a ton of enums, which neatly compartmentalizes the slightly less vague errors, but then I can't propagate them upward; if a file loading function encounters a line containing an invalid regex, and the regex parser throws an error, the file loading function returns an enum related to its functionality, which results in a loss of information since the regex-specific error code is lost.

This is, of course, ignoring niceties like giving the character offset where the problem was encountered, which a simple enum can't provide.

What other alternatives are there for this sort of thing? Are exceptions my only (undesirable) option, even if it isn't always an exceptional occurrence?

#2frob  Moderators

41340
Like
2Likes
Like

Posted 08 February 2014 - 08:39 PM

Any reason your error code cannot be a string literal?

String literals have static storage duration, so returning a const char* rather than an integer error code works:  return "You forgot to do the thing. Do the thing next time";  If you are passing around const char* variables as parameters that would work instead as an alternate way of getting the message across. This also works well as a 0 value is no error message.

For debug-only things, most code bases have advanced assertion functionality that takes strings. If yours doesn't have that yet, it is easy enough to write:

assert( foo==bar && "this message appears along with the statement foo==bar when the condition fails");

You can also use the __FILE__ macro to get the file name, the __LINE__ macro to get the line number (but you need to tokenize it). If you want to enter platform specific functionality, you might also use __FUNCTION__ and __PRETTY_FUNCTION__, but they sometimes are variables rather than static strings. Google can find lots of examples about using those in error messages.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

#3Ectara  Members

3097
Like
0Likes
Like

Posted 08 February 2014 - 09:09 PM

Any reason your error code cannot be a string literal?

The biggest one that I can think of is handling the error. If it is all status code, the user can't fix his input without meticulous re-doing of the steps that the machine just did. If it is all human-readable, it takes an undue amount of effort for a program to figure out what happened and report on it.

For debug-only things, most code bases have advanced assertion functionality that takes strings. If yours doesn't have that yet, it is easy enough to write:

assert( foo==bar && "this message appears along with the statement foo==bar when the condition fails");

Funny thing, I decided earlier today to start doing that. I've been making the expressions unnecessarily verbose to indicate intent, like comparing variables against zero rather than just letting them be implicitly converted to bool, but I think text would be best, because if someone else triggers my assertion, they'd have to start with my code and work their way back, rather than starting with their code.

Mostly, what I'm looking to accomplish is a dynamic method of combining a return code with extra information, in a way that doesn't depend on its origin. The flimsiest way that I can imagine accomplishing this is having a return code base class, and returning a pointer to a derived class that is specific to the problem, containing extra information like the offset, line number, precise cause, etc., and some virtual methods to get the general code, maybe the name of the class, and if it is acceptable to not need the details, just a general line of text explaining what the problem is.

Obviously, this sounds like a horrible idea, because I don't want to introduce heap allocations for returning the result of a function, and I'm not sure how to have polymorphism like this without pointers and have the storage duration last longer than the function that created it.

The only way that I know of to do this is exceptions, pretty much for this reason, but I don't want to use them. I know it sounds silly to have a feature that seems to do what I want and not use it, but it has noticeable overhead in multiple ways, and choosing between having overhead whether an exception is thrown or not and just not using exceptions seems like an obvious choice.

What works for you in your high-performance code?

Edited by Ectara, 08 February 2014 - 09:10 PM.

#4TheComet  Members

2676
Like
0Likes
Like

Posted 09 February 2014 - 04:35 PM

Could exception handling perhaps be an alternative? You can derive a specific class for every error that can occur from std::runtime_error, and catch it:

try
{
doTheThing();
} catch( const RegexInvalidLoopConstructException& e )
{
// handle error
} catch( const SyntaxErrorException& e )
{
// handle error
}
// --etc--

Edited by TheComet, 09 February 2014 - 04:46 PM.

"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty

#5Ectara  Members

3097
Like
0Likes
Like

Posted 09 February 2014 - 07:12 PM

Could exception handling perhaps be an alternative? You can derive a specific class for every error that can occur from std::runtime_error, and catch it:

try
{
doTheThing();
} catch( const RegexInvalidLoopConstructException& e )
{
// handle error
} catch( const SyntaxErrorException& e )
{
// handle error
}
// --etc--

I hope to avoid it, for the reasons that I stated in my above posts that I don't want to use them.

I probably would use them, if compilers didn't have an option to disable them. Not only does this provide a more efficient alternative, but should one ever disable them, all code that is capable of recovering if an exception occurs now breaks entirely, and aborts instead of having the exception be catchable. Thus, I guess you could say I'm looking for a way around it, by having some sort of return code process that doesn't penalize code that never encounters an exceptional case.

#6TheComet  Members

2676
Like
0Likes
Like

Posted 10 February 2014 - 05:28 AM

Exception handling is considered the cleanest approach in C++. However, they are not always viable, such as in your case.

I can recommend reading page 32 and onwards in the following document:

http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf

http://www.gamedev.net/topic/653195-towards-better-error-handling-constructs/

Edited by TheComet, 10 February 2014 - 05:49 AM.

"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty

#7Ectara  Members

3097
Like
0Likes
Like

Posted 10 February 2014 - 04:49 PM

I can recommend reading page 32 and onwards in the following document:

http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf

I'm sorry, but this didn't provide me with any new information.

http://www.gamedev.net/topic/653195-towards-better-error-handling-constructs/

While the words "exception" and "return code" do occur in the thread, I believe the thread's purpose is defining a new language construct, in a custom language, rather than using existing constructs in a new way.

I have to wonder, am I out of options here? Either exceptions with overhead whether I use them or not, millions of return code checks with minimal information, or some Frankenstein's monster combination of both using polymorphism and dynamic allocation?

#8frob  Moderators

41340
Like
1Likes
Like

Posted 10 February 2014 - 06:33 PM

I have to wonder, am I out of options here? Either exceptions with overhead whether I use them or not, millions of return code checks with minimal information, or some Frankenstein's monster combination of both using polymorphism and dynamic allocation?

Your example was for regex, but i thought you meant this was going to be for a generalized system.

If your example case was your only case, then different rules apply.

For regex specifically, I wouldn't do anything more than return a single error code.

Developers using your program only need to know that their expression is broken. Regular expressions can be brutally difficult to get right, and helpful hints will only go so far. I don't see any need for anything beyond the simple return code and a few assertions. They can be debug-only checks since the programmers should have fixed any errors with it before release.

If this is something that will be exposed to the user, well, you might as well just have a single error and provide a link to this amazon page. The 550 page book doesn't cover everything but it is a good start on the subject. The average human isn't going to have any success with regular expressions, and even technically-inclined individuals will struggle with a regex of any complexity.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

#9Ectara  Members

3097
Like
0Likes
Like

Posted 10 February 2014 - 06:57 PM

Your example was for regex, but i thought you meant this was going to be for a generalized system.

This is intended to be for a generalized system. The regex library is an example of a localized issue that loses important context when attempting to propagate the information upward to code that is in a position to handle it, or ask the user to rectify the situation.

#10Hodgman  Moderators

49421
Like
0Likes
Like

Posted 10 February 2014 - 07:41 PM

Whats the final destination of the error, when it bubbles back out of the system? Is it just displayed to the user in some way? Just a string to be printed to a log, etc?

#11Ectara  Members

3097
Like
0Likes
Like

Posted 10 February 2014 - 10:10 PM

In this example, it could be formatted and displayed to the user, if they provide the regex or a higher-level file that happens to contain an invalid regex. If this is, for instance, a regex compilation method that is never intended to fail, because the regex is provided by the code, not from user input, the information would be formatted and placed in an assertion. Optionally, it may suffice for calling code to see the type of error and ignore the specifics, if they are irrelevant.

On a side note, that furthers the discussion, how would you choose to represent syntax error return codes for a regex engine? Say we're restricted to ERE, nothing fancy. Would you return a single generic syntax error code? Would you have a unique code for all possibilities?

Furthermore, would these codes be in their own enumeration, that is to say, however you choose to represent them, they don't have unique values across all modules, but separate types keeps them from being compared against a different group of return codes? Or, would they be part of one giant collection of codes, so that one type can represent any of the codes, and you could propagate the codes to anywhere that would expect them?

#12Hodgman  Moderators

49421
Like
1Likes
Like

Posted 10 February 2014 - 10:42 PM

You don't really need a generic system then. The regex part needs to be able to spit out a human readable error of why it's failed. You can have bool TryCompile(string) and string GetReasonWhyCompilationFailed(void), etc, etc...

Someone who's using hard-coded regexes can then assert that TryCompile succeeds, and they can log the reason if the assertion fails (assuming you've got an assertion macro that takes a reason string).

Someone who's parsing a user-generated file can then check that all the regexes in the file compile successfully, and if they don't, they can return early from parsing in error, with their own reason string. This reason string could prepend the filename and line number that it was up to, and append the error string from the regex engine.

The GUI system that launched the file-parser can then take that returned string and shove it in a pop-up box, etc.

If that kind of thing covers all your use-cases, then there's no real need for error code enums... There's not a real need for exceptions either, unless it makes aborting the file-parser easier.

Edited by Hodgman, 10 February 2014 - 10:43 PM.

#13Ectara  Members

3097
Like
0Likes
Like

Posted 11 February 2014 - 08:56 AM

You don't really need a generic system then. The regex part needs to be able to spit out a human readable error of why it's failed. You can have bool TryCompile(string) and string GetReasonWhyCompilationFailed(void), etc, etc...

Out of curiosity, should the code care about why the function failed? I have an integer return code that is returned, and if it isn't successful, it propagates the code upward, but with a boolean approach, it either failed, or it didn't.

Now, this approach has overhead. Every object that can fail must now have a dynamic string object, whether it fails or not.

Edited by Ectara, 11 February 2014 - 12:20 PM.

#14King Mir  Members

2391
Like
0Likes
Like

Posted 11 February 2014 - 02:15 PM

I have to wonder, am I out of options here? Either exceptions with overhead whether I use them or not, millions of return code checks with minimal information, or some Frankenstein's monster combination of both using polymorphism and dynamic allocation?

You could implement setjump/longjump exceptions that would not be disabled by compiler settings (provided that the compiler is linked with a complete C library). With some suitable macros you could make the syntax halfway decent. The problem is, calling setjump is expensive and must occur at every would-be try block, even in the normal flow of code.

Just listing this as an option.

#15Ectara  Members

3097
Like
0Likes
Like

Posted 11 February 2014 - 06:19 PM

I have to wonder, am I out of options here? Either exceptions with overhead whether I use them or not, millions of return code checks with minimal information, or some Frankenstein's monster combination of both using polymorphism and dynamic allocation?

You could implement setjump/longjump exceptions that would not be disabled by compiler settings (provided that the compiler is linked with a complete C library). With some suitable macros you could make the syntax halfway decent. The problem is, calling setjump is expensive and must occur at every would-be try block, even in the normal flow of code.

Just listing this as an option.

While I agree that it has its uses, it majorly disrupts the flow of execution, without providing any more information than a return code. My main goal is providing as much information as possible, in a way that I can propagate errors to callers, without the pollution of a billion different return codes in the same namespace.

#16Hodgman  Moderators

49421
Like
1Likes
Like

Posted 11 February 2014 - 07:15 PM

You don't really need a generic system then. The regex part needs to be able to spit out a human readable error of why it's failed. You can have bool TryCompile(string) and string GetReasonWhyCompilationFailed(void), etc, etc...

Out of curiosity, should the code care about why the function failed? I have an integer return code that is returned, and if it isn't successful, it propagates the code upward, but with a boolean approach, it either failed, or it didn't.

Now, this approach has overhead. Every object that can fail must now have a dynamic string object, whether it fails or not.

You said above that the code doesn't care why, but it must print the reason why to the user. So you end up with a boolean success, and a message for the user.

If storing the error in the regex is a concern, then TryCompile can return the reason string instead of a bool if you like -- where a null error string indicates success.
Or it could return a pair{ bool success, string error };

Alternatively, you can copy the Windows API's solution. They allocate an error string per thread, not per object. Many functions fail with an error code that's basically equivalent to ERROR__SUCCESS_IS_FALSE. If you get a failure like this, then you call a global function to retrieve a human readable string of the most recent error to occur on the current thread.
Under this design, you'd have a global ThreadLocal<string> g_error, and TryCompile would write to g_error before returning false.

My thinking is that if the ultimate destination of the errors is always a human -- either a user-facing GUI, or an assertion/log message for a debugging programmer -- then what you really want is a very descriptive string, to which you can append specific information, such as line numbers, values of variables involved, etc, etc. An error code doesn't allow for that kind of extension.

Also, if the destination is a human, then at some point you need a mechanism to convert the error code into a string anyway. For the sake of loose-coupling, it seems bad for a regex module to return a code, which is then converted to a string by a completely different module (say, the GUI). This would mean that the GUI module has intimate knowledge about regexes, so that it can translate regex-specific errors to English... It seems cleaner to keep that translation local to the regex module.

However, this then complicates international localization -- you don't really want all your text hard-coded to English...
To get around this, you'd need your error objects to be quite fat, containing all the extended information (line numbers, values of bad variables, etc), and a string which acts as a key into a localization dictionary.

That ends up being something pretty complex, like:

Dict[string, string] locale = { "Err_regex_foo" -> "A regex was bad, see column %d" };

struct Error { string key; vector<Variant> data; }

function TryCompile(...)
...
return Error{ "Err_regex_foo", { 42 } }

string TranslateError( Error e, Dict locale )
format = locale[e.key]
return sprintf( format, e.data )

Edited by Hodgman, 11 February 2014 - 07:16 PM.

#17Ectara  Members

3097
Like
0Likes
Like

Posted 11 February 2014 - 07:40 PM

You said above that the code doesn't care why, but it must print the reason why to the user.

The truth is, a regex can be used for anything. I have no way of predicting where it will be used, even if it is me that does it. Thus, I cannot know if the code will or won't care, and thus, I leave the opportunity to do so.

If storing the error in the regex is a concern, then TryCompile can return the reason string instead of a bool if you like -- where a null error string indicates success.

The way it is set up in my library is to have a pattern class, and a matcher class. It is possible to compile a new pattern in the same pattern object, and if compilation fails, it has the same state as it did before attempting compilation. It seems odd to keep error information in an object that has a valid state.

Since this is a lower-level library to be used by higher objects, this needs to be written first, yet this is sounding as if the design depends upon what will use it, not the other way around.

#18King Mir  Members

2391
Like
0Likes
Like

Posted 11 February 2014 - 08:03 PM

I have to wonder, am I out of options here? Either exceptions with overhead whether I use them or not, millions of return code checks with minimal information, or some Frankenstein's monster combination of both using polymorphism and dynamic allocation?

You could implement setjump/longjump exceptions that would not be disabled by compiler settings (provided that the compiler is linked with a complete C library). With some suitable macros you could make the syntax halfway decent. The problem is, calling setjump is expensive and must occur at every would-be try block, even in the normal flow of code.

Just listing this as an option.

While I agree that it has its uses, it majorly disrupts the flow of execution, without providing any more information than a return code. My main goal is providing as much information as possible, in a way that I can propagate errors to callers, without the pollution of a billion different return codes in the same namespace.

You could implement that with a global variable that stores a pointer to the thrown exception. In fact, you can implement all of C++ exception handling and more with setjmp/longjmp. (For instance, you can implement multiple concurrent exceptions, as in D.) The two downsides are speed, and ugly macros.

From your later post it sounds like you're trying to pass error information across a library boundary -- in this case setjmp/longjmp is probably too ugly to expose, but you could use it for internal error propagation.

There are other ways to be fancy: you could pass in an "on error" function, you could pass in a "continuation" which doesn't get called when there's an error, or you can pass around an "Object-or-error" class template. But being fancy is usually bad for a library interface.

#19Ectara  Members

3097
Like
0Likes
Like

Posted 11 February 2014 - 08:15 PM

You could implement that with a global variable that stores a pointer to the thrown exception. In fact, you can implement all of C++ exception handling and more with setjmp/longjmp. (For instance, you can implement multiple concurrent exceptions, as in D.) The two downsides are speed, and ugly macros.

My goal isn't stack unwinding and disrupting the flow of execution, but the informative aspect; an exception object is constructed on throw, and thus, you don't pay for it if you don't use it. However, you must pay for the potential to unwind the stack and RTTI, whether you use it or not.

From your later post it sounds like you're trying to pass error information across a library boundary

Yes, in a way. Other modules are within the same library that use it, so they're in the same translation unit, but code outside of the regex class will use it. There will be code outside of the library that uses it, too.

Though, since it is a header-only templated implementation, technically, it's in every translation unit that uses it.

But being fancy is usually bad for a library interface.

If there is no solution to this request, then I can just go back to using return codes, and make do. There isn't a point in coming up with exotic solutions that have most of the same pitfalls as the established best practices.

Edited by Ectara, 11 February 2014 - 08:21 PM.

#20frob  Moderators

41340
Like
0Likes
Like

Posted 11 February 2014 - 10:39 PM

King Mir, on 11 Feb 2014 - 7:03 PM, said:

You could implement that with a global variable that stores a pointer to the thrown exception. In fact, you can implement all of C++ exception handling and more with setjmp/longjmp. (For instance, you can implement multiple concurrent exceptions, as in D.) The two downsides are speed, and ugly macros.
My goal isn't stack unwinding and disrupting the flow of execution, but the informative aspect; an exception object is constructed on throw, and thus, you don't pay for it if you don't use it. However, you must pay for the potential to unwind the stack and RTTI, whether you use it or not.

You pay for it either way.

Exceptions can be implemented with code or by data, or a mixed approach, depending on the compiler.

If they are implemented in code there is a runtime cost (even if you don't call an exception) is incurred every time a try, catch, or finally is placed in your code. It also adds a small cost to every function call's prologue. Based on numbers I've read, the cost is about a 6% penalty globally.

If they are implemented as data tables there is an executable size cost with a minor lookup fee when exceptions trigger. The exact cost depends on the implementation details. The system normally piggybacks on RTTI. If RTTI is not enabled then you can expect around a 10%-15% increase in executable size depending on how you uses classes, vtables, and other factors. If RTTI was already enabled the increase may be under 5%.

So if you use RTTI to get the info you pay for it. If you use exceptions to get the info you pay for it. You get to decide how affordable the payment is.

One of the major long-term complaints about C++ exceptions (and also RTTI) is that unlike many other language features, you absolutely pay a cost by their mere presence. This was a major sticking point in the initial standardization of the language and it remains an open performance issue. (Clicky 1, clicky 2, and many more.)These two features have always been among the first to go in game development in large part because of the size and space costs. When you are on a game console or embedded system having that much permanently wasted memory or that much globally lost performance is a very real concern. It is less pressure than it used to be, but when you are programming on a 66MHz console and paying for cartridge storage by the bit, the cost is huge.

Edited by frob, 11 February 2014 - 11:04 PM.
Clickies

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.