Error handling

Started by
3 comments, last by Ashaman73 9 years, 11 months ago

I've been working on a few different projects - all of them I have handled errors in different ways. I am always unsure of myself when it comes to this - I never really know when I should let stuff fail - when I should print to log files - when I should return something indicating - when I should use error states - etc

I know its pretty common to let stuff that should break your code break it - so that there is a fixable crash - but I often find that when I do this I am allowing things that shouldn't break my code break it... like an assert(pointer != NULL) will result in a crash when some allowable condition is creating a NULL pointer.. Just some condition I didn't think about when I originally wrote the code

Anyways - you guys have any methodology to this - any things that you have found that work best for you for error handling in general? Any important remarks on logging? What level do you do the error checking and at what level do you handle it? Does anyone ever use standard exceptions?

Advertisement
The company I work for basically makes a server that is a polling engine...so every time you allocate memory, convert something or call a function you check for success or failure. If you can continue on failure you fail over to another path if possible. When something happens (error, warning, status or information change) it is pushed to a log service. The log can be filtered for LOD and viewed in real time as well as be pushed asynchronously to file. If viewing the log through client you can reference objects it is related to (and eventually help). I would concentrate on making try catch blocks as discrete as possible and check everything before you use it. Investigate boost pointers. Structure code so if one thing fails you aren't screwed. Make your logging as rich as possible. A big thing that seperates us from competition is that you can identify problems....communications path, device and host problems with the log. Then if you have a support contract you can email it to us for help.

I know its pretty common to let stuff that should break your code break it - so that there is a fixable crash - but I often find that when I do this I am allowing things that shouldn't break my code break it...

Making your code robust means, to let it run even if an error occurred. Error handling has only one important task: display an error clearly to the developer and user and gather as enough information to track it down. Obviously, one of the most clearly kinds of displaying an error is to let the code crash. And to be honest, many users will only react after a crash has occurred multiple times.

So, first of all try to find a way to clearly display an error:

1. Use asserts to catch development errors.

2. Catch exception and transform then into an user manageable bug message.

3. Include some auto-generated bugreports.

Then, try to make not fatal failure condition more robust. If you encounter an error, try to use some fallback mechanism, log the error, communicate the error to the user and continue. A typical example are missing textures. If you failed to load a texture, use a fallback texture (eg black texture), log the error, the texture glitch will be clearly visible to the user. But be careful about hidding some important failure conditions, if it is really hard to track or really important to be tracked, include a graceful crash.


A typical example are missing textures. If you failed to load a texture, use a fallback texture (eg black texture), log the error, the texture glitch will be clearly visible to the user. But be careful about hidding some important failure conditions, if it is really hard to track or really important to be tracked, include a graceful crash.

This is a good point - I actually do use a fallback texture for textures that failed to load and then log the mistake..

What about things like - assigning incorrect animations to some mesh - that is lets say that the animation bone structure does not match the mesh bone tree - If there is a GUI it is simple enough to alert the user "Can not assign animation to mesh because bones do not match" or something - but at the engine library level would we want to log this type of error and treat it much more seriously?

This is where I get most confused - where is the line of separation? When writing code that other software developers would be using should the errors have more harsh results than say.. GUIs? If I am writing a software library - should I be treating things like incorrect texture parameters, or possibly incorrectly named entities (if no two entities are allowed to have the same name) as reasons for complete failure of the function? Should assigning an animation to an incompatible mesh simply fail and have no consequence? or should it fail and log?


This is where I get most confused - where is the line of separation?

There's no clear line. Here are some tips:

1. Robust code is nice, but it costs time and makes your code more complexe, so don't overdo it.

2. Make robust code for errors, which have limited effect on other parts of code. When you are unsure, that a certain error could result in more errors in other parts of the code, let it fail.

3. Make robust code where your team would suffer from work-in-progress states. E.g. missing animations, textures, game states etc.

4. Make hard, but graceful, fails, where it is important, that this part of the code should work and any error should be discovered as soon as possible.

5. Use macro controlled asserts to switch between robust- and fail-handling, robust for development time and fail handling for testing.

Displaying some information about errors on the gui will not be the best solution for end-users. Most, if not all, will just ignore it. If you want to detect an error, make a graceful fail + ready to send bugreport.

This topic is closed to new replies.

Advertisement