We don't need no stinkin' assertions

Started by
11 comments, last by Nypyren 7 years, 3 months ago
You know you're having a... well, let's say "interesting" weekend when you find yourself doing this in production code:

//assert(someImportantPointer != nullptr);
if(someImportantPointer == nullptr)
    return;

For backstory, I'm changing the order of operations of a few critical processes, mostly to exploit batching and coherency opportunities. A lot of the changes are "winging it" because the system is old, not well-understood, and full of unwritten assumptions.

One of the few explicit assumptions in the code is that this pointer is valid at a certain moment of execution. I'm making it invalid, on purpose, for performance gains. I suspect the whole chunk of code is full of potential wins like this one, sometimes a 3-5x speedup depending on circumstances. Trouble is, since the system is critical and legacy as hell, it's very hard to know when you're looking at an easy perf win and when you're looking at a really deep, really ugly rabbit hole.


So today, I'm commenting out an assertion and replacing it with an early-out, because it's legitimately better to just silently fail than to crash.


I feel dirty, but sometimes you do what you have to.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Advertisement

I don't see a problem there.

For game systems normally I do both: Assert and return something sane. Most game libraries develop some form of skipable assert, a dialog box that says "This is wrong", with an option to keep going anyway. They should do something moderately sane in response.

Many game systems I've worked with have a generic "failure object". It's a regular old game object, but it is specialized to have all the interfaces have debug-build-only messages, otherwise performing no-ops. So if an operation fails an assertion but is expected to return a game object, it can return a failure object instead.

I'd use that in your case as well. A big noisy skipable assert so programmers know to fix the bug and QA can document it, but return a sane (probably incorrect yet still safe) result so it doesn't break.

Of course, server code and stuff that doesn't go out to game clients often follows a different model of extensive logs and notifications when it fails in final builds, but those are different beasts.

So today, I'm commenting out an assertion and replacing it with an early-out, because it's legitimately better to just silently fail than to crash.
If you care about correctness of results from your critical system, "legitimately better" is quite debatable.

Your need to remove the assert however suggests the code wasn't compiled with some optimization, at least afaik, optimizing code tends to turn of asserts at some point. Maybe it's simpler to gain performance by fixing compile options?

I don't see a problem there.

For game systems normally I do both: Assert and return something sane. Most game libraries develop some form of skipable assert, a dialog box that says "This is wrong", with an option to keep going anyway. They should do something moderately sane in response.

Many game systems I've worked with have a generic "failure object". It's a regular old game object, but it is specialized to have all the interfaces have debug-build-only messages, otherwise performing no-ops. So if an operation fails an assertion but is expected to return a game object, it can return a failure object instead.

I'd use that in your case as well. A big noisy skipable assert so programmers know to fix the bug and QA can document it, but return a sane (probably incorrect yet still safe) result so it doesn't break.

Of course, server code and stuff that doesn't go out to game clients often follows a different model of extensive logs and notifications when it fails in final builds, but those are different beasts.

That reminds me of a bug that took me a hell of a long time to figure out. We had an assert, followed by a safety response, much like the OP.


MyAssert(!badCondition);
if (badCondition)
    return;
RestOfFunction...

Somehow, in our optimized builds the execution managed to get to RestOfFunction despite the fact badCondition was true. It just didn't make any sense, and of course, it was hard to debug our optimized builds to figure out exactly what was going on.

The eventual explanation was that rather than our MyAssert macro being #defined to nothing on our optimized builds, it was defined to __assume that the condition held: https://msdn.microsoft.com/en-us/library/1b3fsfxw.aspx

Using the MSVC __assume keyword let the compiler optimize away the "if (badCondition) return;" statement entirely.

We decided to disable that behaviour (we removed the __assume), because while it makes a certain amount of sense, we felt it was too aggressive an optimization and didn't like the fact that the addition of asserts was actually changing program behaviour (and changing it in release builds only).

Your need to remove the assert however suggests the code wasn't compiled with some optimization, at least afaik, optimizing code tends to turn of asserts at some point. Maybe it's simpler to gain performance by fixing compile options?

I don't think he means the assert is the performance problem. Originally the pointer was never allowed to be invalid. However, the pointer being invalid is now a valid state sometimes (that sounded horrible in my head as I was typing it), but would fail the assertion and crash. It should be allowed to be nullptr.

If some system is dependent of this being correct, couldn't you return something to indicate failure? I don't think that triggers a warning for legacy code that assumes its current type void and discards the result, while new code can make use of it. Actually kind of what frob already mentioned.

Perhaps I'm taking this too seriously. I'd sure post this on my wall of coding horrors fame :D


[quote name='Alberth' timestamp='1485671961' post='5329944'] Your need to remove the assert however suggests the code wasn't compiled with some optimization, at least afaik, optimizing code tends to turn of asserts at some point. Maybe it's simpler to gain performance by fixing compile options? [/quote]

I don't think he means the assert is the performance problem.

I know, but asserts check for "impossible" situations. They are mostly aimed at developers, and do not produce anything of use for a normal user.

So afaik, it is common to set NDEBUG while compiling for a release, which simply removes all assert checks. Under that assumption you wouldn't even stumble on the assert, and need to "fix" it by removing the assert. However, since the OP posted about removing the assert to avoid the crash, clearly the code was not running under NDEBUG. That raises the question what optimization settings were used for compiling the release. Maybe no optimization was used for building the release!

I thought just turning on optimization already set NDEBUG, but that seems to be a false idea, at least for gcc. I learned at least something new from this horrorr thread :D

Some context on how we use assertions is probably in order.

I will spare you all my rant about skippable asserts, since that's a flamewar waiting to happen. Suffice it to say I think they're a really, really Bad Idea.

What's important is that we have two basic modes of thinking about assertions:

1. The world is essentially on fire and continuing to execute code is probably just going to make it worse;
2. The slightly stronger statement - that if we continue to run code, we will certainly corrupt user data in some way.

Outside of these two situations, we ban assertions in server code. They are always on, even in release builds, because they are saying that some crucial invariant of the universe has failed to hold and we MUST stop executing or we will make the world a worse place.

Assertions are reserved for what they should be - not bounds checking, or error handling, or dealing with transient failures.

Assertions in our code are for when shit has splattered off the fan and we are sinking fast.


This particular one deals with a central part of the game simulation. (I won't be more specific.) Taking out the assert() is tantamount to super-gluing down the handles on all the fire alarms in a building.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Outside of these two situations, we ban assertions in server code. They are always on, even in release builds

I completely agree for server code. Servers are things that are basically living code for as long as the project exists. They behave quite differently. Error conditions on servers need to do all kinds of announcements to the developers to let them know something is wrong.

Code that ships on a disk, or a cartridge, or otherwise is gone forever once you release it, that code needs to behave differently. Hence you need to have something that is extremely visible during in-house builds, but will silently fail safe, for whatever a proper definition of safe is. These assertions are to tell the developers something is wrong and get the bug fixed, but it also doesn't block QA or other developers from running the game because it fails 'safely'.

Games that go through certification processes must never crash. It is generally an automatic certification failure.

I've done the same thing quite a few times. You have the assertion to enforce some invariant that should never occur. Six months later the entire architecture has changed enough to break this invariant. You get called over to look at the code, being the author/expert on it, quickly identify that the assumption is no longer worth enforcing, and quickly develop a safe failure-handling code path so the person who's caused this assertion to fail can keep working.

Sometimes it's a case of "just quickly comment the assert out and add [return blah; //hodgman: shit shit shit will fix soon] so you can keep working, and I'll go work on a real fix now".

I will spare you all my rant about skippable asserts, since that's a flamewar waiting to happen. Suffice it to say I think they're a really, really Bad Idea.

Yeah I literally go straight to the crash handler in my assertion failure path, save a log and tear down the process.

I have worked on some very large teams though, and skippable assertions did save quite a bit of time on slightly-broken builds. Quite often the game would be broken, but still working enough for you to keep working. The assertion failure would cause "ASSERT" to be written on the screen in large red text, and made obvious in the logs, so any logs/screenshots/videos attached to a bug report/etc were obviously taken after an assertion had been skipped, tainting them.

IMHO the first thing a team like this should do though is to put a CI solution in place that tests every commit and rejects any that fail tests before they're merged into master. Master should never contain code that fails the test cases, and the CI server should enforce that. If you do end up with a broken build, you should expand the test cases immediately.

The need for skippable assertions as a crutch to deal with broken builds is then greatly reduced.

As an alternative solution:


if (importantPointer == nullptr) return;

// keep all existing code
assert(importantPointer != nullptr);
....

This topic is closed to new replies.

Advertisement