Sign in to follow this  
TapewormTuna

Bugs caused by stupidity are hard to find

Recommended Posts

My rendering code is meant to be API-agnostic (I am actually using two different APIs). I have a VideoDriver class and a pure virtual HardwareBuffer class. Whenever I need to create a new HardwareBuffer you create it via the VideoDriver class. Internally, the VideoDriver implementation stores a vector of the actual HardwareBuffer objects. When you create a new one, it pushes a new object to the back of the vector and returns a pointer to the virtual class.

This worked until I changed some of the code. nvogl.dll would sporadically crash because it tried referencing a null pointer. I eventually figured out that it was caused by the creation of the buffers, but after probing various parts of the code I couldn't figure out why it wasn't working.

I spent maybe 10 hours searching through my code trying to figure out what could've possible gone wrong. I eventually found out that after a certain number of buffers were created, the vector would resize and cause all pointers previously returned to become dangling (causing the VBO handle to become trashed). The fact that the memory addresses would change after a reallocation had never crossed my mind. I think I've been using Java too much recently.

Edited by TapewormTuna

Share this post


Link to post
Share on other sites

Indeed.  I just had my own episode of stupid.

I was implementing CEF for a UI system in my game, but not using the standard C/C++ family so I had to use some bindings.  Unfortunately, for the language I was using the existing bindings were about 3 years old, and since CEF has very little qualms changing the API I had to update a lot of the function signatures (usually deleting functions entirely) in order to get it to build at all.  After spending about two straight days trying to get DLL's debug symbols due to various mishaps (which is a hilarious story in itself), I discovered that it was crashing on initialization.

Okay, fine.  Digging through, I realized it was trying to copy a configuration struct (CefSettings for the initiated), presumably to martial it over to another thread.  After some debug prints and sanity checks, I realized that it was failing to allocate because it was trying to copy a string in this struct with some ungodly length, like 182375342980 or something.  "That doesn't make any sense", I thought to myself.  The string itself is empty,   The debugger shows it as empty,. and that same struct in my application's code is initialized properly.  How can a struct in my code reliably get corrupted immediately down the stack when passed into a well-used and tested library?  Of course, after asking that question I immediately realized the problem.  So I spent the next few hours fixing all the struct definitions in the binding.

 

Lesson learned:  Don't try to change a binding "as little as possible to get it to compile."  It took me almost three days to fix something that should've just been common sense.  I did experience all kinds of fun things in the process, like half a day just trying to figure out why my built DLL crashed the visual studio debugger 100% of the time, and discovering that neither the "how to build" documentation nor the minimum space requirements should ever be trusted.

Edited by SeraphLance

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this