Jump to content

  • Log In with Google      Sign In   
  • Create Account

The Bag of Holding

Concurrent programming bug - the solution

Posted by , 04 February 2013 - - - - - - · 832 views

In my last post I outlined a bug that recently bit me in a reference-counting mechanism in a concurrent system.

If you haven't solved the mystery yet, here's some hints from common guesses I've seen from various people:
  • The reference count is implemented using atomic intrinsics.
  • Atomicity and alignment are proven correct for the platform in question.
  • Mutual exclusion and other locking mechanisms are not necessary for the solution.
  • Reference counts are "correct" at all times in that there is no leaked reference.
  • RAII is already in use, so it will not magically make the problem go away.
If you're still scratching your head, here's one last clue: the bug manifests as accessing a deleted object in memory. I strongly encourage everyone to try and figure it out before reading on to the following spoilers.

Spoiler hint 1

Spoiler hint 2


A couple lucky people picked up the solution pretty fast, but for the most part this seems like something that most programmers I've shown this to are not thinking about. Ironically, a few have had "aha!" moments where they recalled various coding conventions and rules about reference counting, and suddenly understood why those rules exist.

Thanks for playing!

Concurrent programming is hard, mmmkay?

Posted by , 01 February 2013 - - - - - - · 1,004 views

Here's a fun bug that I've had in one of my projects for quite some time, which I finally figured out and fixed today.

As with most bugs of this nature, it took dozens of readings through the code to spot it, and by the time I finally realized what I'd done, I felt incredibly stupid. In hindsight it's bloody obvious, but it highlights exactly the kind of mistake that is frighteningly easy to make in concurrent systems.

I'll pose the puzzle here first, and post the solution later. If you figure out my mistake, feel free to say so - but please don't post spoilers :-)
// Step 1: allocate a reference counted resource
// Step 2: initialize reference count to 0
// Step 3: increment reference count
// Step 4: pass resource to an asynchronous (multithreaded/concurrent) procedure
// Step 5: increment reference count again
// Step 6: pass the same resource to a second asynchronous procedure
// Step 7: return from function and wait for the processes to finish

// Both asynchronous procedures decrement the reference counter when they complete
// When the reference counter hits 0, the resource is deallocated
Good luck!

February 2013 »