Home » Community » Forums » » Introduction to Debugging
  Intel sponsors gamedev.net search:   
[Control Panel] [Register] [Bookmarks] [Who's Online] [Active Topics] [Stats] [FAQ] [Search]

Add Forum to Favorites |  Send Topic To a Friend | View Forum FAQ | Track this topic


 Last Thread Next Thread 
 Introduction to Debugging
Post Reply 
An excellent article. We now have a place to point beginners to as the introduction to debugging.

Great job.

 User Rating: 1823   |  Rate This User  Send Private MessageView ProfileView Journal Report this Post to a Moderator | Link

Great material!

I'd personally offer one minor addition that I feel is extremely important: the single most effective way to prevent bugs is eager failure. Eager failure is a simple principle: if something goes wrong, bring the entire system to a halt as soon as possible, but no sooner.

For example, suppose you have a Square Root function, which is called from your AI code to do pathfinding. (This is kind of a bad example because you should just be using a library function, but the illustration is still valid.) Giving a negative number to SquareRoot is an error. There are two different ways you can deal with this: if SquareRoot detects that it has been passed a negative number, it can either return a "dummy value" like 0, or it can crash immediately.

Which is better? Obviously nobody wants to see the game crash - but a crash is far easier to contain than the random behavior that you might get by using the dummy value method. If your pathfinding gets a bogus 0 back and doesn't double-check itself, your units might do all kinds of crazy things. Worse, they might not do anything immediately, leaving a bug that takes time to appear. Diagnosing these kinds of bugs is extremely hard and time consuming. In the long run, it's better for the program to "barf" as quickly as possible. Of course, be sure to provide helpful error messages.

Maybe a bit closer to home: if your bank's software has a bug when calculating interest on your savings account, would you prefer for their software to crash and leave your account inaccessible for a few hours, or to have $100 a month mysteriously deducted from your account for no apparent reason?


Using compiler warnings, strict compiler checks (like typesafe containers), and using assertions are all practices that follow the principle of eager failure. The best bug is one that causes a compiler error: you can fix it before you even run the code. The next best bug is one that fires a runtime error and leaves you an explicit error message to check into. The worst bugs are ones that just happen and rely on someone noticing that something is wrong - like money disappearing from your bank account.


Dirge of the Derelict
[Work - Egosoft] [Epoch Language] [Journal - Peek into my shattered mind]


 User Rating: 1950   |  Rate This User  Send Private MessageView ProfileView Journal Report this Post to a Moderator | Link

Good article, definetly one that needed to be written. :)
Off the top of my head, here's a few more things I've seen being used ( whether I personally use them or not ) in debugging -

The asm instruction int 3 when the program is attached to the debugger causes the program to break, so you can do something like this -
#define ForceBreakpoint() __asm int 3

if you always want to break under some condition. You might for instance want to conditionally compile a special assert macro that always causes the debugger to break when an assertion fails on that line for whatever reason -
#ifdef _BREAKONASSERT
#define Assert(isTrue) _Assert( TEXT(__FILE__), __LINE__, (isTrue) ); ForceBreakpoint()

Not something I generally use, but there you are..

If you want turn __LINE__ into a string to save yourself a step ( since in general you're always going to be using it as a string ), you can coax it into one like this:
// cludge for getting the line number as a string
#define STRINGIZE(x) #x
// expands x then STRINGIZEs it
#define EXPAND(x) STRINGIZE(x)
// stringize line number by first expanding it
#define LINE_STR EXPAND(__LINE__)

Then use LINE_STR in place of __LINE__. This for some reason doesn't work correctly if your debug information type is set to edit/continue in MSVC for some reason though...

If you want a call stack trace in error reports from testers or end users on exceptions ( or asserts assuming your asserts throw ), there's a few ways to do it, one I've seen was something like this:
// hacky callstack trace on exception
#define TRY(funcName) static string _message = TEXT(__FILE__) ## TEXT("(") ## 				TEXT(LINE_STR) ## TEXT(")") ## TEXT(" : stack unwinding in ") 				## TEXT(#funcName) ## TEXT("\n"); try {
#define CATCH } catch(...) { CallStackTrace.push_back( _message ); throw; }

// set to decent initial size to avoid bad_alloc throw
std::vector<string> CallStackTrace(2048);



And then you can iterate through the vector and dump the strings in your outermost catch block. This of course requires placing try/catch around all of your functions in order to have a thorough report, but you can use a TRYd/CATCHd version for performance critical sections that will be compiled out in your final release. Another way is to use StackWalk64 in Dbghelp.h.

Anyways, again it was a well written article, nice job. :)
Daniel

EDIT: It appears that even the FORUMS hate macros, and I can't get it to format quite right. :D

 User Rating: 1047   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

As others have said, this is one of the articles that GDNet can really do with.
On the topic of asserts, one thing that I found to be annoying about the standard assert() macro, is that when it pops up the dialog box, other threads keep running. I ended up writing my own version of Assert(), which suspends all threads in the current process except for the current thread. That stops everything else freaking out while the user decides what to do about the assert.

For anyone who cares:

Assert.h:
//==========================================================================
// Assert.h - Custom assert statement
//==========================================================================

#ifndef __ASSERT_H__
#define __ASSERT_H__

#ifdef _DEBUG
#	define Assert(exp) (void)((!!(exp)) || (DbgAssertFailed(#exp,__FILE__,__LINE__,__FUNCTION__), 0) )
#else
#	define Assert(exp) ((void)0)
#endif

void DbgAssertFailed(const char* szExp, const char* szFile, unsigned int nLine, const char* szFunc);

#endif /* __ASSERT_H__ */




Assert.cpp:
//==========================================================================
// Assert.cpp - Custom assert statement
//==========================================================================

#include <string>
#include <windows.h>
#include <tlhelp32.h>

void FreezeAllThreadsExceptThis()
{
THREADENTRY32 theThread;
DWORD dwProcess = GetCurrentProcessId();

	HANDLE hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD,dwProcess);
	if(hSnapshot == INVALID_HANDLE_VALUE)
		return;
	theThread.dwSize = sizeof(theThread);
	if(!Thread32First(hSnapshot,&theThread))
	{
		CloseHandle(hSnapshot);
		return;
	}

	do {
		if(theThread.th32OwnerProcessID != dwProcess)
			continue;
		if(theThread.th32ThreadID != GetCurrentThreadId())
		{
			HANDLE hThread = OpenThread(THREAD_SUSPEND_RESUME,FALSE,theThread.th32ThreadID);
			if(hThread)
			{
				SuspendThread(hThread);
				CloseHandle(hThread);
			}
		}
		theThread.dwSize = sizeof(theThread);
	} while(Thread32Next(hSnapshot,&theThread));
	CloseHandle(hSnapshot);
}

void ResumeAllThreads()
{
THREADENTRY32 theThread;
DWORD dwProcess = GetCurrentProcessId();

	HANDLE hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD,dwProcess);
	if(hSnapshot == INVALID_HANDLE_VALUE)
		return;
	theThread.dwSize = sizeof(theThread);
	if(!Thread32First(hSnapshot,&theThread))
	{
		CloseHandle(hSnapshot);
		return;
	}

	do {
		if(theThread.th32OwnerProcessID != dwProcess)
			continue;
		if(theThread.th32ThreadID != GetCurrentThreadId())
		{
			HANDLE hThread = OpenThread(THREAD_SUSPEND_RESUME,FALSE,theThread.th32ThreadID);
			if(hThread)
			{
				ResumeThread(hThread);
				CloseHandle(hThread);
			}
		}
		theThread.dwSize = sizeof(theThread);
	} while(Thread32Next(hSnapshot,&theThread));
	CloseHandle(hSnapshot);
}

void DbgAssertFailed(const char* szExp, const char* szFile, unsigned int nLine, const char* szFunc)
{
char szBuff[32];

	if(!szFile) szFile = "??";
	const char* szRealFile = strrchr(szFile,'\\');
	if(!szRealFile) szRealFile = szFile;
	else szRealFile++;

	std::string str("Assertion failed:\nFile: ");
	sprintf(szBuff,"%d (",nLine);
	str += szRealFile;
	str += ":";
	str += szBuff; str += szFunc;
	str += ")\nExpression:\n";
	str += szExp;
	str += "\n\nDebug?";

	FreezeAllThreadsExceptThis();
	if(MessageBox(NULL,str.c_str(),"Assertion failed!",MB_YESNO|MB_ICONEXCLAMATION) == IDYES)
		_asm {int 3};
	ResumeAllThreads();
}



One thing to note is that it doesn't play nicely if you suspend threads in your application. I haven't needed to yet, so I haven't bothered implementing that work-around, but it'd be simple enough to build a std::list or std::vector of thread IDs to resume instead of resuming them all.

 User Rating: 2014   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

A few tip for the watch window:
  • @err if the return value of GetLastError()

  • @err,hr is the error message which is associated with @err

  • p,n (p is a pointer, n is a literal constant (neither a symbolic one not a variable)) treats the pointer p as if it was an array of n entries, ie instead of displaying only the entry when you expand it. example: myvector._MyFirst,5 will display the 5 first entries of myvector (myvector is a std::vector<>).

  • @clk is the clock. I read somewhere on the net that since watch are evaulated from top to bottom, you can do simple profiling while debugging your code by adding two watches: "@clk" and then "@clk = 0"

  • if you are into asm, great news: the registers can be viewed from the watch window - just add the name of the register (eax, ecx, ...)

  • postfixing a variable name with ,wm will treat this variable name as a windows message (ie it will display WM_COMMAND instead of 273)

These are not documented very well, so I guess you'll find the list useful.

Regards,

-- Emmanuel D. [blog, in French] [blog, very bad googlized translation] [NEW: English version of teh blog! (WIP)]

 User Rating: 1828   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

Excellent - I've been waiting for this to appear... I even had a placeholder for it in the DX FAQ ready and waiting

Keep up the good work!
Jack


Jack Hoxley [ Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]

 User Rating: 1947   |  Rate This User  Send Private MessageView ProfileView Journal Report this Post to a Moderator | Link

I found this section EXTREMELY useful.

Quote:

# Run the program in the debugger and do what you did to make the program crash again. The debugger should catch it this time, telling you that it has halted the program because "an unhandled exception of type 0xc0000005 has occurred: Access Violation" and a memory address.
# Look at the memory address reported. If it's 0x00000000, or a very low value like 0x0000000B, then we're probably looking at a null pointer dereferencing. If it's a higher value like 0x00455CD2, then we're probably looking at a pointer corruption bug.
# There are a few other special codes to look out for as well - values close to 0xCDCDCDCD, 0xCCCCCCCC or 0xBAADF00D indicate an uninitialised variable, while values close to 0xDDDDDDDD and 0xFEEEFEEE indicate recently deleted variables. If you see these in a pointer, it doesn't mean that the pointer is pointing to uninitialised or deleted memory - it means that the pointer itself is uninitialised or has been deleted. The other one to watch out for is 0xFDFDFDFD - it can indicate that you're reading past the beginning or end of a buffer.


 User Rating: 75   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Very nice article, good work!
GameDev has needed an introduction to debuging for a while, finally we have a link to reffer newcomers to.

 User Rating: 1743   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Quote:
Original post by FBMachine
The asm instruction int 3 when the program is attached to the debugger causes


Or you could simply use DebugBreak().

The article did not? mention memory trashing bugs. Those that appear to work properly on Debug builds but crash on release builds. Or the code might differently if you just attach debugger or add single debug print. These are the worst to debug IMHO.

 User Rating: 1050   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

Quote:
Original post by tksuoran
Quote:
Original post by FBMachine
The asm instruction int 3 when the program is attached to the debugger causes


Or you could simply use DebugBreak().


The only minor inconvenience with DebugBreak, IIRC, is that it shows up on the call stack - so the frame you get given immediately after the break is almost always one call too deep compared to what you care about. __asm int 3, on the other hand, doesn't touch the stack, so you can insert it in the middle of a function and get the right stack out of it.

 User Rating: 2118   |  Rate This User  Send Private MessageView ProfileView JournalView GD Showcase Entries Report this Post to a Moderator | Link

Quote:
Original post by soconne
I found this section EXTREMELY useful.

Quote:

# Run the program in the debugger and do what you did to make the program crash again. The debugger should catch it this time, telling you that it has halted the program because "an unhandled exception of type 0xc0000005 has occurred: Access Violation" and a memory address.
# Look at the memory address reported. If it's 0x00000000, or a very low value like 0x0000000B, then we're probably looking at a null pointer dereferencing. If it's a higher value like 0x00455CD2, then we're probably looking at a pointer corruption bug.
# There are a few other special codes to look out for as well - values close to 0xCDCDCDCD, 0xCCCCCCCC or 0xBAADF00D indicate an uninitialised variable, while values close to 0xDDDDDDDD and 0xFEEEFEEE indicate recently deleted variables. If you see these in a pointer, it doesn't mean that the pointer is pointing to uninitialised or deleted memory - it means that the pointer itself is uninitialised or has been deleted. The other one to watch out for is 0xFDFDFDFD - it can indicate that you're reading past the beginning or end of a buffer.


You'll find this more useful -

http://www.nobugs.org/developer/win32/debug_crt_heap.html

 User Rating: 1015   |  Rate This User  Send Private MessageView Profile Report this Post to a Moderator | Link

All times are ET (US)

Post Reply
 Last Thread Next Thread 
Forum Rules:
You may not post new threads
You may post replies
You may not edit your posts
You may not use HTML in your posts
Jump To:
Administrative Options: