Jump to content
  • Advertisement
Sign in to follow this  
Telastyn

Suggestions on tracking down a particular bug.

This topic is 4954 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have managed to stumble across a pretty nasty bug while porting the server side of my game over to windows. Somehow, I manage to hit all the interesting ones... Thankfully, the bug had a KB article specified directly to it: http://support.microsoft.com/default.aspx?scid=kb;en-us;240746 Unfortunately, looking through the code I've not found the situation the KB describes. So, before I go back looking for my needle, I'd like to ask those here that know more about compiler inner workings as to what nuances could perhaps cause the same effect? Note that the same code compiles and runs fine on NetBSD using a relatively older version of gcc [actually egcs-2.91.66] Some similar things I've thought of, but do not know if they're applicable: - the member function wasn't implimented, simply declared; so VC6 chokes trying to find it. - the member function was implimented, but as null; - the member function was optimized out. - a member function with no parameters is run without parens [object.foo] but VC6 thinks it's a redeclaration. I've tried compiling the code under gcc with all warnings, hoping it would indicate an empty, or un-implimented function but nothing turned up. Anyways, any suggestions on how to approach this problem, or insight as to the root cause beyond microsoft's article would be appreciated. [edit: found another situation where this occured, though it's with some functor code...: http://www.flipcode.com/cgi-bin/fcarticles.cgi?show=64065 ]

Share this post


Link to post
Share on other sites
Advertisement
C1001 can occur for any number of reasons, not just for the reason that the KB entry you linked to described. Template code, namespace issues, all sorts of things can cause VC 6 to spit out a C1001. My first word of advice would be to upgrade your compiler, at the very least check to make sure that you have the latest service pack (I believe MSVC 6 is on service pack 6 right now).

But if you're porting code from NetBSD, it shouldn't be a big deal to grab the MSVC .NET 2003 Toolkit from the Microsoft site and try using that on your code. You can also get Windows ports for gcc such as MinGW.

Share this post


Link to post
Share on other sites
Hrm, the 2003 toolkit compiler using VC6's libs/includes at least provided a slew of errors to point me kinda where to look. It looks like somewhere the STL templates are being fed a blank type.

Many thanks. I'll investigate further tomarrow.

Share this post


Link to post
Share on other sites
Funny things like a carriage return at the end of all your resource files (for one) and your header and class files might help. If you are using unixy carriage returns instead of windows ones I suppose there is a very outside chance it could be that.

Also things can get odd if some compiler options are changed between compiles and the cache isn't cleared. Like swapping between pre-compiled headers and not. If the precompiled headers option is switched on and the project doesn't suit them, wierd stuff can happen.

If STL is being fed a blank type, I would check any #define used with templates that are being given to the STL. And I suppose check for namespaces being properly used, and if necessary fully-qualifying a few things just to be safe.

Sounds like a nasty thing to track down though, good luck :-/

Share this post


Link to post
Share on other sites
From my experience with tracking down internal compiler errors, you need to isolate the line(s) of code that produces the error. This is not always the one where the compiler complains! It's a lot of work, but once you do that, the reason is often not hard to detect or work around by tweaking the code.

Here's a step by step guide:
1. Backup your current (faulty) version of the code. Make sure you do not overwrite any older versions you may have (that actually happened to me a few times, IIRC..).
2. If you have any recent backups:
. 2.a. Find the latest backup that does not produce the error.
. 2.b. Use a text comparison tool to find all the differences between the current and older versions, and comment them out in the current version. Alternatively, use compiler directives to disable those sections (I think commenting out is easier though). Mark all the disabled areas by some marker, e.g. "// FOOBAR", so that you can easily find them later.
. 2.c. Go to step 4.
3. If you don't have a backup, gradually comment out sections of your code until the error no longer occurs. Start with the block or function where the compiler complains, and work your way up as far as necessary. As in 2, mark all the disabled areas by some marker, e.g. "// FOOBAR", so that you can easily find them later.
4. Now you can do a binary search for the problematic area(s). Start with the smallest disabled area of the code (if there are more than one):
. 4.a. Try to enable the entire area. If it doesn't produce the error, continue to the next disabled area. Otherwise disable back again, and continue to 4.b.
. 4.b. Try to enable half the area.
. 4.c. If it produces the error, try to enable the other half instead.
. 4.d. If it doesn't produce the error, recurse for the remainig disabled area(s). As your disabled areas get split, keep marking them with the special marker so that they can be easily found later.
5. You should be left with one or more small isolated disabled areas, each of which produces the error if any part of them is enabled. You may want to back this up, especially if there are a lot of such areas, since it probably took a lot of work to get to this stage.
6. This is where it gets hard. The remaining disabled code may contain the problem itself, or it may only malfunction because of a problem in some other place. If you're lucky, it's the former, and you should be able to identify the problem(s) after several minutes of staring at the code and scratching your head (or something of that sort). It may help to go do someting else for a while and then come back with a fresh perspective. It may also help to try and rewrite the code some other way, to see if that sits better with the compiler. If you're not that lucky, it's time for the real detective work. I can't help you with that, so goodluck.
7. After you've finally found and fixed a problem, repeat from step 4 with the remaining disabled areas, until you can enable all the remaining code (you should be able to find the remaining areas by searching for that marker).

Share this post


Link to post
Share on other sites
Indeed, I had thought of that last night after posting. Much of the actual classes are already included in my windows client, so I checked the diff between the client classes and the server classes, and focused on them. Still nothing though. The client doesn't use the STL, so likely couldn't trigger the bug anyways.

I did manage to get the thing to compile using the 2003 toolkit, and at this point I'm likely to write it off as a VC++ 6 issue and move on.

Share this post


Link to post
Share on other sites
I found the/a solution/workaround to the bug in question. Explicitly including libcp.lib rather than libcpd.lib [ie, remove debugging for libc ?] allows VC++6 to build the app nicely. I also did some futzing with other debugging options [mainly the preprocessor _DEBUG/NDEBUG] which might have influenced things...

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!