Strange performance differences and crashes on different systems

Started by
11 comments, last by irreversible 5 years, 4 months ago

Hi everyone,

I have written a game that uses a custom engine and it seems to work on most of the systems. I even got it to run on my old laptop with a GT630M (with 20 fps in 720p, but still). However, I am seeing strange behavior on some of my customer's and tester's systems. For example, one guy reported 15 FPS in 3440x1440 with a GTX 970. I didn't believe it and bought a 970 to test it, only to find out that it runs on my test system with 30-40 FPS with the default settings. My test system has a Core 2 Quad processor and DDR2 RAM and despite that I have more than twice the performance. How does this make any sense? And some other guy has random crashes every few minutes that I cannot reproduce.

What should I do about that? My code seems to be (mostly) correct, I personally didn't experience any crash with any of the released versions of my game.

Cheers,

Magogan

Advertisement

Antiviruses, browser, not enough of RAM. Building Google Chrome from sources in parallel. Bad GPU driver's (driver not installed).

26 minutes ago, Magogan said:

I personally didn't experience any crash with any of the released versions of my game.

This usually tells you very little in terms of compatibility. Generally the only time this kind of info is useful is programming for fixed hardware / environment, such as a console. You might for example have loads of errors and your particular system is recovering from them for you, or be using features that are only present / work in that way on your particular system.

It would probably be useful to know what platform / APIs (graphics / sound etc) you are using to start with. Am guessing DX11 from your other posts. There may be some kind of reference / debug implementation that will help flag up errors (I haven't used directx for years).

Part of the reason game engines (unity, unreal etc) are so successful is that they do a lot of this compatibility testing for you (unless you are doing something particularly funky).

I am no expert on the subject but this is just some general thoughts:

On a PC or phones / tablets etc reasons for crashes / slowdowns can be a multitude of different things, you need to test on loads of different systems (combinations of OS, OS patches, GPU, drivers, anti virus etc etc), as well as provide means to provide a log from a customer machine so you can get some environment info if they are having a problem, and identify patterns (although don't collect this info without their permission, data protection...). Also code defensively, check return values for calls to APIs, check memory allocations for failure, and in all cases provide meaningful error messages with code location etc, don't just crash. Have non-time critical asserts present in release builds.

You may be able to pay a company to do some compatibility testing for you, and / or utilize virtual machines. Big companies often have rooms full of different hardware / configurations to test on. It can be a nightmare, but my advice is to work out your lowest spec target and work to that to start with. Often there will be low hanging fruit, easy fixes which cure problems for a large percentage of machines. Other times you might need separate code paths for different environments.

And some other guy has random crashes every few minutes that I cannot reproduce.

 

that describes perfectly how faulty your code is, maybe reconsider rewriting

I already tested it on 4 different systems and with 7 different GPUs in total. The only crashes I had were due to an old GPU driver on my oldest laptop. My code is not that bad and I do check a lot of return values where it makes sense. I also check all pointers before using them.

I usually don't run out of memory. I actually managed to crash Windows once with a BSOD without any allocation failing (I experimented with a ridiculously high render distance). According to this stackoverflow thread, it does not make much sense to check for failed allocations since there is no way to recover in most of the cases.

I am using DirectX 11 and I made sure to only use the features that are available on all DX11 systems. The nature of the crashes doesn't make me believe that something is wrong with my code as it happens randomly. If I was using any API incorrectly, you should be able to reproduce that error on that system - and it would happen on more systems. But only 1-2% of the users have problems, so it's either a strange bug that I didn't catch or something is wrong with their systems.

I guess it may actually be related to anti-virus software.

41 minutes ago, Magogan said:

According to this stackoverflow thread, it does not make much sense to check for failed allocations since there is no way to recover in most of the cases.

Well if nothing else, you the developer now know that the crash was due to a failed allocation, and not something else. This information could also be useful to the user (perhaps they could close their parallel build of google chrome?). If you just crash, it could have been anything, and you are left praying to the gods as if crashes were some cosmic intervention from the astral plane.

Besides this, why are you getting a failed allocation? Have you got a memory leak? Is the app trying to allocation an unreasonable amount of memory / resources? Could there be a fallback method (maybe a smaller texture?).. etc

"crashed because of anti-virus" is usually non-sense.

From my personal experience of random crashes, I had some users who experienced random crashes - could be as soon as app starts it it could take an hour. Eventually I tracked it down to a bug related to SSE - it requires aligned memory but due to my bug memory was not aligned and it just happened to be aligned on my system, thus "works on my system".

It's not memory alignment. My game is a 64-bit application and all 64-bit allocations on Windows are 16-byte aligned by default.

I don't have failed allocations normally. I just tried a 2048 meter render distance (which is too much for a voxel game without a level of detail system), which needed over 128 GB of RAM and after a while I got a BSOD because of this. Apart from that experiment, I never ran out of memory on any system.

11 minutes ago, Magogan said:

I don't have failed allocations normally.

How do you know this if you are not checking for failed allocations? Again as with Zaoshi Kaba's example, this is just an example, always question your reasoning. Given how vague the information given, it is difficult to magically guess what is causing your problem, this is where a logical methodical approach to debugging will help you pin down what is causing problems.

In your case though if you are using voxels, as a wild stab in the dark I'd double check how you are handling graphics resources. I'm assuming you are using pools / overallocating buffers for worst case rather than attempting to allocate resources on the fly. If not this is a prime candidate for slowdowns / crashes.

If the machines you're having poor performance on are laptops, make sure they're actually using their dedicated graphics card. Many laptops that ship these days actually have two graphics cards - an integrated card on the motherboard and a dedicated GPU. Sometimes OS settings cause apps to run using the integrated card, which usually doesn't perform nearly as well as the dedicated GPU.

This topic is closed to new replies.

Advertisement