SLI with an advanced OpenGL game?

Started by
4 comments, last by theagentd 9 years ago

Hello.

I have a rather advanced game engine utilizing deferred shading, lots of post-processing, etc. I also have two GTX 770s in SLI, and for some time I was able to get around 1.9x scaling using a small SLI profile I made using Nvidia Inspector, but after adding some new special effects and postprocessing it no longer works. In addition, the last time it DID work it seemed to corrupt the GL_TEXTURE_2D_ARRAY texture used for particles somehow...

The point is that the entire engine was made with SLI in mind. No framebuffers are reused inbetween frames, except for my temporal supersampling which actually has a system for buffering frames so that each frame reuses every Nth frame's texture instead to compensate for N GPUs running in parallel using Alternate Frame Rendering (AFR). All I want to do is disable all driver side synchronization of framebuffers etc, but no matter what SLI compatibility settings I use the game won't scale beyond a single GPU.

What can I do to disable all synchronization? Is there some specific compatibility bits to do this for OpenGL games? Is there some way of debugging the behavior of the driver to find out what's going on?

Advertisement
Heh, yeah. Doing SLI or Crossfire on your own is a nightmare. The GL API doesn't help at all.

Here's the situation: you cannot debug it. Some of NVIDIA's performance tools (NSight) might help but sometimes these things blow up or fail to give productive results in GL. Basically, something in your code is shutting off SLI or breaking it. It might be a bad Map or BufferData. It might be an extension that isn't supported.

The way to find it is to throw an FPS counter in, and then shut off the entire pipeline. Then start bringing phases of the render back in, until you find which piece breaks the speed up. It's going to be boring and time consuming, and the offending call may not make any sense when looking from outside the driver. Good luck.
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

Oh, god. That sounds horrible. Sigh. Any general tips on what I should and shouldn't do? I do texture streaming from a second OpenGL context, framebuffer blits...

If it sounds clever, or if it's a relatively recent feature, then you're potentially in trouble. It sounds like you have a lot of surface area for trouble. Last time I did this was on AMD drivers, so I don't have recent info on what to look out for. I would try NSight and see if it's able to give you any useful information. Other than that, it's all just guess work. But without a known good and known bad version of the code to bisect from, there's little else you can do.

SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

I managed to get it working again. Turns out it was Nvidia Surround preventing it from going to proper fullscreen, which fucked things up. I get around 90-95% scaling with two GTX 770s in my game (scene 1: 54 FPS ---> 105, scene 2: FPS, 67 ---> 128). The particles are STILL messed up though, and it doesn't make sense why. Somehow it seems like every other frame causes the particle textures to flicker, but not on all particles. I am getting quite a few debug output messages related to SLI performance:

ARB_debug_output message
ID: 131234
Source: API
Type: PERFORMANCE
Severity: MEDIUM
Message: SLI performance warning: SLI AFR copy and synchronization for texture mipmaps (120).

This only seems to happen once per synchronized texture during initialization though, which is good. It seems to trigger on 3 textures where I render to their mipmaps using an FBO, and two other textures where I call glBlitFramebuffer() on textures that don't have mipmaps. Anyway, this isn't really what's bothering me in this case. I really need to figure out why the particle textures are flickering. The particle texture is just a GL_TEXTURE_2D_ARRAY texture where each layer is its own particle which is initialized like everything else.

I think I have the basics figured out. To get SLI working with an OpenGL game, you need to use NVIDIA Inspector.

1. Create a new profile for your game.

2. Add the game's executable(s) to the profile.

3. Set "SLI rendering mode" to "SLI_RENDERING_MODE_FORCE_AFR2", hex value 0x00000003. Do NOT touch the "NVIDIA predefined---" values, and there is no need to choose a GPU count.

(Optional) 4: Enable the SLI indicator to show the (pretty worthless) SLI scaling indicator, which can at least tell you if SLI is enabled at all.

At this point, you can try running your game. Make sure you run the game in true fullscreen, not just a borderless window covering the whole screen, or SLI won't scale (but an empty SLI indicator will still show!). If all goes well, you should see a 90% boost in FPS (assuming you're GPU limited) and the SLI indicator should fill with green. However, many functions can inhibit SLI performance (most notably FBO rendering to textures, mipmap generation, etc), and in some cases the driver my completely kill your scaling by forcing synchronization between the GPUs, often leading to negative scaling. In this case, there is a special compatibility setting you can set which seems to disable most SLI synchronization and give you proper scaling. If you see no scaling, try these last few steps:

5. Click the "Show unknown settings from NVIDIA predefined profiles" button on the menu bar (the icon with two cogwheels and a magnifying glass).

6. Scroll down pretty far until you reach a cathegory called "Unknown".

7. Find the setting called "MULTICHIP_OGL_OPTIONS (0x209746C1)". Change it from the default 0x00000000 to 0x00000002 by typing in the value by hand.

Explanation:

The MULTICHIP_OGL_OPTIONS seems to have the same function for OpenGL as the SLI compatibility bits have for DirectX games. I tried all of the predefined values in the dropdown list for that setting, but many either had no scaling or graphical artifacts. What I realized was that changing it to a value that was NOT on the list seems to disable the synchronization, regardless of which value is chosen. I was expecting each bit to serve some specific function, but that does not seem to be the case. The hex value may be some kind of hash code or something that alters the driver's behavior. Setting it to anything that isn't predefined (0x00000002 being the first "free" value) seems to disable all synchronization between GPUs.

Sadly, I still haven't figured out the GL_TEXTURE_2D_ARRAY problem. It seems to be a driver problem where the driver does not copy the generated mipmaps to both GPUs when glGenerateMipmaps() is called. This may be intended behavior, but the same function certainly works for normal GL_TEXTURE_2D textures.

This topic is closed to new replies.

Advertisement