Multithreaded game design - help

Started by
10 comments, last by Nairou 16 years, 2 months ago
I'm trying to design my game with multithreading in mind. It's going to be quite a simple system, consisting of only 2 main threads: "Simulation" and "Render". One of the key goals of the system is to completely detach the renderer and simulation. That is, the simulation runs at a speed independent of the renderer. I chose a fixed timestep of 10hz (100ms) for the simulation, and the renderer will just push out frames as fast as it can. Because of the low update rate of the simulation thread, the renderer needs to interpolate data between timesteps. Since the renderer could potentially be running at upwards of 30-60fps, there would need to be 3-6 interpolated frame for each timestep of 100ms. To achieve this, I'm thinking of a buffering system. The simulation thread does its calculations and writes to a buffer, while the renderer reads from two buffers containing the data from the past 2 timesteps and interpolates between them. This is what I'm envisioning: There are a few problems with it. One is latency. Since my timestep is 100ms, the time between input and render is 200ms at a minimum. If the rendering or simulation takes longer than expected, the latency will be even higher. Since my game is a space sim, latency isn't a *huge* issue, but 200ms (or more) seems a bit much. The other problem is the potential for data starvation. If the simulation thread takes longer to process a timestep than expected, then the render thread will finish the interpolation through its two buffers and then have to sit waiting for the next timestep to complete. If this happened every timestep, it would result in extremely jerky motion. Can anyone give any suggestions to improve my design?
NextWar: The Quest for Earth available now for Windows Phone 7.
Advertisement
I see where you're going with the whole timestep at 10 hz thing but I don't think it's a good idea. The interpolation part will essentially serialize your renderer which will heavily reduce the benefits of using multiple threads.
The buffer-approach is quite good but you should use a ring-buffer to account for unexpected delays.
If you want to speed up your simulation, make sure each entity in your scene, given the current frame time, can catch up with the simulation, even if it missed a couple of frames. That way you only have to simulate entities that pass the visibility test (i.e. entities that are inside the view frustum and not occluded). Of course you'll still have to do the transformations for all entities (or your culling won't work), but you won't have to do particle simulations or bone transforms unless they're really visible.
Also, make sure you use all the available cores for your simulation or your app will scale very poorly. You're on the right track with your buffer approach.
What are the goals you hope to achieve by putting your renderer in a separate thread? Interpolation is very problematic because you have no reliable way of knowing how long a logic frame or a render frame is going to take. Whenever the render thread stalls for the logic thread, the motion looks jerky. Whenever the logic thread stalls for the render thread, your simulation appears to be running in slow motion. There's also the issue of the interpolation itself, which isn't necessarily well-defined in all cases. For instance, if the logic thread says to render an object one frame but not the next frame (or vice versa), how does your renderer "interpolate" that? How do you interpolate between the different renderstates?

I'm not saying that you shouldn't put logic and rendering in separate threads, just that interpolation probably isn't the best reason because it's incredibly difficult to get looking right without jerkiness or slow motion. Instead, try to go for application responsiveness. Put your GUI and input in the render thread as well, so even if the simulation is running at 10 FPS the user can still interact with your application at 60 FPS or however fast the render thread is running. Don't have the render thread stall on the logic thread -- whenever the logic thread has new data (about every 100ms in your case), it waits for the render thread to finish its current frame (which shouldn't be very long since the render thread is running as fast as possible), and flags that they should sync up their data. You can use whatever buffering scheme you like here, but note that ring buffers or double buffers can be troublesome in low-memory environments, eg. consoles. The render thread just keeps rendering the same game scene over and over until the logic thread tells is otherwise, however it responds to user input and updates the GUI appropriately. The end result is that all interactions with the GUI itself are really fast and everything *feels* very smooth, even though the simulation itself will be running at a slower rate. 10 FPS is a bit slow though, you might want to try 15-25 FPS so that the user doesn't notice the discrete simulation steps as much.
Just stick to using a single thread for now. It's hard enough to program correctly with a sequential model. Doing so with a concurrent model is near impossible in existing languages. Don't be suckered into jumping on the silly "OMG use threads" bandwagon that so many others have... instead, build good software and don't worry so much about how many threads and cores you're utilizing.
My rating perfectly reflects the pathetic yes-men in-crowd attitude of this forum.
Has nothing to do with threading, but you might find this interesting.
Quote:Original post by Zipster
What are the goals you hope to achieve by putting your renderer in a separate thread?

The reasons are twofold: on one hand, I want to have a framerate-independent rate of simulation. The system has to deal with the possibility of either system running slow. If the renderer is running slower than the simulation, then the simulation should keep simulating at the set speed. If the simulation is running slower than the renderer, the renderer should interpolate between steps, as fast as it can.

On the other hand, I like the extra performance it can give, although this is a secondary concern.

Quote:I'm not saying that you shouldn't put logic and rendering in separate threads, just that interpolation probably isn't the best reason because it's incredibly difficult to get looking right without jerkiness or slow motion. Instead, try to go for application responsiveness. Put your GUI and input in the render thread as well, so even if the simulation is running at 10 FPS the user can still interact with your application at 60 FPS or however fast the render thread is running. Don't have the render thread stall on the logic thread -- whenever the logic thread has new data (about every 100ms in your case), it waits for the render thread to finish its current frame (which shouldn't be very long since the render thread is running as fast as possible), and flags that they should sync up their data. You can use whatever buffering scheme you like here, but note that ring buffers or double buffers can be troublesome in low-memory environments, eg. consoles. The render thread just keeps rendering the same game scene over and over until the logic thread tells is otherwise, however it responds to user input and updates the GUI appropriately. The end result is that all interactions with the GUI itself are really fast and everything *feels* very smooth, even though the simulation itself will be running at a slower rate. 10 FPS is a bit slow though, you might want to try 15-25 FPS so that the user doesn't notice the discrete simulation steps as much.

But that has its own problems. Since that system uses no interpolation, you'd have very jerky motion. If the simulation was running at 30fps, and the renderer was runnign at 60fps, you have the renderer just wasting time rendering frames since it renders the same frame twice (except the GUI possibly).

Memory isn't of particular concern right now, but I don't think my simulation can run very fast. I plan to have, effectively, a very large game world to simulate. Dynamic economies, hundreds of NPCs, AI, etc. 100ms is a lot more realistic than 20ms for this kind of simulation. And even if not, if the simulation actually runs slower/faster than expected then I'd prefer if the entire system didn't break down.

I can't really think of any method that would satisfy my requirements. Since I need:
- Decoupled simulation and rendering
- If the renderer is slow, the simulation is not affected
- If the simulation is slow, the renderer is not affected
- Low enough latency

Quote:Original post by Gage64
Has nothing to do with threading, but you might find this interesting.

I'll take a look at that, thanks.
NextWar: The Quest for Earth available now for Windows Phone 7.
Why not decouple the wide-scale simulation (economy, out-of-zone NPCs, etc) into a seperate thread, but keep input and things directly effected by input (physics, local AI, etc) in the render thread? That would offload quite a bit of load from the render thread without running into lag problems. ;)
Excellent thread. I am working on a system almost exactly like Sc4Freak describes, for many of the same reasons (i.e. allowing the use of V-Sync without stalling the game logic). Interpolation is definitely a pain, but I don't see an alternative when the two threads are running at different speeds.

Like you, I use a fixed number of buffers between the logic and render threads (currently 4), and the buffers are rotated between the threads. One buffer for the logic thread to write to, two buffers for the renderer thread to read from containing the two logic updates for it to interpolate between, and one 'pending' buffer holding the most recent update from the logic thread for the renderer to use when it is ready.

If the logic thread finishes an update but the renderer is for some reason running so slowly that it hasn't taken the pending buffer, it swaps the buffer it just finished with the pending buffer, to provide the renderer with the latest update, and then reuses the other buffer for the next update.

However, if the renderer finishes it's rendering of the current frame (or interpolation between the current and previous updates) before the logic thread has prepared a new update, then the renderer attempts to extrapolate beyond the data it has, and "guess" where things should be rendered based on previous interpolation trends. Not ideal, but better than stalling and creating visual jitters. Once a new update becomes available, it can go back to interpolating from it's current extrapolated data to the new updated data.

Interpolation and extrapolation in the renderer just exist to guarantee that everything always looks smooth between logic updates, whether they are on time or not. I don't know of any other way around this.

Latency is still an issue, but at least it is fairly predictable, and if your logic updates are fast enough then it won't be noticeable. I too think 10hz is way too slow for anything on the client, unless you have very little going on in-game. And, as mentioned, such a low update speed will make player input (i.e. mouse movement) feel very sluggish unless you do additional hacking to speed up it's sample rate. 20-30hz is better, and if you have a lot of speed or physics in your simulation I'd go so far as to say 80-100hz is ideal to prevent anomalies.
Quote:Original post by Pox
Why not decouple the wide-scale simulation (economy, out-of-zone NPCs, etc) into a seperate thread, but keep input and things directly effected by input (physics, local AI, etc) in the render thread? That would offload quite a bit of load from the render thread without running into lag problems. ;)

Because that violates one of the key goals of the system. Physics and local AI, as well as economy and out-of-zone NPCs, must operate on a fixed timestep. Otherwise, you make physics and local AI dependent on the renderer's framerate. Which would be bad.

@Nairou
Yeah, I also had a recent idea to include a fourth buffer, providing the exact same function as yours. The fourth buffer was needed, otherwise the simulation thread would stall if the renderer was still using the other two buffers.

Although, I've had an idea. I could have 3 threads: one for "in-zone" simulation, one for "out-of-zone" simulation, and one for render. The "in-zone" thread would handle physics, local AI, and everything else that's near you. It'd run at a relatively high timestep (30hz and above, depending on how fast I can make it) which would help mitigate input lag. The "out-of-zone" simulation (which would contain a lot of the heavy lifting - dynamic economy simulation, wide-scale AI, etc) would run at a much slower timestep - 10hz or so.

It doesn't *completely* solve the input lag situation, since all input would still be 3 or 4 timesteps behind what's currently rendering. But at 30hz (or more, if I can) it won't be as painfully obvious as 10hz.

I think this might be the system I'll use, since I really don't see much of an alternative. Unless anyone has suggestions or improvements?
NextWar: The Quest for Earth available now for Windows Phone 7.
Running the less important large-scale stuff at a lower frequency is definitely a good idea. Though I think adding a third thread to do more of the same sort of work (essentially) would be a bit overkill in this case. The alternative would be to run the entire logic thread at a high rate, but use a multiplier to perform the "out-of-zone" tasks at a lower rate. So if you had the logic thread running at a speed of, say, 30hz, and you wanted the "out-of-zone" tasks running at 10hz, then you just have your logic thread perform those tasks every third iteration for the same result.

This topic is closed to new replies.

Advertisement