My game engine, vastly simplified, does two things: simulate the game world, then render the game world. Note that I'm including getting input and some other subsystems as part of the "simulate" step.
Based on
this forum discussion, I decided to try moving the rendering to a second thread.
Unfortunately, the performance is worse running in multithreaded mode than in single threaded mode on my dual core CPU... by a fair margin. Let me explain in a little more detail what's going on.
Here's what the execution looks like in single threaded mode:
|---sim---|---sync---|-------render-------|
and repeat. Note: the "sync" represents the time spent sending the simulation updates to the rendering system. Here's what the execution looks like in multi-threaded mode:
main thread: |---sim---|---wait---|---sync---|
render thread: |-------render-------|---wait---|
and repeat.
Note that the main thread's simulation step finishes much faster than the rendering, and then has to wait for the rendering to finish before the main thread starts a sync (sending data to the render system). When the main thread is doing a sync, the render thread has to wait.
Performance data indicates that the single threaded mode is using 100% of one core, as expected -- and I get about 65 FPS. In multithreaded mode, each thread uses a little over 50% of its core, spending the rest of the time waiting as per the diagram above -- and I get about 45 FPS.
I was expecting the multithreaded version to do a fair amount of waiting due to the fact that right now my simulation of the game world isn't very CPU intensive, and also that my sync step is poorly optimized and takes longer than it should. I was expecting that the multithreaded version would be maybe marginally faster than the single threaded version, and then as I added complexity to the sim and sped up the sync, the multithreaded version would start to really out-pace the single threaded version. However, at this point the multithreaded version is so much slower than the single threaded version that I'm wondering if I've done something terribly, terribly wrong in the way I architected my multithreaded version. So, I thought I'd ask the gamedev folks:
Am I doing something fundamentally wrong with my multithreaded architecture? Here it is again:
main thread: |---sim---|---wait---|---sync---|
render thread: |-------render-------|---wait---|
The way I'm implementing the waiting is with SDL (libsdl.org) semaphores. I'm on a dual core linux 32-bit system. Also, just another data point, if I make the simulation much simpler and reduce the scene to something very simple (which reduces sync and render times), I can get upwards of 400 FPS out of the multithreaded mode and maybe 600 FPS from single threaded mode.
Thanks.