Single core hyper threaded machine optimizations

Started by
0 comments, last by Promit 17 years, 7 months ago
I have an intel pentium 4, 2.8Ghz single core, hyperthreaded CPU I had a go at optimizing my simple test bed engine to be mutli-threaded and the results are interesting: The main program (main thread) handles the logic and builds a display list that is executed on another thread (render thread). I have an option to turn on/off the render thread (when it's off, the display list commands are executed immediately on the main thread). Here's whats weird: when the render thread is turned on, the logic time goes up (not expected!), and the display list building time goes down (expected). The final result is that for both methods (thread on or off), the sum of the logic time and display list building time equal roughly the same amount! This off course results in the same frame rate when cpu is the bottle neck. Can someone explain this? I'm not entirely sure how hyper threading works on a single cpu, but I'm thinking the reason the logic time goes up with the render thread active is because of cache misses caused by the render thread reading from a different area in memory (and invalidating the shared memory cache?) which is fighting with the main thread memory access? thanks
Steve Broumley
Advertisement
Roughly speaking, hyperthreading leverages the fact that thanks to the ludicrous depth and substantial width of the pipeline, combined with the double pumped ALUs, a single thread can almost never keep all of the different parts of the chip occupied at once. So what the P4 does is to sneak another thread into the units that aren't in use.

Intel has a page all about hyperthreading and how to write code that deals with it effectively. (Beware, applying these techniques may result in degraded performance on other systems.)
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

This topic is closed to new replies.

Advertisement